Skip to content

Commit

Permalink
Merge pull request #391 from JaikumarSRajan/master
Browse files Browse the repository at this point in the history
RELEASE-1.13 Issue #SC-543 feat: Organisation External Id ETL Jobs documentation
  • Loading branch information
Basreena authored Feb 27, 2019
2 parents 84c9d72 + 2750247 commit f972aae
Show file tree
Hide file tree
Showing 3 changed files with 150 additions and 0 deletions.
50 changes: 50 additions & 0 deletions developer-docs/installation/org_channel_migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: Organisation channel migration
page_title: Organisation channel migration
description: Organisation channel migration
published: true
allowSearch: true
---

## Overview
Sunbird, from its release version 1.13 captures the channel details of the tenant organization for all sub-organisations. This data is captured in the channel column of the organisation table in cassandra database. Sub-organisations created prior to this release version will not have channel values. Thus to ensure data consistency this migration has to be done to set the channel value for all the existing sub-organisations based on their root organisation ID.

## Prerequisites

To run the migration script, ensure you have:

1. Access to cassandra database
2. A backup of sunbird keyspace in Cassandra DB.

## Configuration Parameters
The following parameters must be passed as arguments for the channel value migration job

S.No. | Parameter | Description | Example
-------|-----------|-------------|---------
1 | sunbird_cassandra_server | Cassandra DB IP Address| 198.168.1.1
2 | sunbird_cassandra_port | Cassandra DB Port Number | 9042
3 | sunbird_cassandra_username* | Cassandra DB User Name | username
4 | sunbird_cassandra_password* | Cassandra DB Password | password
5 | sunbird_cassandra_keyspace | Cassandra DB Keyspace Name | demodb
6 | sunbird_org_channel_migration_log_file | Path to CSV file where migration logs are stored | \home\channel_migration_log.csv

> Note: If authentication is not required, pass `""` for parameters, username, and password
## Migration Script

To migrate channel value for the organisations:

1. Extract the archive file (sunbird-utils/cassandra-migration-etl/r1.13/OrgMigrationUpdateChannelBin.zip) that contains the script for channel value migration

2. Run the following command to migrate the data
<pre>
OrgMigrationUpdateChannel_run.sh --context_param sunbird_cassandra_server="{sunbird_cassandra_server}" --context_param sunbird_cassandra_port="{sunbird_cassandra_port}" --context_param sunbird_cassandra_username="{sunbird_cassandra_username}" --context_param sunbird_cassandra_password="{sunbird_cassandra_password}" --context_param sunbird_cassandra_keyspace="{sunbird_cassandra_keyspace}" --context_param sunbird_org_channel_migration_log_file="{sunbird_org_channel_migration_log_file}"
</pre>

On successful migration, the log is available in the configured file {sunbird_org_channel_migration_log_file}. To cross-check whether all organisations have channel value, the following queries can be used

- Query to fetch number of organisations
```select count(*) from organisation;```
- Query to fetch number of organisations with channel value
```select count(channel) from organisation;```

50 changes: 50 additions & 0 deletions developer-docs/installation/org_external_identity_migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: Organisation external identity migration
page_title: Organisation external identity migration
description: Organisation external identity migration
published: true
allowSearch: true
---

## Overview
Sunbird, from its release version 1.13 onwards captures the external ID details of organisations in a separate table(org_external_identity) in cassandra database. Running the migration script moves the external ID details of organisations created before this release(which were captured in organisation table) to the new table to ensure data consistency.

## Prerequisites

To run the migration script, ensure you have:

1. Access to cassandra database
2. A backup of sunbird keyspace in Cassandra DB.

## Configuration Parameters
The following parameters needs to be passed as arguments for the organisation external identity migration job

S.No. | Parameter | Description | Example
-------|-----------|-------------|---------
1 | sunbird_cassandra_server | Cassandra DB IP Address| 198.168.1.1
2 | sunbird_cassandra_port | Cassandra DB Port Number | 9042
3 | sunbird_cassandra_username* | Cassandra DB User Name | username
4 | sunbird_cassandra_password* | Cassandra DB Password | password
5 | sunbird_cassandra_keyspace | Cassandra DB Keyspace Name | demodb
6 | sunbird_org_externalid_migration_log_file | Path to CSV file where migration logs are stored | \home\externalid_migration_log.csv

> Note: If authentication is not required, pass `""` for parameters, username, and password
## Migration Script

To migrate external identity value for the organisations:

1. Extract the archive file (sunbird-utils/cassandra-migration-etl/r1.13/OrgExternalIdentityMigrationBin.zip) that contains the script for external identity migration

2. Run the following command to migrate the data
<pre>
OrgExternalIdentityMigration_run.sh --context_param sunbird_cassandra_server="{sunbird_cassandra_server}" --context_param sunbird_cassandra_port="{sunbird_cassandra_port}" --context_param sunbird_cassandra_username="{sunbird_cassandra_username}" --context_param sunbird_cassandra_password="{sunbird_cassandra_password}" --context_param sunbird_cassandra_keyspace="{sunbird_cassandra_keyspace}" --context_param sunbird_org_externalid_migration_log_file="{sunbird_org_externalid_migration_log_file}"
</pre>

On successful migration, the log is available in the configured file {sunbird_org_externalid_migration_log_file}. To cross-check whether all external ids has been populated to the org_external_identity table, the following queries can be used

- Query to fetch number of organisations with external id
```select count(externalid) from organisation;```
- Query to fetch number of records from org_external_identity table
```select count(*) from org_external_identity;```

50 changes: 50 additions & 0 deletions developer-docs/installation/sync_org.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: Syncing Organisation Data
page_title: Syncing Organisation Data
description: Details on how to sync organisation's data
keywords: sync, org sync, org sync job
allowSearch: true
published: true
---

## Overview

In Sunbird, all transfers of data from CPU to a Memory Unit (all write operations) are done in Cassandra and all read operations (reading data from a Memory Unit) are done through Elasticsearch. Currently, after write operations are completed in Cassandra, the data is written asynchronously into Elasticsearch.
On running a Cassandra migration all organisation data such as Organization External ID, Organization are affected. This document describes the script that can be run to sync organisation's data from Cassandra to Elasticsearch.

## Prerequisites

To sync organisation data, ensure you have:

1. Access to Cassandra database
2. API Key to access the Sync API

## Configuration Parameters

Pass the following parameters as arguments for the organisation sync job:

S.No. | Parameter | Description | Example
-------|-----------|-------------|---------
1 | sunbird_cassandra_server | The IP address of the Cassandra DB. This parameter is used to identify the server on which the Cassandra DB runs. The system uses the details provided to connect to the database.| 198.168.1.1
2 | sunbird_cassandra_port | The port number of the Cassandra DB. This parameter is used to identify the port on which the Cassandra DB runs. The system uses the details provided to connect to the database.| 9042
3 | sunbird_cassandra_username* | The user name for the Cassandra DB. This parameter is used to authenticate the user accessing the DB. | [email protected]
4 | sunbird_cassandra_password* | The password for the Cassandra DB. This parameter is used to authenticate the user accessing the DB.| password
5 | sunbird_sync_api_endpoint | [Sync API](http://docs.sunbird.org/latest/apis/datasyncapi/#tag/Data-Sync-API(s)) | {{domain}}/api/data/v1/index/sync
6 | sunbird_sync_api_key | The API Key to access the Sync API. | As23456789zws34567w234
7 | sunbird_sync_block_size | The number of org records to be synced per API call. | 1000
8 | sunbird_sync_sleep_time | The time interval in milliseconds between API calls. | 5000

> Note: If you do not need to authenticate access to the cassandra, pass `""` value for the **username** and **password** parameters.
## Syncing Organisation Data

To sync organisation data from Cassandra to Elasticsearch:

1. Extract the archive file (sunbird-utils/cassandra-migration-etl/r1.13/OrgSyncBin.zip) that contains the script to sync the organisation's data

2. Run the following command to sync all organisation's data

````
OrgSync_run.sh --context_param sunbird_cassandra_server="{sunbird_cassandra_server}" --context_param sunbird_cassandra_port="{sunbird_cassandra_port}" --context_param sunbird_cassandra_username="{sunbird_cassandra_username}" --context_param sunbird_cassandra_password="{sunbird_cassandra_password}" --context_param sunbird_sync_api_endpoint="{sunbird_sync_api_endpoint}" --context_param sunbird_sync_api_key="{sunbird_sync_api_key}" --context_param sunbird_sync_block_size="{sunbird_sync_block_size}" --context_param sunbird_sync_sleep_time="{sunbird_sync_sleep_time}"
````
On completion refer to the success and failure logs that are generated

0 comments on commit f972aae

Please sign in to comment.