-
Notifications
You must be signed in to change notification settings - Fork 111
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #391 from JaikumarSRajan/master
RELEASE-1.13 Issue #SC-543 feat: Organisation External Id ETL Jobs documentation
- Loading branch information
Showing
3 changed files
with
150 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
--- | ||
title: Organisation channel migration | ||
page_title: Organisation channel migration | ||
description: Organisation channel migration | ||
published: true | ||
allowSearch: true | ||
--- | ||
|
||
## Overview | ||
Sunbird, from its release version 1.13 captures the channel details of the tenant organization for all sub-organisations. This data is captured in the channel column of the organisation table in cassandra database. Sub-organisations created prior to this release version will not have channel values. Thus to ensure data consistency this migration has to be done to set the channel value for all the existing sub-organisations based on their root organisation ID. | ||
|
||
## Prerequisites | ||
|
||
To run the migration script, ensure you have: | ||
|
||
1. Access to cassandra database | ||
2. A backup of sunbird keyspace in Cassandra DB. | ||
|
||
## Configuration Parameters | ||
The following parameters must be passed as arguments for the channel value migration job | ||
|
||
S.No. | Parameter | Description | Example | ||
-------|-----------|-------------|--------- | ||
1 | sunbird_cassandra_server | Cassandra DB IP Address| 198.168.1.1 | ||
2 | sunbird_cassandra_port | Cassandra DB Port Number | 9042 | ||
3 | sunbird_cassandra_username* | Cassandra DB User Name | username | ||
4 | sunbird_cassandra_password* | Cassandra DB Password | password | ||
5 | sunbird_cassandra_keyspace | Cassandra DB Keyspace Name | demodb | ||
6 | sunbird_org_channel_migration_log_file | Path to CSV file where migration logs are stored | \home\channel_migration_log.csv | ||
|
||
> Note: If authentication is not required, pass `""` for parameters, username, and password | ||
## Migration Script | ||
|
||
To migrate channel value for the organisations: | ||
|
||
1. Extract the archive file (sunbird-utils/cassandra-migration-etl/r1.13/OrgMigrationUpdateChannelBin.zip) that contains the script for channel value migration | ||
|
||
2. Run the following command to migrate the data | ||
<pre> | ||
OrgMigrationUpdateChannel_run.sh --context_param sunbird_cassandra_server="{sunbird_cassandra_server}" --context_param sunbird_cassandra_port="{sunbird_cassandra_port}" --context_param sunbird_cassandra_username="{sunbird_cassandra_username}" --context_param sunbird_cassandra_password="{sunbird_cassandra_password}" --context_param sunbird_cassandra_keyspace="{sunbird_cassandra_keyspace}" --context_param sunbird_org_channel_migration_log_file="{sunbird_org_channel_migration_log_file}" | ||
</pre> | ||
|
||
On successful migration, the log is available in the configured file {sunbird_org_channel_migration_log_file}. To cross-check whether all organisations have channel value, the following queries can be used | ||
|
||
- Query to fetch number of organisations | ||
```select count(*) from organisation;``` | ||
- Query to fetch number of organisations with channel value | ||
```select count(channel) from organisation;``` | ||
|
50 changes: 50 additions & 0 deletions
50
developer-docs/installation/org_external_identity_migration.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
--- | ||
title: Organisation external identity migration | ||
page_title: Organisation external identity migration | ||
description: Organisation external identity migration | ||
published: true | ||
allowSearch: true | ||
--- | ||
|
||
## Overview | ||
Sunbird, from its release version 1.13 onwards captures the external ID details of organisations in a separate table(org_external_identity) in cassandra database. Running the migration script moves the external ID details of organisations created before this release(which were captured in organisation table) to the new table to ensure data consistency. | ||
|
||
## Prerequisites | ||
|
||
To run the migration script, ensure you have: | ||
|
||
1. Access to cassandra database | ||
2. A backup of sunbird keyspace in Cassandra DB. | ||
|
||
## Configuration Parameters | ||
The following parameters needs to be passed as arguments for the organisation external identity migration job | ||
|
||
S.No. | Parameter | Description | Example | ||
-------|-----------|-------------|--------- | ||
1 | sunbird_cassandra_server | Cassandra DB IP Address| 198.168.1.1 | ||
2 | sunbird_cassandra_port | Cassandra DB Port Number | 9042 | ||
3 | sunbird_cassandra_username* | Cassandra DB User Name | username | ||
4 | sunbird_cassandra_password* | Cassandra DB Password | password | ||
5 | sunbird_cassandra_keyspace | Cassandra DB Keyspace Name | demodb | ||
6 | sunbird_org_externalid_migration_log_file | Path to CSV file where migration logs are stored | \home\externalid_migration_log.csv | ||
|
||
> Note: If authentication is not required, pass `""` for parameters, username, and password | ||
## Migration Script | ||
|
||
To migrate external identity value for the organisations: | ||
|
||
1. Extract the archive file (sunbird-utils/cassandra-migration-etl/r1.13/OrgExternalIdentityMigrationBin.zip) that contains the script for external identity migration | ||
|
||
2. Run the following command to migrate the data | ||
<pre> | ||
OrgExternalIdentityMigration_run.sh --context_param sunbird_cassandra_server="{sunbird_cassandra_server}" --context_param sunbird_cassandra_port="{sunbird_cassandra_port}" --context_param sunbird_cassandra_username="{sunbird_cassandra_username}" --context_param sunbird_cassandra_password="{sunbird_cassandra_password}" --context_param sunbird_cassandra_keyspace="{sunbird_cassandra_keyspace}" --context_param sunbird_org_externalid_migration_log_file="{sunbird_org_externalid_migration_log_file}" | ||
</pre> | ||
|
||
On successful migration, the log is available in the configured file {sunbird_org_externalid_migration_log_file}. To cross-check whether all external ids has been populated to the org_external_identity table, the following queries can be used | ||
|
||
- Query to fetch number of organisations with external id | ||
```select count(externalid) from organisation;``` | ||
- Query to fetch number of records from org_external_identity table | ||
```select count(*) from org_external_identity;``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
--- | ||
title: Syncing Organisation Data | ||
page_title: Syncing Organisation Data | ||
description: Details on how to sync organisation's data | ||
keywords: sync, org sync, org sync job | ||
allowSearch: true | ||
published: true | ||
--- | ||
|
||
## Overview | ||
|
||
In Sunbird, all transfers of data from CPU to a Memory Unit (all write operations) are done in Cassandra and all read operations (reading data from a Memory Unit) are done through Elasticsearch. Currently, after write operations are completed in Cassandra, the data is written asynchronously into Elasticsearch. | ||
On running a Cassandra migration all organisation data such as Organization External ID, Organization are affected. This document describes the script that can be run to sync organisation's data from Cassandra to Elasticsearch. | ||
|
||
## Prerequisites | ||
|
||
To sync organisation data, ensure you have: | ||
|
||
1. Access to Cassandra database | ||
2. API Key to access the Sync API | ||
|
||
## Configuration Parameters | ||
|
||
Pass the following parameters as arguments for the organisation sync job: | ||
|
||
S.No. | Parameter | Description | Example | ||
-------|-----------|-------------|--------- | ||
1 | sunbird_cassandra_server | The IP address of the Cassandra DB. This parameter is used to identify the server on which the Cassandra DB runs. The system uses the details provided to connect to the database.| 198.168.1.1 | ||
2 | sunbird_cassandra_port | The port number of the Cassandra DB. This parameter is used to identify the port on which the Cassandra DB runs. The system uses the details provided to connect to the database.| 9042 | ||
3 | sunbird_cassandra_username* | The user name for the Cassandra DB. This parameter is used to authenticate the user accessing the DB. | [email protected] | ||
4 | sunbird_cassandra_password* | The password for the Cassandra DB. This parameter is used to authenticate the user accessing the DB.| password | ||
5 | sunbird_sync_api_endpoint | [Sync API](http://docs.sunbird.org/latest/apis/datasyncapi/#tag/Data-Sync-API(s)) | {{domain}}/api/data/v1/index/sync | ||
6 | sunbird_sync_api_key | The API Key to access the Sync API. | As23456789zws34567w234 | ||
7 | sunbird_sync_block_size | The number of org records to be synced per API call. | 1000 | ||
8 | sunbird_sync_sleep_time | The time interval in milliseconds between API calls. | 5000 | ||
|
||
> Note: If you do not need to authenticate access to the cassandra, pass `""` value for the **username** and **password** parameters. | ||
## Syncing Organisation Data | ||
|
||
To sync organisation data from Cassandra to Elasticsearch: | ||
|
||
1. Extract the archive file (sunbird-utils/cassandra-migration-etl/r1.13/OrgSyncBin.zip) that contains the script to sync the organisation's data | ||
|
||
2. Run the following command to sync all organisation's data | ||
|
||
```` | ||
OrgSync_run.sh --context_param sunbird_cassandra_server="{sunbird_cassandra_server}" --context_param sunbird_cassandra_port="{sunbird_cassandra_port}" --context_param sunbird_cassandra_username="{sunbird_cassandra_username}" --context_param sunbird_cassandra_password="{sunbird_cassandra_password}" --context_param sunbird_sync_api_endpoint="{sunbird_sync_api_endpoint}" --context_param sunbird_sync_api_key="{sunbird_sync_api_key}" --context_param sunbird_sync_block_size="{sunbird_sync_block_size}" --context_param sunbird_sync_sleep_time="{sunbird_sync_sleep_time}" | ||
```` | ||
On completion refer to the success and failure logs that are generated |