fivetran · fivetran-abdulsalam · Jun 18, 2024 · Jun 18, 2024 · Jun 18, 2024 · Jun 18, 2024
diff --git a/development-guide.md b/development-guide.md
@@ -105,12 +105,14 @@ This operation should report all columns in the destination table, including Fiv
 - This operation might be requested for a table that does not exist in the destination. In that case, it should NOT fail, simply ignore the request and return `success = true`.
 - `utc_delete_before` has millisecond precision.
 
-#### WriteBatchRequest
+#### WriteBatchRequest 
 - `replace_files` is for `upsert` operation where the rows should be inserted if they don't exist or updated if they do. Each row will always provide values for all columns. Set the `_fivetran_synced` column in the destination with the values coming in from the csv files.
 
 - `update_files` is for `update` operation where modified columns have actual values whereas unmodified columns have the special value `unmodified_string` in `CsvFileParams`. Soft-deleted rows will arrive in here as well. Update the `_fivetran_synced` column in the destination with the values coming in from the csv files.
 
-- `delete_files` is for `hard delete` operation. Use primary key columns (or `_fivetran_id` system column for primary-keyless tables) to perform `DELETE FROM`.
+- `delete_files` is for `hard delete` operation. Use primary key columns (or `_fivetran_id` system column for primary-keyless tables) to perform `DELETE FROM`.   
+
+> Note: To handle history mode `replace_files`, `update_files` and `delete_files`, follow [How to Handle History Mode Batch Files](how-to-handle-history-mode-batch-files.md) guide.
 
 Also, Fivetran will deduplicate operations such that each primary key will show up only once in any of the operations
 

diff --git a/how-to-handle-history-mode-batch-files.md b/how-to-handle-history-mode-batch-files.md
@@ -0,0 +1,221 @@
+# What is History Mode
+
+History mode allows to capture every available version of each record from Fivetran source connectors.
+In order to keep all versions of the records, three new system columns are added to tables with history mode enabled.
+
+
+Column | Type | Description
+--- | --- | ---
+_fivetran_active | Boolean | TRUE if it is the currently active record. FALSE if it is a historical version of the record. Only one version of the record can be TRUE.
+_fivetran_start | TimeStamp | The time when the record was first created or modified in the source.
+_fivetran_end | TimeStamp | The value for this column depends on whether the record is active. If the record is not active, then `_fivetran_end` value will be `_fivetran_start` of the next version of the record minus 1 millisecond. If the record is deleted, then the value will be the same as the timestamp of delete operation. If the record is active, then `_fivetran_end` is set to maximum TIMESTAMP value.
+
+
+## Points to remember in history mode
+
+- In `WriterBatchRequest`, we pass a new boolean field, `history_mode`, which indicates if the connector is in history mode or not.
+- If the existing table is not empty, then, in the batch file, we also send a boolean column `_fivetran_earliest`. Suppose, in an `upsert`, we got multiple versions of the same record in a flush, then we set the `_fivetran_earliest` column value to `TRUE` for the record which have the earliest `_fivetran_start` and rest of the versions will have `_fivetran_earliest` as FALSE.
+- For each `replace`, `update` and `delete` batch file, DELETE the existing records in the destination table with `_fivetran_start` greater than or equal to  `_fivetran_start` of matcing records in batch file (refer to [Example 1](#example-1) and [Example 2](#example-2)).   
+
+> IMPORTANT: Do not add the `_fivetran_earliest` column to the destination table. It is provided for convenience to easily identify the earliest record and can be used to optimize the data load query.   
+See the following example of a `replace` batch file in history mode:
+
+Id(PK) | COL1    | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_earliest
+---|---------|---------------------| --- |------------------| --- 
+1 | abc     | T1                  | T2-1  | FALSE            | TRUE
+2 | xyz     | T1                  | TMAX | TRUE             | TRUE
+1 | pqr     | T2                  | T3-1 | FALSE            | FALSE
+1 | def | T3                  | TMAX                | TRUE             | FALSE
+
+## How to Handle Replaces, Updates and Deletes
+
+### Replace
+
+#### Example 1
+This example describes a case where the `_fivetran_start` column value of the destination row is less than `_fivetran_start` of the matching row in the batch file.
+
+Suppose the existing table in the destination is as follows:
+
+Id(PK) | COL1 | COL2 | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_synced
+--- |------|----| --- | --- | --- | ---
+1 | abc  | 1  |T1 | T2-1 | FALSE | T100
+1 | pqr | 2  |  T2 | TMAX | TRUE | T101
+2 | mno | 3  | T2  | TMAX | TRUE | T103
+
+At the source, new records are added:
+
+Id(PK) | COL1 | COL2 | Timestamp | Type
+--- | --- | --- |-----------| ---
+1 | def |1 | T3        | Upserted
+1 | ghi | 1 |  T4       | Upserted
+
+The replace batch file will be as follows:
+
+Id(PK) | COL1 | COL2  | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_earliest | _fivetran_synced
+--- |------|-------|---------------------| --- | --- | --- | ---
+1 | def  | 1     | T3                  | T4-1 | FALSE | TRUE | T104
+1 | ghi  | 1| T4                  | TMAX | TRUE | FALSE | T105
+
+
+The final destination table will be as follows:
+
+Id(PK) | COL1 | COL2   | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_synced
+--- |---|--------|---------------------| --- |------------------| ---
+1  | abc | 1      | T1                  | T2-1 | FALSE            | T100
+1 | pqr | 2      | T2                  | T3-1 | FALSE            | T101
+2  | mno  | 3  | T3                  | TMAX | TRUE             | T103
+1 | def | 1 |T3 | T4-1 | FALSE            | T104
+1  | ghi | 1 | T4 | TMAX | TRUE             | T105
+
+**Explanation:**
+- We got new records for id = 1. 
+- We check for the corresponding earliest record (`_fivetran_earliest` = TRUE), and delete the existing records from the destination table where the `_fivetran_start` column value is greater than or equal to the  `_fivetran_start` column value of the matching rows in batch file.
+- We set the value of `_fivetran_end` of the active record in the destination table to `_fivetran_start`-1 of the `_fivetran_earliest` record of the batch file.
+- We set the `_fivetran_active` column value for the above updated record to FALSE.
+- We insert new records in the destination table _as is_, excluding the `_fivetran_earliest` column.
+
+#### Example 2
+
+This example describes a case where the `_fivetran_start` column value of the destination row is greater than or equal to the  `_fivetran_start` of the matching row in the batch file.
+
+Suppose the existing table in the destination is as follows:
+
+Id(PK) | COL1 | COL2 | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_synced
+--- |---|--------|---------------------| --- |------------------| ---
+1  | xyz | 4 | T1 | T3-1 | FALSE            | T100
+1  | abc | 1 | T3 | T4-1 | FALSE            | T100 
+1  | pqr | 2 | T4 | TMAX | TRUE             | T101
+2  | mno | 3 | T4 | TMAX | TRUE             | T103
+
+At the source, new records are added:
+
+Id(PK) | COL1 | COL2  | Timestamp  | Type
+--- | --- | --- | --- | ---
+1 | ghi | 1  | T2  | Upserted
+
+
+
+The replace batch file will be as follows:
+
+Id(PK) | COL1  | COL2 | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_earliest | _fivetran_synced
+--- | --- | --- | --- | --- | --- | --- | ---
+1  | ghi | 1 | T2 | TMAX | TRUE | TRUE | T104
+
+The final destination table will be as follows:
+
+Id(PK) |  COL1  | COL2 | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_synced
+--- | --- | --- | --- | --- | --- | --- 
+1  | ghi | 1 | T2 | TMAX | TRUE | T104
+1  | xyz | 4 | T1 | T3-1 | FALSE | T100
+2  | mno | 3 | T4 | TMAX | TRUE | T103
+
+**Explanation:**
+- We got a new record for id = 1.
+- We check for the corresponding earliest record(`_fivetran_earliest` = TRUE), and delete existing records from the destination table where `_fivetran_start` of destination row is greater than or equal to the `_fivetran_start` of the matching row in the batch file. In this example, we have such records, so we delete records with id = 1, _fivetran_start = T3 and id = 1, _fivetran_start = T4.
+- `_fivetran_end` of the active record in the destination table is set to `_fivetran_start`-1 of the `_fivetran_earliest` record of the batch file.
+- We set the `_fivetran_active` column value for the above updated record to FALSE.
+- We insert new records _as is_ excluding the `_fivetran_earliest` column.
+
+### Updates
+
+Suppose the existing table in destination is as follows:
+
+Id(PK) | COL1 | COL2 | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_synced
+--- | --- | --- | --- | --- | --- | --- 
+1  | abc  | 1  | T1  | T2-1  | FALSE  | T100
+1 | pqr | 2 | T2 | TMAX | TRUE | T101
+2 | mno | 3 | T2 | TMAX | TRUE | T103
+
+
+At the source, record with Id = 1 is updated:
+
+Id(PK) | COL1 | Timestamp  | Type
+--- | --- | --- | ---
+1 | xyz | T3 | Updated
+
+
+
+and record with Id = 2 is updated:
+
+Id(PK) |  COL2  | Timestamp  | Type
+--- | --- | --- | ---
+2 | 1000 | T4 | Updated
+
+And lastly, record with Id = 1 is again updated:
+
+Id(PK) |  COL1  | Timestamp  | Type
+--- | --- | --- | ---
+1  | def  | T5  | Updated
+
+
+
+The update batch file will be as follows:
+
+
+Id(PK) | COL1  | COL2 | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_earliest | _fivetran_synced
+--- | --- | --- | --- | --- | --- | --- | ---
+1  | xyz | | T3| T5-1 | FALSE | TRUE | T107
+2 | | 1000 | T4 | TMAX | TRUE | TRUE | T108
+1 | def | | T5 | TMAX | TRUE | FALSE | T109
+
+
+Final Destination Table will be as follows:
+
+Id(PK) |  COL1  | COL2 | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_synced
+--- | --- | --- | --- | --- | --- | ---
+1  | abc  | 1  | T1  | T2-1  | FALSE  | T100
+1  | pqr | 2 | T2 | T3-1 | FALSE | T101
+2  | mno | 3 | T2 | T4-1 | FALSE | T103
+1  | def | 2 | T5 | TMAX | TRUE | T109
+1  | xyz | 2 | T3 | T5-1 | FALSE | T107
+2  | mno | 1000 | T4 | TMAX | TRUE | T108
+
+
+
+**Explanation:**
+ - In the batch file, we got records with id = 1 and id = 2.
+- We set unmodified columns' values to the values of the active records. In this example, for id = 2, we didn’t get COL1 value, so we set COL1 to “mno” (COL1 value of the active record).
+- We set `_fivetran_end` of the active record in the destination table to `_fivetran_start`-1 of the `_fivetran_earliest` record of the batch file.
+- We set the `_fivetran_active` column value for the above updated record to FALSE.
+- We set other columns _as is_ from the batch file in the destination table except the `_fivetran_earliest` column.
+
+
+### Deletes
+
+Suppose the existing table in the destination is as follows:
+
+Id(PK) | COL1  | COL2 | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_synced
+--- | --- | --- | --- | --- | --- | ---
+1  | abc | 1  | T1  | T2-1 | FALSE | T100
+1  | pqr | 2 | T2 | TMAX | TRUE | T101
+2 | mno | 3 | T2 | TMAX | TRUE | T103
+
+
+
+At the source, a record is deleted:
+
+
+Id(PK) | Timestamp  | Type
+--- | --- | ---
+1  | T3 | Deleted
+
+
+The delete batch file will be as follows:
+
+Id(PK) | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_earliest | _fivetran_synced
+--- | --- |---------------|------| --- | ---
+1  | | T3-1          |  | TRUE | T104
+
+
+The final destination table will be as follows:
+
+Id(PK) | COL1  | COL2 | _fivetran_start(PK) | _fivetran_end | _fivetran_active | _fivetran_synced
+--- | --- | --- | --- | --- |------------------| ---
+1  | abc  | 1  | T1  | T2-1  | FALSE            | T100
+1  | pqr | 2 | T2 | T3-1 | FALSE            | T101
+2  | mno | 3 | T2 | TMAX | TRUE             | T103
+
+**Explanation:**
+- We set the `_fivetran_active` column value to FALSE for the active record and set the `_fivetran_end` column value to `T3-1`.
+
+