Releases: cloudera-labs/hms-mirror
hms-mirror
hms-mirror
Max Reducers setting, when needed, was a double under certain conditions. Cast to INT to allow Hive to set value in session.
hms-mirror
handle data-size issues with stats when not available.
added more traps for hdpHive3 (lack of features check) hive.optimize.sort.dynamic.partition.threshold.
hms-mirror
For extremely large tables with a lot of partitions, we fixed the max reducer calculations to match the need based on the distribution.
hms-mirror
Fixed some casting issues while setting dynamic partitions and max reducers.
hms-mirror
Features:
- Auto-Tuning (
-at
)- Introduction of basic stats regarding file counts/sizes for large tables. We'll make adjustments to DISTRIBUTE BY and Tez Groupings to provide more efficient/balanced migrations with better/more optimized file sizes after migration for migrations using SQL. #53 - AAdditional table filters (
-tfs|--table-filter-size-limit
and-tfp|--table-filter-partition-count-limit
) that check a tables data size and partition count limits can also be applied to narrow the range of tables you'll process. #55 - Add property to tables migrated with "STORAGE_MIGRATION" to identify and filter them out from future runs. #56
-cto|--compress-text-output
option and additional session level settings using basic stats.- HDP3 scenario that doesn't support MANAGEDLOCATION element in database properties. #52
Fixes:
- AVRO Schema Only Fix.. #58
- Cleanup messaging around legacy config settings.
- Fix/Added
dbRegEx
command line parameter: #57
NOTE: Configuration Breaking Change. If you see note about A configuration element is no longer valid, progress. Please remove the element from the configuration yaml and try again.
with Caused by: com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "tblRegEx"
, please remove the properties dbRegEx
, tblRegEx
and tblExcludeRegEx
from the config yaml.
hms-mirror
Features:
- Auto-Tuning - Introduction of basic stats regarding file counts/sizes for large tables. We'll make adjustments to DISTRIBUTE BY and Tez Groupings to provide more efficient/balanced migrations with better/more optimized file sizes after migration for migrations using SQL. #53
- AAdditional table filters (
-tfs|--table-filter-size-limit
and-tfp|--table-filter-partition-count-limit
) that check a tables data size and partition count limits can also be applied to narrow the range of tables you'll process. #55 - Add property to tables migrated with "STORAGE_MIGRATION" to identify and filter them out from future runs. #56
-cto|--compress-text-output
option and additional session level settings using basic stats.- HDP3 scenario that doesn't support MANAGEDLOCATION element in database properties. #52
Fixes:
- AVRO Schema Only Fix.. #58
- Cleanup messaging around legacy config settings.
- Fix/Added
dbRegEx
command line parameter: #57
NOTE: Configuration Breaking Change. If you see note about A configuration element is no longer valid, progress. Please remove the element from the configuration yaml and try again.
with Caused by: com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "tblRegEx"
, please remove the properties dbRegEx
, tblRegEx
and tblExcludeRegEx
from the config yaml.
hms-mirror
Support for HDP Hive 3 anomalies regarding locations.
hms-mirror
Features:
- Prevent user from setting -wd and -ewd to the same value.
- Exit process if the database commands aren't successful. #48
- Global Location Map option, which would override -rdl is a match is found. #50
- ForceExternalLocations added to address bugs in HDP Hive 3 where the Database LOCATION is NOT honored for new tables and falls back to the warehouse directory. #51
Change internal structure of property overrides.
Fix Prefixed DB in CLeanup sql. #46
Fixed dbprefix for dc. #36
hms-mirror
STORAGE_MIGRATION checks for:
- Previously Migrated
- Old Artifact Tables
- Validations when using -dc (table name mismatches to directory name)
- Table Status Rollups in Report
- Simplified options for -rdl, -dc, when using STORAGE_MIGRATION
- Support for STORAGE_MIGRATION in same Namespace (used to migrate/organize data to another directory)