diff --git a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md index 1305f8799f0c3..8fe862ed2d3e3 100644 --- a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md +++ b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md @@ -26,31 +26,51 @@ under the License. ## Description -This statement is used to cancel rebalancing disks of specified backends with high priority +The `CANCEL REBALANCE DISK` statement is used to cancel the high-priority disk data balancing for Backend (BE) nodes. This statement has the following functionalities: -Grammar: +- It can cancel the high-priority disk balancing for specified BE nodes. +- It can cancel the high-priority disk balancing for all BE nodes in the entire cluster. +- After cancellation, the system will still balance the disk data of BE nodes using the default scheduling method. -ADMIN CANCEL REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; +## Syntax -Explain: +```sql +ADMIN CANCEL REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` -1. This statement only indicates that the system no longer rebalance disks of specified backends with high priority. The system will still rebalance disks by default scheduling. +## Optional Parameters -## Example +**1. `":"`** -1. Cancel High Priority Disk Rebalance of all of backends of the cluster +> Specifies the list of BE nodes for which the high-priority disk balancing needs to be canceled. +> +> Each node consists of a hostname (or IP address) and a heartbeat port. +> +> If this parameter is not specified, it will cancel the high-priority disk balancing for all BE nodes. -ADMIN CANCEL REBALANCE DISK; +## Access Control Requirements -2. Cancel High Priority Disk Rebalance of specified backends +Users executing this SQL command must have at least the following permissions: -ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | + +## Usage Notes -## Keywords +- This statement only indicates that the system will no longer prioritize balancing the disk data of specified BEs; however, the system will still balance BE's disk data using the default scheduling method. +- After executing this command, any previously set high-priority balancing strategy will become immediately invalid. -ADMIN,CANCEL,REBALANCE DISK +## Examples -## Best Practice +- Cancel high-priority disk balancing for all BEs in the cluster: + ```sql + ADMIN CANCEL REBALANCE DISK; + ``` +- Cancel high-priority disk balancing for specified BEs: +```sql +ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +``` diff --git a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md index 005f3e50c949a..e21b80cb8ea90 100644 --- a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md +++ b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md @@ -1,6 +1,6 @@ --- { - "title": "CANCEL REPAIR", + "title": "CANCEL REPAIR TABLE", "language": "en" } --- @@ -27,29 +27,59 @@ under the License. ## Description -This statement is used to cancel the repair of the specified table or partition with high priority +The `CANCEL REPAIR TABLE` statement is used to cancel high-priority repairs for a specified table or partition. This statement has the following functionalities: -grammar: +- It can cancel high-priority repairs for an entire table. +- It can cancel high-priority repairs for specified partitions. +- It does not affect the system's default replica repair mechanism. + +## Syntax ```sql -ADMIN CANCEL REPAIR TABLE table_name[ PARTITION (p1,...)]; +ADMIN CANCEL REPAIR TABLE [ PARTITION ( [, ...]) ]; ``` -illustrate: +## Required Parameters + +**1. ``** + +> Specifies the name of the table for which the repair is to be canceled. +> +> The table name must be unique within its database. + +## Optional Parameters + +**1. `PARTITION ( [, ...])`** + +> Specifies a list of partition names for which the repair is to be canceled. +> +> If this parameter is not specified, it will cancel high-priority repairs for the entire table. + +## Access Control Requirements + +Users executing this SQL command must have at least the following permissions: -1. This statement simply means that the system will no longer repair shard copies of the specified table or partition with high priority. Replicas are still repaired with the default schedule. +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | -## Example +## Usage Notes - 1. Cancel high priority repair +- This statement only cancels high-priority repairs and does not stop the system's default replica repair mechanism. +- After cancellation, the system will still repair replicas using the default scheduling method. +- If there is a need to re-establish high-priority repairs, the `ADMIN REPAIR TABLE` command can be used. +- The effects of this command take place immediately after execution. - ```sql - ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1); - ``` +## Examples -## Keywords +- Cancel high-priority repairs for an entire table: - ADMIN, CANCEL, REPAIR + ```sql + ADMIN CANCEL REPAIR TABLE tbl; + ``` -## Best Practice +- Cancel high-priority repairs for specified partitions: + ```sql + ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1, p2); + ``` diff --git a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md index 21dea061c04b1..bdf90dcad4d0c 100644 --- a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md +++ b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md @@ -24,44 +24,54 @@ under the License. --> +## Description +The `REBALANCE DISK` statement is used to optimize the data distribution on Backend (BE) nodes. This statement has the following functionalities: -### Name +- It can perform data balancing for specified BE nodes. +- It can balance data across all BE nodes in the entire cluster. +- It prioritizes balancing the data of specified nodes, regardless of the overall balance state of the cluster. -ADMIN REBALANCE DISK +## Syntax -## Description +```sql +ADMIN REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` -This statement is used to try to rebalance disks of the specified backends first, no matter if the cluster is balanced +## Optional Parameters -Grammar: +**1. `":"`** -``` -ADMIN REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; -``` +> Specifies the list of BE nodes that need to be balanced. +> +> Each node consists of a hostname (or IP address) and a heartbeat port. +> +> If this parameter is not specified, it will balance all BE nodes. -Explain: +## Access Control Requirements -1. This statement only means that the system attempts to rebalance disks of specified backends with high priority, no matter if the cluster is balanced. -2. The default timeout is 24 hours. Timeout means that the system will no longer rebalance disks of specified backends with high priority. The command settings need to be reused. +Users executing this SQL command must have at least the following permissions: -## Example +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | -1. Attempt to rebalance disks of all backends +## Usage Notes -``` +- The default timeout for this command is 24 hours. After this period, the system will no longer prioritize balancing the disk data of specified BEs. To continue balancing, the command needs to be executed again. +- Once the disk data balancing for a specified BE node is completed, the high-priority setting for that node will automatically become invalid. +- This command can be executed even when the cluster is in an unbalanced state. + +## Examples + +- Balance data across all BE nodes in the cluster: + +```sql ADMIN REBALANCE DISK; ``` -2. Attempt to rebalance disks oof the specified backends +- Balance data for two specified BE nodes: -``` +```sql ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); ``` - -## Keywords - -ADMIN,REBALANCE,DISK - -## Best Practice - diff --git a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md index 3e37dbd7b0b5a..3d6e3c08519cc 100644 --- a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md +++ b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md @@ -27,32 +27,68 @@ under the License. ## Description -statement used to attempt to preferentially repair the specified table or partition +The `REPAIR TABLE` statement is used to prioritize the repair of replicas for a specified table or partition. This statement has the following functionalities: -grammar: +- It can repair all replicas of an entire table. +- It can repair replicas of specified partitions. +- It performs replica repairs with high priority. +- It supports setting a repair timeout. + +## Syntax ```sql -ADMIN REPAIR TABLE table_name[ PARTITION (p1,...)] +ADMIN REPAIR TABLE [ PARTITION ( [, ...]) ]; ``` -illustrate: +## Required Parameters + +**1. ``** + +> Specifies the name of the table that needs to be repaired. +> +> The table name must be unique within its database. + +## Optional Parameters + +**1. `PARTITION ( [, ...])`** + +> Specifies a list of partition names that need to be repaired. +> +> If this parameter is not specified, it will repair all partitions of the entire table. + +## Access Control Requirements + +Users executing this SQL command must have at least the following permissions: -1. This statement only means to let the system try to repair the shard copy of the specified table or partition with high priority, and does not guarantee that the repair can be successful. Users can view the repair status through the SHOW REPLICA STATUS command. -2. The default timeout is 14400 seconds (4 hours). A timeout means that the system will no longer repair shard copies of the specified table or partition with high priority. Need to re-use this command to set +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | -## Example +## Usage Notes -1. Attempt to repair the specified table +- This statement indicates that the system will attempt to repair the specified replicas with high priority, but it does not guarantee successful repairs. +- The default timeout is set to 14,400 seconds (4 hours). +- After the timeout, the system will no longer prioritize the repair of specified replicas. +- If a repair times out, the command needs to be executed again to continue the repair process. +- The progress of repairs can be monitored using the `SHOW REPLICA STATUS` command. +- This command does not affect the normal replica repair mechanism of the system; it merely elevates the priority of repairs for the specified table or partition. - ADMIN REPAIR TABLE tbl1; +## Examples -2. Try to repair the specified partition +- Repair all replicas of an entire table: - ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); + ```sql + ADMIN REPAIR TABLE tbl1; + ``` -## Keywords +- Repair replicas of specified partitions: - ADMIN, REPAIR, TABLE + ```sql + ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); + ``` -## Best Practice +- Check the repair progress: + ```sql + SHOW REPLICA STATUS FROM tbl1; + ``` diff --git a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md index 028232af4e8a6..2847303572e3e 100644 --- a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md +++ b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md @@ -24,55 +24,86 @@ specific language governing permissions and limitations under the License. --> +## Description +The `SET TABLE STATUS` statement is used to manually set the status of an OLAP table. This statement has the following functionalities: +- It only supports setting the status of OLAP tables. +- It can modify the table status to a specified target state. +- It is used to resolve task blocking caused by the table status. -## Description +**Supported States**: + +| State | Description | +|-------------------|--------------------------------------| +| NORMAL | Indicates that the table is in a normal state. | +| ROLLUP | Indicates that the table is undergoing a ROLLUP operation. | +| SCHEMA_CHANGE | Indicates that the table is undergoing a schema change. | +| BACKUP | Indicates that the table is undergoing a backup operation. | +| RESTORE | Indicates that the table is undergoing a restore operation. | +| WAITING_STABLE | Indicates that the table is waiting for a stable state. | -This statement is used to set the state of the specified table. Only supports OLAP tables. +## Syntax -This command is currently only used to manually set the OLAP table state to the specified state, allowing some jobs that are stuck by the table state to continue running. +```sql +ADMIN SET TABLE STATUS PROPERTIES ("" = "" [, ...]); +``` -grammar: +Where: ```sql -ADMIN SET TABLE table_name STATUS - PROPERTIES ("key" = "value", ...); + + : "state" + + + : "NORMAL" + | "ROLLUP" + | "SCHEMA_CHANGE" + | "BACKUP" + | "RESTORE" + | "WAITING_STABLE" ``` -The following properties are currently supported: +## Required Parameters -1. "state":Required. Specifying a target state then the state of the OLAP table will change to this state. +**1. ``** -> The current target states include: -> -> 1. NORMAL -> 2. ROLLUP -> 3. SCHEMA_CHANGE -> 4. BACKUP -> 5. RESTORE -> 6. WAITING_STABLE -> -> If the current state of the table is already the specified state, it will be ignored. +> Specifies the name of the table for which the status needs to be set. +> +> The table name must be unique within its database. -**Note: This command is generally only used for emergency fault repair, please proceed with caution.** +**2. `PROPERTIES ("state" = "")`** -## Example +> Specifies the target status of the table. +> +> The "state" property must be set, and its value must be one of the supported states. -1. Set the state of table tbl1 to NORMAL. +## Access Control Requirements -```sql -admin set table tbl1 status properties("state" = "NORMAL"); -``` +Users executing this SQL command must have at least the following permissions: -2. Set the state of table tbl2 to SCHEMA_CHANGE +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | -```sql -admin set table test_set_table_status status properties("state" = "SCHEMA_CHANGE"); -``` +## Usage Notes + +- This command is intended for emergency fault recovery; please use it with caution. +- It only supports OLAP tables and does not support other types of tables. +- If the table is already in the target state, this command will be ignored. +- Improper state settings may lead to system anomalies; it is recommended to use this command under technical support guidance. +- After modifying the status, it is advisable to monitor the system's operational status promptly. + +## Examples + +- Set the table status to NORMAL: -## Keywords + ```sql + ADMIN SET TABLE tbl1 STATUS PROPERTIES("state" = "NORMAL"); + ``` - ADMIN, SET, TABLE, STATUS +- Set the table status to SCHEMA_CHANGE: -## Best Practice \ No newline at end of file + ```sql + ADMIN SET TABLE tbl2 STATUS PROPERTIES("state" = "SCHEMA_CHANGE"); + ``` diff --git a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md index d2d5c794c9d19..ee2e631eef106 100644 --- a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md +++ b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md @@ -24,28 +24,170 @@ specific language governing permissions and limitations under the License. --> - - ## Description -This statement is used to view the data skew of a table or a partition. - -grammar: - -`SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (p1)];` - - - -1. Only one partition must be specified. For non-partitioned tables, the partition name is the same as the table name. -2. The result will show row count and data volume of each bucket under the specified partition, and the proportion of the data volume of each bucket in the total data volume. - -## Example - -1. View the data skew of the table - - ` SHOW DATA SKEW FROM db1.test PARTITION(p1);` - -## Keywords - -SHOW, DATA, SKEW - +The `SHOW DATA SKEW` statement is used to view the data skew of a table or partition. This statement has the following functionalities: + +- It can display the data distribution of the entire table. +- It can display the data distribution of specified partitions. +- It shows the row count, data volume, and percentage for each bucket. +- It supports both partitioned and non-partitioned tables. + +## Syntax + +```sql +SHOW DATA SKEW FROM [.] [ PARTITION ( [, ...]) ]; +``` + +## Required Parameters + +**1. `FROM [.]`** + +> Specifies the name of the table to be viewed. The database name can be included. +> +> The table name must be unique within its database. + +## Optional Parameters + +**1. `PARTITION ( [, ...])`** + +> Specifies a list of partition names to be viewed. +> +> If this parameter is not specified, it will display the data distribution for all partitions in the table. +> +> For non-partitioned tables, the partition name is the same as the table name. + +## Return Values + +| Column Name | Description | +|------------------|--------------------------------------| +| PartitionName | Partition name | +| BucketIdx | Bucket index number | +| AvgRowCount | Average row count | +| AvgDataSize | Average data size (in bytes) | +| Graph | Visualization chart of data distribution | +| Percent | Percentage of this bucket's data volume relative to total data volume | + +## Access Control Requirements + +Users executing this SQL command must have at least the following permissions: + +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| SELECT | Table | SELECT permission is required for viewing the table. | + +## Usage Notes + +- The data distribution is displayed along two dimensions: partition and bucket. +- The Graph column uses the character `>` to visually represent the data distribution ratio. +- Percentages are accurate to two decimal places. +- For non-partitioned tables, the partition name in the query result is the same as the table name. + +## Examples + +- Create a partitioned table and view its data distribution: + + ```sql + CREATE TABLE test_show_data_skew + ( + id int, + name string, + pdate date + ) + PARTITION BY RANGE(pdate) + ( + FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY + ) + DISTRIBUTED BY HASH(id) BUCKETS 5 + PROPERTIES ( + "replication_num" = "1" + ); + ``` + +- View data distribution for the entire table: + + + ```sql + SHOW DATA SKEW FROM test_show_data_skew; + ``` + + ```text + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % | + | p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % | + | p_20230416 | 2 | 0 | 0 | | 00.00 % | + | p_20230416 | 3 | 0 | 0 | | 00.00 % | + | p_20230416 | 4 | 0 | 0 | | 00.00 % | + | p_20230417 | 0 | 0 | 0 | | 00.00 % | + | p_20230417 | 1 | 0 | 0 | | 00.00 % | + | p_20230417 | 2 | 0 | 0 | | 00.00 % | + | p_20230417 | 3 | 0 | 0 | | 00.00 % | + | p_20230417 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | + | p_20230418 | 0 | 0 | 0 | | 00.00 % | + | p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | + | p_20230418 | 2 | 0 | 0 | | 00.00 % | + | p_20230418 | 3 | 0 | 0 | | 00.00 % | + | p_20230418 | 4 | 0 | 0 | | 00.00 % | + | p_20230419 | 0 | 0 | 0 | | 00.00 % | + | p_20230419 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.96 % | + | p_20230419 | 2 | 0 | 0 | | 00.00 % | + | p_20230419 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.04 % | + | p_20230419 | 4 | 0 | 0 | | 00.00 % | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + ``` + + View data distribution for specified partitions: + + ```sql + SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418); + ``` + + ```text + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % | + | p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % | + | p_20230416 | 2 | 0 | 0 | | 00.00 % | + | p_20230416 | 3 | 0 | 0 | | 00.00 % | + | p_20230416 | 4 | 0 | 0 | | 00.00 % | + | p_20230418 | 0 | 0 | 0 | | 00.00 % | + | p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | + | p_20230418 | 2 | 0 | 0 | | 00.00 % | + | p_20230418 | 3 | 0 | 0 | | 00.00 % | + | p_20230418 | 4 | 0 | 0 | | 00.00 % | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + ``` + +- View data distribution for a non-partitioned table: + + ```sql + CREATE TABLE test_show_data_skew2 + ( + id int, + name string, + pdate date + ) + DISTRIBUTED BY HASH(id) BUCKETS 5 + PROPERTIES ( + "replication_num" = "1" + ); + ``` + + ```sql + SHOW DATA SKEW FROM test_show_data_skew2; + ``` + + ```text + +----------------------+-----------+-------------+-------------+---------------------------+---------+ + | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | + +----------------------+-----------+-------------+-------------+---------------------------+---------+ + | test_show_data_skew2 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.73 % | + | test_show_data_skew2 | 1 | 4 | 667 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.46 % | + | test_show_data_skew2 | 2 | 0 | 0 | | 00.00 % | + | test_show_data_skew2 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.77 % | + | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | + +----------------------+-----------+-------------+-------------+---------------------------+---------+ + ``` \ No newline at end of file diff --git a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md index 5efddf21d5e69..9c9614ec3ac93 100644 --- a/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md +++ b/docs/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md @@ -24,37 +24,92 @@ specific language governing permissions and limitations under the License. --> +## Description +The `SHOW DATA` statement is used to display information about data volume, replica count, and row statistics. This statement has the following functionalities: -## Description +- It can display the data volume and replica count for all tables in the current database. +- It can show the data volume, replica count, and row statistics for a specified table's materialized views. +- It can display the quota usage of the database. +- It supports sorting by data volume, replica count, etc. + +## Syntax -This statement is used to display the amount of data, the number of replicas, and the number of statistical rows. +```sql +SHOW DATA [ FROM [.] ] [ ORDER BY ]; +``` -grammar: +Where: ```sql -SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +order_by_clause: + [ ASC | DESC ] [ , [ ASC | DESC ] ... ] ``` -illustrate: +## Optional Parameters -1. If the FROM clause is not specified, the data volume and number of replicas subdivided into each table under the current db will be displayed. The data volume is the total data volume of all replicas. The number of replicas is the number of replicas for all partitions of the table and all materialized views. +**1. `FROM [.]`** -2. If the FROM clause is specified, the data volume, number of copies and number of statistical rows subdivided into each materialized view under the table will be displayed. The data volume is the total data volume of all replicas. The number of replicas is the number of replicas for all partitions of the corresponding materialized view. The number of statistical rows is the number of statistical rows for all partitions of the corresponding materialized view. +> Specifies the name of the table to view. The database name can be included. +> +> If this parameter is not specified, it will display data information for all tables in the current database. -3. When counting the number of rows, the one with the largest number of rows among the multiple copies shall prevail. +**2. `ORDER BY `** -4. The `Total` row in the result set represents the total row. The `Quota` line represents the quota set by the current database. The `Left` line indicates the remaining quota. +> Specifies the sorting method for the result set. +> +> Any column can be sorted in ascending (ASC) or descending (DESC) order. +> +> Supports multi-column combination sorting. -5. If you want to see the size of each Partition, see `help show partitions`. +## Return Values -6. You can use ORDER BY to sort on any combination of columns. +Depending on different query scenarios, the following result sets are returned: -## Example +- When the `FROM` clause is not specified (displaying database-level information): -1. Display the data size and RecycleBin size of each database by default. +| Column Name | Description | +|------------------|--------------------------------------| +| DbId | Database ID | +| DbName | Database name | +| Size | Total data volume of the database | +| RemoteSize | Remote storage data volume | +| RecycleSize | Recycle bin data volume | +| RecycleRemoteSize| Recycle bin remote storage volume | - ``` +- When the `FROM` clause is specified (displaying table-level information): + +| Column Name | Description | +|------------------|--------------------------------------| +| TableName | Table name | +| IndexName | Index (materialized view) name | +| Size | Data size | +| ReplicaCount | Replica count | +| RowCount | Row statistics (shown only when viewing a specific table) | + +## Access Control Requirements + +Users executing this SQL command must have at least the following permissions: + +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| SELECT | Table | SELECT permission is required for viewing the table. | + +## Usage Notes + +- The data volume statistics include the total data volume of all replicas. +- The replica count includes all partitions and replicas of all materialized views for the table. +- When counting rows, it considers the maximum row count among multiple replicas. +- The `Total` row in the result set indicates aggregated data. +- The `Quota` row in the result set indicates the current quota set for the database. +- The `Left` row in the result set indicates remaining quota. +- If you need to view the size of each partition, use the `SHOW PARTITIONS` command. + +## Examples + +- Display data volume information for all databases: + + ```sql SHOW DATA; ``` @@ -64,74 +119,62 @@ illustrate: +-------+-----------------------------------+--------+------------+-------------+-------------------+ | 21009 | db1 | 0 | 0 | 0 | 0 | | 22011 | regression_test_inverted_index_p0 | 72764 | 0 | 0 | 0 | - | 0 | information_schema | 0 | 0 | 0 | 0 | - | 22010 | regression_test | 0 | 0 | 0 | 0 | - | 1 | mysql | 0 | 0 | 0 | 0 | - | 22017 | regression_test_show_p0 | 0 | 0 | 0 | 0 | - | 10002 | __internal_schema | 46182 | 0 | 0 | 0 | | Total | NULL | 118946 | 0 | 0 | 0 | +-------+-----------------------------------+--------+------------+-------------+-------------------+ ``` -2. Display the data volume, replica number, aggregate data volume and aggregate replica number of each table in a database. +- Display data volume information for all tables in the current database: - ```sql - USE db1; - SHOW DATA; - ``` + ```sql + USE db1; + SHOW DATA; + ``` - ``` - +-----------+-------------+--------------+ - | TableName | Size | ReplicaCount | - +-----------+-------------+--------------+ - | tbl1 | 900.000 B | 6 | - | tbl2 | 500.000 B | 3 | - | Total | 1.400 KB | 9 | - | Quota | 1024.000 GB | 1073741824 | - | Left | 1021.921 GB | 1073741815 | - +-----------+-------------+--------------+ - ``` + ```text + +-----------+-------------+--------------+ + | TableName | Size | ReplicaCount | + +-----------+-------------+--------------+ + | tbl1 | 900.000 B | 6 | + | tbl2 | 500.000 B | 3 | + | Total | 1.400 KB | 9 | + | Quota | 1024.000 GB | 1073741824 | + | Left | 1021.921 GB | 1073741815 | + +-----------+-------------+--------------+ + ``` -3. Display the subdivided data volume, the number of replicas and the number of statistical rows of the specified table under the specified db +- Display detailed data volume information for a specified table: - ```sql - SHOW DATA FROM example_db.test; - ``` + ```sql + SHOW DATA FROM example_db.test; + ``` - ``` - +-----------+-----------+-----------+--------------+----------+ - | TableName | IndexName | Size | ReplicaCount | RowCount | - +-----------+-----------+-----------+--------------+----------+ - | test | r1 | 10.000MB | 30 | 10000 | - | | r2 | 20.000MB | 30 | 20000 | - | | test2 | 50.000MB | 30 | 50000 | - | | Total | 80.000 | 90 | | - +-----------+-----------+-----------+--------------+----------+ - ``` + ```text + +-----------+-----------+-----------+--------------+----------+ + | TableName | IndexName | Size | ReplicaCount | RowCount | + +-----------+-----------+-----------+--------------+----------+ + | test | r1 | 10.000MB | 30 | 10000 | + | | r2 | 20.000MB | 30 | 20000 | + | | test2 | 50.000MB | 30 | 50000 | + | | Total | 80.000MB | 90 | | + +-----------+-----------+-----------+--------------+----------+ + ``` -4. It can be combined and sorted according to the amount of data, the number of copies, the number of statistical rows, etc. +- Sort by replica count in descending order and by data volume in ascending order: - ```sql - SHOW DATA ORDER BY ReplicaCount desc,Size asc; - ``` + ```sql + SHOW DATA ORDER BY ReplicaCount DESC, Size ASC; + ``` - ``` - +-----------+-------------+--------------+ - | TableName | Size | ReplicaCount | - +-----------+-------------+--------------+ - | table_c | 3.102 KB | 40 | - | table_d | .000 | 20 | - | table_b | 324.000 B | 20 | - | table_a | 1.266 KB | 10 | - | Total | 4.684 KB | 90 | - | Quota | 1024.000 GB | 1073741824 | - | Left | 1024.000 GB | 1073741734 | + ```text + +-----------+-------------+--------------+ + | TableName | Size | ReplicaCount | + +-----------+-------------+--------------+ + | table_c | 3.102 KB | 40 | + | table_d | .000 | 20 | + | table_b |=324.000 B |=20 | + |=table_a |=1.266 KB |=10 | + |=Total |=4.684 KB |=90 | + |=Quota |=1024.000 GB |=1073741824 | + |=Left |=1024.000 GB |=1073741734 | +-----------+-------------+--------------+ ``` - -## Keywords - - SHOW, DATA - -## Best Practice - diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md index 6e0e49f043da1..49055fc93fdf4 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md @@ -22,35 +22,53 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`CANCEL REBALANCE DISK` 语句用于取消优先均衡 BE(Backend)节点的磁盘数据。该语句具有以下功能: +- 可以取消指定 BE 节点的优先磁盘均衡 +- 可以取消整个集群所有 BE 节点的优先磁盘均衡 +- 取消后系统仍会以默认调度方式均衡 BE 的磁盘数据 +## 语法 -## 描述 +```sql +ADMIN CANCEL REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` -该语句用于取消优先均衡 BE 的磁盘 +## 可选参数 -语法: +**1. `":"`** -ADMIN CANCEL REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; +> 指定需要取消优先磁盘均衡的 BE 节点列表。 +> +> 每个节点由主机名(或 IP 地址)和心跳端口组成。 +> +> 如果不指定此参数,则取消所有 BE 节点的优先磁盘均衡。 -说明: +## 权限控制 -1. 该语句仅表示系统不再优先均衡指定 BE 的磁盘数据。系统仍会以默认调度方式均衡 BE 的磁盘数据。 +执行此 SQL 命令的用户必须至少具有以下权限: -## 示例 +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | -1. 取消集群所有 BE 的优先磁盘均衡 +## 注意事项 - ADMIN CANCEL REBALANCE DISK; +- 该语句仅表示系统不再优先均衡指定 BE 的磁盘数据,系统仍会以默认调度方式均衡 BE 的磁盘数据。 +- 执行该命令后,之前设置的优先均衡策略将立即失效。 -2. 取消指定 BE 的优先磁盘均衡 - - ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +## 示例 -## 关键词 +- 取消集群所有 BE 的优先磁盘均衡: - ADMIN,CANCEL,REBALANCE,DISK + ```sql + ADMIN CANCEL REBALANCE DISK; + ``` -### 最佳实践 +- 取消指定 BE 的优先磁盘均衡: +```sql +ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md new file mode 100644 index 0000000000000..40e6416309bb7 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md @@ -0,0 +1,84 @@ +--- +{ + "title": "CANCEL REPAIR TABLE", + "language": "zh-CN" +} +--- + + + +## 描述 + +`CANCEL REPAIR TABLE` 语句用于取消对指定表或分区的高优先级修复。该语句具有以下功能: + +- 可以取消整个表的高优先级修复 +- 可以取消指定分区的高优先级修复 +- 不影响系统默认的副本修复机制 + +## 语法 + +```sql +ADMIN CANCEL REPAIR TABLE [ PARTITION ( [, ...]) ]; +``` + +## 必选参数 + +**1. ``** + +> 指定要取消修复的表名。 +> +> 表名在其所在的数据库中必须唯一。 + +## 可选参数 + +**1. `PARTITION ( [, ...])`** + +> 指定要取消修复的分区名称列表。 +> +> 如果不指定此参数,则取消整个表的高优先级修复。 + +## 权限控制 + +执行此 SQL 命令的用户必须至少具有以下权限: + +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | + +## 注意事项 + +- 该语句仅取消高优先级修复,不会停止系统的默认副本修复机制 +- 取消后,系统仍会以默认调度方式修复副本 +- 如果需要重新设置高优先级修复,可以使用 `ADMIN REPAIR TABLE` 命令 +- 该命令执行后立即生效 + +## 示例 + +- 取消整个表的高优先级修复: + + ```sql + ADMIN CANCEL REPAIR TABLE tbl; + ``` + +- 取消指定分区的高优先级修复: + + ```sql + ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1, p2); + ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR.md deleted file mode 100644 index 436ee2a115bc5..0000000000000 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR.md +++ /dev/null @@ -1,58 +0,0 @@ ---- -{ - "title": "CANCEL REPAIR", - "language": "zh-CN" -} ---- - - - - - - - -## 描述 - -该语句用于取消以高优先级修复指定表或分区 - -语法: - -```sql -ADMIN CANCEL REPAIR TABLE table_name[ PARTITION (p1,...)]; -``` - -说明: - -1. 该语句仅表示系统不再以高优先级修复指定表或分区的分片副本。系统仍会以默认调度方式修复副本。 - -## 示例 - - 1. 取消高优先级修复 - - ```sql - ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1); - ``` - -## 关键词 - - ADMIN, CANCEL, REPAIR - -### 最佳实践 - diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md index 96ffdb2d4b0f0..4692dea074f4a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md @@ -26,37 +26,52 @@ under the License. ## 描述 -该语句用于尝试优先均衡指定的 BE 磁盘数据 +REBALANCE DISK 语句用于优化 BE(Backend)节点上的数据分布。该语句具有以下功能: -语法: +- 可以针对指定的 BE 节点进行数据均衡 +- 可以对整个集群的所有 BE 节点进行数据均衡 +- 优先均衡指定节点的数据,不受集群整体均衡状态的限制 - ``` - ADMIN REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; - ``` +## 语法 -说明: +```sql +ADMIN REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` - 1. 该语句表示让系统尝试优先均衡指定 BE 的磁盘数据,不受限于集群是否均衡。 - 2. 默认的 timeout 是 24 小时。超时意味着系统将不再优先均衡指定的 BE 磁盘数据。需要重新使用该命令设置。 - 3. 指定 BE 的磁盘数据均衡后,该 BE 的优先级将会失效。 +## 可选参数 -## 示例 +**1. `":"`** + +> 指定需要进行数据均衡的 BE 节点列表。 +> +> 每个节点由主机名(或 IP 地址)和心跳端口组成。 +> +> 如果不指定此参数,则对所有 BE 节点进行均衡。 + +## 权限控制 -1. 尝试优先均衡集群内的所有 BE +执行此 SQL 命令的用户必须至少具有以下权限: - ``` - ADMIN REBALANCE DISK; - ``` +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | -2. 尝试优先均衡指定 BE +## 注意事项 - ``` - ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); - ``` +- 命令的默认超时时间为 24 小时。超时后,系统将不再优先均衡指定的 BE 磁盘数据。如需继续均衡,需要重新执行该命令。 +- 当指定 BE 节点的磁盘数据均衡完成后,该节点的优先均衡设置将自动失效。 +- 该命令可以在集群非均衡状态下执行。 + +## 示例 -## 关键词 +- 对集群内所有 BE 节点进行数据均衡: - ADMIN,REBALANCE,DISK +```sql +ADMIN REBALANCE DISK; +``` -### 最佳实践 +- 对指定的两个 BE 节点进行数据均衡: +```sql +ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md index 1bcde56584a2c..1c3756409e6e7 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md @@ -24,36 +24,71 @@ specific language governing permissions and limitations under the License. --> - - ## 描述 -语句用于尝试优先修复指定的表或分区 +`REPAIR TABLE` 语句用于优先修复指定表或分区的副本。该语句具有以下功能: -语法: +- 可以修复整个表的所有副本 +- 可以修复指定分区的副本 +- 以高优先级进行副本修复 +- 支持设置修复超时时间 + +## 语法 ```sql -ADMIN REPAIR TABLE table_name[ PARTITION (p1,...)] +ADMIN REPAIR TABLE [ PARTITION ( [, ...]) ]; ``` -说明: +## 必选参数 -1. 该语句仅表示让系统尝试以高优先级修复指定表或分区的分片副本,并不保证能够修复成功。用户可以通过 `SHOW REPLICA STATUS` 命令查看修复情况。 -2. 默认的 timeout 是 14400 秒 (4 小时)。超时意味着系统将不再以高优先级修复指定表或分区的分片副本。需要重新使用该命令设置 +**1. ``** -## 示例 +> 指定需要修复的表名。 +> +> 表名在其所在的数据库中必须唯一。 + +## 可选参数 + +**1. `PARTITION ( [, ...])`** + +> 指定需要修复的分区名称列表。 +> +> 如果不指定此参数,则修复整个表的所有分区。 -1. 尝试修复指定表 +## 权限控制 + +执行此 SQL 命令的用户必须至少具有以下权限: + +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | + +## 注意事项 + +- 该语句仅表示系统会尝试以高优先级修复指定的副本,不保证一定能修复成功 +- 默认超时时间为 14400 秒(4 小时) +- 超时后系统将不再以高优先级修复指定的副本 +- 如果修复超时,需要重新执行该命令来继续修复 +- 可以通过 `SHOW REPLICA STATUS` 命令查看修复进度 +- 该命令不会影响系统的正常副本修复机制,仅提升指定表或分区的修复优先级 + +## 示例 - ADMIN REPAIR TABLE tbl1; +- 修复整个表的副本: -2. 尝试修复指定分区 + ```sql + ADMIN REPAIR TABLE tbl1; + ``` - ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); +- 修复指定分区的副本: -## 关键词 + ```sql + ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); + ``` - ADMIN, REPAIR, TABLE +- 查看修复进度: -### 最佳实践 + ```sql + SHOW REPLICA STATUS FROM tbl1; + ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md index ef911204cbd6c..d538391c5f435 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md @@ -24,59 +24,86 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`SET TABLE STATUS` 语句用于手动设置 OLAP 表的状态。该语句具有以下功能: +- 仅支持 OLAP 表的状态设置 +- 可以将表状态修改为指定的目标状态 +- 用于解除因表状态导致的任务阻塞 +**支持的状态**: -## 描述 +| 状态 | 说明 | +|------|------| +| NORMAL | 表示表处于正常状态 | +| ROLLUP | 表示表正在进行 ROLLUP 操作 | +| SCHEMA_CHANGE | 表示表正在进行 Schema 变更 | +| BACKUP | 表示表正在进行备份 | +| RESTORE | 表示表正在进行恢复 | +| WAITING_STABLE | 表示表正在等待稳定状态 | -该语句用于设置指定表的状态,仅支持 OLAP 表。 +## 语法 -该命令目前仅用于手动将 OLAP 表状态设置为指定状态,从而使得某些由于表状态被阻碍的任务能够继续运行。 +```sql +ADMIN SET TABLE STATUS PROPERTIES ("" = "" [, ...]); +``` -语法: +其中: ```sql -ADMIN SET TABLE table_name STATUS - PROPERTIES ("key" = "value", ...); + + : "state" + + + : "NORMAL" + | "ROLLUP" + | "SCHEMA_CHANGE" + | "BACKUP" + | "RESTORE" + | "WAITING_STABLE" ``` -目前支持以下属性: +## 必选参数 -1. "state":必需。指定一个目标状态,将会修改 OLAP 表的状态至此状态。 +**1. ``** -> 当前可修改的目标状态包括: -> -> 1. NORMAL -> 2. ROLLUP -> 3. SCHEMA_CHANGE -> 4. BACKUP -> 5. RESTORE -> 6. WAITING_STABLE -> -> 如果表的状态已经是指定的状态,则会被忽略。 +> 指定要设置状态的表名。 +> +> 表名在其所在的数据库中必须唯一。 -**注意:此命令一般只用于紧急故障修复,请谨慎操作。** +**2. `PROPERTIES ("state" = "")`** -## 示例 +> 指定表的目标状态。 +> +> 必须设置 "state" 属性,且值必须是支持的状态之一。 -1. 设置表 tbl1 的状态为 NORMAL。 +## 权限控制 -```sql -admin set table tbl1 status properties("state" = "NORMAL"); -``` +执行此 SQL 命令的用户必须至少具有以下权限: -2. 设置表 tbl2 的状态为 SCHEMA_CHANGE。 +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | -```sql -admin set table test_set_table_status status properties("state" = "SCHEMA_CHANGE"); -``` +## 注意事项 -## 关键词 +- 此命令仅用于紧急故障修复,请谨慎操作 +- 仅支持 OLAP 表,不支持其他类型的表 +- 如果表已经处于目标状态,该命令将被忽略 +- 不当的状态设置可能会导致系统异常,建议在技术支持指导下使用 +- 修改状态后,建议及时观察系统运行情况 - ADMIN, SET, TABLE, STATUS +## 示例 -### 最佳实践 +- 将表状态设置为 NORMAL: + ```sql + ADMIN SET TABLE tbl1 STATUS PROPERTIES("state" = "NORMAL"); + ``` +- 将表状态设置为 SCHEMA_CHANGE: + ```sql + ADMIN SET TABLE tbl2 STATUS PROPERTIES("state" = "SCHEMA_CHANGE"); + ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md index 8f29d62f21396..5fd1a8154aa1a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md @@ -24,47 +24,93 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`SHOW DATA SKEW` 语句用于查看表或分区的数据倾斜情况。该语句具有以下功能: +- 可以查看整个表的数据分布情况 +- 可以查看指定分区的数据分布情况 +- 展示各个分桶的数据行数、数据量及其占比 +- 支持分区表和非分区表 +## 语法 -## 描述 +```sql +SHOW DATA SKEW FROM [.] [ PARTITION ( [, ...]) ]; +``` + +## 必选参数 + +**1. `FROM [.]`** + +> 指定要查看的表名。可以包含数据库名称。 +> +> 表名在其所在的数据库中必须唯一。 + +## 可选参数 + +**1. `PARTITION ( [, ...])`** + +> 指定要查看的分区名称列表。 +> +> 如果不指定此参数,则展示表中所有分区的数据分布情况。 +> +> 对于非分区表,分区名称同表名。 + +## 返回值 + +| 列名 | 说明 | +|------|------| +| PartitionName | 分区名称 | +| BucketIdx | 分桶索引号 | +| AvgRowCount | 平均行数 | +| AvgDataSize | 平均数据大小(字节) | +| Graph | 数据分布可视化图表 | +| Percent | 该分桶数据量占总数据量的百分比 | -该语句用于查看表或某个分区的数据倾斜情况。 +## 权限控制 -语法: +执行此 SQL 命令的用户必须至少具有以下权限: -SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)]; +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| SELECT | 表(Table) | 需要对查看的表有 SELECT 权限 | -说明: +## 注意事项 -1. 结果将展示指定分区下,各个分桶的数据行数,数据量,以及每个分桶数据量在总数据量中的占比。 -2. 对于非分区表,查询结果中分区名称同表名。 +- 数据分布情况按照分区和分桶两个维度展示 +- Graph 列使用字符 `>` 直观展示数据分布比例 +- 百分比精确到小数点后两位 +- 对于非分区表,查询结果中分区名称同表名 ## 示例 -1. 分区表场景 +- 创建分区表并查看数据分布: -* 建表语句 ```sql CREATE TABLE test_show_data_skew ( - id int, - name string, - pdate date - ) - PARTITION BY RANGE(pdate) + id int, + name string, + pdate date + ) + PARTITION BY RANGE(pdate) ( - FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY - ) + FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY + ) DISTRIBUTED BY HASH(id) BUCKETS 5 PROPERTIES ( - "replication_num" = "1" + "replication_num" = "1" ); ``` -* 查询整表的数据倾斜情况 - ```sql - mysql> SHOW DATA SKEW FROM test_show_data_skew; + + 查看整表数据分布: + + ```sql + SHOW DATA SKEW FROM test_show_data_skew; + ``` + + ```text +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ @@ -90,9 +136,14 @@ SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)]; | p_20230419 | 4 | 0 | 0 | | 00.00 % | +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ ``` -* 查询指定分区的数据倾斜情况 + +- 查看指定分区的数据分布: + ```sql - mysql> SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418); + SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418); + ``` + + ```text +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ @@ -109,37 +160,26 @@ SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)]; +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ ``` -2. 非分区表场景 +- 查看非分区表的数据分布: -* 建表语句 ```sql CREATE TABLE test_show_data_skew2 ( - id int, - name string, + id int, + name string, pdate date - ) + ) DISTRIBUTED BY HASH(id) BUCKETS 5 PROPERTIES ( "replication_num" = "1" ); ``` -* 查询整表的数据倾斜情况 - ```sql - mysql> SHOW DATA SKEW FROM test_show_data_skew2; - +----------------------+-----------+-------------+-------------+---------------------------+---------+ - | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | - +----------------------+-----------+-------------+-------------+---------------------------+---------+ - | test_show_data_skew2 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.73 % | - | test_show_data_skew2 | 1 | 4 | 667 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.46 % | - | test_show_data_skew2 | 2 | 0 | 0 | | 00.00 % | - | test_show_data_skew2 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.77 % | - | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | - +----------------------+-----------+-------------+-------------+---------------------------+---------+ - + ```sql + SHOW DATA SKEW FROM test_show_data_skew2; + ``` - mysql> SHOW DATA SKEW FROM test_show_data_skew2 PARTITION(test_show_data_skew2); + ```text +----------------------+-----------+-------------+-------------+---------------------------+---------+ | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | +----------------------+-----------+-------------+-------------+---------------------------+---------+ @@ -150,9 +190,3 @@ SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)]; | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | +----------------------+-----------+-------------+-------------+---------------------------+---------+ ``` - -## 关键词 - - SHOW,DATA,SKEW - -### 最佳实践 \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md index 1a46dfef222ec..65002bb7af7c8 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md @@ -24,38 +24,92 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`SHOW DATA` 语句用于展示数据量、副本数量以及统计行数信息。该语句具有以下功能: +- 可以展示当前数据库下所有表的数据量和副本数量 +- 可以展示指定表的物化视图数据量、副本数量和统计行数 +- 可以展示数据库的配额使用情况 +- 支持按照数据量、副本数量等进行排序 -## 描述 +## 语法 -该语句用于展示数据量、副本数量以及统计行数。 +```sql +SHOW DATA [ FROM [.] ] [ ORDER BY ]; +``` -语法: +其中: ```sql -SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +order_by_clause: + [ ASC | DESC ] [ , [ ASC | DESC ] ... ] ``` -说明: +## 可选参数 + +**1. `FROM [.]`** + +> 指定要查看的表名。可以包含数据库名称。 +> +> 如果不指定此参数,则展示当前数据库下所有表的数据信息。 + +**2. `ORDER BY `** + +> 指定结果集的排序方式。 +> +> 可以对任意列进行升序(ASC)或降序(DESC)排序。 +> +> 支持多列组合排序。 + +## 返回值 + +根据不同查询场景,返回以下结果集: + +- 不指定 FROM 子句时(展示数据库级别信息): -1. 如果不指定 FROM 子句,则展示当前 db 下细分到各个 table 的数据量和副本数量。其中数据量为所有副本的总数据量。而副本数量为表的所有分区以及所有物化视图的副本数量。 +| 列名 | 说明 | +|------|------| +| DbId | 数据库 ID | +| DbName | 数据库名称 | +| Size | 数据库总数据量 | +| RemoteSize | 远程存储数据量 | +| RecycleSize | 回收站数据量 | +| RecycleRemoteSize | 回收站远程存储数据量 | -2. 如果指定 FROM 子句,则展示 table 下细分到各个物化视图的数据量、副本数量和统计行数。其中数据量为所有副本的总数据量。副本数量为对应物化视图的所有分区的副本数量。统计行数为对应物化视图的所有分区统计行数。 +- 指定 FROM 子句时(展示表级别信息): -3. 统计行数时,以多个副本中,行数最大的那个副本为准。 +| 列名 | 说明 | +|------|------| +| TableName | 表名 | +| IndexName | 索引(物化视图)名称 | +| Size | 数据大小 | +| ReplicaCount | 副本数量 | +| RowCount | 统计行数(仅在查看具体表时显示)| -4. 结果集中的 `Total` 行表示汇总行。`Quota` 行表示当前数据库设置的配额。`Left` 行表示剩余配额。 +## 权限控制 -5. 如果想查看各个 Partition 的大小,请参阅 `help show partitions`。 +执行此 SQL 命令的用户必须至少具有以下权限: -6. 可以使用 ORDER BY 对任意列组合进行排序。 +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| SELECT | 表(Table) | 需要对查看的表有 SELECT 权限 | + +## 注意事项 + +- 数据量统计包含所有副本的总数据量 +- 副本数量包含表的所有分区以及所有物化视图的副本数量 +- 统计行数时,以多个副本中行数最大的那个副本为准 +- 结果集中的 `Total` 行表示汇总数据 +- 结果集中的 `Quota` 行表示当前数据库设置的配额 +- 结果集中的 `Left` 行表示剩余配额 +- 如果需要查看各个 Partition 的大小,请使用 `SHOW PARTITIONS` 命令 ## 示例 -1. 默认展示各个 db 的汇总数据量,RecycleBin 中的数据量 +- 展示所有数据库的数据量信息: - ``` + ```sql SHOW DATA; ``` @@ -65,23 +119,18 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +-------+-----------------------------------+--------+------------+-------------+-------------------+ | 21009 | db1 | 0 | 0 | 0 | 0 | | 22011 | regression_test_inverted_index_p0 | 72764 | 0 | 0 | 0 | - | 0 | information_schema | 0 | 0 | 0 | 0 | - | 22010 | regression_test | 0 | 0 | 0 | 0 | - | 1 | mysql | 0 | 0 | 0 | 0 | - | 22017 | regression_test_show_p0 | 0 | 0 | 0 | 0 | - | 10002 | __internal_schema | 46182 | 0 | 0 | 0 | | Total | NULL | 118946 | 0 | 0 | 0 | +-------+-----------------------------------+--------+------------+-------------+-------------------+ ``` -2. 展示特定 db 的各个 table 的数据量,副本数量,汇总数据量和汇总副本数量。 +- 展示当前数据库下所有表的数据量信息: ```sql USE db1; SHOW DATA; ``` - ``` + ```text +-----------+-------------+--------------+ | TableName | Size | ReplicaCount | +-----------+-------------+--------------+ @@ -93,13 +142,13 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +-----------+-------------+--------------+ ``` -3. 展示指定 db 的下指定表的细分数据量、副本数量和统计行数 +- 展示指定表的详细数据量信息: ```sql SHOW DATA FROM example_db.test; ``` - ``` + ```text +-----------+-----------+-----------+--------------+----------+ | TableName | IndexName | Size | ReplicaCount | RowCount | +-----------+-----------+-----------+--------------+----------+ @@ -110,13 +159,13 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +-----------+-----------+-----------+--------------+----------+ ``` -4. 可以按照数据量、副本数量、统计行数等进行组合排序 +- 按照副本数量降序、数据量升序排序: ```sql - SHOW DATA ORDER BY ReplicaCount desc,Size asc; + SHOW DATA ORDER BY ReplicaCount DESC, Size ASC; ``` - ``` + ```text +-----------+-------------+--------------+ | TableName | Size | ReplicaCount | +-----------+-------------+--------------+ @@ -129,10 +178,3 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; | Left | 1024.000 GB | 1073741734 | +-----------+-------------+--------------+ ``` - -## 关键词 - - SHOW, DATA - -### 最佳实践 - diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md index 24db22309fe88..49055fc93fdf4 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md @@ -22,34 +22,53 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`CANCEL REBALANCE DISK` 语句用于取消优先均衡 BE(Backend)节点的磁盘数据。该语句具有以下功能: +- 可以取消指定 BE 节点的优先磁盘均衡 +- 可以取消整个集群所有 BE 节点的优先磁盘均衡 +- 取消后系统仍会以默认调度方式均衡 BE 的磁盘数据 -## 描述 +## 语法 -该语句用于取消优先均衡 BE 的磁盘 +```sql +ADMIN CANCEL REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` -语法: +## 可选参数 -ADMIN CANCEL REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; +**1. `":"`** -说明: +> 指定需要取消优先磁盘均衡的 BE 节点列表。 +> +> 每个节点由主机名(或 IP 地址)和心跳端口组成。 +> +> 如果不指定此参数,则取消所有 BE 节点的优先磁盘均衡。 -1. 该语句仅表示系统不再优先均衡指定 BE 的磁盘数据。系统仍会以默认调度方式均衡 BE 的磁盘数据。 +## 权限控制 -## 示例 +执行此 SQL 命令的用户必须至少具有以下权限: - 1. 取消集群所有 BE 的优先磁盘均衡 +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | - ADMIN CANCEL REBALANCE DISK; +## 注意事项 - 2. 取消指定 BE 的优先磁盘均衡 +- 该语句仅表示系统不再优先均衡指定 BE 的磁盘数据,系统仍会以默认调度方式均衡 BE 的磁盘数据。 +- 执行该命令后,之前设置的优先均衡策略将立即失效。 - ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +## 示例 -## 关键词 +- 取消集群所有 BE 的优先磁盘均衡: - ADMIN,CANCEL,REBALANCE,DISK + ```sql + ADMIN CANCEL REBALANCE DISK; + ``` -## 最佳实践 +- 取消指定 BE 的优先磁盘均衡: +```sql +ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md index 73d2da92ba027..40e6416309bb7 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md @@ -24,34 +24,61 @@ specific language governing permissions and limitations under the License. --> - - - ## 描述 -该语句用于取消以高优先级修复指定表或分区 +`CANCEL REPAIR TABLE` 语句用于取消对指定表或分区的高优先级修复。该语句具有以下功能: + +- 可以取消整个表的高优先级修复 +- 可以取消指定分区的高优先级修复 +- 不影响系统默认的副本修复机制 -语法: +## 语法 ```sql -ADMIN CANCEL REPAIR TABLE table_name[ PARTITION (p1,...)]; +ADMIN CANCEL REPAIR TABLE [ PARTITION ( [, ...]) ]; ``` -说明: +## 必选参数 -1. 该语句仅表示系统不再以高优先级修复指定表或分区的分片副本。系统仍会以默认调度方式修复副本。 +**1. ``** -## 示例 +> 指定要取消修复的表名。 +> +> 表名在其所在的数据库中必须唯一。 + +## 可选参数 + +**1. `PARTITION ( [, ...])`** - 1. 取消高优先级修复 +> 指定要取消修复的分区名称列表。 +> +> 如果不指定此参数,则取消整个表的高优先级修复。 - ```sql - ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1); - ``` +## 权限控制 + +执行此 SQL 命令的用户必须至少具有以下权限: + +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | + +## 注意事项 + +- 该语句仅取消高优先级修复,不会停止系统的默认副本修复机制 +- 取消后,系统仍会以默认调度方式修复副本 +- 如果需要重新设置高优先级修复,可以使用 `ADMIN REPAIR TABLE` 命令 +- 该命令执行后立即生效 + +## 示例 -## 关键词 +- 取消整个表的高优先级修复: - ADMIN, CANCEL, REPAIR + ```sql + ADMIN CANCEL REPAIR TABLE tbl; + ``` -## 最佳实践 +- 取消指定分区的高优先级修复: + ```sql + ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1, p2); + ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md index e0aeb7e13344c..4692dea074f4a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md @@ -26,37 +26,52 @@ under the License. ## 描述 -该语句用于尝试优先均衡指定的 BE 磁盘数据 +REBALANCE DISK 语句用于优化 BE(Backend)节点上的数据分布。该语句具有以下功能: -语法: +- 可以针对指定的 BE 节点进行数据均衡 +- 可以对整个集群的所有 BE 节点进行数据均衡 +- 优先均衡指定节点的数据,不受集群整体均衡状态的限制 - ``` - ADMIN REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; - ``` +## 语法 -说明: +```sql +ADMIN REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` - 1. 该语句表示让系统尝试优先均衡指定 BE 的磁盘数据,不受限于集群是否均衡。 - 2. 默认的 timeout 是 24 小时。超时意味着系统将不再优先均衡指定的 BE 磁盘数据。需要重新使用该命令设置。 - 3. 指定 BE 的磁盘数据均衡后,该 BE 的优先级将会失效。 +## 可选参数 -## 示例 +**1. `":"`** + +> 指定需要进行数据均衡的 BE 节点列表。 +> +> 每个节点由主机名(或 IP 地址)和心跳端口组成。 +> +> 如果不指定此参数,则对所有 BE 节点进行均衡。 + +## 权限控制 -1. 尝试优先均衡集群内的所有 BE +执行此 SQL 命令的用户必须至少具有以下权限: - ``` - ADMIN REBALANCE DISK; - ``` +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | -2. 尝试优先均衡指定 BE +## 注意事项 - ``` - ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); - ``` +- 命令的默认超时时间为 24 小时。超时后,系统将不再优先均衡指定的 BE 磁盘数据。如需继续均衡,需要重新执行该命令。 +- 当指定 BE 节点的磁盘数据均衡完成后,该节点的优先均衡设置将自动失效。 +- 该命令可以在集群非均衡状态下执行。 + +## 示例 -## 关键词 +- 对集群内所有 BE 节点进行数据均衡: - ADMIN,REBALANCE,DISK +```sql +ADMIN REBALANCE DISK; +``` -## 最佳实践 +- 对指定的两个 BE 节点进行数据均衡: +```sql +ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md index af32175ca8671..1c3756409e6e7 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md @@ -24,38 +24,71 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`REPAIR TABLE` 语句用于优先修复指定表或分区的副本。该语句具有以下功能: +- 可以修复整个表的所有副本 +- 可以修复指定分区的副本 +- 以高优先级进行副本修复 +- 支持设置修复超时时间 +## 语法 -## 描述 +```sql +ADMIN REPAIR TABLE [ PARTITION ( [, ...]) ]; +``` -语句用于尝试优先修复指定的表或分区 +## 必选参数 -语法: +**1. ``** -```sql -ADMIN REPAIR TABLE table_name[ PARTITION (p1,...)] -``` +> 指定需要修复的表名。 +> +> 表名在其所在的数据库中必须唯一。 -说明: +## 可选参数 -1. 该语句仅表示让系统尝试以高优先级修复指定表或分区的分片副本,并不保证能够修复成功。用户可以通过 `SHOW REPLICA STATUS` 命令查看修复情况。 -2. 默认的 timeout 是 14400 秒 (4 小时)。超时意味着系统将不再以高优先级修复指定表或分区的分片副本。需要重新使用该命令设置 +**1. `PARTITION ( [, ...])`** -## 示例 +> 指定需要修复的分区名称列表。 +> +> 如果不指定此参数,则修复整个表的所有分区。 -1. 尝试修复指定表 +## 权限控制 + +执行此 SQL 命令的用户必须至少具有以下权限: + +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | + +## 注意事项 + +- 该语句仅表示系统会尝试以高优先级修复指定的副本,不保证一定能修复成功 +- 默认超时时间为 14400 秒(4 小时) +- 超时后系统将不再以高优先级修复指定的副本 +- 如果修复超时,需要重新执行该命令来继续修复 +- 可以通过 `SHOW REPLICA STATUS` 命令查看修复进度 +- 该命令不会影响系统的正常副本修复机制,仅提升指定表或分区的修复优先级 + +## 示例 - ADMIN REPAIR TABLE tbl1; +- 修复整个表的副本: -2. 尝试修复指定分区 + ```sql + ADMIN REPAIR TABLE tbl1; + ``` - ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); +- 修复指定分区的副本: -## 关键词 + ```sql + ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); + ``` - ADMIN, REPAIR, TABLE +- 查看修复进度: -## 最佳实践 + ```sql + SHOW REPLICA STATUS FROM tbl1; + ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md index 26d4d561240ff..d538391c5f435 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md @@ -24,59 +24,86 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`SET TABLE STATUS` 语句用于手动设置 OLAP 表的状态。该语句具有以下功能: +- 仅支持 OLAP 表的状态设置 +- 可以将表状态修改为指定的目标状态 +- 用于解除因表状态导致的任务阻塞 +**支持的状态**: -## 描述 +| 状态 | 说明 | +|------|------| +| NORMAL | 表示表处于正常状态 | +| ROLLUP | 表示表正在进行 ROLLUP 操作 | +| SCHEMA_CHANGE | 表示表正在进行 Schema 变更 | +| BACKUP | 表示表正在进行备份 | +| RESTORE | 表示表正在进行恢复 | +| WAITING_STABLE | 表示表正在等待稳定状态 | -该语句用于设置指定表的状态,仅支持 OLAP 表。 +## 语法 -该命令目前仅用于手动将 OLAP 表状态设置为指定状态,从而使得某些由于表状态被阻碍的任务能够继续运行。 +```sql +ADMIN SET TABLE STATUS PROPERTIES ("" = "" [, ...]); +``` -语法: +其中: ```sql -ADMIN SET TABLE table_name STATUS - PROPERTIES ("key" = "value", ...); + + : "state" + + + : "NORMAL" + | "ROLLUP" + | "SCHEMA_CHANGE" + | "BACKUP" + | "RESTORE" + | "WAITING_STABLE" ``` -目前支持以下属性: +## 必选参数 -1. "state":必需。指定一个目标状态,将会修改 OLAP 表的状态至此状态。 +**1. ``** -> 当前可修改的目标状态包括: -> -> 1. NORMAL -> 2. ROLLUP -> 3. SCHEMA_CHANGE -> 4. BACKUP -> 5. RESTORE -> 6. WAITING_STABLE -> -> 如果表的状态已经是指定的状态,则会被忽略。 +> 指定要设置状态的表名。 +> +> 表名在其所在的数据库中必须唯一。 -**注意:此命令一般只用于紧急故障修复,请谨慎操作。** +**2. `PROPERTIES ("state" = "")`** -## 示例 +> 指定表的目标状态。 +> +> 必须设置 "state" 属性,且值必须是支持的状态之一。 -1. 设置表 tbl1 的状态为 NORMAL。 +## 权限控制 -```sql -admin set table tbl1 status properties("state" = "NORMAL"); -``` +执行此 SQL 命令的用户必须至少具有以下权限: -2. 设置表 tbl2 的状态为 SCHEMA_CHANGE。 +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | -```sql -admin set table test_set_table_status status properties("state" = "SCHEMA_CHANGE"); -``` +## 注意事项 -## 关键词 +- 此命令仅用于紧急故障修复,请谨慎操作 +- 仅支持 OLAP 表,不支持其他类型的表 +- 如果表已经处于目标状态,该命令将被忽略 +- 不当的状态设置可能会导致系统异常,建议在技术支持指导下使用 +- 修改状态后,建议及时观察系统运行情况 - ADMIN, SET, TABLE, STATUS +## 示例 -## 最佳实践 +- 将表状态设置为 NORMAL: + ```sql + ADMIN SET TABLE tbl1 STATUS PROPERTIES("state" = "NORMAL"); + ``` +- 将表状态设置为 SCHEMA_CHANGE: + ```sql + ADMIN SET TABLE tbl2 STATUS PROPERTIES("state" = "SCHEMA_CHANGE"); + ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md index b52cadb13122e..5fd1a8154aa1a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md @@ -24,45 +24,93 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`SHOW DATA SKEW` 语句用于查看表或分区的数据倾斜情况。该语句具有以下功能: +- 可以查看整个表的数据分布情况 +- 可以查看指定分区的数据分布情况 +- 展示各个分桶的数据行数、数据量及其占比 +- 支持分区表和非分区表 -## 描述 +## 语法 + +```sql +SHOW DATA SKEW FROM [.] [ PARTITION ( [, ...]) ]; +``` + +## 必选参数 + +**1. `FROM [.]`** + +> 指定要查看的表名。可以包含数据库名称。 +> +> 表名在其所在的数据库中必须唯一。 + +## 可选参数 + +**1. `PARTITION ( [, ...])`** + +> 指定要查看的分区名称列表。 +> +> 如果不指定此参数,则展示表中所有分区的数据分布情况。 +> +> 对于非分区表,分区名称同表名。 + +## 返回值 - 该语句用于查看表或某个分区的数据倾斜情况。 +| 列名 | 说明 | +|------|------| +| PartitionName | 分区名称 | +| BucketIdx | 分桶索引号 | +| AvgRowCount | 平均行数 | +| AvgDataSize | 平均数据大小(字节) | +| Graph | 数据分布可视化图表 | +| Percent | 该分桶数据量占总数据量的百分比 | - 语法: +## 权限控制 - SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)]; +执行此 SQL 命令的用户必须至少具有以下权限: - 说明: +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| SELECT | 表(Table) | 需要对查看的表有 SELECT 权限 | - 1. 结果将展示指定分区下,各个分桶的数据行数,数据量,以及每个分桶数据量在总数据量中的占比。 - 2. 对于非分区表,查询结果中分区名称同表名。 +## 注意事项 + +- 数据分布情况按照分区和分桶两个维度展示 +- Graph 列使用字符 `>` 直观展示数据分布比例 +- 百分比精确到小数点后两位 +- 对于非分区表,查询结果中分区名称同表名 ## 示例 -1. 分区表场景 -* 建表语句 +- 创建分区表并查看数据分布: + ```sql CREATE TABLE test_show_data_skew ( - id int, - name string, - pdate date - ) - PARTITION BY RANGE(pdate) + id int, + name string, + pdate date + ) + PARTITION BY RANGE(pdate) ( - FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY - ) + FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY + ) DISTRIBUTED BY HASH(id) BUCKETS 5 PROPERTIES ( - "replication_num" = "1" + "replication_num" = "1" ); ``` -* 查询整表的数据倾斜情况 - ```sql - mysql> SHOW DATA SKEW FROM test_show_data_skew; + + 查看整表数据分布: + + ```sql + SHOW DATA SKEW FROM test_show_data_skew; + ``` + + ```text +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ @@ -88,9 +136,14 @@ under the License. | p_20230419 | 4 | 0 | 0 | | 00.00 % | +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ ``` -* 查询指定分区的数据倾斜情况 + +- 查看指定分区的数据分布: + ```sql - mysql> SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418); + SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418); + ``` + + ```text +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ @@ -107,37 +160,26 @@ under the License. +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ ``` -2. 非分区表场景 +- 查看非分区表的数据分布: -* 建表语句 ```sql CREATE TABLE test_show_data_skew2 ( - id int, - name string, + id int, + name string, pdate date - ) + ) DISTRIBUTED BY HASH(id) BUCKETS 5 PROPERTIES ( "replication_num" = "1" ); ``` -* 查询整表的数据倾斜情况 - ```sql - mysql> SHOW DATA SKEW FROM test_show_data_skew2; - +----------------------+-----------+-------------+-------------+---------------------------+---------+ - | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | - +----------------------+-----------+-------------+-------------+---------------------------+---------+ - | test_show_data_skew2 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.73 % | - | test_show_data_skew2 | 1 | 4 | 667 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.46 % | - | test_show_data_skew2 | 2 | 0 | 0 | | 00.00 % | - | test_show_data_skew2 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.77 % | - | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | - +----------------------+-----------+-------------+-------------+---------------------------+---------+ - + ```sql + SHOW DATA SKEW FROM test_show_data_skew2; + ``` - mysql> SHOW DATA SKEW FROM test_show_data_skew2 PARTITION(test_show_data_skew2); + ```text +----------------------+-----------+-------------+-------------+---------------------------+---------+ | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | +----------------------+-----------+-------------+-------------+---------------------------+---------+ @@ -148,9 +190,3 @@ under the License. | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | +----------------------+-----------+-------------+-------------+---------------------------+---------+ ``` - -## 关键词 - - SHOW,DATA,SKEW - -## 最佳实践 \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md index 7150e6ce85862..65002bb7af7c8 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md @@ -26,59 +26,111 @@ under the License. ## 描述 -该语句用于展示数据量、副本数量以及统计行数。 +`SHOW DATA` 语句用于展示数据量、副本数量以及统计行数信息。该语句具有以下功能: -语法: +- 可以展示当前数据库下所有表的数据量和副本数量 +- 可以展示指定表的物化视图数据量、副本数量和统计行数 +- 可以展示数据库的配额使用情况 +- 支持按照数据量、副本数量等进行排序 + +## 语法 + +```sql +SHOW DATA [ FROM [.] ] [ ORDER BY ]; +``` + +其中: ```sql -SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +order_by_clause: + [ ASC | DESC ] [ , [ ASC | DESC ] ... ] ``` -说明: +## 可选参数 + +**1. `FROM [.]`** + +> 指定要查看的表名。可以包含数据库名称。 +> +> 如果不指定此参数,则展示当前数据库下所有表的数据信息。 + +**2. `ORDER BY `** + +> 指定结果集的排序方式。 +> +> 可以对任意列进行升序(ASC)或降序(DESC)排序。 +> +> 支持多列组合排序。 + +## 返回值 + +根据不同查询场景,返回以下结果集: + +- 不指定 FROM 子句时(展示数据库级别信息): -1. 如果不指定 FROM 子句,则展示当前 db 下细分到各个 table 的数据量和副本数量。其中数据量为所有副本的总数据量。而副本数量为表的所有分区以及所有物化视图的副本数量。 +| 列名 | 说明 | +|------|------| +| DbId | 数据库 ID | +| DbName | 数据库名称 | +| Size | 数据库总数据量 | +| RemoteSize | 远程存储数据量 | +| RecycleSize | 回收站数据量 | +| RecycleRemoteSize | 回收站远程存储数据量 | -2. 如果指定 FROM 子句,则展示 table 下细分到各个物化视图的数据量、副本数量和统计行数。其中数据量为所有副本的总数据量。副本数量为对应物化视图的所有分区的副本数量。统计行数为对应物化视图的所有分区统计行数。 +- 指定 FROM 子句时(展示表级别信息): -3. 统计行数时,以多个副本中,行数最大的那个副本为准。 +| 列名 | 说明 | +|------|------| +| TableName | 表名 | +| IndexName | 索引(物化视图)名称 | +| Size | 数据大小 | +| ReplicaCount | 副本数量 | +| RowCount | 统计行数(仅在查看具体表时显示)| -4. 结果集中的 `Total` 行表示汇总行。`Quota` 行表示当前数据库设置的配额。`Left` 行表示剩余配额。 +## 权限控制 -5. 如果想查看各个 Partition 的大小,请参阅 `help show partitions`。 +执行此 SQL 命令的用户必须至少具有以下权限: -6. 可以使用 ORDER BY 对任意列组合进行排序。 +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| SELECT | 表(Table) | 需要对查看的表有 SELECT 权限 | + +## 注意事项 + +- 数据量统计包含所有副本的总数据量 +- 副本数量包含表的所有分区以及所有物化视图的副本数量 +- 统计行数时,以多个副本中行数最大的那个副本为准 +- 结果集中的 `Total` 行表示汇总数据 +- 结果集中的 `Quota` 行表示当前数据库设置的配额 +- 结果集中的 `Left` 行表示剩余配额 +- 如果需要查看各个 Partition 的大小,请使用 `SHOW PARTITIONS` 命令 ## 示例 -1. 默认展示各个 db 的汇总数据量,RecycleBin 中的数据量 +- 展示所有数据库的数据量信息: ```sql SHOW DATA; ``` - ```sql + ``` +-------+-----------------------------------+--------+------------+-------------+-------------------+ | DbId | DbName | Size | RemoteSize | RecycleSize | RecycleRemoteSize | +-------+-----------------------------------+--------+------------+-------------+-------------------+ | 21009 | db1 | 0 | 0 | 0 | 0 | | 22011 | regression_test_inverted_index_p0 | 72764 | 0 | 0 | 0 | - | 0 | information_schema | 0 | 0 | 0 | 0 | - | 22010 | regression_test | 0 | 0 | 0 | 0 | - | 1 | mysql | 0 | 0 | 0 | 0 | - | 22017 | regression_test_show_p0 | 0 | 0 | 0 | 0 | - | 10002 | __internal_schema | 46182 | 0 | 0 | 0 | | Total | NULL | 118946 | 0 | 0 | 0 | +-------+-----------------------------------+--------+------------+-------------+-------------------+ ``` -2. 展示特定 db 的各个 table 的数据量,副本数量,汇总数据量和汇总副本数量。 +- 展示当前数据库下所有表的数据量信息: ```sql USE db1; SHOW DATA; ``` - ``` + ```text +-----------+-------------+--------------+ | TableName | Size | ReplicaCount | +-----------+-------------+--------------+ @@ -90,13 +142,13 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +-----------+-------------+--------------+ ``` -3. 展示指定 db 的下指定表的细分数据量、副本数量和统计行数 +- 展示指定表的详细数据量信息: ```sql SHOW DATA FROM example_db.test; ``` - ``` + ```text +-----------+-----------+-----------+--------------+----------+ | TableName | IndexName | Size | ReplicaCount | RowCount | +-----------+-----------+-----------+--------------+----------+ @@ -107,13 +159,13 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +-----------+-----------+-----------+--------------+----------+ ``` -4. 可以按照数据量、副本数量、统计行数等进行组合排序 +- 按照副本数量降序、数据量升序排序: ```sql - SHOW DATA ORDER BY ReplicaCount desc,Size asc; + SHOW DATA ORDER BY ReplicaCount DESC, Size ASC; ``` - ``` + ```text +-----------+-------------+--------------+ | TableName | Size | ReplicaCount | +-----------+-------------+--------------+ @@ -126,10 +178,3 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; | Left | 1024.000 GB | 1073741734 | +-----------+-------------+--------------+ ``` - -## 关键词 - - SHOW, DATA - -## 最佳实践 - diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md index 6e0e49f043da1..49055fc93fdf4 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md @@ -22,35 +22,53 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`CANCEL REBALANCE DISK` 语句用于取消优先均衡 BE(Backend)节点的磁盘数据。该语句具有以下功能: +- 可以取消指定 BE 节点的优先磁盘均衡 +- 可以取消整个集群所有 BE 节点的优先磁盘均衡 +- 取消后系统仍会以默认调度方式均衡 BE 的磁盘数据 +## 语法 -## 描述 +```sql +ADMIN CANCEL REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` -该语句用于取消优先均衡 BE 的磁盘 +## 可选参数 -语法: +**1. `":"`** -ADMIN CANCEL REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; +> 指定需要取消优先磁盘均衡的 BE 节点列表。 +> +> 每个节点由主机名(或 IP 地址)和心跳端口组成。 +> +> 如果不指定此参数,则取消所有 BE 节点的优先磁盘均衡。 -说明: +## 权限控制 -1. 该语句仅表示系统不再优先均衡指定 BE 的磁盘数据。系统仍会以默认调度方式均衡 BE 的磁盘数据。 +执行此 SQL 命令的用户必须至少具有以下权限: -## 示例 +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | -1. 取消集群所有 BE 的优先磁盘均衡 +## 注意事项 - ADMIN CANCEL REBALANCE DISK; +- 该语句仅表示系统不再优先均衡指定 BE 的磁盘数据,系统仍会以默认调度方式均衡 BE 的磁盘数据。 +- 执行该命令后,之前设置的优先均衡策略将立即失效。 -2. 取消指定 BE 的优先磁盘均衡 - - ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +## 示例 -## 关键词 +- 取消集群所有 BE 的优先磁盘均衡: - ADMIN,CANCEL,REBALANCE,DISK + ```sql + ADMIN CANCEL REBALANCE DISK; + ``` -### 最佳实践 +- 取消指定 BE 的优先磁盘均衡: +```sql +ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md new file mode 100644 index 0000000000000..40e6416309bb7 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md @@ -0,0 +1,84 @@ +--- +{ + "title": "CANCEL REPAIR TABLE", + "language": "zh-CN" +} +--- + + + +## 描述 + +`CANCEL REPAIR TABLE` 语句用于取消对指定表或分区的高优先级修复。该语句具有以下功能: + +- 可以取消整个表的高优先级修复 +- 可以取消指定分区的高优先级修复 +- 不影响系统默认的副本修复机制 + +## 语法 + +```sql +ADMIN CANCEL REPAIR TABLE [ PARTITION ( [, ...]) ]; +``` + +## 必选参数 + +**1. ``** + +> 指定要取消修复的表名。 +> +> 表名在其所在的数据库中必须唯一。 + +## 可选参数 + +**1. `PARTITION ( [, ...])`** + +> 指定要取消修复的分区名称列表。 +> +> 如果不指定此参数,则取消整个表的高优先级修复。 + +## 权限控制 + +执行此 SQL 命令的用户必须至少具有以下权限: + +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | + +## 注意事项 + +- 该语句仅取消高优先级修复,不会停止系统的默认副本修复机制 +- 取消后,系统仍会以默认调度方式修复副本 +- 如果需要重新设置高优先级修复,可以使用 `ADMIN REPAIR TABLE` 命令 +- 该命令执行后立即生效 + +## 示例 + +- 取消整个表的高优先级修复: + + ```sql + ADMIN CANCEL REPAIR TABLE tbl; + ``` + +- 取消指定分区的高优先级修复: + + ```sql + ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1, p2); + ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR.md deleted file mode 100644 index 436ee2a115bc5..0000000000000 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR.md +++ /dev/null @@ -1,58 +0,0 @@ ---- -{ - "title": "CANCEL REPAIR", - "language": "zh-CN" -} ---- - - - - - - - -## 描述 - -该语句用于取消以高优先级修复指定表或分区 - -语法: - -```sql -ADMIN CANCEL REPAIR TABLE table_name[ PARTITION (p1,...)]; -``` - -说明: - -1. 该语句仅表示系统不再以高优先级修复指定表或分区的分片副本。系统仍会以默认调度方式修复副本。 - -## 示例 - - 1. 取消高优先级修复 - - ```sql - ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1); - ``` - -## 关键词 - - ADMIN, CANCEL, REPAIR - -### 最佳实践 - diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md index 96ffdb2d4b0f0..4692dea074f4a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md @@ -26,37 +26,52 @@ under the License. ## 描述 -该语句用于尝试优先均衡指定的 BE 磁盘数据 +REBALANCE DISK 语句用于优化 BE(Backend)节点上的数据分布。该语句具有以下功能: -语法: +- 可以针对指定的 BE 节点进行数据均衡 +- 可以对整个集群的所有 BE 节点进行数据均衡 +- 优先均衡指定节点的数据,不受集群整体均衡状态的限制 - ``` - ADMIN REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; - ``` +## 语法 -说明: +```sql +ADMIN REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` - 1. 该语句表示让系统尝试优先均衡指定 BE 的磁盘数据,不受限于集群是否均衡。 - 2. 默认的 timeout 是 24 小时。超时意味着系统将不再优先均衡指定的 BE 磁盘数据。需要重新使用该命令设置。 - 3. 指定 BE 的磁盘数据均衡后,该 BE 的优先级将会失效。 +## 可选参数 -## 示例 +**1. `":"`** + +> 指定需要进行数据均衡的 BE 节点列表。 +> +> 每个节点由主机名(或 IP 地址)和心跳端口组成。 +> +> 如果不指定此参数,则对所有 BE 节点进行均衡。 + +## 权限控制 -1. 尝试优先均衡集群内的所有 BE +执行此 SQL 命令的用户必须至少具有以下权限: - ``` - ADMIN REBALANCE DISK; - ``` +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | -2. 尝试优先均衡指定 BE +## 注意事项 - ``` - ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); - ``` +- 命令的默认超时时间为 24 小时。超时后,系统将不再优先均衡指定的 BE 磁盘数据。如需继续均衡,需要重新执行该命令。 +- 当指定 BE 节点的磁盘数据均衡完成后,该节点的优先均衡设置将自动失效。 +- 该命令可以在集群非均衡状态下执行。 + +## 示例 -## 关键词 +- 对集群内所有 BE 节点进行数据均衡: - ADMIN,REBALANCE,DISK +```sql +ADMIN REBALANCE DISK; +``` -### 最佳实践 +- 对指定的两个 BE 节点进行数据均衡: +```sql +ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md index 1bcde56584a2c..1c3756409e6e7 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md @@ -24,36 +24,71 @@ specific language governing permissions and limitations under the License. --> - - ## 描述 -语句用于尝试优先修复指定的表或分区 +`REPAIR TABLE` 语句用于优先修复指定表或分区的副本。该语句具有以下功能: -语法: +- 可以修复整个表的所有副本 +- 可以修复指定分区的副本 +- 以高优先级进行副本修复 +- 支持设置修复超时时间 + +## 语法 ```sql -ADMIN REPAIR TABLE table_name[ PARTITION (p1,...)] +ADMIN REPAIR TABLE [ PARTITION ( [, ...]) ]; ``` -说明: +## 必选参数 -1. 该语句仅表示让系统尝试以高优先级修复指定表或分区的分片副本,并不保证能够修复成功。用户可以通过 `SHOW REPLICA STATUS` 命令查看修复情况。 -2. 默认的 timeout 是 14400 秒 (4 小时)。超时意味着系统将不再以高优先级修复指定表或分区的分片副本。需要重新使用该命令设置 +**1. ``** -## 示例 +> 指定需要修复的表名。 +> +> 表名在其所在的数据库中必须唯一。 + +## 可选参数 + +**1. `PARTITION ( [, ...])`** + +> 指定需要修复的分区名称列表。 +> +> 如果不指定此参数,则修复整个表的所有分区。 -1. 尝试修复指定表 +## 权限控制 + +执行此 SQL 命令的用户必须至少具有以下权限: + +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | + +## 注意事项 + +- 该语句仅表示系统会尝试以高优先级修复指定的副本,不保证一定能修复成功 +- 默认超时时间为 14400 秒(4 小时) +- 超时后系统将不再以高优先级修复指定的副本 +- 如果修复超时,需要重新执行该命令来继续修复 +- 可以通过 `SHOW REPLICA STATUS` 命令查看修复进度 +- 该命令不会影响系统的正常副本修复机制,仅提升指定表或分区的修复优先级 + +## 示例 - ADMIN REPAIR TABLE tbl1; +- 修复整个表的副本: -2. 尝试修复指定分区 + ```sql + ADMIN REPAIR TABLE tbl1; + ``` - ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); +- 修复指定分区的副本: -## 关键词 + ```sql + ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); + ``` - ADMIN, REPAIR, TABLE +- 查看修复进度: -### 最佳实践 + ```sql + SHOW REPLICA STATUS FROM tbl1; + ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md index ef911204cbd6c..d538391c5f435 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md @@ -24,59 +24,86 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`SET TABLE STATUS` 语句用于手动设置 OLAP 表的状态。该语句具有以下功能: +- 仅支持 OLAP 表的状态设置 +- 可以将表状态修改为指定的目标状态 +- 用于解除因表状态导致的任务阻塞 +**支持的状态**: -## 描述 +| 状态 | 说明 | +|------|------| +| NORMAL | 表示表处于正常状态 | +| ROLLUP | 表示表正在进行 ROLLUP 操作 | +| SCHEMA_CHANGE | 表示表正在进行 Schema 变更 | +| BACKUP | 表示表正在进行备份 | +| RESTORE | 表示表正在进行恢复 | +| WAITING_STABLE | 表示表正在等待稳定状态 | -该语句用于设置指定表的状态,仅支持 OLAP 表。 +## 语法 -该命令目前仅用于手动将 OLAP 表状态设置为指定状态,从而使得某些由于表状态被阻碍的任务能够继续运行。 +```sql +ADMIN SET TABLE STATUS PROPERTIES ("" = "" [, ...]); +``` -语法: +其中: ```sql -ADMIN SET TABLE table_name STATUS - PROPERTIES ("key" = "value", ...); + + : "state" + + + : "NORMAL" + | "ROLLUP" + | "SCHEMA_CHANGE" + | "BACKUP" + | "RESTORE" + | "WAITING_STABLE" ``` -目前支持以下属性: +## 必选参数 -1. "state":必需。指定一个目标状态,将会修改 OLAP 表的状态至此状态。 +**1. ``** -> 当前可修改的目标状态包括: -> -> 1. NORMAL -> 2. ROLLUP -> 3. SCHEMA_CHANGE -> 4. BACKUP -> 5. RESTORE -> 6. WAITING_STABLE -> -> 如果表的状态已经是指定的状态,则会被忽略。 +> 指定要设置状态的表名。 +> +> 表名在其所在的数据库中必须唯一。 -**注意:此命令一般只用于紧急故障修复,请谨慎操作。** +**2. `PROPERTIES ("state" = "")`** -## 示例 +> 指定表的目标状态。 +> +> 必须设置 "state" 属性,且值必须是支持的状态之一。 -1. 设置表 tbl1 的状态为 NORMAL。 +## 权限控制 -```sql -admin set table tbl1 status properties("state" = "NORMAL"); -``` +执行此 SQL 命令的用户必须至少具有以下权限: -2. 设置表 tbl2 的状态为 SCHEMA_CHANGE。 +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| ADMIN | 系统 | 用户必须拥有 ADMIN 权限才能执行该命令 | -```sql -admin set table test_set_table_status status properties("state" = "SCHEMA_CHANGE"); -``` +## 注意事项 -## 关键词 +- 此命令仅用于紧急故障修复,请谨慎操作 +- 仅支持 OLAP 表,不支持其他类型的表 +- 如果表已经处于目标状态,该命令将被忽略 +- 不当的状态设置可能会导致系统异常,建议在技术支持指导下使用 +- 修改状态后,建议及时观察系统运行情况 - ADMIN, SET, TABLE, STATUS +## 示例 -### 最佳实践 +- 将表状态设置为 NORMAL: + ```sql + ADMIN SET TABLE tbl1 STATUS PROPERTIES("state" = "NORMAL"); + ``` +- 将表状态设置为 SCHEMA_CHANGE: + ```sql + ADMIN SET TABLE tbl2 STATUS PROPERTIES("state" = "SCHEMA_CHANGE"); + ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md index 8f29d62f21396..5fd1a8154aa1a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md @@ -24,47 +24,93 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`SHOW DATA SKEW` 语句用于查看表或分区的数据倾斜情况。该语句具有以下功能: +- 可以查看整个表的数据分布情况 +- 可以查看指定分区的数据分布情况 +- 展示各个分桶的数据行数、数据量及其占比 +- 支持分区表和非分区表 +## 语法 -## 描述 +```sql +SHOW DATA SKEW FROM [.] [ PARTITION ( [, ...]) ]; +``` + +## 必选参数 + +**1. `FROM [.]`** + +> 指定要查看的表名。可以包含数据库名称。 +> +> 表名在其所在的数据库中必须唯一。 + +## 可选参数 + +**1. `PARTITION ( [, ...])`** + +> 指定要查看的分区名称列表。 +> +> 如果不指定此参数,则展示表中所有分区的数据分布情况。 +> +> 对于非分区表,分区名称同表名。 + +## 返回值 + +| 列名 | 说明 | +|------|------| +| PartitionName | 分区名称 | +| BucketIdx | 分桶索引号 | +| AvgRowCount | 平均行数 | +| AvgDataSize | 平均数据大小(字节) | +| Graph | 数据分布可视化图表 | +| Percent | 该分桶数据量占总数据量的百分比 | -该语句用于查看表或某个分区的数据倾斜情况。 +## 权限控制 -语法: +执行此 SQL 命令的用户必须至少具有以下权限: -SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)]; +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| SELECT | 表(Table) | 需要对查看的表有 SELECT 权限 | -说明: +## 注意事项 -1. 结果将展示指定分区下,各个分桶的数据行数,数据量,以及每个分桶数据量在总数据量中的占比。 -2. 对于非分区表,查询结果中分区名称同表名。 +- 数据分布情况按照分区和分桶两个维度展示 +- Graph 列使用字符 `>` 直观展示数据分布比例 +- 百分比精确到小数点后两位 +- 对于非分区表,查询结果中分区名称同表名 ## 示例 -1. 分区表场景 +- 创建分区表并查看数据分布: -* 建表语句 ```sql CREATE TABLE test_show_data_skew ( - id int, - name string, - pdate date - ) - PARTITION BY RANGE(pdate) + id int, + name string, + pdate date + ) + PARTITION BY RANGE(pdate) ( - FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY - ) + FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY + ) DISTRIBUTED BY HASH(id) BUCKETS 5 PROPERTIES ( - "replication_num" = "1" + "replication_num" = "1" ); ``` -* 查询整表的数据倾斜情况 - ```sql - mysql> SHOW DATA SKEW FROM test_show_data_skew; + + 查看整表数据分布: + + ```sql + SHOW DATA SKEW FROM test_show_data_skew; + ``` + + ```text +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ @@ -90,9 +136,14 @@ SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)]; | p_20230419 | 4 | 0 | 0 | | 00.00 % | +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ ``` -* 查询指定分区的数据倾斜情况 + +- 查看指定分区的数据分布: + ```sql - mysql> SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418); + SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418); + ``` + + ```text +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ @@ -109,37 +160,26 @@ SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)]; +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ ``` -2. 非分区表场景 +- 查看非分区表的数据分布: -* 建表语句 ```sql CREATE TABLE test_show_data_skew2 ( - id int, - name string, + id int, + name string, pdate date - ) + ) DISTRIBUTED BY HASH(id) BUCKETS 5 PROPERTIES ( "replication_num" = "1" ); ``` -* 查询整表的数据倾斜情况 - ```sql - mysql> SHOW DATA SKEW FROM test_show_data_skew2; - +----------------------+-----------+-------------+-------------+---------------------------+---------+ - | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | - +----------------------+-----------+-------------+-------------+---------------------------+---------+ - | test_show_data_skew2 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.73 % | - | test_show_data_skew2 | 1 | 4 | 667 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.46 % | - | test_show_data_skew2 | 2 | 0 | 0 | | 00.00 % | - | test_show_data_skew2 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.77 % | - | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | - +----------------------+-----------+-------------+-------------+---------------------------+---------+ - + ```sql + SHOW DATA SKEW FROM test_show_data_skew2; + ``` - mysql> SHOW DATA SKEW FROM test_show_data_skew2 PARTITION(test_show_data_skew2); + ```text +----------------------+-----------+-------------+-------------+---------------------------+---------+ | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | +----------------------+-----------+-------------+-------------+---------------------------+---------+ @@ -150,9 +190,3 @@ SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)]; | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | +----------------------+-----------+-------------+-------------+---------------------------+---------+ ``` - -## 关键词 - - SHOW,DATA,SKEW - -### 最佳实践 \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md index 1a46dfef222ec..65002bb7af7c8 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md @@ -24,38 +24,92 @@ specific language governing permissions and limitations under the License. --> +## 描述 +`SHOW DATA` 语句用于展示数据量、副本数量以及统计行数信息。该语句具有以下功能: +- 可以展示当前数据库下所有表的数据量和副本数量 +- 可以展示指定表的物化视图数据量、副本数量和统计行数 +- 可以展示数据库的配额使用情况 +- 支持按照数据量、副本数量等进行排序 -## 描述 +## 语法 -该语句用于展示数据量、副本数量以及统计行数。 +```sql +SHOW DATA [ FROM [.] ] [ ORDER BY ]; +``` -语法: +其中: ```sql -SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +order_by_clause: + [ ASC | DESC ] [ , [ ASC | DESC ] ... ] ``` -说明: +## 可选参数 + +**1. `FROM [.]`** + +> 指定要查看的表名。可以包含数据库名称。 +> +> 如果不指定此参数,则展示当前数据库下所有表的数据信息。 + +**2. `ORDER BY `** + +> 指定结果集的排序方式。 +> +> 可以对任意列进行升序(ASC)或降序(DESC)排序。 +> +> 支持多列组合排序。 + +## 返回值 + +根据不同查询场景,返回以下结果集: + +- 不指定 FROM 子句时(展示数据库级别信息): -1. 如果不指定 FROM 子句,则展示当前 db 下细分到各个 table 的数据量和副本数量。其中数据量为所有副本的总数据量。而副本数量为表的所有分区以及所有物化视图的副本数量。 +| 列名 | 说明 | +|------|------| +| DbId | 数据库 ID | +| DbName | 数据库名称 | +| Size | 数据库总数据量 | +| RemoteSize | 远程存储数据量 | +| RecycleSize | 回收站数据量 | +| RecycleRemoteSize | 回收站远程存储数据量 | -2. 如果指定 FROM 子句,则展示 table 下细分到各个物化视图的数据量、副本数量和统计行数。其中数据量为所有副本的总数据量。副本数量为对应物化视图的所有分区的副本数量。统计行数为对应物化视图的所有分区统计行数。 +- 指定 FROM 子句时(展示表级别信息): -3. 统计行数时,以多个副本中,行数最大的那个副本为准。 +| 列名 | 说明 | +|------|------| +| TableName | 表名 | +| IndexName | 索引(物化视图)名称 | +| Size | 数据大小 | +| ReplicaCount | 副本数量 | +| RowCount | 统计行数(仅在查看具体表时显示)| -4. 结果集中的 `Total` 行表示汇总行。`Quota` 行表示当前数据库设置的配额。`Left` 行表示剩余配额。 +## 权限控制 -5. 如果想查看各个 Partition 的大小,请参阅 `help show partitions`。 +执行此 SQL 命令的用户必须至少具有以下权限: -6. 可以使用 ORDER BY 对任意列组合进行排序。 +| 权限(Privilege) | 对象(Object) | 说明(Notes) | +| :---------------- | :------------- | :-------------------------------------- | +| SELECT | 表(Table) | 需要对查看的表有 SELECT 权限 | + +## 注意事项 + +- 数据量统计包含所有副本的总数据量 +- 副本数量包含表的所有分区以及所有物化视图的副本数量 +- 统计行数时,以多个副本中行数最大的那个副本为准 +- 结果集中的 `Total` 行表示汇总数据 +- 结果集中的 `Quota` 行表示当前数据库设置的配额 +- 结果集中的 `Left` 行表示剩余配额 +- 如果需要查看各个 Partition 的大小,请使用 `SHOW PARTITIONS` 命令 ## 示例 -1. 默认展示各个 db 的汇总数据量,RecycleBin 中的数据量 +- 展示所有数据库的数据量信息: - ``` + ```sql SHOW DATA; ``` @@ -65,23 +119,18 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +-------+-----------------------------------+--------+------------+-------------+-------------------+ | 21009 | db1 | 0 | 0 | 0 | 0 | | 22011 | regression_test_inverted_index_p0 | 72764 | 0 | 0 | 0 | - | 0 | information_schema | 0 | 0 | 0 | 0 | - | 22010 | regression_test | 0 | 0 | 0 | 0 | - | 1 | mysql | 0 | 0 | 0 | 0 | - | 22017 | regression_test_show_p0 | 0 | 0 | 0 | 0 | - | 10002 | __internal_schema | 46182 | 0 | 0 | 0 | | Total | NULL | 118946 | 0 | 0 | 0 | +-------+-----------------------------------+--------+------------+-------------+-------------------+ ``` -2. 展示特定 db 的各个 table 的数据量,副本数量,汇总数据量和汇总副本数量。 +- 展示当前数据库下所有表的数据量信息: ```sql USE db1; SHOW DATA; ``` - ``` + ```text +-----------+-------------+--------------+ | TableName | Size | ReplicaCount | +-----------+-------------+--------------+ @@ -93,13 +142,13 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +-----------+-------------+--------------+ ``` -3. 展示指定 db 的下指定表的细分数据量、副本数量和统计行数 +- 展示指定表的详细数据量信息: ```sql SHOW DATA FROM example_db.test; ``` - ``` + ```text +-----------+-----------+-----------+--------------+----------+ | TableName | IndexName | Size | ReplicaCount | RowCount | +-----------+-----------+-----------+--------------+----------+ @@ -110,13 +159,13 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +-----------+-----------+-----------+--------------+----------+ ``` -4. 可以按照数据量、副本数量、统计行数等进行组合排序 +- 按照副本数量降序、数据量升序排序: ```sql - SHOW DATA ORDER BY ReplicaCount desc,Size asc; + SHOW DATA ORDER BY ReplicaCount DESC, Size ASC; ``` - ``` + ```text +-----------+-------------+--------------+ | TableName | Size | ReplicaCount | +-----------+-------------+--------------+ @@ -129,10 +178,3 @@ SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; | Left | 1024.000 GB | 1073741734 | +-----------+-------------+--------------+ ``` - -## 关键词 - - SHOW, DATA - -### 最佳实践 - diff --git a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md index 9c4ecfffe24be..8fe862ed2d3e3 100644 --- a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md +++ b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md @@ -3,6 +3,7 @@ "title": "CANCEL REBALANCE DISK", "language": "en" } + --- - ## Description -statement used to attempt to preferentially repair the specified table or partition +The `REPAIR TABLE` statement is used to prioritize the repair of replicas for a specified table or partition. This statement has the following functionalities: + +- It can repair all replicas of an entire table. +- It can repair replicas of specified partitions. +- It performs replica repairs with high priority. +- It supports setting a repair timeout. -grammar: +## Syntax ```sql -ADMIN REPAIR TABLE table_name[ PARTITION (p1,...)] +ADMIN REPAIR TABLE [ PARTITION ( [, ...]) ]; ``` -illustrate: +## Required Parameters -1. This statement only means to let the system try to repair the shard copy of the specified table or partition with high priority, and does not guarantee that the repair can be successful. Users can view the repair status through the SHOW REPLICA STATUS command. -2. The default timeout is 14400 seconds (4 hours). A timeout means that the system will no longer repair shard copies of the specified table or partition with high priority. Need to re-use this command to set +**1. ``** -## Examples +> Specifies the name of the table that needs to be repaired. +> +> The table name must be unique within its database. + +## Optional Parameters + +**1. `PARTITION ( [, ...])`** + +> Specifies a list of partition names that need to be repaired. +> +> If this parameter is not specified, it will repair all partitions of the entire table. -1. Attempt to repair the specified table +## Access Control Requirements - ADMIN REPAIR TABLE tbl1; +Users executing this SQL command must have at least the following permissions: + +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | + +## Usage Notes + +- This statement indicates that the system will attempt to repair the specified replicas with high priority, but it does not guarantee successful repairs. +- The default timeout is set to 14,400 seconds (4 hours). +- After the timeout, the system will no longer prioritize the repair of specified replicas. +- If a repair times out, the command needs to be executed again to continue the repair process. +- The progress of repairs can be monitored using the `SHOW REPLICA STATUS` command. +- This command does not affect the normal replica repair mechanism of the system; it merely elevates the priority of repairs for the specified table or partition. + +## Examples -2. Try to repair the specified partition +- Repair all replicas of an entire table: - ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); + ```sql + ADMIN REPAIR TABLE tbl1; + ``` -## Keywords +- Repair replicas of specified partitions: - ADMIN, REPAIR, TABLE + ```sql + ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); + ``` -## Best Practice +- Check the repair progress: + ```sql + SHOW REPLICA STATUS FROM tbl1; + ``` diff --git a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md index 4b8b4f0266889..2847303572e3e 100644 --- a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md +++ b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md @@ -24,53 +24,86 @@ specific language governing permissions and limitations under the License. --> - ## Description -This statement is used to set the state of the specified table. Only supports OLAP tables. +The `SET TABLE STATUS` statement is used to manually set the status of an OLAP table. This statement has the following functionalities: + +- It only supports setting the status of OLAP tables. +- It can modify the table status to a specified target state. +- It is used to resolve task blocking caused by the table status. -This command is currently only used to manually set the OLAP table state to the specified state, allowing some jobs that are stuck by the table state to continue running. +**Supported States**: -grammar: +| State | Description | +|-------------------|--------------------------------------| +| NORMAL | Indicates that the table is in a normal state. | +| ROLLUP | Indicates that the table is undergoing a ROLLUP operation. | +| SCHEMA_CHANGE | Indicates that the table is undergoing a schema change. | +| BACKUP | Indicates that the table is undergoing a backup operation. | +| RESTORE | Indicates that the table is undergoing a restore operation. | +| WAITING_STABLE | Indicates that the table is waiting for a stable state. | + +## Syntax ```sql -ADMIN SET TABLE table_name STATUS - PROPERTIES ("key" = "value", ...); +ADMIN SET TABLE STATUS PROPERTIES ("" = "" [, ...]); ``` -The following properties are currently supported: +Where: -1. "state":Required. Specifying a target state then the state of the OLAP table will change to this state. +```sql + + : "state" + + + : "NORMAL" + | "ROLLUP" + | "SCHEMA_CHANGE" + | "BACKUP" + | "RESTORE" + | "WAITING_STABLE" +``` -> The current target states include: -> -> 1. NORMAL -> 2. ROLLUP -> 3. SCHEMA_CHANGE -> 4. BACKUP -> 5. RESTORE -> 6. WAITING_STABLE -> -> If the current state of the table is already the specified state, it will be ignored. +## Required Parameters -**Note: This command is generally only used for emergency fault repair, please proceed with caution.** +**1. ``** -## Examples +> Specifies the name of the table for which the status needs to be set. +> +> The table name must be unique within its database. -1. Set the state of table tbl1 to NORMAL. +**2. `PROPERTIES ("state" = "")`** -```sql -admin set table tbl1 status properties("state" = "NORMAL"); -``` +> Specifies the target status of the table. +> +> The "state" property must be set, and its value must be one of the supported states. -2. Set the state of table tbl2 to SCHEMA_CHANGE +## Access Control Requirements -```sql -admin set table test_set_table_status status properties("state" = "SCHEMA_CHANGE"); -``` +Users executing this SQL command must have at least the following permissions: + +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | + +## Usage Notes + +- This command is intended for emergency fault recovery; please use it with caution. +- It only supports OLAP tables and does not support other types of tables. +- If the table is already in the target state, this command will be ignored. +- Improper state settings may lead to system anomalies; it is recommended to use this command under technical support guidance. +- After modifying the status, it is advisable to monitor the system's operational status promptly. + +## Examples + +- Set the table status to NORMAL: -## Keywords + ```sql + ADMIN SET TABLE tbl1 STATUS PROPERTIES("state" = "NORMAL"); + ``` - ADMIN, SET, TABLE, STATUS +- Set the table status to SCHEMA_CHANGE: -## Best Practice \ No newline at end of file + ```sql + ADMIN SET TABLE tbl2 STATUS PROPERTIES("state" = "SCHEMA_CHANGE"); + ``` diff --git a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md index afcff2218f2b4..ee2e631eef106 100644 --- a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md +++ b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md @@ -24,28 +24,170 @@ specific language governing permissions and limitations under the License. --> - ## Description - This statement is used to view the data skew of a table or a partition. +The `SHOW DATA SKEW` statement is used to view the data skew of a table or partition. This statement has the following functionalities: - grammar: +- It can display the data distribution of the entire table. +- It can display the data distribution of specified partitions. +- It shows the row count, data volume, and percentage for each bucket. +- It supports both partitioned and non-partitioned tables. - SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (p1)]; +## Syntax - Description: +```sql +SHOW DATA SKEW FROM [.] [ PARTITION ( [, ...]) ]; +``` - 1. Only one partition must be specified. For non-partitioned tables, the partition name is the same as the table name. - 2. The result will show row count and data volume of each bucket under the specified partition, and the proportion of the data volume of each bucket in the total data volume. +## Required Parameters -## Examples +**1. `FROM [.]`** + +> Specifies the name of the table to be viewed. The database name can be included. +> +> The table name must be unique within its database. + +## Optional Parameters + +**1. `PARTITION ( [, ...])`** + +> Specifies a list of partition names to be viewed. +> +> If this parameter is not specified, it will display the data distribution for all partitions in the table. +> +> For non-partitioned tables, the partition name is the same as the table name. - 1. View the data skew of the table +## Return Values - SHOW DATA SKEW FROM db1.test PARTITION(p1); +| Column Name | Description | +|------------------|--------------------------------------| +| PartitionName | Partition name | +| BucketIdx | Bucket index number | +| AvgRowCount | Average row count | +| AvgDataSize | Average data size (in bytes) | +| Graph | Visualization chart of data distribution | +| Percent | Percentage of this bucket's data volume relative to total data volume | -## Keywords +## Access Control Requirements - SHOW, DATA, SKEW +Users executing this SQL command must have at least the following permissions: + +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| SELECT | Table | SELECT permission is required for viewing the table. | + +## Usage Notes + +- The data distribution is displayed along two dimensions: partition and bucket. +- The Graph column uses the character `>` to visually represent the data distribution ratio. +- Percentages are accurate to two decimal places. +- For non-partitioned tables, the partition name in the query result is the same as the table name. + +## Examples -## Best Practice +- Create a partitioned table and view its data distribution: + + ```sql + CREATE TABLE test_show_data_skew + ( + id int, + name string, + pdate date + ) + PARTITION BY RANGE(pdate) + ( + FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY + ) + DISTRIBUTED BY HASH(id) BUCKETS 5 + PROPERTIES ( + "replication_num" = "1" + ); + ``` + +- View data distribution for the entire table: + + + ```sql + SHOW DATA SKEW FROM test_show_data_skew; + ``` + + ```text + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % | + | p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % | + | p_20230416 | 2 | 0 | 0 | | 00.00 % | + | p_20230416 | 3 | 0 | 0 | | 00.00 % | + | p_20230416 | 4 | 0 | 0 | | 00.00 % | + | p_20230417 | 0 | 0 | 0 | | 00.00 % | + | p_20230417 | 1 | 0 | 0 | | 00.00 % | + | p_20230417 | 2 | 0 | 0 | | 00.00 % | + | p_20230417 | 3 | 0 | 0 | | 00.00 % | + | p_20230417 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | + | p_20230418 | 0 | 0 | 0 | | 00.00 % | + | p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | + | p_20230418 | 2 | 0 | 0 | | 00.00 % | + | p_20230418 | 3 | 0 | 0 | | 00.00 % | + | p_20230418 | 4 | 0 | 0 | | 00.00 % | + | p_20230419 | 0 | 0 | 0 | | 00.00 % | + | p_20230419 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.96 % | + | p_20230419 | 2 | 0 | 0 | | 00.00 % | + | p_20230419 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.04 % | + | p_20230419 | 4 | 0 | 0 | | 00.00 % | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + ``` + + View data distribution for specified partitions: + + ```sql + SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418); + ``` + + ```text + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % | + | p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % | + | p_20230416 | 2 | 0 | 0 | | 00.00 % | + | p_20230416 | 3 | 0 | 0 | | 00.00 % | + | p_20230416 | 4 | 0 | 0 | | 00.00 % | + | p_20230418 | 0 | 0 | 0 | | 00.00 % | + | p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | + | p_20230418 | 2 | 0 | 0 | | 00.00 % | + | p_20230418 | 3 | 0 | 0 | | 00.00 % | + | p_20230418 | 4 | 0 | 0 | | 00.00 % | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + ``` + +- View data distribution for a non-partitioned table: + + ```sql + CREATE TABLE test_show_data_skew2 + ( + id int, + name string, + pdate date + ) + DISTRIBUTED BY HASH(id) BUCKETS 5 + PROPERTIES ( + "replication_num" = "1" + ); + ``` + + ```sql + SHOW DATA SKEW FROM test_show_data_skew2; + ``` + + ```text + +----------------------+-----------+-------------+-------------+---------------------------+---------+ + | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | + +----------------------+-----------+-------------+-------------+---------------------------+---------+ + | test_show_data_skew2 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.73 % | + | test_show_data_skew2 | 1 | 4 | 667 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.46 % | + | test_show_data_skew2 | 2 | 0 | 0 | | 00.00 % | + | test_show_data_skew2 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.77 % | + | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | + +----------------------+-----------+-------------+-------------+---------------------------+---------+ + ``` \ No newline at end of file diff --git a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md index 611f6a4667c0f..9c9614ec3ac93 100644 --- a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md +++ b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md @@ -24,36 +24,92 @@ specific language governing permissions and limitations under the License. --> - ## Description -This statement is used to display the amount of data, the number of replicas, and the number of statistical rows. +The `SHOW DATA` statement is used to display information about data volume, replica count, and row statistics. This statement has the following functionalities: + +- It can display the data volume and replica count for all tables in the current database. +- It can show the data volume, replica count, and row statistics for a specified table's materialized views. +- It can display the quota usage of the database. +- It supports sorting by data volume, replica count, etc. + +## Syntax + +```sql +SHOW DATA [ FROM [.] ] [ ORDER BY ]; +``` -grammar: +Where: ```sql -SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +order_by_clause: + [ ASC | DESC ] [ , [ ASC | DESC ] ... ] ``` -illustrate: +## Optional Parameters + +**1. `FROM [.]`** + +> Specifies the name of the table to view. The database name can be included. +> +> If this parameter is not specified, it will display data information for all tables in the current database. + +**2. `ORDER BY `** + +> Specifies the sorting method for the result set. +> +> Any column can be sorted in ascending (ASC) or descending (DESC) order. +> +> Supports multi-column combination sorting. -1. If the FROM clause is not specified, the data volume and number of replicas subdivided into each table under the current db will be displayed. The data volume is the total data volume of all replicas. The number of replicas is the number of replicas for all partitions of the table and all materialized views. +## Return Values -2. If the FROM clause is specified, the data volume, number of copies and number of statistical rows subdivided into each materialized view under the table will be displayed. The data volume is the total data volume of all replicas. The number of replicas is the number of replicas for all partitions of the corresponding materialized view. The number of statistical rows is the number of statistical rows for all partitions of the corresponding materialized view. +Depending on different query scenarios, the following result sets are returned: -3. When counting the number of rows, the one with the largest number of rows among the multiple copies shall prevail. +- When the `FROM` clause is not specified (displaying database-level information): -4. The `Total` row in the result set represents the total row. The `Quota` line represents the quota set by the current database. The `Left` line indicates the remaining quota. +| Column Name | Description | +|------------------|--------------------------------------| +| DbId | Database ID | +| DbName | Database name | +| Size | Total data volume of the database | +| RemoteSize | Remote storage data volume | +| RecycleSize | Recycle bin data volume | +| RecycleRemoteSize| Recycle bin remote storage volume | -5. If you want to see the size of each Partition, see `help show partitions`. +- When the `FROM` clause is specified (displaying table-level information): -6. You can use ORDER BY to sort on any combination of columns. +| Column Name | Description | +|------------------|--------------------------------------| +| TableName | Table name | +| IndexName | Index (materialized view) name | +| Size | Data size | +| ReplicaCount | Replica count | +| RowCount | Row statistics (shown only when viewing a specific table) | + +## Access Control Requirements + +Users executing this SQL command must have at least the following permissions: + +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| SELECT | Table | SELECT permission is required for viewing the table. | + +## Usage Notes + +- The data volume statistics include the total data volume of all replicas. +- The replica count includes all partitions and replicas of all materialized views for the table. +- When counting rows, it considers the maximum row count among multiple replicas. +- The `Total` row in the result set indicates aggregated data. +- The `Quota` row in the result set indicates the current quota set for the database. +- The `Left` row in the result set indicates remaining quota. +- If you need to view the size of each partition, use the `SHOW PARTITIONS` command. ## Examples -1. Display the data size and RecycleBin size of each database by default. +- Display data volume information for all databases: - ``` + ```sql SHOW DATA; ``` @@ -63,74 +119,62 @@ illustrate: +-------+-----------------------------------+--------+------------+-------------+-------------------+ | 21009 | db1 | 0 | 0 | 0 | 0 | | 22011 | regression_test_inverted_index_p0 | 72764 | 0 | 0 | 0 | - | 0 | information_schema | 0 | 0 | 0 | 0 | - | 22010 | regression_test | 0 | 0 | 0 | 0 | - | 1 | mysql | 0 | 0 | 0 | 0 | - | 22017 | regression_test_show_p0 | 0 | 0 | 0 | 0 | - | 10002 | __internal_schema | 46182 | 0 | 0 | 0 | | Total | NULL | 118946 | 0 | 0 | 0 | +-------+-----------------------------------+--------+------------+-------------+-------------------+ ``` -2. Display the data volume, replica number, aggregate data volume and aggregate replica number of each table in a database. +- Display data volume information for all tables in the current database: - ```sql - USE db1; - SHOW DATA; - ``` + ```sql + USE db1; + SHOW DATA; + ``` - ``` - +-----------+-------------+--------------+ - | TableName | Size | ReplicaCount | - +-----------+-------------+--------------+ - | tbl1 | 900.000 B | 6 | - | tbl2 | 500.000 B | 3 | - | Total | 1.400 KB | 9 | - | Quota | 1024.000 GB | 1073741824 | - | Left | 1021.921 GB | 1073741815 | - +-----------+-------------+--------------+ - ``` + ```text + +-----------+-------------+--------------+ + | TableName | Size | ReplicaCount | + +-----------+-------------+--------------+ + | tbl1 | 900.000 B | 6 | + | tbl2 | 500.000 B | 3 | + | Total | 1.400 KB | 9 | + | Quota | 1024.000 GB | 1073741824 | + | Left | 1021.921 GB | 1073741815 | + +-----------+-------------+--------------+ + ``` -3. Display the subdivided data volume, the number of replicas and the number of statistical rows of the specified table under the specified db +- Display detailed data volume information for a specified table: - ```sql - SHOW DATA FROM example_db.test; - ``` + ```sql + SHOW DATA FROM example_db.test; + ``` - ``` - +-----------+-----------+-----------+--------------+----------+ - | TableName | IndexName | Size | ReplicaCount | RowCount | - +-----------+-----------+-----------+--------------+----------+ - | test | r1 | 10.000MB | 30 | 10000 | - | | r2 | 20.000MB | 30 | 20000 | - | | test2 | 50.000MB | 30 | 50000 | - | | Total | 80.000 | 90 | | - +-----------+-----------+-----------+--------------+----------+ - ``` + ```text + +-----------+-----------+-----------+--------------+----------+ + | TableName | IndexName | Size | ReplicaCount | RowCount | + +-----------+-----------+-----------+--------------+----------+ + | test | r1 | 10.000MB | 30 | 10000 | + | | r2 | 20.000MB | 30 | 20000 | + | | test2 | 50.000MB | 30 | 50000 | + | | Total | 80.000MB | 90 | | + +-----------+-----------+-----------+--------------+----------+ + ``` -4. It can be combined and sorted according to the amount of data, the number of copies, the number of statistical rows, etc. +- Sort by replica count in descending order and by data volume in ascending order: - ```sql - SHOW DATA ORDER BY ReplicaCount desc,Size asc; - ``` + ```sql + SHOW DATA ORDER BY ReplicaCount DESC, Size ASC; + ``` - ``` - +-----------+-------------+--------------+ - | TableName | Size | ReplicaCount | - +-----------+-------------+--------------+ - | table_c | 3.102 KB | 40 | - | table_d | .000 | 20 | - | table_b | 324.000 B | 20 | - | table_a | 1.266 KB | 10 | - | Total | 4.684 KB | 90 | - | Quota | 1024.000 GB | 1073741824 | - | Left | 1024.000 GB | 1073741734 | + ```text + +-----------+-------------+--------------+ + | TableName | Size | ReplicaCount | + +-----------+-------------+--------------+ + | table_c | 3.102 KB | 40 | + | table_d | .000 | 20 | + | table_b |=324.000 B |=20 | + |=table_a |=1.266 KB |=10 | + |=Total |=4.684 KB |=90 | + |=Quota |=1024.000 GB |=1073741824 | + |=Left |=1024.000 GB |=1073741734 | +-----------+-------------+--------------+ ``` - -## Keywords - - SHOW, DATA - -## Best Practice - diff --git a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md index 1305f8799f0c3..8fe862ed2d3e3 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md +++ b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REBALANCE-DISK.md @@ -26,31 +26,51 @@ under the License. ## Description -This statement is used to cancel rebalancing disks of specified backends with high priority +The `CANCEL REBALANCE DISK` statement is used to cancel the high-priority disk data balancing for Backend (BE) nodes. This statement has the following functionalities: -Grammar: +- It can cancel the high-priority disk balancing for specified BE nodes. +- It can cancel the high-priority disk balancing for all BE nodes in the entire cluster. +- After cancellation, the system will still balance the disk data of BE nodes using the default scheduling method. -ADMIN CANCEL REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; +## Syntax -Explain: +```sql +ADMIN CANCEL REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` -1. This statement only indicates that the system no longer rebalance disks of specified backends with high priority. The system will still rebalance disks by default scheduling. +## Optional Parameters -## Example +**1. `":"`** -1. Cancel High Priority Disk Rebalance of all of backends of the cluster +> Specifies the list of BE nodes for which the high-priority disk balancing needs to be canceled. +> +> Each node consists of a hostname (or IP address) and a heartbeat port. +> +> If this parameter is not specified, it will cancel the high-priority disk balancing for all BE nodes. -ADMIN CANCEL REBALANCE DISK; +## Access Control Requirements -2. Cancel High Priority Disk Rebalance of specified backends +Users executing this SQL command must have at least the following permissions: -ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | + +## Usage Notes -## Keywords +- This statement only indicates that the system will no longer prioritize balancing the disk data of specified BEs; however, the system will still balance BE's disk data using the default scheduling method. +- After executing this command, any previously set high-priority balancing strategy will become immediately invalid. -ADMIN,CANCEL,REBALANCE DISK +## Examples -## Best Practice +- Cancel high-priority disk balancing for all BEs in the cluster: + ```sql + ADMIN CANCEL REBALANCE DISK; + ``` +- Cancel high-priority disk balancing for specified BEs: +```sql +ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +``` diff --git a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md index 005f3e50c949a..e21b80cb8ea90 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md +++ b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/CANCEL-REPAIR-TABLE.md @@ -1,6 +1,6 @@ --- { - "title": "CANCEL REPAIR", + "title": "CANCEL REPAIR TABLE", "language": "en" } --- @@ -27,29 +27,59 @@ under the License. ## Description -This statement is used to cancel the repair of the specified table or partition with high priority +The `CANCEL REPAIR TABLE` statement is used to cancel high-priority repairs for a specified table or partition. This statement has the following functionalities: -grammar: +- It can cancel high-priority repairs for an entire table. +- It can cancel high-priority repairs for specified partitions. +- It does not affect the system's default replica repair mechanism. + +## Syntax ```sql -ADMIN CANCEL REPAIR TABLE table_name[ PARTITION (p1,...)]; +ADMIN CANCEL REPAIR TABLE [ PARTITION ( [, ...]) ]; ``` -illustrate: +## Required Parameters + +**1. ``** + +> Specifies the name of the table for which the repair is to be canceled. +> +> The table name must be unique within its database. + +## Optional Parameters + +**1. `PARTITION ( [, ...])`** + +> Specifies a list of partition names for which the repair is to be canceled. +> +> If this parameter is not specified, it will cancel high-priority repairs for the entire table. + +## Access Control Requirements + +Users executing this SQL command must have at least the following permissions: -1. This statement simply means that the system will no longer repair shard copies of the specified table or partition with high priority. Replicas are still repaired with the default schedule. +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | -## Example +## Usage Notes - 1. Cancel high priority repair +- This statement only cancels high-priority repairs and does not stop the system's default replica repair mechanism. +- After cancellation, the system will still repair replicas using the default scheduling method. +- If there is a need to re-establish high-priority repairs, the `ADMIN REPAIR TABLE` command can be used. +- The effects of this command take place immediately after execution. - ```sql - ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1); - ``` +## Examples -## Keywords +- Cancel high-priority repairs for an entire table: - ADMIN, CANCEL, REPAIR + ```sql + ADMIN CANCEL REPAIR TABLE tbl; + ``` -## Best Practice +- Cancel high-priority repairs for specified partitions: + ```sql + ADMIN CANCEL REPAIR TABLE tbl PARTITION(p1, p2); + ``` diff --git a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md index 21dea061c04b1..bdf90dcad4d0c 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md +++ b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REBALANCE-DISK.md @@ -24,44 +24,54 @@ under the License. --> +## Description +The `REBALANCE DISK` statement is used to optimize the data distribution on Backend (BE) nodes. This statement has the following functionalities: -### Name +- It can perform data balancing for specified BE nodes. +- It can balance data across all BE nodes in the entire cluster. +- It prioritizes balancing the data of specified nodes, regardless of the overall balance state of the cluster. -ADMIN REBALANCE DISK +## Syntax -## Description +```sql +ADMIN REBALANCE DISK [ ON ( ":" [, ... ] ) ]; +``` -This statement is used to try to rebalance disks of the specified backends first, no matter if the cluster is balanced +## Optional Parameters -Grammar: +**1. `":"`** -``` -ADMIN REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; -``` +> Specifies the list of BE nodes that need to be balanced. +> +> Each node consists of a hostname (or IP address) and a heartbeat port. +> +> If this parameter is not specified, it will balance all BE nodes. -Explain: +## Access Control Requirements -1. This statement only means that the system attempts to rebalance disks of specified backends with high priority, no matter if the cluster is balanced. -2. The default timeout is 24 hours. Timeout means that the system will no longer rebalance disks of specified backends with high priority. The command settings need to be reused. +Users executing this SQL command must have at least the following permissions: -## Example +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | -1. Attempt to rebalance disks of all backends +## Usage Notes -``` +- The default timeout for this command is 24 hours. After this period, the system will no longer prioritize balancing the disk data of specified BEs. To continue balancing, the command needs to be executed again. +- Once the disk data balancing for a specified BE node is completed, the high-priority setting for that node will automatically become invalid. +- This command can be executed even when the cluster is in an unbalanced state. + +## Examples + +- Balance data across all BE nodes in the cluster: + +```sql ADMIN REBALANCE DISK; ``` -2. Attempt to rebalance disks oof the specified backends +- Balance data for two specified BE nodes: -``` +```sql ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); ``` - -## Keywords - -ADMIN,REBALANCE,DISK - -## Best Practice - diff --git a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md index 3e37dbd7b0b5a..3d6e3c08519cc 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md +++ b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/REPAIR-TABLE.md @@ -27,32 +27,68 @@ under the License. ## Description -statement used to attempt to preferentially repair the specified table or partition +The `REPAIR TABLE` statement is used to prioritize the repair of replicas for a specified table or partition. This statement has the following functionalities: -grammar: +- It can repair all replicas of an entire table. +- It can repair replicas of specified partitions. +- It performs replica repairs with high priority. +- It supports setting a repair timeout. + +## Syntax ```sql -ADMIN REPAIR TABLE table_name[ PARTITION (p1,...)] +ADMIN REPAIR TABLE [ PARTITION ( [, ...]) ]; ``` -illustrate: +## Required Parameters + +**1. ``** + +> Specifies the name of the table that needs to be repaired. +> +> The table name must be unique within its database. + +## Optional Parameters + +**1. `PARTITION ( [, ...])`** + +> Specifies a list of partition names that need to be repaired. +> +> If this parameter is not specified, it will repair all partitions of the entire table. + +## Access Control Requirements + +Users executing this SQL command must have at least the following permissions: -1. This statement only means to let the system try to repair the shard copy of the specified table or partition with high priority, and does not guarantee that the repair can be successful. Users can view the repair status through the SHOW REPLICA STATUS command. -2. The default timeout is 14400 seconds (4 hours). A timeout means that the system will no longer repair shard copies of the specified table or partition with high priority. Need to re-use this command to set +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | -## Example +## Usage Notes -1. Attempt to repair the specified table +- This statement indicates that the system will attempt to repair the specified replicas with high priority, but it does not guarantee successful repairs. +- The default timeout is set to 14,400 seconds (4 hours). +- After the timeout, the system will no longer prioritize the repair of specified replicas. +- If a repair times out, the command needs to be executed again to continue the repair process. +- The progress of repairs can be monitored using the `SHOW REPLICA STATUS` command. +- This command does not affect the normal replica repair mechanism of the system; it merely elevates the priority of repairs for the specified table or partition. - ADMIN REPAIR TABLE tbl1; +## Examples -2. Try to repair the specified partition +- Repair all replicas of an entire table: - ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); + ```sql + ADMIN REPAIR TABLE tbl1; + ``` -## Keywords +- Repair replicas of specified partitions: - ADMIN, REPAIR, TABLE + ```sql + ADMIN REPAIR TABLE tbl1 PARTITION (p1, p2); + ``` -## Best Practice +- Check the repair progress: + ```sql + SHOW REPLICA STATUS FROM tbl1; + ``` diff --git a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md index 028232af4e8a6..2847303572e3e 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md +++ b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SET-TABLE-STATUS.md @@ -24,55 +24,86 @@ specific language governing permissions and limitations under the License. --> +## Description +The `SET TABLE STATUS` statement is used to manually set the status of an OLAP table. This statement has the following functionalities: +- It only supports setting the status of OLAP tables. +- It can modify the table status to a specified target state. +- It is used to resolve task blocking caused by the table status. -## Description +**Supported States**: + +| State | Description | +|-------------------|--------------------------------------| +| NORMAL | Indicates that the table is in a normal state. | +| ROLLUP | Indicates that the table is undergoing a ROLLUP operation. | +| SCHEMA_CHANGE | Indicates that the table is undergoing a schema change. | +| BACKUP | Indicates that the table is undergoing a backup operation. | +| RESTORE | Indicates that the table is undergoing a restore operation. | +| WAITING_STABLE | Indicates that the table is waiting for a stable state. | -This statement is used to set the state of the specified table. Only supports OLAP tables. +## Syntax -This command is currently only used to manually set the OLAP table state to the specified state, allowing some jobs that are stuck by the table state to continue running. +```sql +ADMIN SET TABLE STATUS PROPERTIES ("" = "" [, ...]); +``` -grammar: +Where: ```sql -ADMIN SET TABLE table_name STATUS - PROPERTIES ("key" = "value", ...); + + : "state" + + + : "NORMAL" + | "ROLLUP" + | "SCHEMA_CHANGE" + | "BACKUP" + | "RESTORE" + | "WAITING_STABLE" ``` -The following properties are currently supported: +## Required Parameters -1. "state":Required. Specifying a target state then the state of the OLAP table will change to this state. +**1. ``** -> The current target states include: -> -> 1. NORMAL -> 2. ROLLUP -> 3. SCHEMA_CHANGE -> 4. BACKUP -> 5. RESTORE -> 6. WAITING_STABLE -> -> If the current state of the table is already the specified state, it will be ignored. +> Specifies the name of the table for which the status needs to be set. +> +> The table name must be unique within its database. -**Note: This command is generally only used for emergency fault repair, please proceed with caution.** +**2. `PROPERTIES ("state" = "")`** -## Example +> Specifies the target status of the table. +> +> The "state" property must be set, and its value must be one of the supported states. -1. Set the state of table tbl1 to NORMAL. +## Access Control Requirements -```sql -admin set table tbl1 status properties("state" = "NORMAL"); -``` +Users executing this SQL command must have at least the following permissions: -2. Set the state of table tbl2 to SCHEMA_CHANGE +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| ADMIN | System | The user must have ADMIN privileges to execute this command. | -```sql -admin set table test_set_table_status status properties("state" = "SCHEMA_CHANGE"); -``` +## Usage Notes + +- This command is intended for emergency fault recovery; please use it with caution. +- It only supports OLAP tables and does not support other types of tables. +- If the table is already in the target state, this command will be ignored. +- Improper state settings may lead to system anomalies; it is recommended to use this command under technical support guidance. +- After modifying the status, it is advisable to monitor the system's operational status promptly. + +## Examples + +- Set the table status to NORMAL: -## Keywords + ```sql + ADMIN SET TABLE tbl1 STATUS PROPERTIES("state" = "NORMAL"); + ``` - ADMIN, SET, TABLE, STATUS +- Set the table status to SCHEMA_CHANGE: -## Best Practice \ No newline at end of file + ```sql + ADMIN SET TABLE tbl2 STATUS PROPERTIES("state" = "SCHEMA_CHANGE"); + ``` diff --git a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md index d2d5c794c9d19..ee2e631eef106 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md +++ b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA-SKEW.md @@ -24,28 +24,170 @@ specific language governing permissions and limitations under the License. --> - - ## Description -This statement is used to view the data skew of a table or a partition. - -grammar: - -`SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (p1)];` - - - -1. Only one partition must be specified. For non-partitioned tables, the partition name is the same as the table name. -2. The result will show row count and data volume of each bucket under the specified partition, and the proportion of the data volume of each bucket in the total data volume. - -## Example - -1. View the data skew of the table - - ` SHOW DATA SKEW FROM db1.test PARTITION(p1);` - -## Keywords - -SHOW, DATA, SKEW - +The `SHOW DATA SKEW` statement is used to view the data skew of a table or partition. This statement has the following functionalities: + +- It can display the data distribution of the entire table. +- It can display the data distribution of specified partitions. +- It shows the row count, data volume, and percentage for each bucket. +- It supports both partitioned and non-partitioned tables. + +## Syntax + +```sql +SHOW DATA SKEW FROM [.] [ PARTITION ( [, ...]) ]; +``` + +## Required Parameters + +**1. `FROM [.]`** + +> Specifies the name of the table to be viewed. The database name can be included. +> +> The table name must be unique within its database. + +## Optional Parameters + +**1. `PARTITION ( [, ...])`** + +> Specifies a list of partition names to be viewed. +> +> If this parameter is not specified, it will display the data distribution for all partitions in the table. +> +> For non-partitioned tables, the partition name is the same as the table name. + +## Return Values + +| Column Name | Description | +|------------------|--------------------------------------| +| PartitionName | Partition name | +| BucketIdx | Bucket index number | +| AvgRowCount | Average row count | +| AvgDataSize | Average data size (in bytes) | +| Graph | Visualization chart of data distribution | +| Percent | Percentage of this bucket's data volume relative to total data volume | + +## Access Control Requirements + +Users executing this SQL command must have at least the following permissions: + +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| SELECT | Table | SELECT permission is required for viewing the table. | + +## Usage Notes + +- The data distribution is displayed along two dimensions: partition and bucket. +- The Graph column uses the character `>` to visually represent the data distribution ratio. +- Percentages are accurate to two decimal places. +- For non-partitioned tables, the partition name in the query result is the same as the table name. + +## Examples + +- Create a partitioned table and view its data distribution: + + ```sql + CREATE TABLE test_show_data_skew + ( + id int, + name string, + pdate date + ) + PARTITION BY RANGE(pdate) + ( + FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY + ) + DISTRIBUTED BY HASH(id) BUCKETS 5 + PROPERTIES ( + "replication_num" = "1" + ); + ``` + +- View data distribution for the entire table: + + + ```sql + SHOW DATA SKEW FROM test_show_data_skew; + ``` + + ```text + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % | + | p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % | + | p_20230416 | 2 | 0 | 0 | | 00.00 % | + | p_20230416 | 3 | 0 | 0 | | 00.00 % | + | p_20230416 | 4 | 0 | 0 | | 00.00 % | + | p_20230417 | 0 | 0 | 0 | | 00.00 % | + | p_20230417 | 1 | 0 | 0 | | 00.00 % | + | p_20230417 | 2 | 0 | 0 | | 00.00 % | + | p_20230417 | 3 | 0 | 0 | | 00.00 % | + | p_20230417 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | + | p_20230418 | 0 | 0 | 0 | | 00.00 % | + | p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | + | p_20230418 | 2 | 0 | 0 | | 00.00 % | + | p_20230418 | 3 | 0 | 0 | | 00.00 % | + | p_20230418 | 4 | 0 | 0 | | 00.00 % | + | p_20230419 | 0 | 0 | 0 | | 00.00 % | + | p_20230419 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.96 % | + | p_20230419 | 2 | 0 | 0 | | 00.00 % | + | p_20230419 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.04 % | + | p_20230419 | 4 | 0 | 0 | | 00.00 % | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + ``` + + View data distribution for specified partitions: + + ```sql + SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418); + ``` + + ```text + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + | p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % | + | p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % | + | p_20230416 | 2 | 0 | 0 | | 00.00 % | + | p_20230416 | 3 | 0 | 0 | | 00.00 % | + | p_20230416 | 4 | 0 | 0 | | 00.00 % | + | p_20230418 | 0 | 0 | 0 | | 00.00 % | + | p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | + | p_20230418 | 2 | 0 | 0 | | 00.00 % | + | p_20230418 | 3 | 0 | 0 | | 00.00 % | + | p_20230418 | 4 | 0 | 0 | | 00.00 % | + +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ + ``` + +- View data distribution for a non-partitioned table: + + ```sql + CREATE TABLE test_show_data_skew2 + ( + id int, + name string, + pdate date + ) + DISTRIBUTED BY HASH(id) BUCKETS 5 + PROPERTIES ( + "replication_num" = "1" + ); + ``` + + ```sql + SHOW DATA SKEW FROM test_show_data_skew2; + ``` + + ```text + +----------------------+-----------+-------------+-------------+---------------------------+---------+ + | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | + +----------------------+-----------+-------------+-------------+---------------------------+---------+ + | test_show_data_skew2 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.73 % | + | test_show_data_skew2 | 1 | 4 | 667 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.46 % | + | test_show_data_skew2 | 2 | 0 | 0 | | 00.00 % | + | test_show_data_skew2 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.77 % | + | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | + +----------------------+-----------+-------------+-------------+---------------------------+---------+ + ``` \ No newline at end of file diff --git a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md index 5efddf21d5e69..9c9614ec3ac93 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md +++ b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/data-and-status-management/SHOW-DATA.md @@ -24,37 +24,92 @@ specific language governing permissions and limitations under the License. --> +## Description +The `SHOW DATA` statement is used to display information about data volume, replica count, and row statistics. This statement has the following functionalities: -## Description +- It can display the data volume and replica count for all tables in the current database. +- It can show the data volume, replica count, and row statistics for a specified table's materialized views. +- It can display the quota usage of the database. +- It supports sorting by data volume, replica count, etc. + +## Syntax -This statement is used to display the amount of data, the number of replicas, and the number of statistical rows. +```sql +SHOW DATA [ FROM [.] ] [ ORDER BY ]; +``` -grammar: +Where: ```sql -SHOW DATA [FROM [db_name.]table_name] [ORDER BY ...]; +order_by_clause: + [ ASC | DESC ] [ , [ ASC | DESC ] ... ] ``` -illustrate: +## Optional Parameters -1. If the FROM clause is not specified, the data volume and number of replicas subdivided into each table under the current db will be displayed. The data volume is the total data volume of all replicas. The number of replicas is the number of replicas for all partitions of the table and all materialized views. +**1. `FROM [.]`** -2. If the FROM clause is specified, the data volume, number of copies and number of statistical rows subdivided into each materialized view under the table will be displayed. The data volume is the total data volume of all replicas. The number of replicas is the number of replicas for all partitions of the corresponding materialized view. The number of statistical rows is the number of statistical rows for all partitions of the corresponding materialized view. +> Specifies the name of the table to view. The database name can be included. +> +> If this parameter is not specified, it will display data information for all tables in the current database. -3. When counting the number of rows, the one with the largest number of rows among the multiple copies shall prevail. +**2. `ORDER BY `** -4. The `Total` row in the result set represents the total row. The `Quota` line represents the quota set by the current database. The `Left` line indicates the remaining quota. +> Specifies the sorting method for the result set. +> +> Any column can be sorted in ascending (ASC) or descending (DESC) order. +> +> Supports multi-column combination sorting. -5. If you want to see the size of each Partition, see `help show partitions`. +## Return Values -6. You can use ORDER BY to sort on any combination of columns. +Depending on different query scenarios, the following result sets are returned: -## Example +- When the `FROM` clause is not specified (displaying database-level information): -1. Display the data size and RecycleBin size of each database by default. +| Column Name | Description | +|------------------|--------------------------------------| +| DbId | Database ID | +| DbName | Database name | +| Size | Total data volume of the database | +| RemoteSize | Remote storage data volume | +| RecycleSize | Recycle bin data volume | +| RecycleRemoteSize| Recycle bin remote storage volume | - ``` +- When the `FROM` clause is specified (displaying table-level information): + +| Column Name | Description | +|------------------|--------------------------------------| +| TableName | Table name | +| IndexName | Index (materialized view) name | +| Size | Data size | +| ReplicaCount | Replica count | +| RowCount | Row statistics (shown only when viewing a specific table) | + +## Access Control Requirements + +Users executing this SQL command must have at least the following permissions: + +| Privilege | Object | Notes | +| :-------------- | :---------- | :-------------------------------------------- | +| SELECT | Table | SELECT permission is required for viewing the table. | + +## Usage Notes + +- The data volume statistics include the total data volume of all replicas. +- The replica count includes all partitions and replicas of all materialized views for the table. +- When counting rows, it considers the maximum row count among multiple replicas. +- The `Total` row in the result set indicates aggregated data. +- The `Quota` row in the result set indicates the current quota set for the database. +- The `Left` row in the result set indicates remaining quota. +- If you need to view the size of each partition, use the `SHOW PARTITIONS` command. + +## Examples + +- Display data volume information for all databases: + + ```sql SHOW DATA; ``` @@ -64,74 +119,62 @@ illustrate: +-------+-----------------------------------+--------+------------+-------------+-------------------+ | 21009 | db1 | 0 | 0 | 0 | 0 | | 22011 | regression_test_inverted_index_p0 | 72764 | 0 | 0 | 0 | - | 0 | information_schema | 0 | 0 | 0 | 0 | - | 22010 | regression_test | 0 | 0 | 0 | 0 | - | 1 | mysql | 0 | 0 | 0 | 0 | - | 22017 | regression_test_show_p0 | 0 | 0 | 0 | 0 | - | 10002 | __internal_schema | 46182 | 0 | 0 | 0 | | Total | NULL | 118946 | 0 | 0 | 0 | +-------+-----------------------------------+--------+------------+-------------+-------------------+ ``` -2. Display the data volume, replica number, aggregate data volume and aggregate replica number of each table in a database. +- Display data volume information for all tables in the current database: - ```sql - USE db1; - SHOW DATA; - ``` + ```sql + USE db1; + SHOW DATA; + ``` - ``` - +-----------+-------------+--------------+ - | TableName | Size | ReplicaCount | - +-----------+-------------+--------------+ - | tbl1 | 900.000 B | 6 | - | tbl2 | 500.000 B | 3 | - | Total | 1.400 KB | 9 | - | Quota | 1024.000 GB | 1073741824 | - | Left | 1021.921 GB | 1073741815 | - +-----------+-------------+--------------+ - ``` + ```text + +-----------+-------------+--------------+ + | TableName | Size | ReplicaCount | + +-----------+-------------+--------------+ + | tbl1 | 900.000 B | 6 | + | tbl2 | 500.000 B | 3 | + | Total | 1.400 KB | 9 | + | Quota | 1024.000 GB | 1073741824 | + | Left | 1021.921 GB | 1073741815 | + +-----------+-------------+--------------+ + ``` -3. Display the subdivided data volume, the number of replicas and the number of statistical rows of the specified table under the specified db +- Display detailed data volume information for a specified table: - ```sql - SHOW DATA FROM example_db.test; - ``` + ```sql + SHOW DATA FROM example_db.test; + ``` - ``` - +-----------+-----------+-----------+--------------+----------+ - | TableName | IndexName | Size | ReplicaCount | RowCount | - +-----------+-----------+-----------+--------------+----------+ - | test | r1 | 10.000MB | 30 | 10000 | - | | r2 | 20.000MB | 30 | 20000 | - | | test2 | 50.000MB | 30 | 50000 | - | | Total | 80.000 | 90 | | - +-----------+-----------+-----------+--------------+----------+ - ``` + ```text + +-----------+-----------+-----------+--------------+----------+ + | TableName | IndexName | Size | ReplicaCount | RowCount | + +-----------+-----------+-----------+--------------+----------+ + | test | r1 | 10.000MB | 30 | 10000 | + | | r2 | 20.000MB | 30 | 20000 | + | | test2 | 50.000MB | 30 | 50000 | + | | Total | 80.000MB | 90 | | + +-----------+-----------+-----------+--------------+----------+ + ``` -4. It can be combined and sorted according to the amount of data, the number of copies, the number of statistical rows, etc. +- Sort by replica count in descending order and by data volume in ascending order: - ```sql - SHOW DATA ORDER BY ReplicaCount desc,Size asc; - ``` + ```sql + SHOW DATA ORDER BY ReplicaCount DESC, Size ASC; + ``` - ``` - +-----------+-------------+--------------+ - | TableName | Size | ReplicaCount | - +-----------+-------------+--------------+ - | table_c | 3.102 KB | 40 | - | table_d | .000 | 20 | - | table_b | 324.000 B | 20 | - | table_a | 1.266 KB | 10 | - | Total | 4.684 KB | 90 | - | Quota | 1024.000 GB | 1073741824 | - | Left | 1024.000 GB | 1073741734 | + ```text + +-----------+-------------+--------------+ + | TableName | Size | ReplicaCount | + +-----------+-------------+--------------+ + | table_c | 3.102 KB | 40 | + | table_d | .000 | 20 | + | table_b |=324.000 B |=20 | + |=table_a |=1.266 KB |=10 | + |=Total |=4.684 KB |=90 | + |=Quota |=1024.000 GB |=1073741824 | + |=Left |=1024.000 GB |=1073741734 | +-----------+-------------+--------------+ ``` - -## Keywords - - SHOW, DATA - -## Best Practice -