Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tvlist feat new #14616

Open
wants to merge 56 commits into
base: force_ci/split_chunk
Choose a base branch
from
Open

Conversation

shizy818
Copy link

@shizy818 shizy818 commented Jan 2, 2025

Description

Content1 ...

Content2 ...

Content3 ...


This PR has:

  • been self-reviewed.
    • concurrent read
    • concurrent write
    • concurrent read and write
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods.
  • added or updated version, license, or notice information
  • added comments explaining the "why" and the intent of the code wherever would not be obvious
    for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold
    for code coverage.
  • added integration tests.
  • been tested in a test IoTDB cluster.

Key changed/added classes (or packages if there are too many classes) in this PR

* out of mempage bounds check
* overlapped data error during query
* change some list to array
* remember row count in tvlist iterator
@shizy818
Copy link
Author

Write Performance Test

# iotdb settings
series_slot_num=100
data_region_per_data_node=5

# benchmark settings
DEVICE_NUMBER=100
SENSOR_NUMBER=20
LOOP=10000
DATA_CLIENT_NUMBER=10
DATA_CLIENT_NUMBER=10
OPERATION_PROPORTION=1:0:0:0:0:0:0:0:0:0:0:0

Unaligned Series:

TVList Sort Threshold 0 100 200 500 1000 2000 5000 10000 master
points/s 4569824.325 4374989.505 4342791.53 4481212.23 4335337.15 4217159.025 4333778.515 4191477.875 4355698.645
Comparison 1.049159893 1.004428878 0.997036729 1.028815948 0.995325321 0.968193479 0.994967482 0.96229749

Aligned Series:

TVList Sort Threshold 0 100 200 500 1000 2000 5000 10000 master
points/s 6845973 6616038.8 7266610.87 7332207.695 6747288.985 6925542.685 7476763.485 7219287.525 7161518.61
Comparison 0.955938729 0.923831824 1.014674578 1.023834203 0.942158968 0.967049457 1.044019277 1.008066573

@shizy818
Copy link
Author

shizy818 commented Jan 17, 2025

Write/Query mixed mode

# iotdb settings
series_slot_num=100
data_region_per_data_node=5

# benchmark settings
DEVICE_NUMBER=100
SENSOR_NUMBER=20
LOOP=10000
DATA_CLIENT_NUMBER=10
DATA_CLIENT_NUMBER=8
IS_RECENT_QUERY=true
OPERATION_PROPORTION=1:0:1:1:1:1:1:1:0:1:1:1

Unaligned Series:

TVList Sort Threshold 0 100 200 500 1000 2000 5000 10000 master
INGESTION 1960317.5 1943008.65 2019711.86 2072910.49 2051436.53 2059098.48 1922076.65 1835176.4 1394176.65
PRECISE_POINT 0 0 0 0 0 0 0 0
TIME_RANGE 1480.28 1373.54 1399.18 1417.37 1461.29 1449.54 1394.38 1374.51 919.82
VALUE_RANGE 1483.19 1412.85 1414.09 1438.75 1470.86 1437.06 1421.54 1390.75 894.92
AGG_RANGE 78.6 77.91 80.98 83.11 82.25 82.56 77.07 73.58 55.9
AGG_VALUE 77.45 76.77 79.8 81.9 81.05 81.35 75.94 72.51 55.08
AGG_RANGE_VALUE 76.88 76.21 79.21 81.3 80.46 80.76 75.38 71.98 54.68
GROUP_BY 1032.57 1023.45 1063.85 1091.87 1080.56 1084.6 1012.42 966.65 734.36
LATEST_POINT 0 0 0 0 0 0 0 0 0
RANGE_QUERY_DESC 1471.21 1379.31 1387.84 1441.23 1454.83 1446.81 1396.66 1370.4 906.93
VALUE_RANGE_QUERY_DESC 1476.39 1392.44 1387.69 1425.76 1467.74 1438.82 1392.74 1378.52 906.6
GROUP_BY_DESC 1010.02 1001.1 1040.62 1068.03 1056.96 1060.91 990.31 945.54 718.32
Comparison 1.406086424 1.39366856 1.448685822 1.486844303 1.471433344 1.476932286 1.378647399 1.316321417

@shizy818
Copy link
Author

Aligned Series:

TVList Sort Threshold 0 100 200 500 1000 2000 5000 10000 master
INGESTION 1585694.11 1976424.22 2003598.88 2067901.94 2034466.94 2004544.07 1938070.54 1841351.19 1734845.89
PRECISE_POINT 0 0 0 0 0 0 0 0 0
TIME_RANGE 1228.43 1480.37 1493.53 1523.19 1508.56 1505.3 1483.8 1425.03 1163.35
VALUE_RANGE 1233.58 1505.41 1516.68 1555.96 1536.99 1534.92 1503.98 1440.16 1168.66
AGG_RANGE 63.58 79.25 80.34 82.91 81.57 80.37 77.71 73.83 69.56
AGG_VALUE 62.65 78.09 79.16 81.7 80.38 79.2 76.57 72.75 68.54
AGG_RANGE_VALUE 62.19 77.52 78.58 81.1 79.79 78.62 76.01 72.22 68.04
GROUP_BY 835.24 1041.05 1055.36 1089.23 1071.62 1055.86 1020.85 969.9 913.8
LATEST_POINT 0 0 0 0 0 0 0 0 0
RANGE_QUERY_DESC 1220.13 1484.94 1492.06 1521.5 1503.96 1517.33 1484.09 1416.81 1151.73
VALUE_RANGE_QUERY_DESC 1221.62 1481.65 1496.19 1542.3 1515.41 1516.17 1484.18 1420.98 1166.05
GROUP_BY_DESC 817 1018.31 1032.32 1065.45 1048.22 1032.8 998.55 948.72 893.85
Comparison 0.914023606 1.139240365 1.154914135 1.19197852 1.172702355 1.155451138 1.117133747 1.061386139

@shizy818
Copy link
Author

8f78b21

When I use minimal heap to merge sort, it performs much better when there are a number of sorted tvlist ( tvlist_sort_threshold = 100). However it performs worse when there is only one tvlist (tvlist_sort_threshold = 0).

Not sure if I should revert this change.

# Datatype: int
tvlist_sort_threshold=0

# When the average point number of timeseries in memtable exceeds this, the memtable is flushed to disk. The default threshold is 100000.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants