Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve flat_object field parsing performance by reducing two passes to a single pass #16297

Merged
merged 11 commits into from
Jan 22, 2025

Conversation

bugmakerrrrrr
Copy link
Contributor

Description

As discussed in #16061, this PR focuses on optimizing flat_object type in a BWC way. I benchmarked this change using noaa workload and set the type of field station as flat_object, the result is as follows.

|                                                        Metric |                     Task |    Baseline |   Contender |     Diff |   Unit |
|--------------------------------------------------------------:|-------------------------:|------------:|------------:|---------:|-------:|
|                                                Min Throughput |                    index |      134517 |      133508 | -1008.98 | docs/s |
|                                               Mean Throughput |                    index |      142195 |      144423 |  2227.85 | docs/s |
|                                             Median Throughput |                    index |      143126 |      146146 |  3020.51 | docs/s |
|                                                Max Throughput |                    index |      150619 |      151649 |   1029.8 | docs/s |
|                                       50th percentile latency |                    index |     206.277 |     210.876 |  4.59914 |     ms |
|                                       90th percentile latency |                    index |     544.938 |     505.005 | -39.9333 |     ms |
|                                       99th percentile latency |                    index |     1225.61 |     1162.57 | -63.0459 |     ms |
|                                     99.9th percentile latency |                    index |     1867.85 |     1568.05 | -299.801 |     ms |
|                                      100th percentile latency |                    index |     1892.49 |     1759.06 |  -133.43 |     ms |
|                                  50th percentile service time |                    index |     206.277 |     210.876 |  4.59914 |     ms |
|                                  90th percentile service time |                    index |     544.938 |     505.005 | -39.9333 |     ms |
|                                  99th percentile service time |                    index |     1225.61 |     1162.57 | -63.0459 |     ms |
|                                99.9th percentile service time |                    index |     1867.85 |     1568.05 | -299.801 |     ms |
|                                 100th percentile service time |                    index |     1892.49 |     1759.06 |  -133.43 |     ms |
|                                                    error rate |                    index |           0 |           0 |        0 |      % |

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 5d37267: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 5dccc88: SUCCESS

Copy link

codecov bot commented Oct 12, 2024

Codecov Report

Attention: Patch coverage is 78.01418% with 31 lines in your changes missing coverage. Please review.

Project coverage is 72.37%. Comparing base (6b1861a) to head (fa569aa).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...opensearch/index/mapper/FlatObjectFieldMapper.java 78.01% 13 Missing and 18 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16297      +/-   ##
============================================
+ Coverage     72.23%   72.37%   +0.13%     
- Complexity    65335    65414      +79     
============================================
  Files          5301     5301              
  Lines        303824   303769      -55     
  Branches      44033    44033              
============================================
+ Hits         219471   219856     +385     
+ Misses        66363    65874     -489     
- Partials      17990    18039      +49     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bugmakerrrrrr
Copy link
Contributor Author

@msfroh @kkewwei please take a look when you get a chance

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added the stalled Issues that have stalled label Nov 14, 2024
Signed-off-by: panguixin <[email protected]>
@bugmakerrrrrr
Copy link
Contributor Author

❌ Gradle check result for 92f1f05: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

The current implementation of the range query is incorrect. When either the upper term or lower term is null, the range query combines the prefix with the literal value of null for searching.

Copy link
Contributor

github-actions bot commented Jan 8, 2025

❌ Gradle check result for dca3469: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bugmakerrrrrr
Copy link
Contributor Author

@msfroh @kkewwei friendly ping :)

Copy link
Collaborator

@msfroh msfroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I really like the cleanup with this change!

I was never a big fan of the JsonToStringXContentParser, so I'm happy to see it go away.

Signed-off-by: panguixin <[email protected]>
@bugmakerrrrrr bugmakerrrrrr changed the title Optimize flat_object type in a BWC way with one phase processing Improve flat_object field parsing performance by reducing two passes to a single pass Jan 13, 2025
Copy link
Contributor

✅ Gradle check result for abb44a1: SUCCESS

Signed-off-by: panguixin <[email protected]>
Copy link
Contributor

❌ Gradle check result for 6afc5b2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@msfroh
Copy link
Collaborator

msfroh commented Jan 14, 2025

I'm happy with these changes. Thanks a lot, @bugmakerrrrrr!

Would you mind resolving the merge conflicts and we can get this in?

@kkewwei -- do you have any more comments/concerns?

@kkewwei
Copy link
Contributor

kkewwei commented Jan 15, 2025

I'm happy with these changes. Thanks a lot, @bugmakerrrrrr!

Would you mind resolving the merge conflicts and we can get this in?

@kkewwei -- do you have any more comments/concerns?

It also make sense to me.

@bugmakerrrrrr
Copy link
Contributor Author

Would you mind resolving the merge conflicts and we can get this in?

@msfroh done

Copy link
Contributor

✅ Gradle check result for 7d1aff1: SUCCESS

Copy link
Contributor

✅ Gradle check result for 013c81b: SUCCESS

@bugmakerrrrrr
Copy link
Contributor Author

hi @msfroh , can we merge this? After this is merged, I'll continue the work mentioned in #16061

@msfroh msfroh requested a review from cwperks as a code owner January 22, 2025 01:33
@msfroh msfroh added backport 2.x Backport to 2.x branch v2.19.0 Issues and PRs related to version 2.19.0 labels Jan 22, 2025
Copy link
Contributor

✅ Gradle check result for fa569aa: SUCCESS

@msfroh msfroh merged commit 2794655 into opensearch-project:main Jan 22, 2025
45 of 46 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-16297-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 27946550b686df41fd0ff53cb340ce52c73a7b45
# Push it to GitHub
git push --set-upstream origin backport/backport-16297-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-16297-to-2.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed v2.19.0 Issues and PRs related to version 2.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants