Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for deriving source field from docValues in FieldMapper #17040

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rayshrey
Copy link
Contributor

@rayshrey rayshrey commented Jan 16, 2025

Description

Add support for deriving source field from docValues in FieldMapper
Following changes have been done:

  • Added interface method in FieldMapper for basic support
  • Implemented above method in DateFieldMapper

Related Issues

#17073

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for a5057b4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?


// generic implementation, override in subclasses for specific implementation
protected Object deriveSource(LeafReader leafReader, int docId) throws IOException {
return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw an exception here instead

@@ -800,4 +817,18 @@ public boolean getIgnoreMalformed() {
public Long getNullValue() {
return nullValue;
}

@Override
protected String[] deriveSource(LeafReader leafReader, int docId) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how would we ensure that the entity creating the source can understand if it should use an array or just a single string while trying to serialize the source

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking if we can let the mapper method itself be responsible to create the source

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the function signature to include a XContentBuilder - it will be the responsibility of each mapper to add data to this builder, this will allow the mapper to handle cases for single value or array in the mapper itself


@Override
protected String[] deriveSource(LeafReader leafReader, int docId) throws IOException {
SortedNumericDocValues sortedNumericDocValues = leafReader.getSortedNumericDocValues(name());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is doc values are not available for the field? Can we have generic validations/assertions around expected data structures as well to ensure that if this method is called, we have a way to create source

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say if a field is stored, should we also allow to use that even if DVs are not present

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added generic validations in the latest revision

Say if a field is stored, should we also allow to use that even if DVs are not present

Yes we can do that but for the current PR I think we should limit it to docValues only, will add support for stored fields in subsequent PRs

@rayshrey rayshrey force-pushed the derived-source-interface branch from a5057b4 to e16b768 Compare January 21, 2025 11:56
@rayshrey rayshrey requested a review from cwperks as a code owner January 21, 2025 11:56
Copy link
Contributor

❕ Gradle check result for e16b768: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Jan 21, 2025

Codecov Report

Attention: Patch coverage is 23.33333% with 23 lines in your changes missing coverage. Please review.

Project coverage is 72.26%. Comparing base (13159c1) to head (e16b768).
Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
...a/org/opensearch/index/mapper/DateFieldMapper.java 19.04% 17 Missing ⚠️
.../java/org/opensearch/index/mapper/FieldMapper.java 0.00% 5 Missing ⚠️
...a/org/opensearch/index/mapper/MappedFieldType.java 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #17040      +/-   ##
============================================
- Coverage     72.31%   72.26%   -0.06%     
- Complexity    65346    65349       +3     
============================================
  Files          5301     5301              
  Lines        303805   303833      +28     
  Branches      44030    44034       +4     
============================================
- Hits         219702   219559     -143     
- Misses        66055    66257     +202     
+ Partials      18048    18017      -31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@navneet1v
Copy link
Contributor

@rayshrey is this feature planned to be released in 2.19 version of Opensearch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants