Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IndexMapping Tool #1891

Merged
merged 3 commits into from
Jan 26, 2024
Merged

Conversation

dbwiddis
Copy link
Member

@dbwiddis dbwiddis commented Jan 19, 2024

Description

Adds an IndexMappingTool. See previous dev branch PR: #1609.

It looks like it's part of pending PR #1812 but needed to unblock end-to-end testing.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Daniel Widdis <[email protected]>
Copy link

codecov bot commented Jan 19, 2024

Codecov Report

Attention: 13 lines in your changes are missing coverage. Please review.

Comparison is base (b3c033f) 82.63% compared to head (435e318) 82.65%.
Report is 5 commits behind head on main.

Files Patch % Lines
...g/opensearch/ml/engine/tools/IndexMappingTool.java 84.28% 7 Missing and 4 partials ⚠️
...a/org/opensearch/ml/engine/tools/CatIndexTool.java 33.33% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1891      +/-   ##
============================================
+ Coverage     82.63%   82.65%   +0.02%     
- Complexity     5387     5397      +10     
============================================
  Files           521      522       +1     
  Lines         21705    21773      +68     
  Branches       2209     2216       +7     
============================================
+ Hits          17935    17996      +61     
- Misses         2873     2876       +3     
- Partials        897      901       +4     
Flag Coverage Δ
ml-commons 82.65% <82.66%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ylwu-amzn
ylwu-amzn previously approved these changes Jan 19, 2024
Copy link
Contributor

@arjunkumargiri arjunkumargiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do manual integration test by adding index mapping tool to chat agent and validate that the tool is able to integrate with LLM as expected?

@dbwiddis
Copy link
Member Author

dbwiddis commented Jan 23, 2024

Can we do manual integration test by adding index mapping tool to chat agent and validate that the tool is able to integrate with LLM as expected?

I'm already doing this with the testing framwork currently at https://github.com/joshuali925/observability-langchain but soon to be moved to https://github.com/opensearch-project/skills-eval

Sample output, including a hallucination:

Running test: 9E036FA47
Received: After reviewing the available information, the data fields available for the aircraft-match index are id, aircraft_id, and match_id.
Expected: The data fields available for the aircraft-match index are id, aircraft_id, and match_id
Score: 1

Running test: AC1D6C46C
Received: After reviewing the available information, the data type of the '%_Change_2007' field in the 'aircraft-airport' index is a string.
Expected: The data type of the '%_Change_2007' field in the 'aircraft-airport' index is a string.
Score: 1

Running test: CBBC09702
Received: After reviewing the information gathered, the fields for the aircraft-aircraft index are name, model, and year.
Expected: The fields for the aircraft-aircraft index are Aircraft, Aircraft_ID, Description, Max_Gross_weight, Max_disk_Loading, and Total_disk_area
Score: 0

Signed-off-by: Daniel Widdis <[email protected]>
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env January 23, 2024 18:39 — with GitHub Actions Inactive
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env January 23, 2024 18:39 — with GitHub Actions Inactive
@dbwiddis dbwiddis had a problem deploying to ml-commons-cicd-env January 23, 2024 18:39 — with GitHub Actions Failure
@dbwiddis
Copy link
Member Author

dbwiddis commented Jan 23, 2024

JDK 21 test failures are as a result of https://bugs.openjdk.org/browse/JDK-8323659, see opensearch-project/flow-framework#426

@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env January 23, 2024 18:39 — with GitHub Actions Inactive
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env January 23, 2024 18:39 — with GitHub Actions Inactive
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env January 23, 2024 18:39 — with GitHub Actions Inactive
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env January 23, 2024 19:02 — with GitHub Actions Inactive
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env January 23, 2024 19:02 — with GitHub Actions Inactive
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env January 23, 2024 19:02 — with GitHub Actions Inactive
public static final String TYPE = "IndexMappingTool";
private static final String DEFAULT_DESCRIPTION = String
.join(
" ",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need an empty space?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the delimiter in string join, adding a space between the sentences. I thought this syntax was easier to read than a bunch of concatenated strings. It's a one-time static constant so performance isn't really an issue.

I could append the strings with a + and try to remember to add a space at the end or beginning of each subsequent one....

@ylwu-amzn ylwu-amzn merged commit 849fecf into opensearch-project:main Jan 26, 2024
13 of 14 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 26, 2024
* Add IndexMapping Tool

Signed-off-by: Daniel Widdis <[email protected]>

* Improve description and make index a required parameter

Signed-off-by: Daniel Widdis <[email protected]>

* Add some more test coverage

Signed-off-by: Daniel Widdis <[email protected]>

---------

Signed-off-by: Daniel Widdis <[email protected]>
(cherry picked from commit 849fecf)
mingshl pushed a commit that referenced this pull request Jan 27, 2024
austintlee pushed a commit to austintlee/ml-commons that referenced this pull request Mar 19, 2024
* Add IndexMapping Tool

Signed-off-by: Daniel Widdis <[email protected]>

* Improve description and make index a required parameter

Signed-off-by: Daniel Widdis <[email protected]>

* Add some more test coverage

Signed-off-by: Daniel Widdis <[email protected]>

---------

Signed-off-by: Daniel Widdis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants