Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supports additional query timing types for profiling plugin query components #17146

Open
wants to merge 4 commits into
base: 2.x
Choose a base branch
from

Conversation

shatejas
Copy link

@shatejas shatejas commented Jan 27, 2025

Description

Adds enums related to knn to be able to profile ann query. Currently its difficult to debug latencies for knn, this will help increase visibility on knn query

KNN PR: opensearch-project/k-NN#2450

Related Issues

Resolves opensearch-project/k-NN#2286

Sample response

},
	"profile": {
		"shards": [
			{
				"id": "[ZaUiItRkQIy9BnX_i0ccNg][target_index_faiss][0]",
				"inbound_network_time_in_millis": 0,
				"outbound_network_time_in_millis": 0,
				"searches": [
					{
						"query": [
							{
								"type": "BooleanQuery",
								"description": "IndexOrDocValuesQuery(indexQuery=rating:[8 TO 10], dvQuery=rating:[8 TO 10]) NativeEngineKnnVectorQuery[]...KNNQuery[]",
								"time_in_nanos": 41773456,
								"breakdown": {
									"advance": 0,
									"advance_count": 0,
									"build_scorer": 11648000,
									"build_scorer_count": 2,
									"compute_max_score": 0,
									"compute_max_score_count": 0,
									"create_weight": 30073458,
									"create_weight_count": 1,
									"match": 0,
									"match_count": 0,
									"next_doc": 35915,
									"next_doc_count": 13,
									"score": 16083,
									"score_count": 12,
									"set_min_competitive_score": 0,
									"set_min_competitive_score_count": 0,
									"shallow_advance": 0,
									"shallow_advance_count": 0
								},
								"children": [
									{
										"type": "IndexOrDocValuesQuery",
										"description": "IndexOrDocValuesQuery(indexQuery=rating:[8 TO 10], dvQuery=rating:[8 TO 10])",
										"time_in_nanos": 7893251,
										"breakdown": {
											"advance": 0,
											"advance_count": 0,
											"build_scorer": 6763916,
											"build_scorer_count": 3,
											"compute_max_score": 0,
											"compute_max_score_count": 0,
											"create_weight": 1113750,
											"create_weight_count": 1,
											"match": 0,
											"match_count": 0,
											"next_doc": 13418,
											"next_doc_count": 11,
											"score": 2167,
											"score_count": 10,
											"set_min_competitive_score": 0,
											"set_min_competitive_score_count": 0,
											"shallow_advance": 0,
											"shallow_advance_count": 0
										}
									},
									{
										"type": "NativeEngineKnnVectorQuery",
										"description": "NativeEngineKnnVectorQuery[]...KNNQuery[]",
										"time_in_nanos": 25468916,
										"breakdown": {
											"advance": 0,
											"advance_count": 0,
											"ann_search": 0,
											"ann_search_count": 0,
											"build_scorer": 287542,
											"build_scorer_count": 3,
											"compute_max_score": 0,
											"compute_max_score_count": 0,
											"create_weight": 25172250,
											"create_weight_count": 1,
											"exact_knn_search": 0,
											"exact_knn_search_count": 0,
											"match": 0,
											"match_count": 0,
											"next_doc": 7374,
											"next_doc_count": 4,
											"score": 1750,
											"score_count": 3,
											"set_min_competitive_score": 0,
											"set_min_competitive_score_count": 0,
											"shallow_advance": 0,
											"shallow_advance_count": 0
										},
										"children": [
											{
												"type": "KNNQuery",
												"description": "",
												"time_in_nanos": 2426625,
												"breakdown": {
													"advance": 0,
													"advance_count": 0,
													"ann_search": 2426625,
													"ann_search_count": 1,
													"build_scorer": 0,
													"build_scorer_count": 0,
													"compute_max_score": 0,
													"compute_max_score_count": 0,
													"create_weight": 0,
													"create_weight_count": 0,
													"exact_knn_search": 0,
													"exact_knn_search_count": 0,
													"match": 0,
													"match_count": 0,
													"next_doc": 0,
													"next_doc_count": 0,
													"score": 0,
													"score_count": 0,
													"set_min_competitive_score": 0,
													"set_min_competitive_score_count": 0,
													"shallow_advance": 0,
													"shallow_advance_count": 0
												}
											}
										]
									}
								]
							}
						],
						"rewrite_time": 429250,
						"collector": [
							{
								"name": "SimpleTopScoreDocCollector",
								"reason": "search_top_hits",
								"time_in_nanos": 375375
							}
						]
					}
				],
				"aggregations": []
			}

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

ContextIndexSearcher

Signed-off-by: Tejas Shah <[email protected]>
Signed-off-by: Tejas Shah <[email protected]>
@shatejas shatejas marked this pull request as ready for review January 27, 2025 21:01
@shatejas shatejas changed the title Adds KNN specific enums for profiling, exposes profiler in Adds KNN specific enums for profiling, exposes profiler in ContextIndexSearcher Jan 27, 2025
Copy link
Contributor

❌ Gradle check result for 51f1446: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@@ -48,7 +48,9 @@ public enum QueryTimingType {
SCORE,
SHALLOW_ADVANCE,
COMPUTE_MAX_SCORE,
SET_MIN_COMPETITIVE_SCORE;
SET_MIN_COMPETITIVE_SCORE,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shatejas the core does not know anything about k-nn plugin (or any other plugin per se), this has to be part of the plugin related instrumentation. We may need to think how the profile phases could be extended / customized though, if required.

Copy link
Author

@shatejas shatejas Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta I understand, whats the recommendation in that case? this is one way I found to be able to have additional components in profile query

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shatejas we may never run into a need to have such an extensibility feature, so we may have to design in a plugin neural way.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may never run into a need to have such an extensibility feature

Currently there is a need to have time_in_nanos for knn. KNN query is relatively complex, both ann and exact search as well as the filter query inside the knn are major components and there is no visibility on these making it extremely difficult to debug performance issues.

Currently I wasn't able to find a hook to have knn components in query breakdown without these changes.

so we may have to design in a plugin neural way.

can you elaborate whats involved here? if its major change in knn plugin it might have to be iterative and this change might work till then

Copy link
Contributor

@navneet1v navneet1v Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta I think agree on designing this in a plugin neutral way. @shatejas lets have extension points in core that can be used by Plugins to provide their QueryTimingTyes.

One idea I can think of here is QueryTimingType would be getting used to put in some string in the profile output. We can create another enum/class which collects all the TimingType from all the plugins and then put them at right place during serialization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta Thanks, I am looking into it. I haven't found a solution yet. Doing a deep dive on possible options

Copy link
Author

@shatejas shatejas Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta I tried another approach which holds additional types per query. Let me know what you think

Other options:

  • Maintain a Query -> QueryBreakdown registry in the profile tree. But I am not sure if there is a use case for it where a plugin wants to override a default types for a query.
  • Maintain a Query -> QueryProfiler registry and get profilers based on Query type falling back to default. Haven't tried it but from what it looks like each profile breakdown is written as a separate json blob

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta I tried another approach which holds additional types per query. Let me know what you think

Thanks @shatejas , I believe it is important to evolve the APIs in a consistent way, here is the quick sketch that I would like to hear your opinion on:

  • the plugins contribute queries using SearchPlugin plugin hooks
  • however, there is nothing here regarding the query profiling

It probably would make sense to introduce the QueryProfilerSpec API, that we should let plugins to contribute, couple of options to consider:

  • the QuerySpec could (optionally) supply the corresponding QueryProfilerSpec , or
  • the SearchPlugin may a generic hook like
    default List<QueryProfilerSpec<?>> getQueryProfilers() {
        return emptyList();
    }
    

That would provide a basic to build atop, does it make sense?
Thank you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta I found one way to do it shatejas@04cff1c

Currently it uses QueryProfiler instead of AbstractQueryProfiler for simplicity. Overall I don't like the implementation, the context index searcher now is responsible for creating a new instance of profiler. Moreover it has to maintain a state of what profilers were used to be able to send it back to Profilers class.

Moreover, I don't think plugins should be responsible for concurrentProfilers, for one plugin queries simply leverage concurrency from opensearch-core. apart from that concurrentProfler implementation seems pretty complex, it will be a heavy lift for plugins (if they are not allowed to provide instance of existing one).

Please note that we don't need to replace or piggybacking here

Just so I understand the concern - Why should plugins not be allowed to piggyback on existing response if its not polluting the response? I understand that the APIs should be consistent, but the current implementation doesn't seem to allow it.

Copy link
Collaborator

@reta reta Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta I found one way to do it shatejas@04cff1c

Thanks @shatejas , I will try to look at it shortly

Just so I understand the concern - Why should plugins not be allowed to piggyback on existing response if its not polluting the response? I understand that the APIs should be consistent, but the current implementation doesn't seem to allow it.

To reiterate, I think the default query profiler should be always on. The additional profilers could be introduced. The concept of "piggybacking" with such a design is not needed here.

@shatejas shatejas changed the title Adds KNN specific enums for profiling, exposes profiler in ContextIndexSearcher Supports additional query timing types for profiling plugin query components Jan 28, 2025
Copy link
Contributor

❌ Gradle check result for 568cfe2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@reta
Copy link
Collaborator

reta commented Jan 28, 2025

On the unrelated note, @shatejas please target main branch

@shatejas
Copy link
Author

On the unrelated note, @shatejas please target main branch

@reta there are some issues with knn main branch which make it harder to test. Can open up a PR against main branch once the approach is finalized

Copy link
Contributor

❌ Gradle check result for fe1c855: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@@ -98,6 +100,10 @@ default QueryBuilder rewrite(QueryRewriteContext queryShardContext) throws IOExc
return this;
}

default Set<String> queryProfilerTimingTypes() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to remove this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants