Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-16503: Introduce new default Http2SolrClient in UpdateShardHandler #2351

Closed

Conversation

iamsanjay
Copy link
Contributor

@iamsanjay iamsanjay commented Mar 18, 2024

https://issues.apache.org/jira/browse/SOLR-16503

Currently default HTTP (Apache Http 1) client in UpdateShardHandler is being used at many places in the codebase and consequently removing it at once and replacing with Jetty HTTP2 would requires lots of efforts and testing. Instead, the possibility of introducing another default Jetty HTTP2 client in UpdateShardHandler is proposed to facilitate incremental changes.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

@iamsanjay iamsanjay changed the title Introduce new default Http2SolrClient in UpdateShardHandler SOLR-16503: Introduce new default Http2SolrClient in UpdateShardHandler Apr 3, 2024
@iamsanjay iamsanjay marked this pull request as ready for review April 3, 2024 09:25
Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets get this PR up to date with master.

public HttpClient getDefaultHttpClient() {
return defaultClient;
}

// if you are looking for a client to use, it's probably this one.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really weird that UpdateShardHandler holds the "default" httpClient (i.e. clients that have nothing to do with updates). I don't want to scope creep much but I propose simply that CoreContainer have a getHttpSolrClient method that simply gets it from here, and nobody should call it here. It's still initialized here for now. I'm aware of SolrClientCache in CoreContainer but I could see transitioning away from that as well, especially with adding convenience base path / url request method to Http2SolrClient but even then it's not essential.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's the name, UpdateShardHandler, which is not appropriate anymore. It started off with right intentions as there was only one client in the beginning, where now it encloses three HTTP clients.

I can create one method in CoreContainer#getDefaultHttpSolrClient and delegate to SolrClientCache for the Default client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solrClientCache object in CoreContainer.java still relies on the old HTTP client. Before we delegate to solrClientCache for HttpClient, we have to change the HttpClient for solrClientCache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point on UpdateShardHandler's name. I think it's a terrible name! In the project we colloquially refer to "handlers" as an abbreviation of RequestHandler which receives requests. But this "ShardHandler" doesn't even receive.... what, shards? To even be worthy of a Handler suffix in any shape or form, it must receive what it supposedly handles so it can then act on it. But this class just holds a collection of things mostly (but not totally) related to updates.

SolrClientCache is for when we need a separate SolrClient instance per base URL. Or similarly a CloudSolrClient for each cluster. Keep in mind it was written at a time when SolrClient classes were not immutable, so there was risk of sharing one. It's nifty but I don't think it should be the go-to place for an internal (within the cluster) clients -- basically any user of UpdateShardHnadler.getDefaultHttpClient today. Javadocs are needed!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Okay so no delegating to SolrClientCache.
  2. Creating CoreContainer#getDefaultHttpSolrClient and delegating it to UpdateShardHandler can be one option(I will add code for this!)
  3. If we are thinking of name change that would requires changes at around 40 places in the codebase as per quick search.
    Some new names suggestions as per AI:
    HttpOperationManager
    HttpExecutorCollection
    MultiHttpClientHandler
    HttpTaskCoordinator
    HttpOperationCoordinator
    HttpClientManager
    HttpTaskManager
  4. We can also create Default HTTP client inside the CoreContainer. On the other hand, CoreContainer already have too much, Not sure If want to give them more.
  5. Or may be it's time to divide UpdateShardHandler to few more classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should subclass SolrClientCache, named "ServerSolrClientCache" (differentiating from the base one that's mostly for streaming-expressions, from which it came) and provide these methods there. That way we don't add ever more responsibilities to CoreContainer. And it keeps these together with other SolrClients, so developers may be looking over in SolrClientCache. Might as well consolidate.

Perhaps defer UpdateShardHandler going away to another work item that is "main" only. This keeps compatibility with a number of plugins. Don't want to distract us from the overall goal of getting Apache HttpClient out of Solr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we defer "the refactoring" for now? Because regardless we have to create a public API in UpdateShardHandler for default Http2SolrClient. So let's keep the scope of ticket there. Later on, when we are going to change the client in SolrClientCache, we can discuss then?

Or, Do you not recommend creating "default" HTTP2 in UpdateShardHandler?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are introducing an Http2SolrClient to use instead of an (Apache) HttpClient, this isn't a refactoring. So now is an appropriate time to think about where best to place this new client so that we needn't go updating all the callers an additional time. We agree UpdateShardHandler doesn't make sense because it's not for updates.

@epugh
Copy link
Contributor

epugh commented Jun 19, 2024

@iamsanjay anythingbi can do to help get this in? Some great work here!!!

Copy link

This PR has had no activity for 60 days and is now labeled as stale. Any new activity or converting it to draft will remove the stale label. To attract more reviewers, please tag people who might be familiar with the code area and/or notify the [email protected] mailing list. Thank you for your contribution!

@github-actions github-actions bot added the stale PR not updated in 60 days label Aug 20, 2024
@iamsanjay iamsanjay closed this Sep 2, 2024
@iamsanjay iamsanjay force-pushed the SOLR-16503_default_Http2SolrClient branch from b91783e to 4736adb Compare September 2, 2024 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:index stale PR not updated in 60 days
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants