-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Researching the queueing/ticketing system in the GCP context and in Cloud Run Services #1541
Comments
|
Could you explain what benefits the second proposal would bring compared to our current setup? Wouldn't just one server also scale well with GCP? |
With our current setup we cannot unbind requests from verifications: a request is pending until the verification process is over. We need some way to separate verification from http requests if we want to support receipts in API v2 We could potentially separate them in the same "sourcify-server" process but then we are not taking advantage of GCP scaling by optimizing on the number of requests:
|
I cannot think of any real use case of this, other than prioritizing some chains
In the diagram I wrote "Read Status" from
|
To be able to proceed here, some more feedback: Second approachIf I get it right, GCP Cloud Run scales the number of instances based on pending http requests or when the CPU utilization gets above some percentage. So the
Maybe I am wrong here with my assumptions, so happy to hear your opinion on this. First approachWe should look into what options we have for such a In general, I think we need to look a bit closer here in the two approaches and also define the internal structure of the components to decide which option is best. |
Summarizing the call and the next steps: We agreed on 3 viable options: 1. Queueing Service + Verification Service + HTTP ServerSimilar to option 1 above, having a Queueing Service and a separate Verification Service. Keeping this short as we did not discuss the details. I guess in this case the scaling will be handled by the Queueing Service itself? Leaving it here to keep this option open. 2. Verification Service + HTTP ServerSimilar to option 2 above, just having a separate Verification service: In this case the rough flow is as following:
Scaling: In this case, the Verification Services will be scaled by their CPU usage. Once a certain use is hit (in GCP Cloud Run 60%) a new instance is spun up and new requests from the HTTP server will be routed to the new instance. This should also be compatible with other scalable deployments e.g. Kubernetes. 3. Only one HTTP ServerIn the call, a third option has been proposed that requires no separate service (just an HTTP server) but outsources the async task to Workers. In this case the rough flow is as following:
Scaling: Here the server instances get scaled with the CPU use, similar to how it's done at the moment. Since the server instances are stateless, it is easily possible. Next stepsWe'd like to create simple sequence diagrams of the last 2 proposals to make them easily understandable. After that we'll contact Markus from the Devops team for his feedback. |
This in the second option also implies a "worker". A worker is just a term for any background task that is being processed. We could also just call it background task or something similar, but I imagine it to be a class that gets instantiated with the request and handles the verification in the background then. This class could be called I also updated your comment to make this clear. |
Recap of my conversation with Markus. The monolithic option 3 is fine: the only downside is that we would scale parts of our service that doesn't need scaling:
Option 2 is ideal because it separates http and verification scaling concerns but it comes with additional effort:
|
I'm not sure if I get the 2. point:
I understand we'll have two components to build and deploy but is there something I'm missing that'll make it incredibly complex?
I don't get why it is favorable to have the HTTP server do the DB operations instead of the Verification Server. Overall to me the downsides of number 3 are not a big concern, compared to the development effort that'll be needed. I think we can just increase the request count limit high enough to mostly scale for the CPU instead. |
In response to
I'm citing Markus:
I honestly also didn't fully get this point. It's not a huge deal to keep everything synchronized. Probably this becomes a problem when you have to keep different versions online or deploy with 0 downtime or you have more than 2 services. |
I still think Markus makes some valid points here. For example, maintaining a database module for two services increases the maintenance burden. Overall, I think option 3 is very easy to implement for us and option 2 just means reduced costs compared to 3. As costs are not a priority at the moment, I would also go with 3 for now. It should also be possible to upgrade the architecture from 3 to 2 if we feel like there is the need later. |
I'm also in favor of option 3 for its ease |
Then we all agree! |
Remember to include server information in the new |
Context
/v2/verify
response, this will allow to separate HTTP request from the verification status.Solutions
We are exploring two solutions:
sourcify-http-server
andsourcify-verification-service
.sourcify-http-server
will push pending contracts to thequeue-service
andsourcify-verification-service
will read pending contracts fromqueue-service
verifying them and marking them as completed. This solution involves setting up a queue service adding more complexity to our architecture, but we have granular control on what's in the queue, enabling us to potentially implement priority systems.sourcify-http-server
andsourcify-verification-service
will be deployed as Google Cloud Run Services,sourcify-http-server
will receive /verify request from the internet and callsourcify-verification-service
directly without passing through a queue. The verification's status is going to be saved insourcify-database
The text was updated successfully, but these errors were encountered: