Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NU-1960] API for messages preview in source #7510

Open
wants to merge 25 commits into
base: staging
Choose a base branch
from

Conversation

pielas
Copy link
Contributor

@pielas pielas commented Jan 29, 2025

Describe your changes

Checklist before merge

  • Related issue ID is placed at the beginning of PR title in [brackets] (can be GH issue or Nu Jira issue)
  • Code is cleaned from temporary changes and commented out lines
  • Parts of the code that are not easy to understand are documented in the code
  • Changes are covered by automated tests
  • Showcase in dev-application.conf added to demonstrate the feature
  • Documentation added or updated
  • Added entry in Changelog.md describing the change from the perspective of a public distribution user
  • Added MigrationGuide.md entry in the appropriate subcategory if introducing a breaking change
  • Verify that PR will be squashed during merge

@pielas pielas changed the title Feature/nu 1960 messages preview in source [NU-1960] API for messages preview in source Feb 3, 2025
@pielas pielas requested a review from arkadius February 3, 2025 16:00
Copy link
Member

@arkadius arkadius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, a few small comments added and one comment about endpoint naming.

)
.errorOut(
oneOf[NodesError](
fetchLatestRecordsErrorExample,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Example is only a part of this error definition. Maybe let's skip it or use some other suffix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what do you mean. Can you give an example of other suffix that fits? I think that word Example fits here because it indicates that it's just an example of FetchLatestRecordsError returned by the service. Additionally other example output instances are named using the same convention, with Example suffix, like noScenarioExample, malformedTypingResultExample, ProcessingTypeExample

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tapir call this entities "Outputs". The output contains a status code, information about what encoding will be used (for example plain text/json) and an example of a body. I think that probably someone firstly extracted only a body example and called it an example and then changed it to output and forgot to change the variable name. Take a look for example on scenarioNotFoundErrorOutput, migrateEndpointErrorOutput. I think that this naming schema is more closer to the model that Tapir provides.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I understand now. It indeed doesn't make sense to call them examples. I renamed all ErrorOutputs according to suggested naming convention

@@ -1472,6 +1542,7 @@ object NodesApiEndpoints {
sealed trait NodesError

object NodesError {
final case class FetchLatestRecordsError(msg: String) extends NodesError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:/ It would be nice to have all business cases such as node compilation error, source doesn't support data preview ... etc. here. Can you provide it? At least let's add a TODO for that

nodesApiEndpoints.fetchLatestRecordsForNodeEndpoint
.serverSecurityLogic(authorizeKnownUser[NodesError])
.serverLogicEitherT { implicit loggedUser =>
{ case (scenarioName, numberOfRecords, fetchLatestRecordsDto) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing that we don't validate if numberOfRecords is too big because we assume that scenarioTestService already do this because there is a limit for data generation for scenario testing purpose?

It is not visible in the code. If we scenarioTestService returned a business errors, we would see it. As an alternative, we can at least add a comment about that assumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added such comment to NodesApiHttpService

          // The service is expected to limit the number of returned records if the response is too large.
          // If it does not, we may need to implement client-side pagination or request size limits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dug a little bit more into the code and found out where this records count is done for test data generation mechanism, take a look at this:

      _ <- Either.cond(
        testSampleSize <= testDataSettings.maxSamplesCount,
        (),
        s"Too many samples requested, limit is ${testDataSettings.maxSamplesCount}"
      )

in ScenarioTestService. At the beginning, I thought that it is somewhere deeper and your endpoint do this check implicitly, but it's not. We have to reuse this code, IMO it is not acceptable that the API accepts any limit. User might blow up our heap memory.

@pielas pielas requested a review from arkadius February 6, 2025 12:58

val noDataGeneratedExample: EndpointOutput.OneOfVariant[NoDataGenerated.type] =
oneOfVariantFromMatchType(
NotFound,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More than one output variant with the same status code won't work correctly. There is a long discussion in Tapirs repo why it works that way. The unpleasant thing is that Tapir don't event report an Exception in that case. It justs ignore some of variants. To make it work with Tapir, you need to have one output variant with given status code with many examples. Take a look at for example DeploymentApiEndpoints.runDeploymentEndpoint

)
)

val noSourcesWithTestDataGenerationExample: EndpointOutput.OneOfVariant[NoSourcesWithTestDataGeneration.type] =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one won't happen because it is a valid error for test generation based on scenario, not a node. Maybe we could another sealed trait level below ScenarioTestError - SourceTestError. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make class hierarchy where ScenarioError extends SourceError, but it didn't really work too well with Eithers. This is why I duplicated classes a little bit and I made it a flat structure for both cases. I don't have any better ideas for now. WDYT?

Copy link
Member

@arkadius arkadius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments resolved, a few added. Please take a look at failing tests as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants