Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-14109 Refactored remaining processors and control services to be uniform when creating properties and relationships. #9600

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dan-s1
Copy link
Contributor

@dan-s1 dan-s1 commented Dec 27, 2024

Summary

NIFI-14109
This PR aims to have a lists of PropertyDescriptor objects and sets of Relationship objects all be defined with one PropertyDescriptor and Relationship per line for increased readability. The list variables are named PROPERTY_DESCRIPTORS and the set variables are named RELATIONSHIPS. Also for common PropertyDescriptor objects defined in a parent a method, a method getCommonPropertyDescriptors is defined to allow children to use them. This was done for the following parent and children classes (parents on left and children which use the method are on the right).

  1. AbstractAMQPProcessor - ConsumeAMQP, PublishAMQP
  2. AbstractAwsMachineLearningJobStarter - StartAwsTextractJob
  3. AbstractAwsMachineLearningJobStatusProcessor - GetAwsTextractJobStatus
  4. AbstractAzureCosmosDBProcessor - PutAzureCosmosDBRecord
  5. AbstractEmailProcessor - ConsumeIMAP, ConsumePOP3
  6. AbstractGridFSProcessor (applies this also to relationships) - FetchGridFS, PutGridFS
  7. AbstractHadoopProcessor - AbstractFetchHDFSRecord, AbstractPutHDFSRecord (PutParquet but doesn't use common method), CreateHadoopSequenceFile, DeleteHDFS, FetchHDFS, GetHDFS, GetHDFSEvents, GetHDFSFileInfo, ListHDFS, MoveHDFS, PutHDFS
  8. AbstractJoltTransform - JoltTransformJSON, JoltTransformRecord
  9. AbstractMongoProcessor - DeleteMongo, GetMongo, PutMongo, PutMongoBulkOperations, PutMongoRecord, RunMongoAggregation
  10. SplunkAPICall - PutSplunkHTTP, QuerySplunkIndexingStatus

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

relationships.addAll(getAdditionalRelationships());
this.relationships = Collections.unmodifiableSet(relationships);
this.descriptors = Stream.concat(
PROPERTIES.stream(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no simpler/cleaner option than Stream.concat ... list.stream().... .toList()? Overall this is cleanear in some ways but does seem like there is a cleaner/simpler way of saying 'List from these lists'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used this as I saw that being used in other parts of the code e.g. ElasticSearchLookupService.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would surprise me if this is the best way to express these declarations.

Copy link
Contributor Author

@dan-s1 dan-s1 Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was the cleanest way to make an unmodifiable list. A quick Google search indicates that is the Java 9 way to combine lists and make them unmodifiable.

AMQP_VERSION,
SSL_CONTEXT_SERVICE,
USE_CERT_AUTHENTICATION
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the parent class 'getCommonPropertyDescriptors' method is removed and instead a subclass is supposed to know to use the static list of descriptors as they build their own lists. I think this weakens what the author intended in building that class.

Also there is at least one example much later on in this PR where you kept the parent method. I think keeping the parent method better conveys intent to any subclassers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert this back.

Copy link
Contributor Author

@dan-s1 dan-s1 Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up adding 'getCommonPropertyDescriptors' for all the parent children relationships I saw. Below are the ones I made changes for (Parents on the left and children who use the method on the right).

  1. AbstractAMQPProcessor - ConsumeAMQP, PublishAMQP
  2. AbstractAwsMachineLearningJobStarter - StartAwsTextractJob
  3. AbstractAwsMachineLearningJobStatusProcessor - GetAwsTextractJobStatus
  4. AbstractAzureCosmosDBProcessor - PutAzureCosmosDBRecord
  5. AbstractEmailProcessor - ConsumeIMAP, ConsumePOP3
  6. AbstractGridFSProcessor (applies this also to relationships) - FetchGridFS, PutGridFS
  7. AbstractHadoopProcessor - AbstractFetchHDFSRecord, AbstractPutHDFSRecord (PutParquet but doesn't use common method), CreateHadoopSequenceFile, DeleteHDFS, FetchHDFS, GetHDFS, GetHDFSEvents, GetHDFSFileInfo, ListHDFS, MoveHDFS, PutHDFS
  8. AbstractJoltTransform - JoltTransformJSON, JoltTransformRecord
  9. AbstractMongoProcessor - DeleteMongo, GetMongo, PutMongo, PutMongoBulkOperations, PutMongoRecord, RunMongoAggregation
  10. SplunkAPICall - PutSplunkHTTP, QuerySplunkIndexingStatus

dan-s1 added 3 commits January 9, 2025 13:49
… uniform when creating properties and relationships.
…ROPERTY_DESCRIPTORS. For parent processor classes which defined common properties, made getCommonPropertyDescriptors() method for children to use to load them. Ensured lists of PropertyDescriptor objects has each PropertyDescriptor on its own line and sets of RelationShip objects has each Relationship on its own line to improve readability.
Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on improving the consistency of approach @dan-s1.

I'm in favor of going with PROPERTY_DESCRIPTORS and RELATIONSHIPS as currently implemented. If you can rebase the pull request, this looks close to completion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants