Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add entity groups #19574

Draft
wants to merge 38 commits into
base: master
Choose a base branch
from
Draft

Add entity groups #19574

wants to merge 38 commits into from

Conversation

ryan-carroll-graylog
Copy link
Contributor

@ryan-carroll-graylog ryan-carroll-graylog commented Jun 6, 2024

Description

Adds base functionality for entity grouping/tagging/categorizing.

Closes:

Motivation and Context

See https://github.com/Graylog2/graylog-plugin-enterprise/issues/7385 for details and pitch.

How Has This Been Tested?

Locally using API browser.

Example request to create a new entity group:

{
  "entities": {"assets" : ["789", "777"], "sigma_rules" : ["456", "888"]},
  "name": "category 1"
}

For the first pass only two entity type values are recognized (chosen simply to be able to test this effectively): assets and sigma_rules.

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactoring (non-breaking change)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.

Copy link
Contributor

@danotorrey danotorrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryan-carroll-graylog This is looking good to me and also tests successfully! Just noticed one issue with the update endpoint.

if (EntityGroup.id() != null) {
return scopedEntityMongoUtils.update(EntityGroup);
}
String newId = scopedEntityMongoUtils.create(EntityGroup);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When trying to update an entity group (without supplying an ID in the request JSON), the update call ends up creating a new copy of the entity. I think some other recently developed API requests don't rely on the ID being in the JSON but instead use the ID from the URL to look up the existing entity (and error if it's not found). I think doing the same here would be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah nice catch, just updated to use the url ID.

paginationHelper.filter(searchQuery.toBson()).sort(sort).perPage(perPage).page(page, filter);
}

public EntityGroup save(EntityGroup EntityGroup) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public EntityGroup save(EntityGroup EntityGroup) {
public EntityGroup save(EntityGroup entityGroup) {

Camel case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catches! Looks like some bad find-and-replaces from the original categories verbiage.

List<String> entityIds
) {
public static final String FIELD_ENTITY_TYPE = "entity_type";
public static final String FIELD_ENTITY_IDS = "entity_IDS";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static final String FIELD_ENTITY_IDS = "entity_IDS";
public static final String FIELD_ENTITY_IDS = "entity_ids";

Nit: lower case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just updated

Comment on lines 20 to 21
public static final String ENTITY_GROUP_READ = "category:read";
public static final String ENTITY_GROUP_EDIT = "category:edit";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to use the entity_group name in the permissions and audit names for consistency, or were we planning to refer to them as categories in the frontend?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we need to go there now, but it might be helpful to have a type enum for entity groups, so that they could be used for specific use cases. For example with Categories, we could assign a CATEGORY type, as a sort of namespace for all the security-specific categories. Not sure it's worth the effort at this point, but might be useful to nicely solve the naming difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch, this was another missed find-and-replace form the original categories code I had.

I definitely like the idea of a type enum for group types, and even toyed with the idea of making entity groups a base class the could be inherited by different type implementations, but ended up leaning toward keeping it as simple as possible until we have a second type use case. If you think it makes sense I'll leave it out for now and maybe we can revisit later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, definitely good to wait until we have a use case.

@kroepke kroepke self-requested a review June 14, 2024 09:09
Copy link
Member

@kroepke kroepke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments inline.
High-level comments:

  • we are still missing tests for the logic (no need to test infrastructure of course)
  • please have a look at using document update query commands from MongoDB instead of load-modify-store patterns, as it is inherently dangerous and prone to inconsistent results (and also a performance problem at scale)


public EntityGroup addEntityToGroup(String groupId, String type, String entityId) {
final EntityGroup group = requireEntityGroup(groupId);
return dbEntityGroupService.save(group.addEntity(type, entityId));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be using an atomic update instead of a load-modify-store pattern.
The issue here is that updating a group this way is not atomic and can easily cause updates that happen concurrently to be lost.

If the database query tries to update a document that doesn't exist (which should only happen if a group is concurrently deleted or someone sends arbitrary IDs in the request) the update query should simply fail. Thus we would save one query by not having to load the entire entity.

}

public EntityGroup update(String id, EntityGroup group) {
if (dbEntityGroupService.get(id).isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below, but even if not, this should use the same code as addEntityToGroup (but in reality it should simply perform an update on the existing document and not use load-modify-store at all).

exists(typeField(type)),
in(typeField(type), entityIds)
);
final List<EntityGroup> groups = MongoUtils.stream(collection.find(query)).toList();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stream(...).toList() is an anti-pattern, especially on the database level. Please avoid doing this. I know that in other places, it ends up being a list quickly anyway, but in general, it's dangerous.

In particular, if the result set is large, this forces loading the entire result set into memory instead of using the underlying cursor to read documents incrementally.
We start looping over the list immediately, so there's no reason to materialize the entire result set.

Many other places employ this pattern as well, but we should be careful about these. As a general rule, if dealing with the database, keep in mind how many documents can come back and if you intend to loop over them, consider using the stream instead of calling toList

Comment on lines 121 to 132
final Multimap<String, EntityGroup> entityToGroupsMap = MultimapBuilder.hashKeys().hashSetValues().build();
for (EntityGroup group : groups) {
for (Map.Entry<String, Set<String>> StringEntry : group.entities().entrySet()) {
if (StringEntry.getKey().equals(type)) {
for (String entityId : StringEntry.getValue()) {
if (entityIds.contains(entityId)) {
entityToGroupsMap.put(entityId, group);
}
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments:

  • group.entities() is nullable, so it should be checked, however in practice the query guarantees that the group will have entities. If that's always the case on the DB level, maybe reconsider the @Nullable annotation on entities().
  • Loop over entities().entrySet() and if (StringEntry.getKey().equals(type)) { should be a simple get(type)? The query already makes sure all groups contain something from that type, we just don't know which entityId it is.
  • The overall code would benefit from performing this as an aggregation query in MongoDB, instead of doing it in code.

@ApiResponse(code = 400, message = "An entity group already exists with id or name"),
@ApiResponse(code = 404, message = "No such entity group")
})
@RequiresPermissions(EntityGroupPermissions.ENTITY_GROUP_EDIT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Permission checks on resources that deal with a specific instance should use per-instance checks, not only annotations. Compare eg.

@Path("/{group_id}/add_entity")
@ApiOperation("Add an entity to an entity group")
@ApiResponses(@ApiResponse(code = 404, message = "No such entity group"))
@RequiresPermissions(EntityGroupPermissions.ENTITY_GROUP_EDIT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Permission checks on resources that deal with a specific instance should use per-instance checks, not only annotations. Compare eg.

@PUT
@Path("/get_for_entities")
@ApiOperation("Get a list of entity groups for a list of entities")
@RequiresPermissions(EntityGroupPermissions.ENTITY_GROUP_READ)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Permission checks on resources that deal with a specific instance should use per-instance checks, not only annotations. Compare eg.

@Path("/{id}")
@Consumes(MediaType.APPLICATION_JSON)
@ApiOperation("Delete an entity group")
@RequiresPermissions(EntityGroupPermissions.ENTITY_GROUP_EDIT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the others, but this also should use a separate "delete" permission.

Permission checks on resources that deal with a specific instance should use per-instance checks, not only annotations. Compare eg.

@ryan-carroll-graylog
Copy link
Contributor Author

Really appreciate the review and feedback @kroepke! I believe I've addressed everything you noted but definitely let me know if I missed anything.

Copy link
Member

@kroepke kroepke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the updates!

I've left a few more comments around permissions and one resource leak.
The stream issue we apparently have in many places around the code, we are going to do a more comprehensive check, but please already address it here, as it is relatively simple to do.

in(typeField(type), entityIds)
);
final Multimap<String, EntityGroup> entityToGroupsMap = MultimapBuilder.hashKeys().hashSetValues().build();
MongoUtils.stream(collection.find(query)).forEach(group -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stream uses must call close, because this is an IO stream (see the javadocs of MongoUtils#stream).
.forEach does not call close(), although one could assume it does.

The best way to do this is always to use try-with-resources on stream usages; worst case, it's a no-op for streams that don't require it:

try (final Stream<EntityGroup> stream = MongoUtils.stream(collection.find(query))) {
            stream.forEach(group -> { ... });
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow I was not aware of this, thanks for pointing that out! I've addressed it here and will work this pattern in elsewhere as I see it.

@Path("/{id}")
@ApiOperation(value = "Get a single entity group")
@ApiResponses(@ApiResponse(code = 404, message = "No such entity group"))
@RequiresPermissions(ENTITY_GROUP_READ)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't use the annotation if you check the individual permission in the code as that might confuse future readers.
Another PR will add marker annotations to clearly spell out the intent, but for now, let's use either an annotation or the code check, but not both at the same time (also reduces the risk of modifying one but not the other).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok definitely makes sense, have removed the extraneous annotations in this class.

@GET
@Path("/get_for_entity")
@ApiOperation(value = "Get a list of entity groups for an entity")
@RequiresPermissions(ENTITY_GROUP_READ)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra annotation

@ApiResponse(code = 400, message = "An entity group already exists with id or name"),
@ApiResponse(code = 404, message = "No such entity group")
})
@RequiresPermissions(ENTITY_GROUP_EDIT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra annotation

@Path("/{group_id}/add_entity")
@ApiOperation("Add an entity to an entity group")
@ApiResponses(@ApiResponse(code = 404, message = "No such entity group"))
@RequiresPermissions(ENTITY_GROUP_EDIT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra annotation

@Path("/{id}")
@Consumes(MediaType.APPLICATION_JSON)
@ApiOperation("Delete an entity group")
@RequiresPermissions(ENTITY_GROUP_DELETE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra annotation


@GET
@ApiOperation(value = "Get a list of entity groups")
@RequiresPermissions(ENTITY_GROUP_READ)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In resources that return a list of entities to a user, the entities need to be permission-checked before being returned to the client.

While this creates an awkward problem with pagination and limits, it's currently the only way to ensure we don't leak information.

One way to do this here would be:

final PaginatedList<EntityGroup> paginated = entityGroupService.findPaginated(query, page, perPage, order, sort, null);
return paginated.withList(paginated.stream().filter(entityGroup -> isPermitted(ENTITY_GROUP_READ, entityGroup.id())).toList());

This approach isn't great, as it could have completely empty pages or pages with less than per_page entries, but at least it would be correct.
There's code in the search classes that extends pagination with a predicted, but for now there's no globally available method to do this properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact there's

PaginatedList<T> page(int pageNumber, Predicate<T> selector);
which can help do this already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely makes sense, went ahead and refactored the paginated endpoints to use the MongoPaginationHelper filters downstream.

@PUT
@Path("/get_for_entities")
@ApiOperation("Get a list of entity groups for a list of entities")
@RequiresPermissions(ENTITY_GROUP_READ)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as with findPaginated, this needs to check the returned entities individually

@ryan-carroll-graylog
Copy link
Contributor Author

Thanks for all the comments @kroepke! A lot of great info I believe I've addressed here, and will apply across the board elsewhere too.

Copy link
Contributor

@danotorrey danotorrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryan-carroll-graylog Thanks for all of the updates. I just noticed a couple of additional items from the latest.

Comment on lines 175 to 177
public static final String ENTITY_GROUP_CREATE = PREFIX + "category:create";
public static final String ENTITY_GROUP_UPDATE = PREFIX + "category:update";
public static final String ENTITY_GROUP_DELETE = PREFIX + "category:delete";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might have been missed in the recent changes, but I believe we were going to use the name entity_group everywhere.

Suggested change
public static final String ENTITY_GROUP_CREATE = PREFIX + "category:create";
public static final String ENTITY_GROUP_UPDATE = PREFIX + "category:update";
public static final String ENTITY_GROUP_DELETE = PREFIX + "category:delete";
public static final String ENTITY_GROUP_CREATE = PREFIX + "entity_group:create";
public static final String ENTITY_GROUP_UPDATE = PREFIX + "entity_group:update";
public static final String ENTITY_GROUP_DELETE = PREFIX + "entity_group:delete";

Comment on lines 326 to 328
.add(ENTITY_GROUP_CREATE)
.add(ENTITY_GROUP_UPDATE)
.add(ENTITY_GROUP_DELETE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like formatters for these audit entries are still needed, since the general message is still being logged.
image

@ApiParam(name = "query") @QueryParam("query") @DefaultValue("") String query,
@ApiParam(name = "sort",
value = "The field to sort the result on",
allowableValues = "name")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Similar to some other pagination endpoints, could we add a default value in the pagination endpoints to avoid an error when it's not provided?

Suggested change
allowableValues = "name")
allowableValues = "name")
@DefaultValue(EntityGroup.FIELD_NAME)

@ryan-carroll-graylog
Copy link
Contributor Author

Thanks for reviewing with the latest @danotorrey! Somehow I missed the categories audit logs last time around.
Pushed some updates here and opened an enterprise PR for the audit log formatting: https://github.com/Graylog2/graylog-plugin-enterprise/pull/7585

@ryan-carroll-graylog
Copy link
Contributor Author

@kroepke I believed I've addressed all the PR feedback here. Do you see any issues merging this once we get the last SecDev reviews so we can push forward with front end development?

We've got this behind a feature flag so we can also safely make changes in the future as needed without any major disruptions.

Copy link
Contributor

@danotorrey danotorrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the fixes @ryan-carroll-graylog! LGTM and tests successfully.

ryan-carroll-graylog and others added 2 commits July 8, 2024 11:02
* Create categories page

* Add entity group management functionality

* Remove changelog

* Remove empty hook file

* Tweak menu name
@danotorrey
Copy link
Contributor

We discussed this with the Architecture team, and identified that we are putting the general Entity Groups solution on hold. We may pursue it further later, as a more general way to link entities of different types into a group with potential additional properties such as permissions etc.

In the short-term, we will pursue https://github.com/Graylog2/graylog-plugin-enterprise/issues/7926 instead.

@bernd
Copy link
Member

bernd commented Jul 24, 2024

We discussed this with the Architecture team, and identified that we are putting the general Entity Groups solution on hold. We may pursue it further later, as a more general way to link entities of different types into a group with potential additional properties such as permissions etc.

In the short-term, we will pursue Graylog2/graylog-plugin-enterprise#7926 instead.

I am changing the PR to draft status.

@bernd bernd marked this pull request as draft July 24, 2024 07:23
@bernd bernd removed their request for review August 1, 2024 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants