Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DROOLS-7589] ansible-rulebook : Throw Exception when heap reaches to… #90

Merged
merged 5 commits into from
Nov 15, 2023

Conversation

tkobayas
Copy link
Collaborator

… threshold

Comment on lines 65 to 78
public CompletableFuture<Integer> executeFacts(String json) {
MemoryMonitorUtil.checkMemoryOccupation();
return rulesEvaluator.executeFacts(asFactMap(json));
}

public CompletableFuture<List<Match>> processFacts(String json) {
MemoryMonitorUtil.checkMemoryOccupation();
return rulesEvaluator.processFacts(asFactMap(json));
}

public CompletableFuture<List<Match>> processEvents(String json) {
MemoryMonitorUtil.checkMemoryOccupation();
return rulesEvaluator.processEvents(asFactMap(json));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the place where we can check the memory before sync/async execution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that checking this synchronously and at each and every rule evaluation is a good idea. I'm afraid the overhead could be relevant and not totally justified. One possibility is to run this asynchronously as a timed task. Another alternative is doing it here and synchronously but actually doing the check (and running the gc) only once in fixed number (100?) of evaluations. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I agree that we want to minimize the overhead. Probably the former (time-based check) may miss "quickly incoming many events" scenario. So I guess the latter (number-based check) is better?

Comment on lines 30 to 35
int memoryOccupationPercentage = getMemoryOccupationPercentage();
if (memoryOccupationPercentage > MEMORY_OCCUPATION_PERCENTAGE_THRESHOLD) {
System.gc(); // NOSONAR
// double check to avoid frequent GC
memoryOccupationPercentage = getMemoryOccupationPercentage();
if (memoryOccupationPercentage > MEMORY_OCCUPATION_PERCENTAGE_THRESHOLD) {
Copy link
Collaborator Author

@tkobayas tkobayas Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means I call System.gc() only when heap exceeds the threshold. However, I think it doesn't make much sense because GC is processed by a different thread (and not something we can fully control) so the second check wouldn't reflect the result of GC. Shall I simply remove it?

Also Shall I add another threshold (e.g. 60%) to call System.gc()?

As far as I observe the default G1GC, GC is triggered when the used heap is around 40% - 60% of the total heap.

Please let me know your thoughts, @mariofusco

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that what you did here is a good idea. Let's give the gc one last chance to run before actually reporting the problem.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I keep the logic.

@tkobayas tkobayas force-pushed the DROOLS-7589-ansible-heap branch from 9f761c9 to a0293f5 Compare November 14, 2023 07:30
Comment on lines +18 to +23
@Ignore("Disabled by default, because it takes around 40 seconds")
@Test
public void testManyLargeEvents() {
// match_multiple_rules: false means events are removed after match. So this test will pass without throwing MemoryThresholdReachedException
checkDuration("1m_event_with_20kb_payload_rules_ast.json", 120_000);
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this test, we can confirm there is no memory leak.

Comment on lines +25 to +29
@Test
public void testManyLargeEventsMatchMultipleRules() {
// match_multiple_rules: true means events are retained until TTL expires
assertThrows(MemoryThresholdReachedException.class, () -> checkDuration("1m_event_with_20kb_payload_match_multiple_rules_ast.json", 120_000));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this test, we can confirm MemoryThresholdReachedException is thrown before hitting OutOfMemoryError.

Comment on lines +23 to +24
// set true when matchedEvents occupies too much memory
private boolean discardMatchedEvents = false;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to add this to avoid memory retention on Payload side in the test cases.

Comment on lines -46 to +47
if (log.isInfoEnabled()) {
log.info(matchToString(match));
if (log.isDebugEnabled()) {
log.debug(matchToString(match));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested by Madhu, changed to DEBUG level. When there are many large events, INFO would result in too much logging.


public static void checkMemoryOccupation() {
// check memory every N events to avoid performance overhead
if (eventCount.incrementAndGet() >= MEMORY_CHECK_EVENT_COUNT_THRESHOLD) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a similar idea but possibly a little bit more efficient, let me try to propose an alternative.

Comment on lines 45 to 51
public static void checkMemoryOccupation() {
// check memory every N events to avoid performance overhead
if (eventCount.incrementAndGet() >= MEMORY_CHECK_EVENT_COUNT_THRESHOLD) {
doCheckMemoryOccupation();
eventCount.set(0);
}
}
Copy link
Collaborator Author

@tkobayas tkobayas Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mariofusco Now check memory every N events to avoid the overhead. Default is 100.

Btw, you wrote (and running the gc) , but isn't System.gc() per 100 events too frequent? Rather, shall I call System.gc() only when memoryOccupationPercentage > 60%? (maybe another configurable property)

@mariofusco mariofusco merged commit e3a6c73 into kiegroup:main Nov 15, 2023
1 check passed
mariofusco pushed a commit that referenced this pull request Jan 15, 2024
#90)

* [DROOLS-7589] ansible-rulebook : Throw Exception when heap reaches to threshold

* - check memory every N events to avoid performance overhead

* More efficient counter to check memory occupation

* fix compilation

* wip

---------

Co-authored-by: Mario Fusco <[email protected]>
(cherry picked from commit e3a6c73)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants