Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant Performance Discrepancy Between OpenJ9 and Hotspot JVMs for Simple Java Loop #20874

Open
xwqmary opened this issue Dec 31, 2024 · 6 comments

Comments

@xwqmary
Copy link

xwqmary commented Dec 31, 2024

Problem Overview: When executing the following Java program with a loop condition that never meets the exit condition, the OpenJ9 JVM exhibits significantly higher execution time compared to the Hotspot JVM. This behavior suggests that OpenJ9 use the sub-optimal optimization on this program.

Java Code Example:

class B59062_0 {
       static int a = 0;

      public static void main(String[] args) {
          while (a != 14) {
             a += 9;
      }
  }
}

Execution Times:
Hotspot JVM:
time /root/hotspot/jdk-21.0.5/bin/java B59062_0

 real    0m0.341s
 user    0m0.333s
 sys     0m0.021s

Hotspot JVM: The program terminates quickly in approximately 0.34 seconds, suggesting that Hotspot JVM may be optimizing or detecting the infinite loop and exiting early.

OpenJ9 JVM:
time /root/openj9/jdk-21.0.5+11/bin/java B59062_0

real    0m28.353s
user    0m28.483s
sys     0m0.121s

OpenJ9 JVM: The program runs for approximately 28.35 seconds, indicating a prolonged execution likely due to the infinite loop.

Additional Description: I have attempted multiple optimization methods(-Xjit:count=1;-Xaot ;-Xjit:optlevel=scorching ) to improve the performance of the OpenJ9 JVM when executing this code. However, the execution time remains excessively long.

@pshipton
Copy link
Member

pshipton commented Jan 2, 2025

@hzongaro fyi

@hzongaro
Copy link
Member

hzongaro commented Jan 6, 2025

It looks like this is tripping across how long-running methods are handled in OpenJ9. If a method is invoked just once and runs for a very long time, OpenJ9 will usually attempt to compile it while it's being run by the interpreter, and then transfer control into the compiled version of the method while it's still running. There's a brief outline of how that's managed in the Dynamic Loop Transfer section of the Compilation Control documentation.

I'm not familiar with the situations in which Dynamic Loop Transfer can be applied, so I'm not sure why it wasn't activated in this case. Perhaps @mpirvu could provide his insight.

If you tried running with the option -Xjit:{B59062_0.main*}(count=0), it would force the main method to be compiled before it's invoked for the first time. That would avoid the problem.

@mpirvu
Copy link
Contributor

mpirvu commented Jan 6, 2025

The DLT logic exist abruptly if the loop starts at the first bytecode of the method (see the comment below)

   if (startPC ||
       walkState.method==0 ||
       (romMethod->modifiers & J9AccNative) ||
       ((intptr_t)(walkState.method->constantPool) & J9_STARTPC_JNI_NATIVE) ||
       !J9ROMMETHOD_HAS_BACKWARDS_BRANCHES(romMethod) ||
       TR::CompilationInfo::getJ9MethodVMExtra(walkState.method)==J9_JIT_NEVER_TRANSLATE ||
       (J9CLASS_FLAGS(J9_CLASS_FROM_METHOD(walkState.method)) & J9AccClassHotSwappedOut) ||
       walkState.bytecodePCOffset<=0)      // FIXME: Deal with loop back on entry later
      {
      dltBlock->methods[idx] = 0;
      return;

Given the program in comment #20874 (comment), I believe this is what we see here. @zl-wang wrote the initial implementation and may be more familiar as to why loop back on entry is treated differently.
Having said that, DLT situations mostly arise in microbenchmarks and are extremely rare in full applications.

@zl-wang
Copy link
Contributor

zl-wang commented Jan 6, 2025

Given the program in comment #20874 (comment), I believe this is what we see here. @zl-wang wrote the initial implementation and may be more familiar as to why loop back on entry is treated differently.

bytecode index 0 can also mean the full method. loop-back on bci 0 is even rarer. at that time, i didn't work to distinguish between these two cases: full method call (i.e. if the sample happened to fall on stackOverFlow snippet or about starting interp) vs. loop back to bci 0. so, that FIXME comment.

@xwqmary
Copy link
Author

xwqmary commented Jan 7, 2025

@zl-wang Thank you for the detailed analysis of the issue I submitted, helping me understand how OpenJ9 handles long-running methods.
I understand that the problem stems from the Dynamic Loop Transfer (DLT) logic not being properly activated when the loop starts at the first bytecode of the method. I appreciate your guidance. Are there any plans to fix this issue?

@hzongaro hzongaro added the perf label Jan 7, 2025
@hzongaro
Copy link
Member

hzongaro commented Jan 7, 2025

It looks like this problem was previously reported in issue #15281, which contained some discussion of whether it would be feasible to distinguish between the stackOverFlow snippet and loop back to bytecode index 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants