Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented CPU plugin just-in-time emitter for SoftPlus operation. #28752

Open
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

geeky33
Copy link
Contributor

@geeky33 geeky33 commented Jan 30, 2025

Details:

Add ARM64 JIT emitter implementation for SoftPlus operation (ln(1 + exp(x))) with fp32 support.

  • Changes include:
  • Implement SoftPlus JIT emitter using ARM NEON instructions
  • Add kernel support in ARM64 executor
  • Add test coverage for JIT SoftPlus operation
  • Included comprehensive unit tests and performance benchmarks. Verified against reference CPU implementation with proper handling of edge cases (NaN, infinity, denormals).

No breaking changes. Requires ARM64 platform with NEON SIMD support.

@eshoguli
@dmitry-gorokhov
@a-sidorova
@p-wysocki

Tickets:

@geeky33 geeky33 requested review from a team as code owners January 30, 2025 12:40
@github-actions github-actions bot added the category: CPU OpenVINO CPU plugin label Jan 30, 2025
@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Jan 30, 2025
@a-sidorova a-sidorova self-assigned this Jan 31, 2025
@a-sidorova a-sidorova added the platform: arm OpenVINO on ARM / ARM64 label Jan 31, 2025
@a-sidorova a-sidorova added this to the 2025.1 milestone Jan 31, 2025
Comment on lines 2556 to 2559
h->fexp(aux1.s, src.s);
h->ld1r(aux2.s, table_val2("one"));
h->fadd(aux1.s, aux1.s, aux2.s);
h->flog(dst.s, aux1.s);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember, you told that you have problems with test launching using qemu (but as we agreed, I don't see comments with asking help with your problem in your GFI).
But anyway do you have a chance to validate this implementation? I didn't find the instructions in fexp or flog in AArch64 manual.

Exp can be calculated used jit_exp_emitter (please see the Sigmoid emitter implementation, this emitter needs to calculate exp too).

As for log, it can be calculated using approximation. Please take a look at oneDNN impl. But it's implemented using SVE. You can try to rewrite it using NEON with the same math underneath

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Ma'am
I realised that this was the PR which I created after talking with you realted to launching the tests issue. Hence I thought to refactor the code after getting your review and then ask in the comments for asking help on the tests.

@@ -57,6 +57,7 @@ bool JitEltwiseExecutor::isSupported(const Algorithm& algorithm,
Algorithm::EltwiseRoundHalfAwayFromZero,
Algorithm::EltwiseRoundHalfToEven,
Algorithm::EltwiseSelect,
Algorithm::EltwiseSoftPlus,
Copy link
Contributor

@a-sidorova a-sidorova Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like EltwiseSoftPlus is missed in Algorithm enum.

Please see the example how Negative was added to Eltwise op

@geeky33
Copy link
Contributor Author

geeky33 commented Feb 4, 2025

@a-sidorova
Greetings Ma'am

I have done the changes according to the comments made.

Please let me know what all more changes are needed to be done further.

Thanks and Regards.

@a-sidorova
Copy link
Contributor

@geeky33 if contributor asks to review, I assume that the contributor is sure that code is correct and works. And I may spend my time on review. Otherwise, contributor should ask help with the code. I already left the comment with helpful links and hints how to implement SoftPlus emitter with exp and log. Unfortunately, I don't see applied changes (removing fexp and replacing flog with flog_approx don't fix the implementation) but you asked me to review it again :)
If you have concrete questions about the implementation and my previous comment doesn't help, please let me know and feel free to ask!

As for testing and emulator problems, I asked you to leave comment in the issue with asking help with faced problem. I still don't see this comment with problem description and your steps for reproduction.

@geeky33
Copy link
Contributor Author

geeky33 commented Feb 4, 2025

@a-sidorova

Greetings once again.
I will be making the exact changes by next few hours ma'am
Extremely sorry for the issue being caused
I am on it.

Now, since launching the tests was an issue to me

Here are the steps I followed for NotEqual JIT Emitter

sudo apt-get update
sudo apt-get install qemu-user qemu-user-static gcc-aarch64-linux-gnu g++-aarch64-linux-gnu binutils-aarch64-linux-gnu
set(CMAKE_SYSTEM_NAME Linux)
set(CMAKE_SYSTEM_PROCESSOR aarch64)

set(CMAKE_C_COMPILER aarch64-linux-gnu-gcc)
set(CMAKE_CXX_COMPILER aarch64-linux-gnu-g++)
set(CMAKE_STRIP aarch64-linux-gnu-strip)
set(PKG_CONFIG_EXECUTABLE aarch64-linux-gnu-pkg-config CACHE PATH "Path to ARM64 pkg-config")
sudo apt-get install -y \
    qemu-user \
    qemu-user-static \
    gcc-aarch64-linux-gnu \
    g++-aarch64-linux-gnu \
    binutils-aarch64-linux-gnu \
    build-essential \
    cmake
aarch64-linux-gnu-gcc --version
aarch64-linux-gnu-g++ --version

sudo apt update
sudo apt install gcc-aarch64-linux-gnu g++-aarch64-linux-gnu binutils-aarch64-linux-gnu libc6-dev-arm64-cross

Till here everythhing went superbly fine

I have ran the build cmd again after deleting the dir
and i m facing the same issue again

image

image

ran this cmd sudo apt install libc6:aarch64

and also followed your instructions
removed build and bin directories
updated aarch64 compiler
And tried to configure openvino and built it one more time

even after subsequent tries I can see that the build is completed with 100% even after that I am not able to launch the relevant testcases.

@geeky33
Copy link
Contributor Author

geeky33 commented Feb 4, 2025

Extremely sorry for the issue being caused:

Before making any changes,
could you please just check this part of the codeblock which I am attaching below:

  using TReg = typename dnnl::impl::cpu::aarch64::cpu_isa_traits<isa>::TReg;
  const TReg src(in_vec_idxs[0]);
  const TReg dst(out_vec_idxs[0]);
  const TReg aux1(aux_vec_idxs[0]);
  const TReg aux2(aux_vec_idxs[1]);
  const TReg aux3(aux_vec_idxs[2]);
  const TReg aux4(aux_vec_idxs[3]);
  const TReg aux5(aux_vec_idxs[4]);

  h->ld1r(aux1.s, table_val2("threshold"));
  h->ld1r(aux3.s, table_val2("one"));
  h->fcmgt(aux2.s, src.s, aux1.s);
  h->fmov(aux4.s, src.s);
  h->fadd(aux4.s, aux4.s, aux3.s);
  h->fmov(aux5.s, aux4.s);
  h->bsl(dst.s, aux2.s, src.s, aux5.s);

To validate my this implementation:
This implementation appears to be appropriate for performing a SIMD thresholding operation: it compares src with a threshold, increments values below the threshold by 1.0, and uses bsl to conditionally select between the original and incremented values. Ensuring that table_val2 correctly loads threshold and one, and validate edge cases (e.g., threshold equality, NaN).

@a-sidorova
Copy link
Contributor

@geeky33 First of all, I suggest to fix your problem with qemu. As I can see in your previous screenshot, you have the same problem when you try to execute even simple void main() {} using qemu. So I think there is some problem in your environment, not in OpenVINO. Launching of tests is the one of important parts in development :)

Then let's do so: when you are sure in your implementation, launch all needed tests for SoftPlus and they are green in your local environment, please attach screenshots that all tests are successfully passed on your machine and ask me to review your PullRequest. When I see that everything is ready and tests are successfully and locally passed on your machine (see your attached screenshot), I will take a look at your changes.

Also, I asked my colleagues if someone of them faced with the same problem in launching tests with qemu, but no one see it in his environment. So I think that firstly you need to spend some time on googling if you want to launch tests with emulator :)

@geeky33
Copy link
Contributor Author

geeky33 commented Feb 5, 2025

@a-sidorova Thank you for the feedback
Is it possible to get in touch with any of your colleague who looks after the setup areas of openvino for fixing this issue?
Since it is very overwhelming for me to fix this as I tried doing research and I could not find enough clues.
I believe that I have tried all the possible attempts to fix this issue from my side and also the suggestions given by you in the discord

Still whatever could be the done I will look into it as well! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin ExternalPR External contributor platform: arm OpenVINO on ARM / ARM64
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Good First Issue] [ARM]: Implement CPU plugin just-in-time emitter for SoftPlus operation
3 participants