Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nextWakupNanos lazySet. #149

Closed
wants to merge 1 commit into from

Conversation

pveentjer
Copy link
Contributor

@pveentjer pveentjer commented Mar 1, 2022

The memory ordering effects of the nextWakupNanos are irrelevant. We
just need to make sure that the compiler doesn't optimize out the load
or the store. A setOpaque should be sufficient. But since the code needs
to be Java 8 compatible, setRelease is the cheapest alternative.

This will restrict compiler optimizations but doesn't lead to a memory
fence on the X86 since every store is a release store (the semantics
provided by a setLazy).

In a subsequent PR I'll deal with the getAndSet in the finally clause.

Fixes one part of #148

The memory ordering effects of the nextWakupNanos are irrelevant. We
just need to make sure that the compiler doesn't optimize out the load
or the store. A setOpaque should be sufficient. But since the code needs
to be Java 8 compatible, setRelease is the cheapest alternative.

This will restrict compiler optimizations but doesn't lead to a memory
fence on the X86 since every store is a release store (the semantics
provided by a setLazy).

In a subsequent PR I'll deal with the getAndSet in the finally clause.
Copy link
Contributor

@franz1981 franz1981 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@normanmaurer
Copy link
Member

@pveentjer did you sign our ICLA?

https://netty.io/s/icla

@pveentjer
Copy link
Contributor Author

I just did.

@normanmaurer
Copy link
Member

@pveentjer Would you also be interested to do the same for our native epoll / kqueue transports in netty itself ?

@normanmaurer normanmaurer added this to the 0.0.13.Final milestone Mar 1, 2022
@pveentjer
Copy link
Contributor Author

@normanmaurer sure. But let me fix the io_uring one first. I'm still trying to get a proper understanding of the wakeup mechanism in combination with the atomic operations and the memory ordering. I need a fresh brain for that :)


// The memory ordering effects of the nextWakeupNanos are irrelevant. A setOpaque should be sufficient,
// but on Java 8 the lazySet is the cheapest alternative. It is free on X86 since every store is a release store.
nextWakeupNanos.lazySet(curDeadlineNanos);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pveentjer we must ensure that this write happens before the subsequent hasTasks() read or else a (non-IO) task could be put in the queue by another thread without us being awoken.

Note that the cost here should not be significant anyhow. We are about to make a blocking syscall because there's no more work to do, so it is not on a critical path. A heavily loaded system should rarely reach here since we'll remain in the CQ processing loop below.

Copy link
Contributor Author

@pveentjer pveentjer Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you need to order an older store with a newer load to a different address. So you need a [StoreLoad] which is provided by a volatile write and not by something weaker. So my change would break the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pveentjer so I guess we should close this pr ?

@pveentjer pveentjer closed this Mar 3, 2022
@pveentjer
Copy link
Contributor Author

Yes. Best to close this one. I got a lot better understanding of the signaling mechanism and the queue. By having 2 separate mechanisms, the expensive [StoreLoad] is needed. If the flag to wakup and the queue would be integrated into the same structure, I don't think the expensive [StoreLoad] is needed on the I/O thread.

@njhill
Copy link
Member

njhill commented Mar 3, 2022

Thanks @pveentjer. You're right, and in fact a while back I prototyped such a custom queue implementation, I may try to resurrect that for interest. One issue though is that since then the queue impl was made pluggable and a fully integrated queue would probably need to break that extensibility.

@pveentjer
Copy link
Contributor Author

pveentjer commented Mar 4, 2022

For Hazelcast I did something similar.

Before beginning on such a journey, I would first make a JMH test and check if the current implementation is actually causing performance degradation or if there are other more urgent problems.

As any self-respecting engineer, I have made optimizations with unproven improvements :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants