-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
winch: x64 atomic stores #9987
winch: x64 atomic stores #9987
Conversation
Subscribe to Label Action
This issue or pull request has been labeled: "winch"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that we might want to do as part of this change before proceeding further with the development of the rest of the instructions in the proposal, is to ensure that addresses are aligned for atomic loads/stores. See the alignment section of the proposal.
To check for alignment, I'd recommend creating a new method in the CodeGen
module (e.g., emit_check_alignment
) and either calling it before calling emit_compute_heap_addr
or enhancing emit_compute_heap_addr
to check alignment internally if the load/store is atomic.
For reference, this is how alignment is handled in Cranelift.
winch/codegen/src/isa/x64/masm.rs
Outdated
// TODO: we don't support 128-bit atomic store yet. | ||
bail!(CodeGenError::unexpected_operand_size()); | ||
} | ||
// To stay consistent with cranelift, we emit a normal load followed by a mfence, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// To stay consistent with cranelift, we emit a normal load followed by a mfence, | |
// To stay consistent with cranelift, we emit a normal store followed by a mfence, |
0b53b93
to
4dfacab
Compare
@saulecabrera I have added the align check |
02fa02a
to
e462bb3
Compare
winch/codegen/src/codegen/mod.rs
Outdated
) -> Result<Option<Reg>> { | ||
if check_align { | ||
self.check_align(memarg, access_size)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally in the codegen module, we try to stick with the emit_*
prefix, for consistency, could you update the name of this method?
winch/codegen/src/codegen/mod.rs
Outdated
@@ -648,7 +648,12 @@ where | |||
&mut self, | |||
memarg: &MemArg, | |||
access_size: OperandSize, | |||
check_align: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of either:
- Passing an enum here describing the type of heap address computation: (
HeapAddress::AlignChecked
,HeapAddress::AlignUnchecked
) - Creating a wrapper method
emit_compute_heap_address_align_checked
, which internally callsemit_check_align
andemit_compute_heap_address
?
Heap address calculation is a very critical piece of the compiler, so I'd recommend against passing boolean params, to make it less error prone at call sites.
winch/codegen/src/codegen/mod.rs
Outdated
let addr = *self.context.stack.peek().unwrap(); | ||
let tmp = self.context.any_gpr(self.masm)?; | ||
self.context.move_val_to_reg(&addr, tmp, self.masm)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend using self.context.pop_to_reg
, which already handles all the cases for moving a value to a register; e.g., in this case if a the value is already a register, this code will emit a move, which could be avoided, and more importantly it'll reduce register pressure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to note with using self.context.pop_to_reg
is that this method needs to ensure that the value is pushed back to the value stack after all the checks are emitted, to ensure that emit_compute_heap_address
is able to pop the address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been playing around with that, but it turns out that we need to move the address to a register anyway, because we potentially need to add the offset
to the addr
, and we cannot do that with the addr
as a dst
, as we need the address intact for computing the heap address later on. The reason why we can't add with tmp
as a dst, is because that will return a InvalidTwoArgumentForm
.
In light of this, I think peeking and moving to tmp
would save a mov
in the case where we need to compute the offset. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's a good point yeah. My main concern with peek
is that nothing is preventing any other method to accidentally pop the wrong value from the stack. However, since this pattern is local to this method, I don't think there's a huge risk, so yeah, let's try the peek approach. One comment to your original implementation, could you return an error instead of doing peek().unwrap()
? We have CodeGenError::missing_values_in_stack
for this kind of situation.
fn check_align(&mut self, memarg: &MemArg, size: OperandSize) -> Result<()> { | ||
if size.bytes() > 1 { | ||
let addr = *self.context.stack.peek().unwrap(); | ||
let tmp = self.context.any_gpr(self.masm)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you could use a scratch register here? (scratch!(M)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried about the risk of clobbering the a scratch register here. The register is used across multiple masm operation, and we do use scratch registers in masm often.
491c1a7
to
26d9841
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
After the trailing comment is deleted, we can land this one.
winch/codegen/src/codegen/mod.rs
Outdated
if size.bytes() > 1 { | ||
let addr = self.context.pop_to_reg(self.masm, None)?; | ||
let tmp = scratch!(M); | ||
// self.context.move_val_to_reg(&addr, tmp, self.masm)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// self.context.move_val_to_reg(&addr, tmp, self.masm)?; |
26d9841
to
25a703c
Compare
@saulecabrera it's not working yet I had to move to a x86 machine to debug, I'll ping you when i fixed it |
@saulecabrera, I have made the changes you requested, I also responded to your comments, let me know if you want me to make the changes I suggested in the comments instead. |
- use scratch register for tmp in emit_align_check - pop-push value from stack rather than peeking in emit_align_check
73bab48
to
cd24682
Compare
@saulecabrera I made the final changes |
This PR implements x64 store operations:
i32.atomic.store8
i32.atomic.store16
i32.atomic.store
i64.atomic.store8
i64.atomic.store16
i64.atomic.store32
i64.atomic.store
#9734