winch: x64 atomic stores #9987

MarinPostma · 2025-01-12T12:10:08Z

This PR implements x64 store operations:

i32.atomic.store8
i32.atomic.store16
i32.atomic.store
i64.atomic.store8
i64.atomic.store16
i64.atomic.store32
i64.atomic.store

#9734

github-actions · 2025-01-12T13:44:40Z

Subscribe to Label Action

cc @saulecabrera

This issue or pull request has been labeled: "winch"

Thus the following users have been cc'd because of the following labels:

saulecabrera: winch

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

saulecabrera

One thing that we might want to do as part of this change before proceeding further with the development of the rest of the instructions in the proposal, is to ensure that addresses are aligned for atomic loads/stores. See the alignment section of the proposal.

To check for alignment, I'd recommend creating a new method in the CodeGen module (e.g., emit_check_alignment) and either calling it before calling emit_compute_heap_addr or enhancing emit_compute_heap_addr to check alignment internally if the load/store is atomic.

For reference, this is how alignment is handled in Cranelift.

saulecabrera · 2025-01-13T14:02:38Z

winch/codegen/src/isa/x64/masm.rs

+                    // TODO: we don't support 128-bit atomic store yet.
+                    bail!(CodeGenError::unexpected_operand_size());
+                }
+                // To stay consistent with cranelift, we emit a normal load followed by a mfence,


Suggested change

// To stay consistent with cranelift, we emit a normal load followed by a mfence,

// To stay consistent with cranelift, we emit a normal store followed by a mfence,

MarinPostma · 2025-01-13T20:21:38Z

@saulecabrera I have added the align check

saulecabrera · 2025-01-14T11:37:56Z

winch/codegen/src/codegen/mod.rs

    ) -> Result<Option<Reg>> {
+        if check_align {
+            self.check_align(memarg, access_size)?;


Generally in the codegen module, we try to stick with the emit_* prefix, for consistency, could you update the name of this method?

saulecabrera · 2025-01-14T14:06:50Z

winch/codegen/src/codegen/mod.rs

@@ -648,7 +648,12 @@ where
        &mut self,
        memarg: &MemArg,
        access_size: OperandSize,
+        check_align: bool,


What do you think of either:

Passing an enum here describing the type of heap address computation: (HeapAddress::AlignChecked, HeapAddress::AlignUnchecked)

Creating a wrapper method emit_compute_heap_address_align_checked, which internally calls emit_check_align and emit_compute_heap_address?

Heap address calculation is a very critical piece of the compiler, so I'd recommend against passing boolean params, to make it less error prone at call sites.

saulecabrera · 2025-01-14T14:09:05Z

winch/codegen/src/codegen/mod.rs

+            let addr = *self.context.stack.peek().unwrap();
+            let tmp = self.context.any_gpr(self.masm)?;
+            self.context.move_val_to_reg(&addr, tmp, self.masm)?;


I'd recommend using self.context.pop_to_reg, which already handles all the cases for moving a value to a register; e.g., in this case if a the value is already a register, this code will emit a move, which could be avoided, and more importantly it'll reduce register pressure.

One thing to note with using self.context.pop_to_reg is that this method needs to ensure that the value is pushed back to the value stack after all the checks are emitted, to ensure that emit_compute_heap_address is able to pop the address.

I've been playing around with that, but it turns out that we need to move the address to a register anyway, because we potentially need to add the offset to the addr, and we cannot do that with the addr as a dst, as we need the address intact for computing the heap address later on. The reason why we can't add with tmp as a dst, is because that will return a InvalidTwoArgumentForm.

In light of this, I think peeking and moving to tmp would save a mov in the case where we need to compute the offset. WDYT?

Oh that's a good point yeah. My main concern with peek is that nothing is preventing any other method to accidentally pop the wrong value from the stack. However, since this pattern is local to this method, I don't think there's a huge risk, so yeah, let's try the peek approach. One comment to your original implementation, could you return an error instead of doing peek().unwrap()? We have CodeGenError::missing_values_in_stack for this kind of situation.

saulecabrera · 2025-01-14T14:15:38Z

winch/codegen/src/codegen/mod.rs

+    fn check_align(&mut self, memarg: &MemArg, size: OperandSize) -> Result<()> {
+        if size.bytes() > 1 {
+            let addr = *self.context.stack.peek().unwrap();
+            let tmp = self.context.any_gpr(self.masm)?;


I believe you could use a scratch register here? (scratch!(M))

I'm a bit worried about the risk of clobbering the a scratch register here. The register is used across multiple masm operation, and we do use scratch registers in masm often.

saulecabrera

Looks good, thanks!

After the trailing comment is deleted, we can land this one.

saulecabrera · 2025-01-14T15:21:02Z

winch/codegen/src/codegen/mod.rs

+        if size.bytes() > 1 {
+            let addr = self.context.pop_to_reg(self.masm, None)?;
+            let tmp = scratch!(M);
+            // self.context.move_val_to_reg(&addr, tmp, self.masm)?;


Suggested change

// self.context.move_val_to_reg(&addr, tmp, self.masm)?;

MarinPostma · 2025-01-14T16:04:59Z

@saulecabrera it's not working yet I had to move to a x86 machine to debug, I'll ping you when i fixed it

MarinPostma · 2025-01-14T17:44:52Z

@saulecabrera, I have made the changes you requested, I also responded to your comments, let me know if you want me to make the changes I suggested in the comments instead.

- use scratch register for tmp in emit_align_check - pop-push value from stack rather than peeking in emit_align_check

MarinPostma · 2025-01-15T13:05:28Z

@saulecabrera I made the final changes

MarinPostma requested review from a team as code owners January 12, 2025 12:10

MarinPostma requested review from fitzgen and removed request for a team January 12, 2025 12:10

github-actions bot added the winch Winch issues or pull requests label Jan 12, 2025

saulecabrera reviewed Jan 13, 2025

View reviewed changes

fitzgen removed their request for review January 13, 2025 19:42

MarinPostma force-pushed the winc-x64-atomic-store branch 2 times, most recently from 0b53b93 to 4dfacab Compare January 13, 2025 20:07

MarinPostma force-pushed the winc-x64-atomic-store branch 2 times, most recently from 02fa02a to e462bb3 Compare January 13, 2025 20:31

saulecabrera reviewed Jan 14, 2025

View reviewed changes

saulecabrera removed the request for review from a team January 14, 2025 14:16

MarinPostma force-pushed the winc-x64-atomic-store branch from 491c1a7 to 26d9841 Compare January 14, 2025 14:58

saulecabrera approved these changes Jan 14, 2025

View reviewed changes

MarinPostma force-pushed the winc-x64-atomic-store branch from 26d9841 to 25a703c Compare January 14, 2025 15:22

This was referenced Jan 15, 2025

winch x64 rmw sub #10008

Merged

Winch: implement rmw add ops #9990

Merged

MarinPostma added 8 commits January 15, 2025 13:45

add MemOpKind to wasm_store

f9d6efe

add fence to x64 asm

c9b4a0c

implement atomic store for x64 masm

8454623

emit error for 128bits atomic store

b36054c

propagate MemOpKind from Visitor

cc27c42

implement atomic store visitor methods

2473ccd

add atomic store tests

be4ea92

fmt

8bcd530

MarinPostma added 9 commits January 15, 2025 13:45

whitelist spec tests

c7c5d6a

atomic heap address compute align check

a9e35fe

fmt

30cee24

edit comment

4293461

cleanup check align function

5864b5f

add doc for check_align

f7044ef

review edits

54f24d7

review edits

e16d7d8

- use scratch register for tmp in emit_align_check - pop-push value from stack rather than peeking in emit_align_check

final edits

cd24682

MarinPostma force-pushed the winc-x64-atomic-store branch from 73bab48 to cd24682 Compare January 15, 2025 13:04

saulecabrera added this pull request to the merge queue Jan 15, 2025

Merged via the queue into bytecodealliance:main with commit 3a4cf0a Jan 15, 2025
39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

winch: x64 atomic stores #9987

winch: x64 atomic stores #9987

MarinPostma commented Jan 12, 2025

github-actions bot commented Jan 12, 2025

saulecabrera left a comment •

edited

Loading

saulecabrera Jan 13, 2025

MarinPostma commented Jan 13, 2025

saulecabrera Jan 14, 2025

saulecabrera Jan 14, 2025

saulecabrera Jan 14, 2025

saulecabrera Jan 14, 2025

MarinPostma Jan 14, 2025

saulecabrera Jan 14, 2025

saulecabrera Jan 14, 2025

MarinPostma Jan 14, 2025

saulecabrera left a comment

saulecabrera Jan 14, 2025

MarinPostma commented Jan 14, 2025

MarinPostma commented Jan 14, 2025

MarinPostma commented Jan 15, 2025

	// To stay consistent with cranelift, we emit a normal load followed by a mfence,
	// To stay consistent with cranelift, we emit a normal store followed by a mfence,

winch: x64 atomic stores #9987

winch: x64 atomic stores #9987

Conversation

MarinPostma commented Jan 12, 2025

github-actions bot commented Jan 12, 2025

Subscribe to Label Action

saulecabrera left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarinPostma commented Jan 13, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saulecabrera left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarinPostma commented Jan 14, 2025

MarinPostma commented Jan 14, 2025

MarinPostma commented Jan 15, 2025

saulecabrera left a comment •

edited

Loading