Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floating-Point Rounding Modes #73

Closed
ProfPierce opened this issue May 18, 2020 · 22 comments
Closed

Floating-Point Rounding Modes #73

ProfPierce opened this issue May 18, 2020 · 22 comments

Comments

@ProfPierce
Copy link

I've been trying to understand the implementation of the floating-point instructions in RARS. It seems that for the computational instructions, a rounding mode must be specified as the fourth operand in the instruction. The language specification indicates that this is an optional field, and if not specified in the instruction, rounding defaults to the value contained in the frm field of the fcsr. It is recommended this field be 000 by default for Round to Nearest, ties to Even. If I don't include one of the valid mode options (dyn, rne, rtz, etc)., an error is produced from assembly stating pseudoinstructions aren't allowed. I have pseudoinstructions disabled because I want to use only basic instructions. Observing the fcsr register on the Control and Status tab, it appears the rounding mode is zero as I would expect. What am I missing? Is it the intention in RARS to require all instructions include explicit rounding modes as opposed to using the default in fcsr? Thank you for your time.

@TheThirdOne
Copy link
Owner

TheThirdOne commented May 18, 2020

My understanding

For RARS assembly, if you want to use dynamic rounding in a floating point operation without using psuedo-instructions, you need to include using the rounding mode dyn. This fills the 3 RM bits in the instruction format with 0b111 which on execution gets the rounding mode from the frm CSR.

I think you may be misunderstanding dyn as something that specifies a rounding mode on its own, but instead you should think of it as the rounding mode that defers the frm rounding mode.

To give you a little context for what the psuedo-intructions used are the following is the definition of fmul.s without specifying a rounding mode. Essentially all it does is add , dyn on the end of the instruction.

fmul.s    f1, f2, f3     ; fmul.s    RG1, RG2, RG3, dyn     ;#Floating MULtiply: assigns f1 to f2 * f3

The reason why I don't allow rounding mode to be omitted without psuedo-instructions is partially a technical limitation which is the product of several design decisions (and one specification point).

  • Two basic instructions cannot overlap their binary encoding
  • Each basic instruction has only one assembly format that will produce it
  • Psuedo-instructions use basic instructions through the same interface as normally written code.
  • It must be possible to specify rounding mode.

The reason I have not tried to change any of those points is because psuedo-instructions only save you explicitly writing out the rounding mode and if you have psuedo-instructions disabled it probably means you want things to be explicit rather than implicit.

Action Items

Just using the rounding mode dyn should let you make programs that work as you intend. If you think that omitted rounding mode without psuedo-instructions should still be possible, I will need some convincing. Perhaps an alternative solution is that some simple psuedo-instructions could be allowed whereas more complex ones would not.

I would appreciate knowing your reasoning for wanting no psuedo-instructions. Most of them are quite simple, and mandated by the specification.

If you want a fuller explanation of how the technical points make it impossible to make am omitted rounding mode without psuedo-instructions possible, feel free to ask.

@ProfPierce
Copy link
Author

ProfPierce commented May 19, 2020 via email

@TheThirdOne
Copy link
Owner

I am glad that it has been helpful.

I use the Patterson/Hennessy text books and even though they present the 64-bit version of RISC-V, the 32-bit version is based on the same design.

I actually only learned that the Patterson/Hennessy book uses RV64 yesterday. I don't own a RISC-V copy and hadn't heard from anyone before about it.

Would RARS supporting RV64 be helpful? When initially porting RARS, I dismissed it because I thought it was not necessary, but recently I have been thinking about it more because professional tooling for RV32 is shaky and running programs compiled from C would be a nice feature.

I would certainly be open to adding an option to switch between RV32 and RV64 if there was interest.

@ProfPierce
Copy link
Author

ProfPierce commented May 19, 2020 via email

@TheThirdOne
Copy link
Owner

I just created another issue that will track progress on implementing RV64. Most of the work will come in the form of reading the specification closely to see how rv32 and rv64 differ. I might have it done before the fall, but don't count on it.

@ProfPierce
Copy link
Author

ProfPierce commented May 19, 2020 via email

@BenjaminBeichler
Copy link
Contributor

I would also appreciate RV64 support, but I also use RARS since 2 years without it ;-) Unfortunately, I also have no capacity to support the development, except testing (even from my students).

👍

@TheThirdOne
Copy link
Owner

@ProfPierce and @BenjaminBeichler, I have finished the bulk of the work on the rv64 mode. A jar can be found at https://github.com/TheThirdOne/rars/releases/tag/pr78 . Let me know if something doesn't work correctly.

@BenjaminBeichler
Copy link
Contributor

Nice! I will have a look, and report any problems I see.

@ProfPierce
Copy link
Author

ProfPierce commented Jun 23, 2020 via email

@TheThirdOne
Copy link
Owner

TheThirdOne commented Jun 23, 2020

@ProfPierce, Your expectations are absolutely correct. You hit a bug in the .dword implementation; I didn't explicitly handle the case where the value fits in 32 bits and expected the other code to correctly handle it. I pushed a fix in af804a5. Thanks for pointing this out.

BTW, I can't see the attachment through the Github interface.

@ProfPierce
Copy link
Author

ProfPierce commented Jun 24, 2020 via email

@TheThirdOne
Copy link
Owner

TheThirdOne commented Jul 8, 2020

I have updated the jar with the latest developments. It should be correct and complete aside from updates to system calls. Currently all of the system calls only consider the bottom 32 bits.

I am considering merging this in as is and not updating the system calls to go with it because I am feeling particularly demotivated to work on them and I would like to get out a new release before most school's semesters start.

@ProfPierce
Copy link
Author

ProfPierce commented Jul 9, 2020 via email

@BenjaminBeichler-Bot
Copy link

@TheThirdOne good job, I think it is also totally enough to put the info that syscalls only consider bottom 32 bits into the documentation and it is fine. It takes 3 lines to shift the upper part if needed for print, the "user" only needs to know.

@ProfPierce
Copy link
Author

ProfPierce commented Jul 12, 2020 via email

@TheThirdOne
Copy link
Owner

@ProfPierce, If you are downloading from https://github.com/TheThirdOne/rars/releases/tag/continuous, that would not be the correct place to look. https://github.com/TheThirdOne/rars/releases/tag/pr78 would be the place to download from until I make a release for v1.5 which should happen sometime in the next two weeks.

The continuous release is an automatic thing, and I have made some mistakes in how I set it up that makes it more confusing than it should be. I committed a fix for #82 which triggered rebuilding the main branch without support for rv64. The pr78 release is manually updated by me and dedicated to providing a build until rv64 gets included into the master branch.

@ProfPierce
Copy link
Author

ProfPierce commented Jul 13, 2020 via email

@TheThirdOne
Copy link
Owner

I just published the release that adds support for rv64 (https://github.com/TheThirdOne/rars/releases/tag/v1.5).

@ProfPierce
Copy link
Author

ProfPierce commented Aug 8, 2020 via email

@TheThirdOne
Copy link
Owner

I was trying to find this in the spec and cannot. All the references to the .dword directive indicate they should begin at memory addresses that are 8-byte boundaries. Maybe you can help me out to understand if this is indeed the way it is supposed to work and reference me to this in the spec.

I don't recall anything about assembler directives being mentioned anywhere in the RISC-V specification. As a general rule, interactions with the CPU for RISC-V are well specified, but how tooling works is more based on convention of tools for other architectures and what early RISC-V tools used.

I just tested and found that GNU AS also does not align .dword to double word boundaries. It also doesn't align smaller stores to their natural alignment though. So RARS is weird in that it does align the smaller directives.

This isn't a show stopper by any means and i can go with the current implementation. However, I would like some explanation to offer to my students who will come across this when working on assembly language projects.

I decided to implement .dword (and ld) to not need alignment simply because it was easier. I am open to changing it to, but I would be more in favor of matching AS by dropping natural alignment for other directives than adding alignment to .dword.

One other thing to note is that RARS currently traps on misaligned loads which is allowed by the specification, but only with an exception handler that makes the loads still work even if slower and non-atomically. Trapping on misaligned load should become an optional feature.

The best explanation for the behavior of GNU AS can see for students is that it is much harder to get around forced natural alignment than it is to manually align when you need to.

@ProfPierce
Copy link
Author

ProfPierce commented Aug 9, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants