Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undeprecate ld [$ff00+c] #1600

Open
Rangi42 opened this issue Jan 5, 2025 · 4 comments
Open

Undeprecate ld [$ff00+c] #1600

Rangi42 opened this issue Jan 5, 2025 · 4 comments
Assignees
Labels
rgbasm This affects RGBASM
Milestone

Comments

@Rangi42
Copy link
Contributor

Rangi42 commented Jan 5, 2025

In 0.9.0 we deprecated ld [c] in favor of ldh [c]. We also deprecated ld [$ff00+c], expecting ldh [c] to be the sole accepted syntax, but that was not documented in rgbasm-old(5).

ld [$ff00+c] turns out to be common enough that it could be worth supporting.

There is a problem with doing so. Old versions of RGBDS really did just support [$ff00+c] or single-spaced [$ff00 + c] (it was treated as a single token). However, as we refactored and improved the lexer+parser, this ended up becoming way too general: $ff00 was easier to lex as a number than as a literal (case-insensitive) token. But that means RGBASM can technically parse cursed code like [65280 + c], or [0o177400 + C], or [0xff00+LOW(bc)]. We never intended to allow that sort of flexibility -- and it's actually hard to avoid a parser ambiguity since + is also an operator in numeric expressions.

So. I kind of want to support the semi-popular [$ff00+c]/[$ff00 + c] syntax. But I do not want to give the impression that all those weird nontraditional variants are supported. And ideally those variants would be an outright error.

I'm open to community feedback on this question. (See the Discord poll!)

@Rangi42 Rangi42 added the rgbasm This affects RGBASM label Jan 5, 2025
@Rangi42 Rangi42 added this to the 0.9.1 milestone Jan 5, 2025
@vulcandth
Copy link

Personally, I don't think we should support anything other than ldh [c]. The instruction, is only 1 byte and does not take an address input (outside of register c). Supporting alternative syntaxes, such as [ff00+c], misleads that the instruction takes an address input, which it does not.

@pinobatch
Copy link
Member

I favor treating it like how we do rst nowadays. At the parser level, it'd be ld a, [expr + c], and it'd add an assertion that (expr) == 65280.

@Rangi42
Copy link
Contributor Author

Rangi42 commented Jan 5, 2025

I favor treating it like how we do rst nowadays. At the parser level, it'd be ld a, [expr + c], and it'd add an assertion that (expr) == 65280.

Yes, that's how it's been ever since the lexer rewrite in 0.4.2. (Prior to that, [$ff00+c] and [$ff00 + c] were their own whole tokens, so you couldn't even do two-space [$ff00 + c] or all-space [ $ff00 + c ] or anything else mildly unusual.)

Personally I'd like it if the parser allowed five-token [ number + c ] -- not the fully-general [ expression + c ]. But that causes a "shift/reduce conflict" as the bison-generated parser cannot tell whether number + starts an expression, which is why we went with fully-general expression support there in the first place (thereby allowing all these weird variants of $ff00 that people don't/shouldn't actually use).

@Rangi42
Copy link
Contributor Author

Rangi42 commented Jan 6, 2025

Thanks to the community feedback, here's what I'll probably go with. Still deprecate ld [c] but allow ld [$ff00+c] and ldh [c]. Technically allow (a) the redundant ldh [$ff00+c] and (b) the weird variants ld [65280+c], ld [$ff00+low(bc)], etc; but don't officially document them as allowed, and reserve the "right" to break them somehow later.

@Rangi42 Rangi42 changed the title Undeprecate ld [$ff00+c]? Undeprecate ld [$ff00+c] Jan 8, 2025
@Rangi42 Rangi42 self-assigned this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rgbasm This affects RGBASM
Projects
None yet
Development

No branches or pull requests

3 participants