Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seq:add floating point support #6959

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

alexs-sh
Copy link
Contributor

@alexs-sh alexs-sh commented Dec 14, 2024

About

Hello,

This is the initial implementation of hexadecimal floating-point arguments in seq. It started as an attempt to add parsing for hexadecimal floats (issue #6935), but quickly grew into bigger changes.

Before

cargo run -q seq 0x1p-1 2
seq: invalid hexadecimal argument: '0x1p-1'

After

cargo run -q seq 0x1p-1 2
0.5
1.5

Expected

seq 0x1p-1 2
0.5
1.5

Changes

  • Added thehexdecimalfloat module for parsing hex-floats and detecting the precision of numbers closer to GNU seq. I tried to keep the new functionality separate and minimize changes to the existing code to make changes more observable. I'm sure it’s possible unify number parsing to handle all types of values in one place. However, this seems like a big change and (IMHO) deserves a separate refactor-like PR.

  • Changed the method for detecting and using precision for numbers. The initial approach of using the number of fractional digits from a BigDecimal didn’t match the GNU behavior and led to different representations. For now the precision is detected separately and this is not a part of the PreciseNumber (to avoid changing too much of the existing code).

  • Added decimal/exponent notations for hexadecimal floats to match the %Lg format in GNU seq. The sprintf function from uucore is used for this.

  • Added tests. During development, I found some additional mismatches with GNU seq. However, they are not directly related to hexadecimal floating points and could be addressed in future PRs.

Thank you

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch 2 times, most recently from bb32693 to 6ef9921 Compare December 15, 2024 09:20
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch from b25be31 to 4f55794 Compare December 15, 2024 11:34
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)

@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch 3 times, most recently from 0fa1c61 to 5284545 Compare December 16, 2024 08:51
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch from 5284545 to dc44647 Compare December 16, 2024 11:56
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

tests/by-util/test_seq.rs Outdated Show resolved Hide resolved
@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch from dc44647 to 4d9a86e Compare December 17, 2024 18:59
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

/// ```
pub fn parse_hexadecimal_float(s: &str) -> Result<PreciseNumber, ParseNumberError> {
let value = parse_float(s)?;
let number = BigDecimal::from_f64(value).ok_or(ParseNumberError::Float)?;
Copy link
Contributor

@gim913 gim913 Dec 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case of floating points, seq already supports arbitrary precision floating points, i.e.:
uu_seq 1e9865 1e9864 2e9865
the above limits hexadecimal support to f64, so p1023 is maximal binary exponent supported.

(i.e. uu_seq 0x1p1023 0x1p1023 will work, but uu_seq 0x1p1024 0x1p1024 will not)

(not sure if that is an issue or not, but imo it would be nice to have arbitrary precision supported in both variants)

Copy link
Contributor Author

@alexs-sh alexs-sh Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gim913 Thanks, this is a good point. I also thought about it. And:

  1. I decided to start with native & simple f64 just to have something to demonstrate and discuss. :)
  2. arbitrary precision is a great option, but we should also consider compatibility with the original seq. From what I can see, GNU seq operates on long double's with a 15-bit exponent. This should serve as a bound for arbitrary precision. For example,
seq 0x1.4p16383 0
seq 0x1.4p16384 0
seq: invalid floating point argument: ‘0x1.4p16384’
Try 'seq --help' for more information.

@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch from 4d9a86e to a1f4acf Compare December 26, 2024 14:03
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

2 similar comments
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch from db975aa to 8b6317f Compare December 26, 2024 18:48
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch 2 times, most recently from 29601c8 to 3b2664c Compare December 28, 2024 14:07
Initial version of a hexadecimal float parser. This implementation is
intended for preview and further discussion.

Issue uutils#6935
Turn on the float parser. Now it's possible to use hexadecimal floats as
parameters. For example,

    cargo run -- 0x1p-1 3
    0.5
    1.5
    2.5

Issue uutils#6935
Fix the test that fails after the implementation of the floating-point
format. The expected output now matches the GNU seq output.

    seq 0xlmnop
    seq: invalid floating point argument: ‘0xlmnop’

Issue uutils#6935
Comment on lines 830 to 877
#[test]
fn test_parse_valid_hexadecimal_float() {
new_ucmd!()
.args(&["0x1p-1", "2"])
.succeeds()
.stdout_only("0.5\n1.5\n");

new_ucmd!()
.args(&["1", "0x1p-1", "2"])
.succeeds()
.stdout_only("1\n1.5\n2\n");

new_ucmd!()
.args(&["0x3.4p-1", "0x4p-1", "4"])
.succeeds()
.stdout_only("1.625\n3.625\n");

new_ucmd!()
.args(&["0x.8p16", "32768"])
.succeeds()
.stdout_only("32768\n");

// from issue #2660
new_ucmd!()
.args(&["-0x.ep-3", "-0x.1p-3", "-0x.fp-3"])
.succeeds()
.stdout_only("-0.109375\n-0.117188\n");

new_ucmd!()
.args(&["0xffff.4p-4", "4096"])
.succeeds()
.stdout_only("4095.95\n");

new_ucmd!()
.args(&["0xA.A9p-1", "6"])
.succeeds()
.stdout_only("5.33008\n");

new_ucmd!()
.args(&["0xa.a9p-1", "6"])
.succeeds()
.stdout_only("5.33008\n");

new_ucmd!()
.args(&["0xffffffffffp-30", "1024"]) // spell-checker:disable-line
.succeeds()
.stdout_only("1024\n");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[test]
fn test_parse_valid_hexadecimal_float() {
new_ucmd!()
.args(&["0x1p-1", "2"])
.succeeds()
.stdout_only("0.5\n1.5\n");
new_ucmd!()
.args(&["1", "0x1p-1", "2"])
.succeeds()
.stdout_only("1\n1.5\n2\n");
new_ucmd!()
.args(&["0x3.4p-1", "0x4p-1", "4"])
.succeeds()
.stdout_only("1.625\n3.625\n");
new_ucmd!()
.args(&["0x.8p16", "32768"])
.succeeds()
.stdout_only("32768\n");
// from issue #2660
new_ucmd!()
.args(&["-0x.ep-3", "-0x.1p-3", "-0x.fp-3"])
.succeeds()
.stdout_only("-0.109375\n-0.117188\n");
new_ucmd!()
.args(&["0xffff.4p-4", "4096"])
.succeeds()
.stdout_only("4095.95\n");
new_ucmd!()
.args(&["0xA.A9p-1", "6"])
.succeeds()
.stdout_only("5.33008\n");
new_ucmd!()
.args(&["0xa.a9p-1", "6"])
.succeeds()
.stdout_only("5.33008\n");
new_ucmd!()
.args(&["0xffffffffffp-30", "1024"]) // spell-checker:disable-line
.succeeds()
.stdout_only("1024\n");
}
#[test]
fn test_parse_valid_hexadecimal_float() {
let test_cases = [
(&["0x1p-1", "2"], "0.5\n1.5\n"),
(&["1", "0x1p-1", "2"], "1\n1.5\n2\n"),
(&["0x3.4p-1", "0x4p-1", "4"], "1.625\n3.625\n"),
(&["0x.8p16", "32768"], "32768\n"),
(&["-0x.ep-3", "-0x.1p-3", "-0x.fp-3"], "-0.109375\n-0.117188\n"),
(&["0xffff.4p-4", "4096"], "4095.95\n"),
(&["0xA.A9p-1", "6"], "5.33008\n"),
(&["0xa.a9p-1", "6"], "5.33008\n"),
(&["0xffffffffffp-30", "1024"], "1024\n"),
];
for (args, expected_output) in &test_cases {
new_ucmd!()
.args(args)
.succeeds()
.stdout_only(expected_output);
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be enough :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. It looks much better now. Let me add & try changes locally...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's only one issue I encountered. Compiler complains about &["0x1p-1", "2"] and &["1", "0x1p-1", "2"] which have different types: [&str; 2] vs [&str; 3]). However, vectors work well in this case.

Copy link
Contributor

@sylvestre sylvestre Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i didn't try my change.
you can keep it separately

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already pushed a version with vectors :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are now split for 2 and 3 arguments

alexs-sh and others added 19 commits January 9, 2025 19:39
Let's update the structure and add separate functions for parsing each
part of the number. It's simpler and allows us to maintain a consistent
level of detail
This commit adds special cases for displaying floating-point values,
making the behavior closer to the original GNU seq.
Add some tests from GNU coreutils that fail quite often. Feel free to
add more.
Let's drop some latest modifications and start once again
Let's try to implement GNU-like precision detection. These are early
steps to test the idea, and it will likely be modified and improved
Let's add and test a more GNU-like version of number precision.
@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch from bb7b656 to f4b58f1 Compare January 9, 2025 18:39
alexs-sh and others added 2 commits January 9, 2025 20:04
Keep the test cases for 2 and 3 arguments separately.

Co-authored-by: Sylvestre Ledru <[email protected]>
@alexs-sh alexs-sh force-pushed the feature/seq-add-floating-point-support-6935 branch from 97e82d8 to 299814c Compare January 9, 2025 19:35
Copy link

github-actions bot commented Jan 9, 2025

GNU testsuite comparison:

GNU test failed: tests/misc/stdbuf. tests/misc/stdbuf is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants