Adds utilities for AMD fp8 dtype support, follow up PR to add option to the configs #235

drisspg · 2024-03-07T03:35:10Z

Summary

AMD GPUS support a different fp8 dtype compared to nvidia. These dtypes were added to PyTorch and we update Float8Tensor construction to use the format dependent on the arch.

For a detailed summary see: https://github.com/openxla/stablehlo/blob/main/rfcs/20230321-fp8_fnuz.md

gchanan · 2024-03-07T18:08:45Z

float8_experimental/float8_dynamic_linear.py

-        fp8_tensor = to_fp8_no_autograd(
-            gradY, gradY_scale, torch.float8_e5m2, ctx.emulate
-        )
+        fp8_dtype = torch.float8_e5m2fnuz if IS_AMD else torch.float8_e5m2


should this be configurable in the forward with a reasonable default?

I wonder if it would be better to have torch.backends.[cuda|hip|mps].supports_dtype(XYZ) API, because I assume XPUs would probably use fnuz flavor, but say ARM would be using e5m2 flavor

I think this should be configurable instead of depending on env, numerics should be as predictable as possible. It's also valuable to emulate numerics without having the hardware, for debugging.

Oaky, so it sounds like we want this to be defined at module construction. It is up to the constructor of the module to ensure that _scaled_mm will work with their module.

I do think that Nikitas backend dtype helper could a useful pytorch feature, not sure if that should live here though

drisspg · 2024-03-11T21:34:21Z

The non emulated version is failing for me on on an mi300 machine using this version of pytorch:

pytorch-triton-rocm==3.0.0+0a22a91d04
torch==2.3.0.dev20240308+rocm6.0

FAILED test/test_base.py::TestFloat8Linear::test_linear_nobias[True-LinearType.DYNAMIC-x_shape0-False] - AssertionError: -2.7592885494232178 is too low
FAILED test/test_base.py::TestFloat8Linear::test_linear_nobias[True-LinearType.DYNAMIC-x_shape1-False] - AssertionError: -3.372152805328369 is too low
FAILED test/test_base.py::TestFloat8Linear::test_linear_nobias[True-LinearType.DYNAMIC-x_shape2-False] - AssertionError: -2.8420748710632324 is too low
FAILED test/test_base.py::TestFloat8Linear::test_linear_nobias[False-LinearType.DELAYED-x_shape0-False] - AssertionError: -2.7584447860717773 is too low
FAILED test/test_base.py::TestFloat8Linear::test_linear_nobias[False-LinearType.DELAYED-x_shape1-False] - AssertionError: -2.946033239364624 is too low
FAILED test/test_base.py::TestFloat8Linear::test_linear_nobias[False-LinearType.DELAYED-x_shape2-False] - AssertionError: -2.756319999694824 is too low
FAILED test/test_base.py::TestFloat8Linear::test_linear_nobias[False-LinearType.DYNAMIC-x_shape0-False] - AssertionError: -3.377957820892334 is too low
FAILED test/test_base.py::TestFloat8Linear::test_linear_nobias[False-LinearType.DYNAMIC-x_shape1-False] - AssertionError: -3.0644452571868896 is too low
FAILED test/test_base.py::TestFloat8Linear::test_linear_nobias[False-LinearType.DYNAMIC-x_shape2-False] - AssertionError: -3.091813564300537 is too low

alugorey · 2024-05-22T19:35:11Z

@drisspg Is this mergeable? The errors above should have been fixed with this: pytorch/pytorch#125921

drisspg · 2024-05-23T04:26:23Z

@alugorey I was actually thinking that something similiar would be landed here: #248

alugorey · 2024-05-23T14:23:49Z

@alugorey I was actually thinking that something similiar would be landed here: #248

Ah okay, I was working on top of this PR. Do you want me to pull your changes into my PR and we abandon this one? Or do you want me to point my PR to this branch?

drisspg · 2024-05-24T03:36:55Z

@alugorey ahh no worries, let met rebase and whip this PR back into shape

facebook-github-bot · 2024-06-01T00:23:25Z

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-06-01T00:48:20Z

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

malfet · 2024-06-04T15:10:48Z

float8_experimental/float8_utils.py

+    elif float8_dtype == torch.float8_e4m3fnuz:
+        res = E4M3_FNUZ_MAX_POS / torch.clamp(amax, min=EPS)
+    elif float8_dtype == torch.float8_e5m2:
        res = E5M2_MAX_POS / torch.clamp(amax, min=EPS)
+    elif float8_dtype == torch.float8_e5m2fnuz:
+        res = E5M2_FNUZ_MAX_POS / torch.clamp(amax, min=EPS)


Plz avoid code duplication

Suggested change

elif float8_dtype == torch.float8_e4m3fnuz:

res = E4M3_FNUZ_MAX_POS / torch.clamp(amax, min=EPS)

elif float8_dtype == torch.float8_e5m2:

res = E5M2_MAX_POS / torch.clamp(amax, min=EPS)

elif float8_dtype == torch.float8_e5m2fnuz:

res = E5M2_FNUZ_MAX_POS / torch.clamp(amax, min=EPS)

elif float8_dtype in [torch.float8_e4m3fnuz, torch.float8_e5m2, torch.float8_e5m2fnuz]:

res = torch.finfo(dtype).max / torch.clamp(amax, min=EPS)

Actually, you don't even need ifs there, just assert that float8_dtype is indeed the one

assert float8_dtype.itemsize == 1 and float8_dtype.is_floating_point

facebook-github-bot · 2024-06-04T17:24:05Z

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-06-04T17:24:33Z

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-06-04T19:41:08Z

@drisspg merged this pull request in 5fc07fc.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 7, 2024

gchanan reviewed Mar 7, 2024

View reviewed changes

drisspg force-pushed the amd-support branch from 07df039 to 1dcb701 Compare March 11, 2024 21:27

drisspg force-pushed the amd-support branch 2 times, most recently from b1163d1 to e29cc35 Compare May 28, 2024 17:52

drisspg changed the title ~~Add AMD fp8 type support~~ Adds utilities for AMD fp8 dtype support, follow up PR to add option to the configs Jun 1, 2024

Skipping SAM for now since it hangs

ba9b5dd

drisspg force-pushed the amd-support branch from e29cc35 to ba9b5dd Compare June 1, 2024 00:47

drisspg requested a review from vkuzo June 2, 2024 17:59

malfet reviewed Jun 4, 2024

View reviewed changes

malfet approved these changes Jun 4, 2024

View reviewed changes

drisspg force-pushed the amd-support branch 3 times, most recently from 7feb581 to 5da5b5c Compare June 4, 2024 17:18

cleanup fp8 handling

4f304cc

drisspg force-pushed the amd-support branch from 5da5b5c to 4f304cc Compare June 4, 2024 17:24

facebook-github-bot closed this in 5fc07fc Jun 4, 2024

facebook-github-bot added the Merged label Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds utilities for AMD fp8 dtype support, follow up PR to add option to the configs #235

Adds utilities for AMD fp8 dtype support, follow up PR to add option to the configs #235

drisspg commented Mar 7, 2024 •

edited

Loading

gchanan Mar 7, 2024

malfet Mar 7, 2024

vkuzo Mar 7, 2024

drisspg Mar 8, 2024

drisspg commented Mar 11, 2024

alugorey commented May 22, 2024

drisspg commented May 23, 2024

alugorey commented May 23, 2024

drisspg commented May 24, 2024

facebook-github-bot commented Jun 1, 2024

facebook-github-bot commented Jun 1, 2024

malfet Jun 4, 2024

malfet Jun 4, 2024

facebook-github-bot commented Jun 4, 2024

facebook-github-bot commented Jun 4, 2024

facebook-github-bot commented Jun 4, 2024

Adds utilities for AMD fp8 dtype support, follow up PR to add option to the configs #235

Adds utilities for AMD fp8 dtype support, follow up PR to add option to the configs #235

Conversation

drisspg commented Mar 7, 2024 • edited Loading

Summary

gchanan Mar 7, 2024

Choose a reason for hiding this comment

malfet Mar 7, 2024

Choose a reason for hiding this comment

vkuzo Mar 7, 2024

Choose a reason for hiding this comment

drisspg Mar 8, 2024

Choose a reason for hiding this comment

drisspg commented Mar 11, 2024

alugorey commented May 22, 2024

drisspg commented May 23, 2024

alugorey commented May 23, 2024

drisspg commented May 24, 2024

facebook-github-bot commented Jun 1, 2024

facebook-github-bot commented Jun 1, 2024

malfet Jun 4, 2024

Choose a reason for hiding this comment

malfet Jun 4, 2024

Choose a reason for hiding this comment

facebook-github-bot commented Jun 4, 2024

facebook-github-bot commented Jun 4, 2024

facebook-github-bot commented Jun 4, 2024

drisspg commented Mar 7, 2024 •

edited

Loading