amd_detail: fix `atomicMax` and `atomicMin` for floating point types #40

psychocoderHPC · 2022-07-27T09:28:01Z

The return value of atomicMax and atomicMin should be the old value from the destination memory.
This was not the case before the fix.
In cases where no concurrent thread is manipulating the destination the returned values were always the value passed to the atomic function.

test case (for float/double):

#include <assert.h>
#include <stdio.h>

#include "hip/hip_runtime.h"


template<typename T>
__global__ void testAtomicMin(bool* success, T operandOrig)
{
    __shared__ T operand;
    {
        T const value = operandOrig / static_cast<T>(2);
        T const reference = (operandOrig < value) ? operandOrig : value;
        operand = operandOrig;
        T const ret = atomicMin(&operand, value);
        printf("1: %lf==%lf %lf==%lf\n", double(operandOrig), double(ret), double(value),double(reference));
        assert(operandOrig == ret);
        assert(value == reference);
    }
    {

        T const value = static_cast<T>(2)*operandOrig;
        T const reference = (operandOrig < value) ? operandOrig : value;
        operand = operandOrig;
        T const ret = atomicMin(&operand, value);
        printf("2: %lf==%lf %lf==%lf\n", double(operandOrig), double(ret), double(operand),double(reference));

        assert(operandOrig == ret);
        assert(operand == reference);
    }
}

#define HIP_ASSERT(x) (assert((x)==hipSuccess))

int main() {

        bool* deviceResult;

        HIP_ASSERT(hipMalloc((void**)&deviceResult,sizeof(bool)));

        testAtomicMin<<<1,1>>>(deviceResult,42.f);
        testAtomicMin<<<1,1>>>(deviceResult,42);

        HIP_ASSERT(hipDeviceSynchronize());
        HIP_ASSERT(hipFree(deviceResult));


        return 0;
}

output:

1: 42.000000==21.000000 21.000000==21.000000

The return value of `atomicMax` and `atomicMin` should be the old value from the destination memory. This was not the case before the fix. In cases where no concurrent thread is manipulating the destination the returned values were always the value passed to the atomic function.

afanfa · 2022-07-27T20:14:05Z

This issue has been already addressed in an internal version of hipamd. It will eventually make it on this repository as well.

psychocoderHPC · 2022-07-28T15:30:47Z

@afanfa I would say that atomicMin(unsigned long long, unsigned long long) is broken too

https://github.com/ROCm-Developer-Tools/hipamd/blob/3ec1ccdbbbee7090ba854eddd1dee281973a4498/include/hip/amd_detail/amd_hip_atomic.h#L938-L951

If the data in memory tmp==32 and value==4) the return value is 4 but should be 32.

psychocoderHPC · 2022-07-29T07:49:41Z

~~I found out that I checked the wrong branch, the main branch looks like and left over from older development, I opened an issue to remove this dead branch #41~~ [update: I updated the link to point to the develop branch, the bug is existing there too]

psychocoderHPC · 2022-07-29T08:20:13Z

unsigned long long atomicMin and atomicMax is only wrong for architectures without __hip_atomic_compare_exchange_strong

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

amd_detail: fix `atomicMax` and `atomicMin` for floating point types #40

amd_detail: fix `atomicMax` and `atomicMin` for floating point types #40

psychocoderHPC commented Jul 27, 2022

afanfa commented Jul 27, 2022

psychocoderHPC commented Jul 28, 2022 •

edited

Loading

psychocoderHPC commented Jul 29, 2022 •

edited

Loading

psychocoderHPC commented Jul 29, 2022

amd_detail: fix atomicMax and atomicMin for floating point types #40

Are you sure you want to change the base?

amd_detail: fix atomicMax and atomicMin for floating point types #40

Conversation

psychocoderHPC commented Jul 27, 2022

test case (for float/double):

afanfa commented Jul 27, 2022

psychocoderHPC commented Jul 28, 2022 • edited Loading

psychocoderHPC commented Jul 29, 2022 • edited Loading

psychocoderHPC commented Jul 29, 2022

amd_detail: fix `atomicMax` and `atomicMin` for floating point types #40

amd_detail: fix `atomicMax` and `atomicMin` for floating point types #40

psychocoderHPC commented Jul 28, 2022 •

edited

Loading

psychocoderHPC commented Jul 29, 2022 •

edited

Loading