Skip to content
This repository has been archived by the owner on Jan 26, 2024. It is now read-only.

amd_detail: fix atomicMax and atomicMin for floating point types #40

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

psychocoderHPC
Copy link

The return value of atomicMax and atomicMin should be the old value from the destination memory.
This was not the case before the fix.
In cases where no concurrent thread is manipulating the destination the returned values were always the value passed to the atomic function.

test case (for float/double):

#include <assert.h>
#include <stdio.h>

#include "hip/hip_runtime.h"


template<typename T>
__global__ void testAtomicMin(bool* success, T operandOrig)
{
    __shared__ T operand;
    {
        T const value = operandOrig / static_cast<T>(2);
        T const reference = (operandOrig < value) ? operandOrig : value;
        operand = operandOrig;
        T const ret = atomicMin(&operand, value);
        printf("1: %lf==%lf %lf==%lf\n", double(operandOrig), double(ret), double(value),double(reference));
        assert(operandOrig == ret);
        assert(value == reference);
    }
    {

        T const value = static_cast<T>(2)*operandOrig;
        T const reference = (operandOrig < value) ? operandOrig : value;
        operand = operandOrig;
        T const ret = atomicMin(&operand, value);
        printf("2: %lf==%lf %lf==%lf\n", double(operandOrig), double(ret), double(operand),double(reference));

        assert(operandOrig == ret);
        assert(operand == reference);
    }
}

#define HIP_ASSERT(x) (assert((x)==hipSuccess))

int main() {

        bool* deviceResult;

        HIP_ASSERT(hipMalloc((void**)&deviceResult,sizeof(bool)));

        testAtomicMin<<<1,1>>>(deviceResult,42.f);
        testAtomicMin<<<1,1>>>(deviceResult,42);

        HIP_ASSERT(hipDeviceSynchronize());
        HIP_ASSERT(hipFree(deviceResult));


        return 0;
}

output:

1: 42.000000==21.000000 21.000000==21.000000

The return value of `atomicMax` and `atomicMin` should be the old value from the destination memory.
This was not the case before the fix.
In cases where no concurrent thread is manipulating the destination the returned values were always the value passed to the atomic function.
@afanfa
Copy link

afanfa commented Jul 27, 2022

This issue has been already addressed in an internal version of hipamd. It will eventually make it on this repository as well.

@psychocoderHPC
Copy link
Author

psychocoderHPC commented Jul 28, 2022

@afanfa I would say that atomicMin(unsigned long long, unsigned long long) is broken too

https://github.com/ROCm-Developer-Tools/hipamd/blob/3ec1ccdbbbee7090ba854eddd1dee281973a4498/include/hip/amd_detail/amd_hip_atomic.h#L938-L951

If the data in memory tmp==32 and value==4) the return value is 4 but should be 32.

@psychocoderHPC
Copy link
Author

psychocoderHPC commented Jul 29, 2022

I found out that I checked the wrong branch, the main branch looks like and left over from older development, I opened an issue to remove this dead branch #41 [update: I updated the link to point to the develop branch, the bug is existing there too]

@psychocoderHPC
Copy link
Author

unsigned long long atomicMin and atomicMax is only wrong for architectures without __hip_atomic_compare_exchange_strong

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants