Examples of shared memory bugs (with fixes)

This page lists some examples of shared-memory related bugs that have been found and fixed, with explanations of their causes.

Examples

Multiple species loops

Code with bug

From charge_exchange.jl (comments removed for brevity)

@loop_s is begin
    if is ∈ composition.neutral_species_range
        isn = is
        @loop_z iz begin
            if moments.evolve_ppar
                wpa_shift_norm = moments.vth[iz,isn]
            else
                wpa_shift_norm = 1.0 
            end 
            for isi ∈ composition.ion_species_range
                wpa_shift = (fvec_in.upar[iz,isi] - fvec_in.upar[iz,isn])/wpa_shift_norm
                for ivpa ∈ 1:vpa.n
                    vpa.scratch[ivpa] = vpa.grid[ivpa] + wpa_shift
                end 
                vpa.scratch .= interpolate_to_grid_vpa(vpa.scratch, view(fvec_in.pdf,:,iz,isn), vpa, spectral)
                for ivpa ∈ 1:vpa.n
                    f_out[ivpa,iz,isi] += dt*charge_exchange_frequency*fvec_in.density[iz,isn] *    ## <== bug here
                        (vpa.scratch[ivpa] - fvec_in.pdf[ivpa,iz,isi])
                end 
            end 
        end 
    elseif is ∈ composition.ion_species_range
        isi = is
        @loop_z iz begin
            if moments.evolve_ppar
                wpa_shift_norm = moments.vth[iz,isi]
            else
                wpa_shift_norm = 1.0 
            end 
            for isn ∈ composition.neutral_species_range
                wpa_shift = (fvec_in.upar[iz,isn] - fvec_in.upar[iz,isi])/wpa_shift_norm
                for ivpa ∈ 1:vpa.n
                    vpa.scratch[ivpa] = vpa.grid[ivpa] + wpa_shift
                end
                vpa.scratch .= interpolate_to_grid_vpa(vpa.scratch, view(fvec_in.pdf,:,iz,isi), vpa, spectral)
                for ivpa ∈ 1:vpa.n
                    f_out[ivpa,iz,isn] += dt*charge_exchange_frequency*fvec_in.density[iz,isi] *    ## <== bug here
                        (vpa.scratch[ivpa]- fvec_in.pdf[ivpa,iz,isn])
                end
            end
        end
    end
end

Error message

The bug results in errors from debug_test/sound_wave_tests.jl like

Shared memory array written at CartesianIndex(1, 1, 1) from multiple ranks between calls to _block_synchronize(). Array was created at:
  moment_kinetics.communication.DebugMPISharedArray(array::Array{Float64, 3}) at communication.jl:78
  macro expansion at communication.jl:209 [inlined]
  macro expansion at debugging.jl:47 [inlined]
  allocate_shared(T::Type, dims::Tuple{Int64, Int64, Int64}) at communication.jl:207
  allocate_shared_float(::Int64, ::Vararg{Int64}) at array_allocation.jl:38
  setup_scratch_arrays(moments::moment_kinetics.velocity_moments.moments, pdf_in::moment_kinetics.communication.DebugMPISharedArray{Float64, 3}, n_rk_stages::Int64) at time_advance.jl:244
  setup_time_advance!(pdf::moment_kinetics.initial_conditions.pdf_struct, vpa::moment_kinetics.coordinates.coordinate, z::moment_kinetics.coordinates.coordinate, composition::moment_kinetics.input_structs.species_composition, drive_input::moment_kinetics.input_structs.drive_input, moments::moment_kinetics.velocity_moments.moments, t_input::moment_kinetics.input_structs.time_input, collisions::moment_kinetics.input_structs.collisions_input, species::Vector{moment_kinetics.input_structs.species_parameters}) at time_advance.jl:112
... etc ...

showing that the problem is multiple writes to the same shared-memory array element from different ranks (race condition because the order of those writes is undefined, and they might even happen at the same time causing undefined behaviour), and the affected array is the distribution function (which is the thing created at time_advance.jl:112 as part of the scratch_pdf structs).

Cause

The array being written to is f_out. In this region of the code, the loops over species is and spatial dimension iz are split among processes and each process should write to only a subset of the points in each dimension. However, the 'split-up' index is is used to index the neutrals in the first branch isn = is, while f_out is written at isi, which takes all values in composition.ion_species_range. The result is that index isi can take values within the species dimension that do not 'belong to' the process, which allows multiple processes to write to the same array location - an error!

The second (else) branch has the same issue, but with roles of ions and neutrals swapped.

Solution

Rearrange the species loops (swapping the order of loops over ion and neutral species) so that the index set to is is always used to access the array being written to, f_out. This does require moving the initialization of wpa_shift_norm one level down in the nested loop, which is slightly sub-optimal (but very slightly since it is just a scalar value look-up and the loop it moved inside is a species loop which has few iterations). The fixed code is

@loop_s is begin
    if is ∈ composition.ion_species_range                  ## <== fix here
        isi = is                                           ## <== here
        @loop_z iz begin
            for isn ∈ composition.neutral_species_range    ## <== and here
                if moments.evolve_ppar
                    wpa_shift_norm = moments.vth[iz,isn]
                else
                    wpa_shift_norm = 1.0 
                end 
                wpa_shift = (fvec_in.upar[iz,isi] - fvec_in.upar[iz,isn])/wpa_shift_norm
                for ivpa ∈ 1:vpa.n
                    vpa.scratch[ivpa] = vpa.grid[ivpa] + wpa_shift
                end 
                vpa.scratch .= interpolate_to_grid_vpa(vpa.scratch, view(fvec_in.pdf,:,iz,isn), vpa, spectral)
                for ivpa ∈ 1:vpa.n
                    f_out[ivpa,iz,isi] += dt*charge_exchange_frequency*fvec_in.density[iz,isn] *
                        (vpa.scratch[ivpa] - fvec_in.pdf[ivpa,iz,isi])
                end 
            end 
        end 
    elseif is ∈ composition.neutral_species_range          ## <== fix here
        isn = is                                           ## <== here
        @loop_z iz begin
            for isi ∈ composition.ion_species_range        ## <== and here
                if moments.evolve_ppar
                    wpa_shift_norm = moments.vth[iz,isi]
                else
                    wpa_shift_norm = 1.0 
                end 
                wpa_shift = (fvec_in.upar[iz,isn] - fvec_in.upar[iz,isi])/wpa_shift_norm
                for ivpa ∈ 1:vpa.n
                    vpa.scratch[ivpa] = vpa.grid[ivpa] + wpa_shift
                end 
                vpa.scratch .= interpolate_to_grid_vpa(vpa.scratch, view(fvec_in.pdf,:,iz,isi), vpa, spectral)
                for ivpa ∈ 1:vpa.n
                    f_out[ivpa,iz,isn] += dt*charge_exchange_frequency*fvec_in.density[iz,isi] *
                        (vpa.scratch[ivpa]- fvec_in.pdf[ivpa,iz,isn])
                end 
            end 
        end 
    end 
end

Provide feedback

Saved searches

Use saved searches to filter your results more quickly