Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: Introduce complex numbers support (np.complex64/128) #2375

Open
wants to merge 57 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
2869b68
api: add support for complex dtype
mloubout Aug 2, 2023
c5cf3f0
api: fix printer for complex dtype
mloubout May 22, 2024
afdcaf6
compiler: fix alias dtype with complex numbers
mloubout May 22, 2024
71e43ee
api: move complex ctype to dtype lowering
mloubout May 22, 2024
be3728b
compiler: generate std:complex for cpp compilers
mloubout May 28, 2024
c15464a
compiler: add std::complex arithmetic defs for unsupported types
mloubout May 30, 2024
e25051b
compiler: fix alias dtype with complex numbers
mloubout May 30, 2024
109103d
compiler: fix internal language specific types and cast
mloubout May 31, 2024
8c9efb1
compiler: rework dtype lowering
mloubout Jun 20, 2024
705f58b
compiler: switch to c++14 for complex_literals
mloubout Jun 27, 2024
397f90a
compiler: subdtype numpy for dtype lowering
mloubout Jul 8, 2024
f835334
compiler: use structs to pass complex arguments
enwask Jul 9, 2024
85a27f0
compiler: add Dereference scalar case
enwask Jul 11, 2024
fe80b4d
compiler: implement float16 support
enwask Jul 11, 2024
ad26f62
symbolics: fix printer for half precision
enwask Jul 11, 2024
62953b2
misc: fix formatting
enwask Jul 11, 2024
c0a0a29
compiler: refactor float16 and lower_dtypes
enwask Jul 11, 2024
f157aa8
compiler: add dtype_alloc_ctype helper for allocation size
enwask Jul 11, 2024
136a93e
misc: more float16 refactoring/formatting fixes
enwask Jul 15, 2024
cff99c7
Remove dtypes lowering from IET layer
enwask Jul 16, 2024
dd445d5
compiler: reimplement float16/complex lowering
enwask Jul 26, 2024
48ce3ee
misc: cleanup, docs and typing for half support
enwask Jul 29, 2024
bb20c10
compiler: FindSymbols 'scalars' -> 'abstractsymbols'
enwask Jul 29, 2024
9be0ff2
test: include scalar parameters in complex tests
enwask Jul 30, 2024
22b00f6
test: add test_dtypes with initial tests for float16 + complex
enwask Jul 30, 2024
f75a185
misc: more lower_dtypes cleanup + type hints
enwask Jul 30, 2024
372032f
api: use grid dtype for extent and origin, add test_grid
enwask Jul 31, 2024
86cf7a5
test: clean up and add more half/complex tests
enwask Jul 31, 2024
3663b40
test: fix test_grid_objs, add test_grid_dtypes
enwask Jul 31, 2024
9901aba
api: allow side for cross derivatives, fixes #2442
mloubout Aug 13, 2024
04f4a0e
compiler: process dtypes through printer
mloubout Jan 15, 2025
552723c
symbolics: specialize sizeof
mloubout Jan 16, 2025
ccbcf34
compiler: move dtype pass to top level operator iet pass
mloubout Jan 16, 2025
2cab883
symbolics: fix SizeOf rebuild
mloubout Jan 16, 2025
406f759
symbolics: use std namespace for c++
mloubout Jan 16, 2025
575a2ec
compiler: fix std math func names
mloubout Jan 16, 2025
bee616d
symbolics: move printers rogether through registry
mloubout Jan 17, 2025
d82fce6
symbolics: rework Cast
mloubout Jan 17, 2025
f942426
compiler: fix complex headers
mloubout Jan 17, 2025
f6f6c6c
api: remove un-needed dtype reconstruction mode
mloubout Jan 17, 2025
ed29f5e
compiler: fix dtype for mpi routines
mloubout Jan 17, 2025
bf92648
compiler: fix missing algorithm include for min/max
mloubout Jan 18, 2025
fc86294
arch: switch sycl error to warning for no-compile codegen
mloubout Jan 18, 2025
e4d32e6
symbolics: rework cast/sizeof for pickling
mloubout Jan 22, 2025
80e5249
api: fix c_datatype hack
mloubout Jan 22, 2025
13cc31f
compiler: make visitor language parametric
mloubout Jan 23, 2025
2f3d735
compiler: make sure complex ctype is handled properly for typedata
mloubout Jan 23, 2025
14c0baa
symbolics: cleaner repr of Cast
mloubout Jan 23, 2025
704138b
test: improve dtype tests log
mloubout Jan 24, 2025
d48b95d
compiler: make sure cpp is used for c++ compilers
mloubout Jan 26, 2025
195a7c3
compiler: make printer part of the target and differentiate C and CXX
mloubout Jan 27, 2025
032b687
compiler: add all cxx target to operator registry
mloubout Jan 27, 2025
17c6067
compiler: cleanup operator class names
mloubout Jan 28, 2025
5972958
compiler: switch cxx backend to static_cast
mloubout Jan 28, 2025
dd4d9cc
compiler: add switch for static_cast vs reinterpret_cast
mloubout Jan 28, 2025
33d300d
compiler: handle plain text header
mloubout Jan 30, 2025
baebc8c
compiler: convert all in visitors to f-string
mloubout Jan 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions devito/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,8 @@ def reinit_compiler(val):
"""
Re-initialize the Compiler.
"""
configuration['compiler'].__init__(suffix=configuration['compiler'].suffix,
configuration['compiler'].__init__(name=configuration['compiler'].name,
suffix=configuration['compiler'].suffix,
mpi=configuration['mpi'])
return val

Expand All @@ -65,7 +66,7 @@ def reinit_compiler(val):
configuration.add('platform', 'cpu64', list(platform_registry),
callback=lambda i: platform_registry[i]())
configuration.add('compiler', 'custom', compiler_registry,
callback=lambda i: compiler_registry[i]())
callback=lambda i: compiler_registry[i](name=i))

# Setup language for shared-memory parallelism
preprocessor = lambda i: {0: 'C', 1: 'openmp'}.get(i, i) # Handles DEVITO_OPENMP deprec
Expand Down
69 changes: 43 additions & 26 deletions devito/arch/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -180,12 +180,20 @@ def __init__(self):
"""

fields = {'cc', 'ld'}
_cpp = False
default_cpp = False
_cxxstd = 'c++14'
_cstd = 'c99'

def __init__(self, **kwargs):
name = kwargs.pop('name', self.__class__.__name__)
if isinstance(name, Compiler):
name = name.name
self._name = name

super().__init__(**kwargs)

self.__lookup_cmds__()
self._cpp = kwargs.get('cpp', self.default_cpp)

self.suffix = kwargs.get('suffix')
if not kwargs.get('mpi'):
Expand All @@ -196,7 +204,7 @@ def __init__(self, **kwargs):
self.cc = self.MPICC if self._cpp is False else self.MPICXX
self.ld = self.cc # Wanted by the superclass

self.cflags = ['-O3', '-g', '-fPIC', '-Wall', '-std=c99']
self.cflags = ['-O3', '-g', '-fPIC', '-Wall', f'-std={self.std}']
self.ldflags = ['-shared']

self.include_dirs = []
Expand Down Expand Up @@ -226,13 +234,13 @@ def __new_with__(self, **kwargs):
Create a new Compiler from an existing one, inherenting from it
the flags that are not specified via ``kwargs``.
"""
return self.__class__(suffix=kwargs.pop('suffix', self.suffix),
return self.__class__(name=self.name, suffix=kwargs.pop('suffix', self.suffix),
mpi=kwargs.pop('mpi', configuration['mpi']),
**kwargs)

@property
def name(self):
return self.__class__.__name__
return self._name

@property
def version(self):
Expand All @@ -248,6 +256,10 @@ def version(self):

return version

@property
def std(self):
return self._cxxstd if self._cpp else self._cstd

def get_version(self):
result, stdout, stderr = call_capture_output((self.cc, "--version"))
if result != 0:
Expand Down Expand Up @@ -488,15 +500,15 @@ def __init_finalize__(self, **kwargs):
platform = kwargs.pop('platform', configuration['platform'])

if isinstance(platform, NvidiaDevice):
self.cflags.remove('-std=c99')
self.cflags.remove(f'-std={self.std}')
# Add flags for OpenMP offloading
if language in ['C', 'openmp']:
cc = get_nvidia_cc()
if cc:
self.cflags += ['-Xopenmp-target', '-march=sm_%s' % cc]
self.ldflags += ['-fopenmp', '-fopenmp-targets=nvptx64-nvidia-cuda']
elif platform is AMDGPUX:
self.cflags.remove('-std=c99')
self.cflags.remove(f'-std={self.std}')
# Add flags for OpenMP offloading
if language in ['C', 'openmp']:
self.ldflags += ['-target', 'x86_64-pc-linux-gnu']
Expand Down Expand Up @@ -556,9 +568,9 @@ def __init_finalize__(self, **kwargs):
self.cflags.append('-ffast-math')

if isinstance(platform, NvidiaDevice):
self.cflags.remove('-std=c99')
self.cflags.remove(f'-std={self.std}')
elif platform is AMDGPUX:
self.cflags.remove('-std=c99')
self.cflags.remove(f'-std={self.std}')
# Add flags for OpenMP offloading
if language in ['C', 'openmp']:
self.ldflags += ['-target', 'x86_64-pc-linux-gnu']
Expand Down Expand Up @@ -594,15 +606,15 @@ def __lookup_cmds__(self):

class PGICompiler(Compiler):

_cpp = True
default_cpp = True

def __init_finalize__(self, **kwargs):

self.cflags.remove('-std=c99')
self.cflags.remove(f'-std={self.std}')
self.cflags.remove('-O3')
self.cflags.remove('-Wall')

self.cflags.append('-std=c++11')
self.cflags.append(f'-std={self.std}')

language = kwargs.pop('language', configuration['language'])
platform = kwargs.pop('platform', configuration['platform'])
Expand Down Expand Up @@ -645,14 +657,14 @@ def __lookup_cmds__(self):

class CudaCompiler(Compiler):

_cpp = True
default_cpp = True

def __init_finalize__(self, **kwargs):

self.cflags.remove('-std=c99')
self.cflags.remove(f'-std={self.std}')
self.cflags.remove('-Wall')
self.cflags.remove('-fPIC')
self.cflags.extend(['-std=c++14', '-Xcompiler', '-fPIC'])
self.cflags.extend([f'-std={self.std}', '-Xcompiler', '-fPIC'])

if configuration['mpi']:
# We rather use `nvcc` to compile MPI, but for this we have to
Expand Down Expand Up @@ -719,14 +731,14 @@ def __lookup_cmds__(self):

class HipCompiler(Compiler):

_cpp = True
default_cpp = True

def __init_finalize__(self, **kwargs):

self.cflags.remove('-std=c99')
self.cflags.remove(f'-std={self.std}')
self.cflags.remove('-Wall')
self.cflags.remove('-fPIC')
self.cflags.extend(['-std=c++14', '-fPIC'])
self.cflags.extend([f'-std={self.std}', '-fPIC'])

if configuration['mpi']:
# We rather use `hipcc` to compile MPI, but for this we have to
Expand Down Expand Up @@ -833,7 +845,7 @@ def __init_finalize__(self, **kwargs):
language = kwargs.pop('language', configuration['language'])

if language == 'sycl':
raise ValueError("Use SyclCompiler to jit-compile sycl")
warning("Use SyclCompiler to jit-compile sycl")

elif language == 'openmp':
# Earlier versions to OneAPI 2023.2.0 (clang17 underneath), have an
Expand Down Expand Up @@ -880,7 +892,7 @@ def __lookup_cmds__(self):

class SyclCompiler(OneapiCompiler):

_cpp = True
default_cpp = True

def __init_finalize__(self, **kwargs):
IntelCompiler.__init_finalize__(self, **kwargs)
Expand All @@ -889,9 +901,9 @@ def __init_finalize__(self, **kwargs):
language = kwargs.pop('language', configuration['language'])

if language != 'sycl':
raise ValueError("Expected language sycl with SyclCompiler")
warning("Expected language sycl with SyclCompiler")

self.cflags.remove('-std=c99')
self.cflags.remove(f'-std={self.std}')
self.cflags.append('-fsycl')

self.cflags.remove('-g') # -g disables some optimizations in IGC
Expand Down Expand Up @@ -947,7 +959,7 @@ def __new__(cls, *args, **kwargs):
obj = super().__new__(cls)
# Keep base to initialize accordingly
obj._base = kwargs.pop('base', _base)
obj._cpp = obj._base._cpp
obj.default_cpp = obj._base.default_cpp

return obj

Expand Down Expand Up @@ -986,15 +998,19 @@ class CompilerRegistry(dict):
"""

def __getitem__(self, key):
if isinstance(key, Compiler):
key = key.name

if key.startswith('gcc-'):
i = key.split('-')[1]
return partial(GNUCompiler, suffix=i)

return super().__getitem__(key)

def __contains__(self, k):
if isinstance(k, Compiler):
k = k.name
return k in self.keys() or k.startswith('gcc-')
def __contains__(self, key):
if isinstance(key, Compiler):
key = key.name
return key in self.keys() or key.startswith('gcc-')


_compiler_registry = {
Expand All @@ -1013,6 +1029,7 @@ def __contains__(self, k):
'nvc++': NvidiaCompiler,
'nvidia': NvidiaCompiler,
'cuda': CudaCompiler,
'nvcc': CudaCompiler,
'osx': ClangCompiler,
'intel': OneapiCompiler,
'icx': OneapiCompiler,
Expand Down
33 changes: 29 additions & 4 deletions devito/core/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,19 @@
from devito.core.cpu import (Cpu64NoopCOperator, Cpu64NoopOmpOperator,
Cpu64AdvCOperator, Cpu64AdvOmpOperator,
Cpu64FsgCOperator, Cpu64FsgOmpOperator,
Cpu64CustomOperator)
Cpu64CustomOperator, Cpu64CustomCXXOperator,
Cpu64CXXNoopCOperator, Cpu64CXXNoopOmpOperator,
Cpu64AdvCXXOperator, Cpu64AdvCXXOmpOperator,
Cpu64FsgCXXOperator, Cpu64FsgCXXOmpOperator)

from devito.core.intel import (Intel64AdvCOperator, Intel64AdvOmpOperator,
Intel64FsgCOperator, Intel64FsgOmpOperator)
from devito.core.arm import ArmAdvCOperator, ArmAdvOmpOperator
from devito.core.power import PowerAdvCOperator, PowerAdvOmpOperator
Intel64FsgCOperator, Intel64FsgOmpOperator,
Intel64CXXAdvCOperator, Intel64AdvCXXOmpOperator,
Intel64FsgCXXOperator, Intel64FsgCXXOmpOperator)
from devito.core.arm import (ArmAdvCOperator, ArmAdvOmpOperator,
ArmAdvCXXOperator, ArmAdvCXXOmpOperator)
from devito.core.power import (PowerAdvCOperator, PowerAdvOmpOperator,
PowerCXXAdvCOperator, PowerAdvCXXOmpOperator)
from devito.core.gpu import (DeviceNoopOmpOperator, DeviceNoopAccOperator,
DeviceAdvOmpOperator, DeviceAdvAccOperator,
DeviceFsgOmpOperator, DeviceFsgAccOperator,
Expand All @@ -16,26 +24,43 @@
# Register CPU Operators
operator_registry.add(Cpu64CustomOperator, Cpu64, 'custom', 'C')
operator_registry.add(Cpu64CustomOperator, Cpu64, 'custom', 'openmp')
operator_registry.add(Cpu64CustomCXXOperator, Cpu64, 'custom', 'CXX')
operator_registry.add(Cpu64CustomCXXOperator, Cpu64, 'custom', 'CXXopenmp')

operator_registry.add(Cpu64NoopCOperator, Cpu64, 'noop', 'C')
operator_registry.add(Cpu64NoopOmpOperator, Cpu64, 'noop', 'openmp')
operator_registry.add(Cpu64CXXNoopCOperator, Cpu64, 'noop', 'CXX')
operator_registry.add(Cpu64CXXNoopOmpOperator, Cpu64, 'noop', 'CXXopenmp')

operator_registry.add(Cpu64AdvCOperator, Cpu64, 'advanced', 'C')
operator_registry.add(Cpu64AdvOmpOperator, Cpu64, 'advanced', 'openmp')
operator_registry.add(Cpu64AdvCXXOperator, Cpu64, 'advanced', 'CXX')
operator_registry.add(Cpu64AdvCXXOmpOperator, Cpu64, 'advanced', 'CXXopenmp')

operator_registry.add(Cpu64FsgCOperator, Cpu64, 'advanced-fsg', 'C')
operator_registry.add(Cpu64FsgOmpOperator, Cpu64, 'advanced-fsg', 'openmp')
operator_registry.add(Cpu64FsgCXXOperator, Cpu64, 'advanced-fsg', 'CXX')
operator_registry.add(Cpu64FsgCXXOmpOperator, Cpu64, 'advanced-fsg', 'CXXopenmp')

operator_registry.add(Intel64AdvCOperator, Intel64, 'advanced', 'C')
operator_registry.add(Intel64AdvOmpOperator, Intel64, 'advanced', 'openmp')
operator_registry.add(Intel64CXXAdvCOperator, Intel64, 'advanced', 'CXX')
operator_registry.add(Intel64AdvCXXOmpOperator, Intel64, 'advanced', 'CXXopenmp')

operator_registry.add(Intel64FsgCOperator, Intel64, 'advanced-fsg', 'C')
operator_registry.add(Intel64FsgOmpOperator, Intel64, 'advanced-fsg', 'openmp')
operator_registry.add(Intel64FsgCXXOperator, Intel64, 'advanced-fsg', 'CXX')
operator_registry.add(Intel64FsgCXXOmpOperator, Intel64, 'advanced-fsg', 'CXXopenmp')

operator_registry.add(ArmAdvCOperator, Arm, 'advanced', 'C')
operator_registry.add(ArmAdvOmpOperator, Arm, 'advanced', 'openmp')
operator_registry.add(ArmAdvCXXOperator, Arm, 'advanced', 'CXX')
operator_registry.add(ArmAdvCXXOmpOperator, Arm, 'advanced', 'CXXopenmp')

operator_registry.add(PowerAdvCOperator, Power, 'advanced', 'C')
operator_registry.add(PowerAdvOmpOperator, Power, 'advanced', 'openmp')
operator_registry.add(PowerCXXAdvCOperator, Power, 'advanced', 'CXX')
operator_registry.add(PowerAdvCXXOmpOperator, Power, 'advanced', 'CXXopenmp')

# Register Device Operators
operator_registry.add(DeviceCustomOmpOperator, Device, 'custom', 'C')
Expand Down
24 changes: 14 additions & 10 deletions devito/core/arm.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,23 @@
from devito.core.cpu import Cpu64AdvOperator
from devito.passes.iet import CTarget, OmpTarget
from devito.core.cpu import (Cpu64AdvOperator, Cpu64AdvCXXOperator,
Cpu64AdvCOperator)
from devito.passes.iet import OmpTarget, CXXOmpTarget

__all__ = ['ArmAdvCOperator', 'ArmAdvOmpOperator']
__all__ = ['ArmAdvCOperator', 'ArmAdvOmpOperator', 'ArmAdvCXXOperator',
'ArmAdvCXXOmpOperator']


class ArmAdvOperator(Cpu64AdvOperator):
pass
ArmAdvOperator = Cpu64AdvOperator
ArmAdvCOperator = Cpu64AdvCOperator
ArmAdvCXXOperator = Cpu64AdvCXXOperator


class ArmAdvCOperator(ArmAdvOperator):
_Target = CTarget


class ArmAdvOmpOperator(ArmAdvOperator):
class ArmAdvOmpOperator(ArmAdvCOperator):
_Target = OmpTarget

# Avoid nested parallelism on ThunderX2
PAR_NESTED = 4


class ArmAdvCXXOmpOperator(ArmAdvOmpOperator):
_Target = CXXOmpTarget
LINEARIZE = True
Loading
Loading