-
Notifications
You must be signed in to change notification settings - Fork 51
Working with arrays
Forpy offers interoperability of Fortran arrays and numpy arrays through
the type ndarray
. In the
following examples, you will see how to create a numpy array from a Fortran array.
The simplest way to create a numpy array is with ndarray_create
. This
function creates a numpy array with the same content as the Fortran array that is
passed to the function. For example:
program ndarray01
use forpy_mod
implicit none
integer, parameter :: NROWS = 2
integer, parameter :: NCOLS = 3
integer :: ierror, ii, jj
real :: matrix(NROWS, NCOLS)
type(ndarray) :: arr
ierror = forpy_initialize()
do jj = 1, NCOLS
do ii = 1, NROWS
matrix(ii, jj) = real(ii) * jj
enddo
enddo
! creates a numpy array with the same content as 'matrix'
ierror = ndarray_create(arr, matrix)
ierror = print_py(arr)
call arr%destroy
call forpy_finalize
end program
When arrays get very large, creating a copy might not be what you want. The next section describes how to wrap a Fortran array with forpy without making a copy.
When creating a numpy array with ndarray_create_nocopy
, no copy of the Fortran array is made and storage has to be managed by the Fortran programmer. Changes to the Fortran array affect the numpy array
and vice versa. This is more efficient than ndarray_create
, however there are some things to
consider:
Since the Fortran array can now be modified not
only directly but also indirectly by the ndarray
, it is necessary to
add the asynchronous
attribute to the Fortran array declaration, since
without it compiler optimization related bugs
can occur (depending on code, compiler and compiler options).
Alternatively you could also use the volatile
attribute.
program ndarray02
use forpy_mod
implicit none
integer, parameter :: NROWS = 2
integer, parameter :: NCOLS = 3
integer :: ierror, ii, jj
! add the asynchronous attribute to the Fortran array that is wrapped
! as ndarray to avoid bugs caused by compiler optimizations
real, asynchronous :: matrix(NROWS, NCOLS)
type(ndarray) :: arr
ierror = forpy_initialize()
do jj = 1, NCOLS
do ii = 1, NROWS
matrix(ii, jj) = real(ii) * jj
enddo
enddo
! creates a numpy array that refers to 'matrix'
ierror = ndarray_create_nocopy(arr, matrix)
ierror = print_py(arr)
matrix(1,1) = 1234.0 ! Change also affects 'arr'
ierror = print_py(arr)
call arr%destroy
call forpy_finalize
end program
Be aware that as long as references to the ndarray are in use, the Fortran data has to be valid.
(not deallocated or out of scope,...). Also make sure that not a compiler-created
temporary array is passed to ndarray_create_nocopy
.
Alternatively you can create ndarray
s with storage managed by Python as
explained in one of the following sections.
The Fortran array has to be contiguous in memory (this is not checked). Contiguous in memory means that successive array elements are adjacent in memory. For more information, see also Caveats.
We can also create a new ndarray
with the function ndarray_create_empty
,
specifying the shape of the array.
In this case storage is allocated and managed by Python. Memory is freed, when
there is no reference to the ndarray anymore (don't forget to call the destroy
method).
You can also create an array of zeros with ndarray_create_zeros
and an array
of ones with ndarray_create_ones
.
The following example shows how to access the data of a ndarray with
the method ndarray%get_data
. It also shows how to efficiently return a ndarray
from a subroutine - without making unnecessary copies.
To edit the values of the array, use the Fortran
pointer returned from ndarray%get_data
.
! Example of how to return a ndarray from a subroutine
program ndarray03
use forpy_mod
use iso_fortran_env, only: real64
implicit none
integer :: ierror
type(ndarray) :: arr
ierror = forpy_initialize()
call create_matrix(arr)
ierror = print_py(arr)
call arr%destroy
call forpy_finalize
CONTAINS
subroutine create_matrix(arr)
type(ndarray), intent(out) :: arr
integer :: ierror, ii, jj
integer, parameter :: NROWS = 2
integer, parameter :: NCOLS = 3
real(kind=real64), dimension(:,:), pointer :: matrix
ierror = ndarray_create_empty(arr, [NROWS, NCOLS], dtype="float64")
! Use ndarray%getdata to access the content of a numpy array
! from Fortran
! type of matrix must be compatible with dtype of ndarray
! (here: real(kind=real64) and dtype="float64")
ierror = arr%get_data(matrix)
do jj = 1, NCOLS
do ii = 1, NROWS
matrix(ii, jj) = real(ii, kind=real64) * jj
enddo
enddo
end subroutine
end program
Note that if you create an array with ndarray_create_empty
, its values
are undefined. Consider using ndarray_create_zeros
or ndarray_create_ones
instead.
These creation functions take an optional argument dtype
(default "float") to
specify the type of the array's value. You have to use one of numpy's dtype
s,
such as float64
, int32
, ...
By default ndarray
s are created with Fortran storage order. With the optional 4th
argument to one of the creation functions, you can create an array with C storage order:
ierror = ndarray_create_zeros(arr, [NROWS, NCOLS], dtype="float64", order="C")
ndarray%get_data
fails in each of these cases:
- dtype of ndarray and Fortran type of pointer do not match (
TypeError
raised) - dimension of ndarray and pointer do not match (
BufferError
raised) - storage order of ndarray does not match the order expected by
get_data
(by default Fortran order is expected) (BufferError
raised) - ndarray is not contiguous in memory (
BufferError
raised)
For multidimensional arrays one has to take into account
the storage order.
By default ndarray%get_data
expects a Fortran ordered (column-major) array.
Use the optional 2nd argument to change this behaviour:
ierror = arr%get_data(matrix, order='C')
is successful only if arr
has C storage order. Attention: Since Fortran pointers
always use Fortran storage order, the data it points to will be the transpose
of the array.
If you specify:
ierror = arr%get_data(matrix, order='A')
get_data will accept C or Fortran storage order. Again in case of C storage order,
the data pointed to will be the transpose. Most likely you will use order='A'
only if
you expect the array to be symmetric or if you are interested only in diagonal elements.
With ndarray%is_ordered
you can check if an array has a certain order (see below).
Note that contiguous 1D arrays are both Fortran and C contiguous and therefore you do not need to provide the order argument in that case.
ndarray%get_data
does not support non-contiguous arrays (yet?). In that
case, you can create a copy of the array (which will be contiguous) and use get_data
on
the copy:
ierror = arr%copy(copy_of_arr) ! copy_of_arr will have Fortran order by default
ierror = copy_of_arr%get_data(matrix)
You can use ndarray%is_ordered
to check if a ndarray is contiguous and in a certain
storage order:
flag = nd_arr%is_ordered('F') ! .true. if nd_arr Fortran-contiguous
flag = nd_arr%is_ordered('C') ! .true. if nd_arr C-contiguous
flag = nd_arr%is_ordered('A') ! .true. if nd_arr Fortran- or C-contiguous
You can check the type of an array with ndarray%get_dtype_name
. The
resulting type string are numpy dtype names, not Fortran type names:
integer ierror
type(ndarray) :: nd_arr
real(kind=real64) :: testarr(1) = [42.0_real64]
character(kind=C_CHAR, len=:), allocatable :: dname
ierror = ndarray_create(nd_arr, testarr)
ierror = nd_arr%get_dtype_name(dname)
write(*,*) dname
call nd_arr%destroy
You can get the dimensionality of a ndarray with ndarray%ndim
:
integer :: num_dims
ierror = nd_arr%ndim(num_ndims)
In Fortran it is possible to leave the size of the last dimension of an array unspecified.
This often occurs in classic Fortran codes. Use Fortran array slice notation to give
ndarray_create
/ndarray_create_nocopy
the necessary shape information:
subroutine sub(x, n)
integer n
real x(*)
integer :: ierror
type(ndarray) :: arr
ierror = ndarray_create(arr, x(1:n))
! more code ...
call arr%destroy
end subroutine
or (even better), change the subroutine arguments declaration into:
subroutine sub(x, n)
integer n
real x(n)
integer :: ierror
type(ndarray) :: arr
ierror = ndarray_create(arr, x)
! more code ...
call arr%destroy
end subroutine
This section discusses some points that you should consider when using
ndarray_create_nocopy
.
ndarray_create_nocopy
does not make a copy of the Fortran array's data, but
refers to the memory region that the Fortran array uses directly.
While this is performance- and memorywise ideal, there are some caveats:
Make sure that the array argument to ndarray_create_nocopy
is not a
compiler created temporary array, since the lifetime of the temporary is
system dependent (it might get deallocated after the call to ndarray_create_nocopy
):
! BAD: temporary array created by arithmetic expression
ierror = ndarray_create_nocopy(nd_arr, array*23 + 1)
do not do this either:
! do not use array literal, since lifetime depends on implementation and
! compiler options
ierror = ndarray_create_nocopy(nd_arr, [1, 2, 3, 4, 5])
Also make sure that the array input to ndarray_create_nocopy
is contiguous in
memory. This is not checked by ndarray_create
. (One could check this using
the Fortran 2008 intrinsic is_contiguous
, but not all compilers support it.)
program bad_dont_do_this
use forpy_mod
implicit none
integer :: ierror
integer, target :: array(10)
integer, dimension(:), pointer :: ptr
type(ndarray) :: nd_arr
ierror = forpy_initialize()
array = [1,2,3,4,5,6,7,8,9,10]
ptr => array(1:10:2)
! BAD: 'ptr' points to a non-contiguous slice of 'array'
ierror = ndarray_create_nocopy(nd_arr, ptr)
write(*,*) "ptr = ", ptr
write(*,*) "nd_arr = "
ierror = print_py(nd_arr)
call nd_arr%destroy
call forpy_finalize
end program
When using arrays created with ndarray_create_nocopy
keep in mind that
as long as references to the ndarray are in use, the Fortran data
has to be valid. (not deallocated or out of scope,...).
Especially when returning an ndarray from a subroutine, this can lead
to subtle errors. For example:
program bad_dont_do_this
use forpy_mod
implicit none
integer :: ierror
type(ndarray) :: nd_arr
integer, allocatable, dimension(:) :: some_array
ierror = forpy_initialize()
call return_bad_array(nd_arr)
allocate(some_array(4))
some_array = 99
! Depending on your system, this might even print [99, 99, 99, 99] !
ierror = print_py(nd_arr)
call nd_arr%destroy
call forpy_finalize
CONTAINS
subroutine return_bad_array(nd_arr)
type(ndarray) :: nd_arr
integer, allocatable, dimension(:) :: array
integer :: ierror
allocate(array(4))
array = [1, 2, 3, 4]
ierror = ndarray_create_nocopy(nd_arr, array)
ierror = print_py(nd_arr)
! BAD: array goes out of scope and is deallocated...
! nd_arr then refers to deallocated memory
end subroutine
end program
Solutions: 1) As shown above, use
ndarray_create_empty
, ndarray_create_ones
,... and get_data to fill the array.
- OR: create a copy of the ndarray or use
ndarray_create
Since the Fortran array that has been wrapped by ndarray_create_nocopy
can now be modified not only directly but also indirectly by the ndarray
,
you should add the asynchronous
attribute to the Fortran array declaration, since
without it compiler optimization related bugs can occur (depending on code,
compiler and compiler options).
Alternatively you could also use the volatile
attribute.
For an example that demonstrates the issue
see issue #3.
If you do not want to add the asynchronous
attribute to the array declaration,
you can pass the array to a subroutine with an asynchronous
dummy argument:
real, dimension(n) :: array ! declared without asynchronous
! ... some code
call sub(array)
subroutine sub(array)
real, dimension(:), asynchronous, intent(inout) :: array
! ...
ierror = ndarray_create_nocopy(nd_arr, array)
! ...
end subroutine
Note that the asynchronous
attribute should only be necessary when the
array could be modified.