Computations¶
Algorithms¶
General purpose algorithms.
Pure parallel computations¶

class
reikna.algorithms.
PureParallel
(parameters, code, guiding_array=None, render_kwds=None)¶ Bases:
Computation
A general class for pure parallel computations (i.e. with no interaction between threads).
Parameters:  parameters – a list of
Parameter
objects.  code – a source code for the computation.
Can be a
Snippet
object which will be passedIndices
object for theguiding_array
as the first positional argument, andKernelParameter
objects corresponding toparameters
as the rest of positional arguments. If it is a string, suchSnippet
will be created out of it, with the parameter namesidxs
for the first one and the names ofparameters
for the remaining ones.  guiding_array – an tuple with the array shape, or the name of one of
parameters
. By default, the first parameter is chosen.  render_kwds – a dictionary with render keywords for the
code
.

compiled_signature
(*args)¶ Parameters: args – corresponds to the given parameters
.

classmethod
from_trf
(trf, guiding_array=None)¶ Creates a
PureParallel
instance from aTransformation
object.guiding_array
can be a string with a name of an array parameter fromtrf
, or the correspondingTransformationParameter
object.
 parameters – a list of
Transposition (permutation)¶

class
reikna.algorithms.
Transpose
(arr_t, output_arr_t=None, axes=None, block_width_override=None)¶ Bases:
Computation
Changes the order of axes in a multidimensional array. Works analogous to
numpy.transpose
.Parameters:  arr_t – an arraylike defining the initial array.
 output_arr_t – an arraylike defining the output array.
If
None
, its shape will be derived based on the shape ofarr_t
, its dtype will be equal to that ofarr_t
, and any nondefault offset or strides ofarr_t
will be ignored.  axes – tuple with the new axes order.
If
None
, then axes will be reversed.

compiled_signature
(output:o, input:i)¶ Parameters:  output – an array with all the attributes of
arr_t
, with the shape permuted according toaxes
.  input – an array with all the attributes of
arr_t
.
 output – an array with all the attributes of
Reduction¶

class
reikna.algorithms.
Reduce
(arr_t, predicate, axes=None, output_arr_t=None)¶ Bases:
Computation
Reduces the array over given axis using given binary operation.
Parameters:  arr_t – an arraylike defining the initial array.
 predicate – a
Predicate
object.  axes – a list of nonrepeating axes to reduce over.
If
None
, the whole array will be reduced (in which case the shape of the output array is(1,)
).  output_arr_t – an output array metadata (the shape must still correspond to the result of reducing the original array over given axes, but offset and strides can be set to the desired ones).

compiled_signature
(output:o, input:i)¶ Parameters:  input – an array with the attributes of
arr_t
.  output – an array with the attributes of
arr_t
, with its shape missing axes fromaxes
.
 input – an array with the attributes of
Scan¶

class
reikna.algorithms.
Scan
(arr_t, predicate, axes=None, exclusive=False, max_work_group_size=None, seq_size=None)¶ Bases:
Computation
Scans the array over given axis using given binary operation. Namely, from an array
[a, b, c, d, ...]
and an operation.
, produces[a, a.b, a.b.c, a.b.c.d, ...]
ifexclusive
isFalse
and[0, a, a.b, a.b.c, ...]
ifexclusive
isTrue
(here0
is the operation’s identity element).Parameters:  arr_t – an arraylike defining the initial array.
 predicate – a
Predicate
object.  axes – a list of nonrepeating axes to scan over.
(Note that the result will depend on the order of the axes).
If
None
, the whole array will be scanned over.  exclusive – whether to perform an exclusive scan (see above).
 max_work_group_size – the maximum workgroup size to be used for the scan kernel.
 seq_size – the number of elements to be scanned sequentially. If not given, Reikna will attempt to choose the one resulting in the best performance, but sometimes a manual choice may be better.

compiled_signature
(output:o, input:i)¶ Parameters:  input – an array with the attributes of
arr_t
.  output – an array with the attributes of
arr_t
.
 input – an array with the attributes of
Predicates¶

class
reikna.algorithms.
Predicate
(operation, empty)¶ A predicate used in some of Reikna algorithms (e.g.
Reduce
orScan
).Parameters:  operation – a
Snippet
object with two parameters which will take the names of two arguments to join.  empty – a numpy scalar with the empty value of the argument (the one which, being joined by another argument, does not change it).
 operation – a
Linear algebra¶
Linear algebra algorithms.
Matrix multiplication (dot product)¶

class
reikna.linalg.
MatrixMul
(a_arr, b_arr, out_arr=None, block_width_override=None, transposed_a=False, transposed_b=False)¶ Bases:
Computation
Multiplies two matrices using last two dimensions and batching over remaining dimensions. For batching to work, the products of remaining dimensions should be equal (then the multiplication will be performed piecewise), or one of them should equal 1 (then the multiplication will be batched over the remaining dimensions of the other matrix).
Parameters:  a_arr – an arraylike defining the first argument.
 b_arr – an arraylike defining the second argument.
 out_arr – an arraylike definign the output; if not given, both shape and dtype
will be derived from
a_arr
andb_arr
.  block_width_override – if provided, it will used as a block size of the multiplication kernel.
 transposed_a – if
True
, the first matrix will be transposed before the multiplication.  transposed_b – if
True
, the second matrix will be transposed before the multiplication.

compiled_signature
(output:o, matrix_a:i, matrix_b:i)¶ Parameters:  output – the output of matrix multiplication.
 matrix_a – the first argument.
 matrix_b – the second argument.
Matrix norms¶

class
reikna.linalg.
EntrywiseNorm
(arr_t, order=2, axes=None)¶ Bases:
Computation
Calculates the entrywise matrix norm (same as
numpy.linalg.norm
) of an arbitrary order \(r\):\[A_r = \left( \sum_{i,j,\ldots} A_{i,j,\ldots}^r \right)^{1 / r}\]Parameters:  arr_t – an arraylike defining the initial array.
 order – the order \(r\) (any real number).
 axes – a list of nonrepeating axes to sum over.
If
None
, the norm of the whole array will be calculated.

compiled_signature
(output:o, input:i)¶ Parameters:  input – an array with the attributes of
arr_t
.  output – an array with the attributes of
arr_t
, with its shape missing axes fromaxes
.
 input – an array with the attributes of
Fast Fourier transform and related utilities¶
Fast Fourier Transform¶

class
reikna.fft.
FFT
(arr_t, axes=None)¶ Bases:
Computation
Performs the Fast Fourier Transform. The interface is similar to
numpy.fft.fftn
. The inverse transform is normalized so thatIFFT(FFT(X)) = X
.Parameters:  arr_t – an arraylike defining the problem array.
 axes – a tuple with axes over which to perform the transform. If not given, the transform is performed over all the axes.
Note
Current algorithm works most effectively with array dimensions being power of 2 This mostly applies to the axes over which the transform is performed, beacuse otherwise the computation falls back to the Bluestein’s algorithm, which effectively halves the performance.

compiled_signature
(output:o, input:i, inverse:s)¶ output
andinput
may be the same array.Parameters:  output – an array with the attributes of
arr_t
.  input – an array with the attributes of
arr_t
.  inverse – a scalar value castable to integer.
If
0
,output
contains the forward FFT ofinput
, if1
, the inverse one.
 output – an array with the attributes of
FFT frequency shift¶

class
reikna.fft.
FFTShift
(arr_t, axes=None)¶ Bases:
Computation
Shift the zerofrequency component to the center of the spectrum. The interface is similar to
numpy.fft.fftshift
, and the output is the same for the same array shape and axes.Parameters:  arr_t – an arraylike defining the problem array.
 axes – a tuple with axes over which to perform the shift. If not given, the shift is performed over all the axes.

compiled_signature
(output:o, input:i)¶ output
andinput
may be the same array.Parameters:  output – an array with the attributes of
arr_t
.  input – an array with the attributes of
arr_t
.
 output – an array with the attributes of
Discrete harmonic transform¶

reikna.dht.
get_spatial_grid
(modes, order, add_points=0)¶ Returns the spatial grid required to calculate the
order
power of a function defined in the harmonic mode space of the sizemodes
. Ifadd_points
is 0, the grid has the minimum size required for exact transformation back to the mode space.

reikna.dht.
harmonic
(mode)¶ Returns an eigenfunction of order \(n = \mathrm{mode}\) for the harmonic oscillator:
\[\phi_{n} = \frac{1}{\sqrt[4]{\pi} \sqrt{2^n n!}} H_n(x) \exp(x^2/2),\]where \(H_n\) is the \(n\)th order “physicists’” Hermite polynomial. The normalization is chosen so that \(\int \phi_n^2(x) dx = 1\).

class
reikna.dht.
DHT
(mode_arr, add_points=None, inverse=False, order=1, axes=None)¶ Bases:
Computation
Discrete transform to and from harmonic oscillator modes. With
inverse=True
transforms a function defined by its expansion \(C_m,\,m=0 \ldots M1\) in the mode space with mode functions fromharmonic()
, to the coordinate space (\(F(x)\) on the grid \(x\) fromget_spatial_grid()
). Withinverse=False
guarantees to recover first \(M\) modes of \(F^k(x)\), where \(k\) is theorder
parameter.For multiple dimensions the operation is the same, and the mode functions are products of 1D mode functions, i.e. \(\phi_{l,m,n}^{3D}(x,y,z) = \phi_l(x) \phi_m(y) \phi_n(z)\).
For the detailed description of the algorithm, see Dion & Cances, PRE 67(4) 046706 (2003)
Parameters:  mode_arr – an arraylike object defining the shape of mode space.
If
inverse=False
, its shape is used to define the mode space size.  inverse –
False
for forward (coordinate space > mode space) transform,True
for inverse (mode space > coordinate space) transform.  axes – a tuple with axes over which to perform the transform. If not given, the transform is performed over all the axes.
 order – if
F
is a function in mode space, the number of spatial points is chosen so that the transformationDHT[(DHT^{1}[F])^order]
could be performed.  add_points – a list of the same length as
mode_arr
shape, specifying the number of points in xspace to use in addition to minimally required (0
by default).

compiled_signature_forward
(modes:o, coords:i)¶

compiled_signature_inverse
(coords:o, modes:i)¶ Depending on
inverse
value, either of these two will be created.Parameters:  modes – an array with the attributes of
mode_arr
.  coords – an array with the shape depending on
mode_arr
,axes
,order
andadd_points
, and the dtype ofmode_arr
.
 modes – an array with the attributes of
 mode_arr – an arraylike object defining the shape of mode space.
If
Counterbased random number generators¶
This module is based on the paper by Salmon et al., P. Int. C. High. Perform. 16 (2011). and the source code of Random123 library.
A counterbased randomnumber generator (CBRNG) is a parametrized function \(f_k(c)\), where \(k\) is the key, \(c\) is the counter, and the function \(f_k\) defines a bijection in the set of integer numbers. Being applied to successive counters, the function produces a sequence of pseudorandom numbers. The key is an analogue of the seed of stateful RNGs; if the CBRNG is used to generate random num bers in parallel threads, the key is a combination of a seed and a unique thread number.
There are two types of generators available, threefry
(uses large number of simple functions),
and philox
(uses smaller number of more complicated functions).
The latter one is generally faster on GPUs; see the paper above for detailed comparisons.
These generators can be further specialized to use words=2
or words=4
bitness=32
bit or bitness=64
bit counters.
Obviously, the period of the generator equals to the cardinality of the set of possible counters.
For example, if the counter consits of 4 64bit numbers,
then the period of the generator is \(2^{256}\).
As for the key size, in case of threefry
the key has the same size as the counter,
and for philox
the key is half its size.
The CBRNG
class sets one of the words of the key
(except for philox2x64
, where 32 bit of the only word in the key are used),
the rest are the same for all threads and are derived from the provided seed
.
This limits the maximum number of numbergenerating threads (size
).
philox2x32
has a 32bit key and therefore cannot be used in CBRNG
(although it can be used separately with the help of the kernel API).
The CBRNG
class itself is stateless, same as other computations in Reikna,
so you have to manage the generator state yourself.
The state is created by the create_counters()
method
and contains a size
counters.
This state is then passed to, and updated by a CBRNG
object.

class
reikna.cbrng.
CBRNG
(randoms_arr, generators_dim, sampler, seed=None)¶ Bases:
Computation
Counterbased pseudorandom number generator class.
Parameters:  randoms_arr – an array intended for storing generated random numbers.
 generators_dim – the number of dimensions (counting from the end)
which will use independent generators.
For example, if
randoms_arr
has the shape(100, 200, 300)
andgenerators_dim
is2
, then in every subarray(j, :, :)
,j = 0 .. 99
, every element will use an independent generator.  sampler – a
Sampler
object.  seed –
None
for random seed, or an integer.

classmethod
sampler_name
(randoms_arr, generators_dim, sampler_kwds=None, seed=None)¶ A convenience constructor for the sampler
sampler_name
fromsamplers
. The contents of the dictionarysampler_kwds
will be passed to the sampler constructor function (withbijection
being created automatically, anddtype
taken fromrandoms_arr
).

compiled_signature
(counters:io, randoms:o)¶ Parameters:  counters – the RNG “state”.
All attributes are equal to the ones of the result of
create_counters()
.  randoms – generated random numbers.
All attributes are equal to the ones of
randoms_arr
from the constructor.
 counters – the RNG “state”.
All attributes are equal to the ones of the result of
Kernel API¶

class
reikna.cbrng.bijections.
Bijection
(module, word_dtype, key_dtype, counter_dtype)¶ Contains a CBRNG bijection module and accompanying metadata. Supports
__process_modules__
protocol.
word_dtype
¶ The data type of the integer word used by the generator.

key_words
¶ The number of words used by the key.

counter_words
¶ The number of words used by the counter.

key_dtype
¶ The
numpy.dtype
object representing a bijection key. Contains a single array fieldv
withkey_words
ofword_dtype
elements.

counter_dtype
¶ The
numpy.dtype
object representing a bijection counter. Contains a single array fieldv
withkey_words
ofword_dtype
elements.

raw_functions
¶ A dictionary
dtype:function_name
of available functionsfunction_name
inmodule
that produce a random fullrange integerdtype
from aState
, advancing it. Available functions:get_raw_uint32()
,get_raw_uint64()
.

module
¶ The module containing the CBRNG function. It provides the C functions below.

COUNTER_WORDS
¶ Contains the value of
counter_words
.

Word
¶ Contains the type corresponding to
word_dtype
.

Key
¶ Describes the bijection key. Alias for the structure generated from
key_dtype
.
Word v[KEY_WORDS]


Counter
¶ Describes the bijection counter, or its output. Alias for the structure generated from
counter_dtype
.
Word v[COUNTER_WORDS]


Counter
get_next_unused_counter
(State state)¶ Extracts a counter which has not been used in random sampling.

uint32
¶ A type of unsigned 32bit word, corresponds to
numpy.uint32
.

uint64
¶ A type of unsigned 64bit word, corresponds to
numpy.uint64
.


reikna.cbrng.bijections.
philox
(bitness, counter_words, rounds=10)¶ A CBRNG based on a low number of slow rounds (multiplications).
Parameters:  bitness –
32
or64
, corresponds to the size of generated random integers.  counter_words –
2
or4
, number of integers generated in one go.  rounds –
1
to12
, the more rounds, the better randomness is achieved. Default values are big enough to qualify as PRNG.
Returns: a
Bijection
object. bitness –

reikna.cbrng.bijections.
threefry
(bitness, counter_words, rounds=20)¶ A CBRNG based on a big number of fast rounds (bit rotations).
Parameters:  bitness –
32
or64
, corresponds to the size of generated random integers.  counter_words –
2
or4
, number of integers generated in one go.  rounds –
1
to72
, the more rounds, the better randomness is achieved. Default values are big enough to qualify as PRNG.
Returns: a
Bijection
object. bitness –

class
reikna.cbrng.samplers.
Sampler
(bijection, module, dtype, randoms_per_call=1, deterministic=False)¶ Contains a random distribution sampler module and accompanying metadata. Supports
__process_modules__
protocol.
deterministic
¶ If
True
, every sampled random number consumes the same amount of counters.

randoms_per_call
¶ How many random numbers one call to
sample
creates.

dtype
¶ The data type of one random value produced by the sampler.

module
¶ The module containing the distribution sampling function. It provides the C functions below.

RANDOMS_PER_CALL
¶ Contains the value of
randoms_per_call
.

Result
¶ Describes the sampling result.

value v[RANDOMS_PER_CALL]



reikna.cbrng.samplers.
gamma
(bijection, dtype, shape=1, scale=1)¶ Generates random numbers from the gamma distribution
\[P(x) = x^{k1} \frac{e^{x/\theta}}{\theta^k \Gamma(k)},\]where \(k\) is
shape
, and \(\theta\) isscale
. Supported dtypes:float(32/64)
. Returns aSampler
object.

reikna.cbrng.samplers.
normal_bm
(bijection, dtype, mean=0, std=1)¶ Generates normally distributed random numbers with the mean
mean
and the standard deviationstd
using BoxMuller transform. Supported dtypes:float(32/64)
,complex(64/128)
. Produces two random numbers per call for real types and one number for complex types. Returns aSampler
object.Note
In case of a complex
dtype
,std
refers to the standard deviation of the complex numbers (same asnumpy.std()
returns), not real and imaginary components (which will be normally distributed with the standard deviationstd / sqrt(2)
). Consequently, whilemean
is of typedtype
,std
must be real.

reikna.cbrng.samplers.
uniform_float
(bijection, dtype, low=0, high=1)¶ Generates uniformly distributed floatingpoints numbers in the interval
[low, high)
. Supported dtypes:float(32/64)
. A fixed number of counters is used in each thread. Returns aSampler
object.

reikna.cbrng.samplers.
uniform_integer
(bijection, dtype, low, high=None)¶ Generates uniformly distributed integer numbers in the interval
[low, high)
. Ifhigh
isNone
, the interval is[0, low)
. Supported dtypes: any numpy integers. If the size of the interval is a power of 2, a fixed number of counters is used in each thread. Returns aSampler
object.

reikna.cbrng.samplers.
vonmises
(bijection, dtype, mu=0, kappa=1)¶ Generates random numbers from the von Mises distribution
\[P(x) = \frac{\exp(\kappa \cos(x  \mu))}{2 \pi I_0(\kappa)},\]where \(\mu\) is the mode, \(\kappa\) is the dispersion, and \(I_0\) is the modified Bessel function of the first kind. Supported dtypes:
float(32/64)
. Returns aSampler
object.

class
reikna.cbrng.tools.
KeyGenerator
(module, base_key)¶ Contains a key generator module and accompanying metadata. Supports
__process_modules__
protocol.
module
¶ A module with the key generator function:

Key
key_from_int
(int idx)¶ Generates and returns a key, suitable for the bijection which was given to the constructor.

classmethod
create
(bijection, seed=None, reserve_id_space=True)¶ Creates a generator.
Parameters:  bijection – a
Bijection
object.  seed – an integer, or numpy array of 32bit unsigned integers.
 reserve_id_space – if
True
, the last 32 bit of the key will be reserved for the thread identifier. As a result, the total size of the key should be 64 bit or more. IfFalse
, the thread identifier will be just added to the key, which will still result in different keys for different threads, with the danger that different seeds produce the same sequences.
 bijection – a

reference
(idx)¶ Reference function that returns the key given the thread identifier. Uses the same algorithm as the module.
