API¶
Top level user functions:
|
Test whether all array elements along a given axis evaluate to True. |
|
Returns True if two arrays are element-wise equal within a tolerance. |
|
Return the angle of the complex argument. |
|
Test whether any array element along a given axis evaluates to True. |
|
Apply a function to 1-D slices along the given axis. |
|
Apply a function repeatedly over multiple axes. |
|
Return evenly spaced values from start to stop with step size step. |
|
This docstring was copied from numpy.arccos. |
|
This docstring was copied from numpy.arccosh. |
|
This docstring was copied from numpy.arcsin. |
|
This docstring was copied from numpy.arcsinh. |
|
This docstring was copied from numpy.arctan. |
|
This docstring was copied from numpy.arctan2. |
|
This docstring was copied from numpy.arctanh. |
|
Return the maximum of an array or maximum along an axis. |
|
Return the minimum of an array or minimum along an axis. |
|
Extract the indices of the k largest elements from a on the given axis, and return them sorted from largest to smallest. |
|
Find the indices of array elements that are non-zero, grouped by element. |
|
Evenly round to the given number of decimals. |
|
This docstring was copied from numpy.array. |
|
Convert the input to a dask array. |
|
Convert the input to a dask array. |
|
Convert inputs to arrays with at least one dimension. |
|
View inputs as arrays with at least two dimensions. |
|
View inputs as arrays with at least three dimensions. |
|
Compute the weighted average along the specified axis. |
|
This docstring was copied from numpy.bincount. |
|
This docstring was copied from numpy.bitwise_and. |
|
This docstring was copied from numpy.invert. |
|
This docstring was copied from numpy.bitwise_or. |
|
This docstring was copied from numpy.bitwise_xor. |
|
Assemble an nd-array from nested lists of blocks. |
|
Tensor operation: Generalized inner and outer products |
|
Broadcast any number of arrays against each other. |
|
Broadcast an array to a new shape. |
|
Coarsen array by applying reduction to fixed size neighborhoods |
|
This docstring was copied from numpy.ceil. |
|
Construct an array from an index array and a set of arrays to choose from. |
|
Clip (limit) the values in an array. |
|
Return selected slices of an array along given axis. |
|
Concatenate arrays along an existing axis |
|
This docstring was copied from numpy.conjugate. |
|
This docstring was copied from numpy.copysign. |
|
Return Pearson product-moment correlation coefficients. |
|
This docstring was copied from numpy.cos. |
|
This docstring was copied from numpy.cosh. |
|
Counts the number of non-zero values in the array |
|
Estimate a covariance matrix, given data and weights. |
|
Return the cumulative product of elements along a given axis. |
|
Return the cumulative sum of the elements along a given axis. |
|
This docstring was copied from numpy.deg2rad. |
|
This docstring was copied from numpy.degrees. |
|
Extract a diagonal or construct a diagonal array. |
|
Return specified diagonals. |
|
Calculate the n-th discrete difference along the given axis. |
|
This docstring was copied from numpy.divmod. |
|
Return the indices of the bins to which each value in input array belongs. |
|
This docstring was copied from numpy.dot. |
|
Stack arrays in sequence depth wise (along third axis). |
|
The differences between consecutive elements of an array. |
|
This docstring was copied from numpy.einsum. |
|
Blocked variant of empty |
|
Return a new array with the same shape and type as a given array. |
|
This docstring was copied from numpy.exp. |
|
This docstring was copied from numpy.expm1. |
|
Return a 2-D Array with ones on the diagonal and zeros elsewhere. |
|
This docstring was copied from numpy.fabs. |
|
Round to nearest integer towards zero. |
|
Return indices that are non-zero in the flattened version of a. |
|
Reverse element order along axis. |
|
Flip array in the up/down direction. |
|
Flip array in the left/right direction. |
|
This docstring was copied from numpy.floor. |
|
This docstring was copied from numpy.fmax. |
|
This docstring was copied from numpy.fmin. |
|
This docstring was copied from numpy.fmod. |
|
This docstring was copied from numpy.frexp. |
|
Construct an array by executing a function over each coordinate. |
|
This docstring was copied from numpy.frompyfunc. |
|
Blocked variant of full |
|
Return a full array with the same shape and type as a given array. |
|
Return the gradient of an N-dimensional array. |
|
Blocked variant of |
|
Stack arrays in sequence horizontally (column wise). |
|
This docstring was copied from numpy.hypot. |
|
Return the imaginary part of the complex argument. |
|
Implements NumPy’s |
|
Insert values along the given axis before the given indices. |
|
This docstring was copied from numpy.invert. |
|
Returns a boolean array where two arrays are element-wise equal within a tolerance. |
|
Returns a bool array, where True if input element is complex. |
|
This docstring was copied from numpy.isfinite. |
|
Calculates element in test_elements, broadcasting over element only. |
|
This docstring was copied from numpy.isinf. |
|
This docstring was copied from numpy.equal. |
|
This docstring was copied from numpy.isnan. |
|
pandas.isnull for dask arrays |
|
This docstring was copied from numpy.equal. |
|
Returns a bool array, where True if input element is real. |
|
This docstring was copied from numpy.ldexp. |
|
Return num evenly spaced values over the closed interval [start, stop]. |
|
This docstring was copied from numpy.log. |
|
This docstring was copied from numpy.log10. |
|
This docstring was copied from numpy.log1p. |
|
This docstring was copied from numpy.log2. |
|
This docstring was copied from numpy.logaddexp. |
|
This docstring was copied from numpy.logaddexp2. |
|
This docstring was copied from numpy.logical_and. |
|
This docstring was copied from numpy.logical_not. |
|
This docstring was copied from numpy.logical_or. |
|
This docstring was copied from numpy.logical_xor. |
|
Map a function over blocks of arrays with some overlap |
|
Map a function across all blocks of a dask array. |
|
This docstring was copied from numpy.matmul. |
|
Return the maximum of an array or maximum along an axis. |
|
This docstring was copied from numpy.maximum. |
|
Compute the arithmetic mean along the specified axis. |
|
Compute the median along the specified axis. |
|
Return coordinate matrices from coordinate vectors. |
|
Return the minimum of an array or minimum along an axis. |
|
This docstring was copied from numpy.minimum. |
|
This docstring was copied from numpy.modf. |
|
|
|
Move axes of an array to new positions. |
|
Return the maximum of an array or maximum along an axis, ignoring any NaNs. |
|
Return minimum of an array or minimum along an axis, ignoring any NaNs. |
|
Return the cumulative product of array elements over a given axis treating Not a Numbers (NaNs) as one. |
|
Return the cumulative sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. |
|
Return the maximum of an array or maximum along an axis, ignoring any NaNs. |
|
Compute the arithmetic mean along the specified axis, ignoring NaNs. |
|
Compute the median along the specified axis, while ignoring NaNs. |
|
Return minimum of an array or minimum along an axis, ignoring any NaNs. |
|
Return the product of array elements over a given axis treating Not a Numbers (NaNs) as ones. |
|
Compute the standard deviation along the specified axis, while ignoring NaNs. |
|
Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. |
|
Compute the variance along the specified axis, while ignoring NaNs. |
|
Replace NaN with zero and infinity with large finite numbers (default behaviour) or with the numbers defined by the user using the nan, posinf and/or neginf keywords. |
|
This docstring was copied from numpy.nextafter. |
|
Return the indices of the elements that are non-zero. |
|
pandas.notnull for dask arrays |
|
Blocked variant of ones |
|
Return an array of ones with the same shape and type as a given array. |
|
Compute the outer product of two vectors. |
|
Pad an array. |
|
Approximate percentile of 1-D array |
|
A warning given when bad chunking may cause poor performance |
|
Evaluate a piecewise-defined function. |
|
Return the product of array elements over a given axis. |
|
Range of values (maximum - minimum) along an axis. |
|
This docstring was copied from numpy.rad2deg. |
|
This docstring was copied from numpy.radians. |
|
Return a contiguous flattened array. |
|
Return the real part of the complex argument. |
|
Convert blocks in dask array x for new chunks. |
|
General version of reductions |
|
Repeat elements of an array. |
|
Reshape array to new shape |
|
This docstring was copied from numpy.result_type. |
|
This docstring was copied from numpy.rint. |
|
Roll array elements along a given axis. |
|
|
|
Round an array to the given number of decimals. |
|
This docstring was copied from numpy.sign. |
|
This docstring was copied from numpy.signbit. |
|
This docstring was copied from numpy.sin. |
|
This docstring was copied from numpy.sinh. |
|
This docstring was copied from numpy.sqrt. |
|
This docstring was copied from numpy.square. |
|
Remove single-dimensional entries from the shape of an array. |
|
Stack arrays along a new axis |
|
Compute the standard deviation along the specified axis. |
|
Sum of array elements over a given axis. |
|
Take elements from an array along an axis. |
|
This docstring was copied from numpy.tan. |
|
This docstring was copied from numpy.tanh. |
|
Compute tensor dot product along specified axes. |
|
Construct an array by repeating A the number of times given by reps. |
|
Extract the k largest elements from a on the given axis, and return them sorted from largest to smallest. |
|
Return the sum along diagonals of the array. |
|
Permute the dimensions of an array. |
|
Lower triangle of an array with elements above the k-th diagonal zeroed. |
|
Upper triangle of an array with elements below the k-th diagonal zeroed. |
|
This docstring was copied from numpy.trunc. |
|
Unify chunks across a sequence of arrays |
|
Find the unique elements of an array. |
|
This docstring was copied from numpy.unravel_index. |
|
Compute the variance along the specified axis. |
|
This docstring was copied from numpy.vdot. |
|
Stack arrays in sequence vertically (row wise). |
|
This docstring was copied from numpy.where. |
|
Blocked variant of zeros |
|
Return an array of zeros with the same shape and type as a given array. |
Fast Fourier Transforms¶
|
Wrap 1D, 2D, and ND real and complex FFT functions |
|
Wrapping of numpy.fft.fft |
|
Wrapping of numpy.fft.fft2 |
|
Wrapping of numpy.fft.fftn |
|
Wrapping of numpy.fft.ifft |
|
Wrapping of numpy.fft.ifft2 |
|
Wrapping of numpy.fft.ifftn |
|
Wrapping of numpy.fft.rfft |
|
Wrapping of numpy.fft.rfft2 |
|
Wrapping of numpy.fft.rfftn |
|
Wrapping of numpy.fft.irfft |
|
Wrapping of numpy.fft.irfft2 |
|
Wrapping of numpy.fft.irfftn |
|
Wrapping of numpy.fft.hfft |
|
Wrapping of numpy.fft.ihfft |
|
Return the Discrete Fourier Transform sample frequencies. |
|
Return the Discrete Fourier Transform sample frequencies (for usage with rfft, irfft). |
|
Shift the zero-frequency component to the center of the spectrum. |
|
The inverse of fftshift. |
Linear Algebra¶
|
Returns the Cholesky decomposition, A=LL∗ or A=U∗U of a Hermitian positive-definite matrix A. |
|
Compute the inverse of a matrix with LU decomposition and forward / backward substitutions. |
|
Return the least-squares solution to a linear matrix equation using QR decomposition. |
|
Compute the lu decomposition of a matrix. |
|
Matrix or vector norm. |
|
Compute the qr factorization of a matrix. |
|
Solve the equation |
|
Solve the equation a x = b for x, assuming a is a triangular matrix. |
|
Compute the singular value decomposition of a matrix. |
|
Randomly compressed rank-k thin Singular Value Decomposition. |
|
Direct Short-and-Fat QR |
|
Direct Tall-and-Skinny QR algorithm |
Masked Arrays¶
|
Return the weighted average of array over the given axis. |
|
Return input as an array with masked data replaced by a fill value. |
|
Return input with invalid data masked and replaced by a fill value. |
|
Return the data of a masked array as an ndarray. |
Return the mask of a masked array, or full boolean array of False. |
|
|
An array class with possibly masked values. |
|
Mask an array where equal to a given value. |
|
Mask an array where greater than a given value. |
|
Mask an array where greater than or equal to a given value. |
|
Mask an array inside a given interval. |
Mask an array where invalid values occur (NaNs or infs). |
|
|
Mask an array where less than a given value. |
|
Mask an array where less than or equal to a given value. |
|
Mask an array where not equal to a given value. |
|
Mask an array outside a given interval. |
|
Mask using floating point equality. |
|
Mask an array where a condition is met. |
|
Set the filling value of a, if a is a masked array. |
Random¶
|
Draw samples from a Beta distribution. |
|
Draw samples from a binomial distribution. |
|
Draw samples from a chi-square distribution. |
|
Generates a random sample from a given 1-D array |
|
Draw samples from an exponential distribution. |
|
Draw samples from an F distribution. |
|
Draw samples from a Gamma distribution. |
|
Draw samples from the geometric distribution. |
|
Draw samples from a Gumbel distribution. |
|
Draw samples from a Hypergeometric distribution. |
|
Draw samples from the Laplace or double exponential distribution with specified location (or mean) and scale (decay). |
|
Draw samples from a logistic distribution. |
|
Draw samples from a log-normal distribution. |
|
Draw samples from a logarithmic series distribution. |
|
Draw samples from a negative binomial distribution. |
|
Draw samples from a noncentral chi-square distribution. |
|
Draw samples from the noncentral F distribution. |
|
Draw random samples from a normal (Gaussian) distribution. |
|
Draw samples from a Pareto II or Lomax distribution with specified shape. |
|
Randomly permute a sequence, or return a permuted range. |
|
Draw samples from a Poisson distribution. |
|
Draws samples in [0, 1] from a power distribution with positive exponent a - 1. |
|
Return random integers from low (inclusive) to high (exclusive). |
|
Return random floats in the half-open interval [0.0, 1.0). |
|
Return random floats in the half-open interval [0.0, 1.0). |
|
Draw samples from a Rayleigh distribution. |
|
Draw samples from a standard Cauchy distribution with mode = 0. |
|
Draw samples from the standard exponential distribution. |
|
Draw samples from a standard Gamma distribution. |
|
Draw samples from a standard Normal distribution (mean=0, stdev=1). |
|
Draw samples from a standard Student’s t distribution with df degrees of freedom. |
|
Draw samples from the triangular distribution over the interval |
|
Draw samples from a uniform distribution. |
|
Draw samples from a von Mises distribution. |
|
Draw samples from a Wald, or inverse Gaussian, distribution. |
|
Draw samples from a Weibull distribution. |
|
Standard distributions |
Stats¶
|
Calculate the T-test for the means of two independent samples of scores. |
|
Calculate the T-test for the mean of ONE group of scores. |
|
Calculate the t-test on TWO RELATED samples of scores, a and b. |
|
Calculate a one-way chi-square test. |
|
Cressie-Read power divergence statistic and goodness of fit test. |
|
Compute the sample skewness of a data set. |
|
Test whether the skew is different from the normal distribution. |
|
Compute the kurtosis (Fisher or Pearson) of a dataset. |
|
Test whether a dataset has normal kurtosis. |
|
Test whether a sample differs from a normal distribution. |
|
Perform one-way ANOVA. |
|
Calculate the nth moment about the mean for a sample. |
Image Support¶
|
Read a stack of images into a dask array |
Slightly Overlapping Computations¶
|
Share boundaries between neighboring blocks |
|
Map a function over blocks of arrays with some overlap |
|
Trim sides from each block |
|
Trim sides from each block. |
Create and Store Arrays¶
|
Create dask array from something that looks like an array |
|
Create a dask array from a dask delayed value |
|
Load dask array from stack of npy files |
|
Load array from the zarr storage format |
|
Load array from the TileDB storage format |
|
Store dask arrays in array-like objects, overwrite data in target |
|
Store arrays in HDF5 file |
|
Save array to the zarr storage format |
|
Write dask array to a stack of .npy files |
|
Save array to the TileDB storage format |
Generalized Ufuncs¶
|
Apply a generalized ufunc or similar python function to arrays. |
|
Decorator for |
|
Binds pyfunc into |
Internal functions¶
|
Tensor operation: Generalized inner and outer products |
|
Normalize chunks to tuple of tuples |
Other functions¶
-
dask.array.
from_array
(x, chunks='auto', name=None, lock=False, asarray=None, fancy=True, getitem=None, meta=None)¶ Create dask array from something that looks like an array
Input must have a
.shape
,.ndim
,.dtype
and support numpy-style slicing.- Parameters
- xarray_like
- chunksint, tuple
How to chunk the array. Must be one of the following forms:
A blocksize like 1000.
A blockshape like (1000, 1000).
Explicit sizes of all blocks along all dimensions like ((1000, 1000, 500), (400, 400)).
A size in bytes, like “100 MiB” which will choose a uniform block-like shape
The word “auto” which acts like the above, but uses a configuration value
array.chunk-size
for the chunk size
-1 or None as a blocksize indicate the size of the corresponding dimension.
- namestr, optional
The key name to use for the array. Defaults to a hash of
x
. By default, hash uses python’s standard sha1. This behaviour can be changed by installing cityhash, xxhash or murmurhash. If installed, a large-factor speedup can be obtained in the tokenisation step. Usename=False
to generate a random name instead of hashing (fast)Note
Because this
name
is used as the key in task graphs, you should ensure that it uniquely identifies the data contained within. If you’d like to provide a descriptive name that is still unique, combine the descriptive name withdask.base.tokenize()
of thearray_like
. See Task Graphs for more.- lockbool or Lock, optional
If
x
doesn’t support concurrent reads then provide a lock here, or pass in True to have dask.array create one for you.- asarraybool, optional
If True then call np.asarray on chunks to convert them to numpy arrays. If False then chunks are passed through unchanged. If None (default) then we use True if the
__array_function__
method is undefined.- fancybool, optional
If
x
doesn’t support fancy indexing (e.g. indexing with lists or arrays) then set to False. Default is True.- metaArray-like, optional
The metadata for the resulting dask array. This is the kind of array that will result from slicing the input array. Defaults to the input array.
Examples
>>> x = h5py.File('...')['/data/path'] >>> a = da.from_array(x, chunks=(1000, 1000))
If your underlying datastore does not support concurrent reads then include the
lock=True
keyword argument orlock=mylock
if you want multiple arrays to coordinate around the same lock.>>> a = da.from_array(x, chunks=(1000, 1000), lock=True)
If your underlying datastore has a
.chunks
attribute (as h5py and zarr datasets do) then a multiple of that chunk shape will be used if you do not provide a chunk shape.>>> a = da.from_array(x, chunks='auto') >>> a = da.from_array(x, chunks='100 MiB') >>> a = da.from_array(x)
If providing a name, ensure that it is unique
>>> import dask.base >>> token = dask.base.tokenize(x) >>> a = da.from_array('myarray-' + token)
-
dask.array.
from_delayed
(value, shape, dtype=None, meta=None, name=None)¶ Create a dask array from a dask delayed value
This routine is useful for constructing dask arrays in an ad-hoc fashion using dask delayed, particularly when combined with stack and concatenate.
The dask array will consist of a single chunk.
Examples
>>> import dask >>> import dask.array as da >>> value = dask.delayed(np.ones)(5) >>> array = da.from_delayed(value, (5,), dtype=float) >>> array dask.array<from-value, shape=(5,), dtype=float64, chunksize=(5,), chunktype=numpy.ndarray> >>> array.compute() array([1., 1., 1., 1., 1.])
-
dask.array.
store
(sources, targets, lock=True, regions=None, compute=True, return_stored=False, **kwargs)¶ Store dask arrays in array-like objects, overwrite data in target
This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.
If your data fits in memory then you may prefer calling
np.array(myarray)
instead.- Parameters
- sources: Array or iterable of Arrays
- targets: array-like or Delayed or iterable of array-likes and/or Delayeds
These should support setitem syntax
target[10:20] = ...
- lock: boolean or threading.Lock, optional
Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular
threading.Lock
object to be shared among all writes.- regions: tuple of slices or list of tuples of slices
Each
region
tuple inregions
should be such thattarget[region].shape = source.shape
for the corresponding source and target in sources and targets, respectively. If this is a tuple, the contents will be assumed to be slices, so do not provide a tuple of tuples.- compute: boolean, optional
If true compute immediately, return
dask.delayed.Delayed
otherwise- return_stored: boolean, optional
Optionally return the stored result (default False).
Examples
>>> x = ...
>>> import h5py >>> f = h5py.File('myfile.hdf5', mode='a') >>> dset = f.create_dataset('/data', shape=x.shape, ... chunks=x.chunks, ... dtype='f8')
>>> store(x, dset)
Alternatively store many arrays at the same time
>>> store([x, y, z], [dset1, dset2, dset3])
-
dask.array.
coarsen
(reduction, x, axes, trim_excess=False, **kwargs)¶ Coarsen array by applying reduction to fixed size neighborhoods
- Parameters
- reduction: function
Function like np.sum, np.mean, etc…
- x: np.ndarray
Array to be coarsened
- axes: dict
Mapping of axis to coarsening factor
Examples
>>> x = np.array([1, 2, 3, 4, 5, 6]) >>> coarsen(np.sum, x, {0: 2}) array([ 3, 7, 11]) >>> coarsen(np.max, x, {0: 3}) array([3, 6])
Provide dictionary of scale per dimension
>>> x = np.arange(24).reshape((4, 6)) >>> x array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]])
>>> coarsen(np.min, x, {0: 2, 1: 3}) array([[ 0, 3], [12, 15]])
You must avoid excess elements explicitly
>>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8]) >>> coarsen(np.min, x, {0: 3}, trim_excess=True) array([1, 4])
-
dask.array.
stack
(seq, axis=0, allow_unknown_chunksizes=False)¶ Stack arrays along a new axis
Given a sequence of dask arrays, form a new dask array by stacking them along a new dimension (axis=0 by default)
- Parameters
- seq: list of dask.arrays
- axis: int
Dimension along which to align all of the arrays
- allow_unknown_chunksizes: bool
Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.
See also
Examples
Create slices
>>> import dask.array as da >>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2)) ... for i in range(3)]
>>> x = da.stack(data, axis=0) >>> x.shape (3, 4, 4)
>>> da.stack(data, axis=1).shape (4, 3, 4)
>>> da.stack(data, axis=-1).shape (4, 4, 3)
Result is a new dask Array
-
dask.array.
concatenate
(seq, axis=0, allow_unknown_chunksizes=False)¶ Concatenate arrays along an existing axis
Given a sequence of dask Arrays form a new dask Array by stacking them along an existing dimension (axis=0 by default)
- Parameters
- seq: list of dask.arrays
- axis: int
Dimension along which to align all of the arrays
- allow_unknown_chunksizes: bool
Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.
See also
Examples
Create slices
>>> import dask.array as da >>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2)) ... for i in range(3)]
>>> x = da.concatenate(data, axis=0) >>> x.shape (12, 4)
>>> da.concatenate(data, axis=1).shape (4, 12)
Result is a new dask Array
-
dask.array.
all
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Test whether all array elements along a given axis evaluate to True.
This docstring was copied from numpy.all.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Input array or object that can be converted to an array.
- axisNone or int or tuple of ints, optional
Axis or axes along which a logical AND reduction is performed. The default (
axis=None
) is to perform a logical AND over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis.New in version 1.7.0.
If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.
- outndarray, optional
Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if
dtype(out)
is float, the result will consist of 0.0’s and 1.0’s). See ufuncs-output-type for more details.- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the all method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- Returns
- allndarray, bool
A new boolean or array is returned unless out is specified, in which case a reference to out is returned.
See also
ndarray.all
equivalent method
any
Test whether any element along a given axis evaluates to True.
Notes
Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.
Examples
>>> np.all([[True,False],[True,True]]) False
>>> np.all([[True,False],[True,True]], axis=0) array([ True, False])
>>> np.all([-1, 4, 5]) True
>>> np.all([1.0, np.nan]) True
>>> o=np.array(False) >>> z=np.all([-1, 4, 5], out=o) >>> id(z), id(o), z (28293632, 28293632, array(True)) # may vary
-
dask.array.
allclose
(arr1, arr2, rtol=1e-05, atol=1e-08, equal_nan=False)¶ Returns True if two arrays are element-wise equal within a tolerance.
This docstring was copied from numpy.allclose.
Some inconsistencies with the Dask version may exist.
The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.
NaNs are treated as equal if they are in the same place and if
equal_nan=True
. Infs are treated as equal if they are in the same place and of the same sign in both arrays.- Parameters
- a, barray_like
Input arrays to compare.
- rtolfloat
The relative tolerance parameter (see Notes).
- atolfloat
The absolute tolerance parameter (see Notes).
- equal_nanbool
Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array.
New in version 1.10.0.
- Returns
- allclosebool
Returns True if the two arrays are equal within the given tolerance; False otherwise.
Notes
If the following equation is element-wise True, then allclose returns True.
absolute(a - b) <= (atol + rtol * absolute(b))
The above equation is not symmetric in a and b, so that
allclose(a, b)
might be different fromallclose(b, a)
in some rare cases.The comparison of a and b uses standard broadcasting, which means that a and b need not have the same shape in order for
allclose(a, b)
to evaluate to True. The same is true for equal but not array_equal.Examples
>>> np.allclose([1e10,1e-7], [1.00001e10,1e-8]) False >>> np.allclose([1e10,1e-8], [1.00001e10,1e-9]) True >>> np.allclose([1e10,1e-8], [1.0001e10,1e-9]) False >>> np.allclose([1.0, np.nan], [1.0, np.nan]) False >>> np.allclose([1.0, np.nan], [1.0, np.nan], equal_nan=True) True
-
dask.array.
angle
(x, deg=0)¶ Return the angle of the complex argument.
This docstring was copied from numpy.angle.
Some inconsistencies with the Dask version may exist.
- Parameters
- zarray_like (Not supported in Dask)
A complex number or sequence of complex numbers.
- degbool, optional
Return angle in degrees if True, radians if False (default).
- Returns
- anglendarray or scalar
The counterclockwise angle from the positive real axis on the complex plane in the range
(-pi, pi]
, with dtype as numpy.float64.- ..versionchanged:: 1.16.0
This function works on subclasses of ndarray like ma.array.
See also
arctan2
absolute
Examples
>>> np.angle([1.0, 1.0j, 1+1j]) # in radians array([ 0. , 1.57079633, 0.78539816]) # may vary >>> np.angle(1+1j, deg=True) # in degrees 45.0
-
dask.array.
any
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Test whether any array element along a given axis evaluates to True.
This docstring was copied from numpy.any.
Some inconsistencies with the Dask version may exist.
Returns single boolean unless axis is not
None
- Parameters
- aarray_like
Input array or object that can be converted to an array.
- axisNone or int or tuple of ints, optional
Axis or axes along which a logical OR reduction is performed. The default (
axis=None
) is to perform a logical OR over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis.New in version 1.7.0.
If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.
- outndarray, optional
Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if it is of type float, then it will remain so, returning 1.0 for True and 0.0 for False, regardless of the type of a). See ufuncs-output-type for more details.
- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the any method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- Returns
- anybool or ndarray
A new boolean or ndarray is returned unless out is specified, in which case a reference to out is returned.
See also
ndarray.any
equivalent method
all
Test whether all elements along a given axis evaluate to True.
Notes
Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.
Examples
>>> np.any([[True, False], [True, True]]) True
>>> np.any([[True, False], [False, False]], axis=0) array([ True, False])
>>> np.any([-1, 0, 5]) True
>>> np.any(np.nan) True
>>> o=np.array(False) >>> z=np.any([-1, 4, 5], out=o) >>> z, o (array(True), array(True)) >>> # Check now that z is a reference to o >>> z is o True >>> id(z), id(o) # identity of z and o (191614240, 191614240)
-
dask.array.
apply_along_axis
(func1d, axis, arr, *args, dtype=None, shape=None, **kwargs)¶ Apply a function to 1-D slices along the given axis.
This docstring was copied from numpy.apply_along_axis.
Some inconsistencies with the Dask version may exist.
Apply a function to 1-D slices along the given axis. This is a blocked variant of
numpy.apply_along_axis()
implemented viadask.array.map_blocks()
- Parameters
- func1dcallable
Function to apply to 1-D slices of the array along the given axis
- axisint
Axis along which func1d will be applied
- arrdask array
Dask array to which
func1d
will be applied- argsany
Additional arguments to
func1d
.- dtypestr or dtype, optional
The dtype of the output of
func1d
.- shapetuple, optional
The shape of the output of
func1d
.- kwargsany
Additional keyword arguments for
func1d
.
- Returns
- outndarray (Ni…, Nj…, Nk…)
The output array. The shape of out is identical to the shape of arr, except along the axis dimension. This axis is removed, and replaced with new dimensions equal to the shape of the return value of func1d. So if func1d returns a scalar out will have one fewer dimensions than arr.
See also
apply_over_axes
Apply a function repeatedly over multiple axes.
Notes
If either of dtype or shape are not provided, Dask attempts to determine them by calling func1d on a dummy array. This may produce incorrect values for dtype or shape, so we recommend providing them.
Execute func1d(a, *args) where func1d operates on 1-D arrays and a is a 1-D slice of arr along axis.
This is equivalent to (but faster than) the following use of ndindex and s_, which sets each of
ii
,jj
, andkk
to a tuple of indices:Ni, Nk = a.shape[:axis], a.shape[axis+1:] for ii in ndindex(Ni): for kk in ndindex(Nk): f = func1d(arr[ii + s_[:,] + kk]) Nj = f.shape for jj in ndindex(Nj): out[ii + jj + kk] = f[jj]
Equivalently, eliminating the inner loop, this can be expressed as:
Ni, Nk = a.shape[:axis], a.shape[axis+1:] for ii in ndindex(Ni): for kk in ndindex(Nk): out[ii + s_[...,] + kk] = func1d(arr[ii + s_[:,] + kk])
Examples
>>> def my_func(a): ... """Average first and last element of a 1-D array""" ... return (a[0] + a[-1]) * 0.5 >>> b = np.array([[1,2,3], [4,5,6], [7,8,9]]) >>> np.apply_along_axis(my_func, 0, b) array([4., 5., 6.]) >>> np.apply_along_axis(my_func, 1, b) array([2., 5., 8.])
For a function that returns a 1D array, the number of dimensions in outarr is the same as arr.
>>> b = np.array([[8,1,7], [4,3,9], [5,2,6]]) >>> np.apply_along_axis(sorted, 1, b) array([[1, 7, 8], [3, 4, 9], [2, 5, 6]])
For a function that returns a higher dimensional array, those dimensions are inserted in place of the axis dimension.
>>> b = np.array([[1,2,3], [4,5,6], [7,8,9]]) >>> np.apply_along_axis(np.diag, -1, b) array([[[1, 0, 0], [0, 2, 0], [0, 0, 3]], [[4, 0, 0], [0, 5, 0], [0, 0, 6]], [[7, 0, 0], [0, 8, 0], [0, 0, 9]]])
-
dask.array.
apply_over_axes
(func, a, axes)¶ Apply a function repeatedly over multiple axes.
This docstring was copied from numpy.apply_over_axes.
Some inconsistencies with the Dask version may exist.
func is called as res = func(a, axis), where axis is the first element of axes. The result res of the function call must have either the same dimensions as a or one less dimension. If res has one less dimension than a, a dimension is inserted before axis. The call to func is then repeated for each axis in axes, with res as the first argument.
- Parameters
- funcfunction
This function must take two arguments, func(a, axis).
- aarray_like
Input array.
- axesarray_like
Axes over which func is applied; the elements must be integers.
- Returns
- apply_over_axisndarray
The output array. The number of dimensions is the same as a, but the shape can be different. This depends on whether func changes the shape of its output with respect to its input.
See also
apply_along_axis
Apply a function to 1-D slices of an array along the given axis.
Notes
This function is equivalent to tuple axis arguments to reorderable ufuncs with keepdims=True. Tuple axis arguments to ufuncs have been available since version 1.7.0.
Examples
>>> a = np.arange(24).reshape(2,3,4) >>> a array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]])
Sum over axes 0 and 2. The result has same number of dimensions as the original array:
>>> np.apply_over_axes(np.sum, a, [0,2]) array([[[ 60], [ 92], [124]]])
Tuple axis arguments to ufuncs are equivalent:
>>> np.sum(a, axis=(0,2), keepdims=True) array([[[ 60], [ 92], [124]]])
-
dask.array.
arange
(*args, **kwargs)¶ Return evenly spaced values from start to stop with step size step.
The values are half-open [start, stop), so including start and excluding stop. This is basically the same as python’s range function but for dask arrays.
When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.
- Parameters
- startint, optional
The starting value of the sequence. The default is 0.
- stopint
The end of the interval, this value is excluded from the interval.
- stepint, optional
The spacing between the values. The default is 1 when not specified. The last value of the sequence.
- chunksint
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.- dtypenumpy.dtype
Output dtype. Omit to infer it from start, stop, step
- Returns
- samplesdask array
See also
-
dask.array.
arccos
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arccos.
Some inconsistencies with the Dask version may exist.
Trigonometric inverse cosine, element-wise.
The inverse of cos so that, if
y = cos(x)
, thenx = arccos(y)
.- Parameters
- xarray_like
x-coordinate on the unit circle. For real arguments, the domain is [-1, 1].
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- anglendarray
The angle of the ray intersecting the unit circle at the given x-coordinate in radians [0, pi]. This is a scalar if x is a scalar.
Notes
arccos is a multivalued function: for each x there are infinitely many numbers z such that cos(z) = x. The convention is to return the angle z whose real part lies in [0, pi].
For real-valued input data types, arccos always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arccos is a complex analytic function that has branch cuts [-inf, -1] and [1, inf] and is continuous from above on the former and from below on the latter.
The inverse cos is also known as acos or cos^-1.
References
M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 79. http://www.math.sfu.ca/~cbm/aands/
Examples
We expect the arccos of 1 to be 0, and of -1 to be pi:
>>> np.arccos([1, -1]) array([ 0. , 3.14159265])
Plot arccos:
>>> import matplotlib.pyplot as plt >>> x = np.linspace(-1, 1, num=100) >>> plt.plot(x, np.arccos(x)) >>> plt.axis('tight') >>> plt.show()
-
dask.array.
arccosh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arccosh.
Some inconsistencies with the Dask version may exist.
Inverse hyperbolic cosine, element-wise.
- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- arccoshndarray
Array of the same shape as x. This is a scalar if x is a scalar.
Notes
arccosh is a multivalued function: for each x there are infinitely many numbers z such that cosh(z) = x. The convention is to return the z whose imaginary part lies in [-pi, pi] and the real part in
[0, inf]
.For real-valued input data types, arccosh always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arccosh is a complex analytical function that has a branch cut [-inf, 1] and is continuous from above on it.
References
- 1
M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/
- 2
Wikipedia, “Inverse hyperbolic function”, https://en.wikipedia.org/wiki/Arccosh
Examples
>>> np.arccosh([np.e, 10.0]) array([ 1.65745445, 2.99322285]) >>> np.arccosh(1) 0.0
-
dask.array.
arcsin
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arcsin.
Some inconsistencies with the Dask version may exist.
Inverse sine, element-wise.
- Parameters
- xarray_like
y-coordinate on the unit circle.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- anglendarray
The inverse sine of each element in x, in radians and in the closed interval
[-pi/2, pi/2]
. This is a scalar if x is a scalar.
Notes
arcsin is a multivalued function: for each x there are infinitely many numbers z such that sin(z)=x. The convention is to return the angle z whose real part lies in [-pi/2, pi/2].
For real-valued input data types, arcsin always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arcsin is a complex analytic function that has, by convention, the branch cuts [-inf, -1] and [1, inf] and is continuous from above on the former and from below on the latter.
The inverse sine is also known as asin or sin^{-1}.
References
Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions, 10th printing, New York: Dover, 1964, pp. 79ff. http://www.math.sfu.ca/~cbm/aands/
Examples
>>> np.arcsin(1) # pi/2 1.5707963267948966 >>> np.arcsin(-1) # -pi/2 -1.5707963267948966 >>> np.arcsin(0) 0.0
-
dask.array.
arcsinh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arcsinh.
Some inconsistencies with the Dask version may exist.
Inverse hyperbolic sine element-wise.
- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Array of the same shape as x. This is a scalar if x is a scalar.
Notes
arcsinh is a multivalued function: for each x there are infinitely many numbers z such that sinh(z) = x. The convention is to return the z whose imaginary part lies in [-pi/2, pi/2].
For real-valued input data types, arcsinh always returns real output. For each value that cannot be expressed as a real number or infinity, it returns
nan
and sets the invalid floating point error flag.For complex-valued input, arccos is a complex analytical function that has branch cuts [1j, infj] and [-1j, -infj] and is continuous from the right on the former and from the left on the latter.
The inverse hyperbolic sine is also known as asinh or
sinh^-1
.References
- 1
M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/
- 2
Wikipedia, “Inverse hyperbolic function”, https://en.wikipedia.org/wiki/Arcsinh
Examples
>>> np.arcsinh(np.array([np.e, 10.0])) array([ 1.72538256, 2.99822295])
-
dask.array.
arctan
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arctan.
Some inconsistencies with the Dask version may exist.
Trigonometric inverse tangent, element-wise.
The inverse of tan, so that if
y = tan(x)
thenx = arctan(y)
.- Parameters
- xarray_like
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Out has the same shape as x. Its real part is in
[-pi/2, pi/2]
(arctan(+/-inf)
returns+/-pi/2
). This is a scalar if x is a scalar.
See also
Notes
arctan is a multi-valued function: for each x there are infinitely many numbers z such that tan(z) = x. The convention is to return the angle z whose real part lies in [-pi/2, pi/2].
For real-valued input data types, arctan always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arctan is a complex analytic function that has [1j, infj] and [-1j, -infj] as branch cuts, and is continuous from the left on the former and from the right on the latter.
The inverse tangent is also known as atan or tan^{-1}.
References
Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions, 10th printing, New York: Dover, 1964, pp. 79. http://www.math.sfu.ca/~cbm/aands/
Examples
We expect the arctan of 0 to be 0, and of 1 to be pi/4:
>>> np.arctan([0, 1]) array([ 0. , 0.78539816])
>>> np.pi/4 0.78539816339744828
Plot arctan:
>>> import matplotlib.pyplot as plt >>> x = np.linspace(-10, 10) >>> plt.plot(x, np.arctan(x)) >>> plt.axis('tight') >>> plt.show()
-
dask.array.
arctan2
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arctan2.
Some inconsistencies with the Dask version may exist.
Element-wise arc tangent of
x1/x2
choosing the quadrant correctly.The quadrant (i.e., branch) is chosen so that
arctan2(x1, x2)
is the signed angle in radians between the ray ending at the origin and passing through the point (1,0), and the ray ending at the origin and passing through the point (x2, x1). (Note the role reversal: the “y-coordinate” is the first function parameter, the “x-coordinate” is the second.) By IEEE convention, this function is defined for x2 = +/-0 and for either or both of x1 and x2 = +/-inf (see Notes for specific values).This function is not defined for complex-valued arguments; for the so-called argument of complex values, use angle.
- Parameters
- x1array_like, real-valued
y-coordinates.
- x2array_like, real-valued
x-coordinates. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- anglendarray
Array of angles in radians, in the range
[-pi, pi]
. This is a scalar if both x1 and x2 are scalars.
Notes
arctan2 is identical to the atan2 function of the underlying C library. The following special values are defined in the C standard: [1]
x1
x2
arctan2(x1,x2)
+/- 0
+0
+/- 0
+/- 0
-0
+/- pi
> 0
+/-inf
+0 / +pi
< 0
+/-inf
-0 / -pi
+/-inf
+inf
+/- (pi/4)
+/-inf
-inf
+/- (3*pi/4)
Note that +0 and -0 are distinct floating point numbers, as are +inf and -inf.
References
- 1
ISO/IEC standard 9899:1999, “Programming language C.”
Examples
Consider four points in different quadrants:
>>> x = np.array([-1, +1, +1, -1]) >>> y = np.array([-1, -1, +1, +1]) >>> np.arctan2(y, x) * 180 / np.pi array([-135., -45., 45., 135.])
Note the order of the parameters. arctan2 is defined also when x2 = 0 and at several other special points, obtaining values in the range
[-pi, pi]
:>>> np.arctan2([1., -1.], [0., 0.]) array([ 1.57079633, -1.57079633]) >>> np.arctan2([0., 0., np.inf], [+0., -0., np.inf]) array([ 0. , 3.14159265, 0.78539816])
-
dask.array.
arctanh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arctanh.
Some inconsistencies with the Dask version may exist.
Inverse hyperbolic tangent element-wise.
- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Array of the same shape as x. This is a scalar if x is a scalar.
See also
emath.arctanh
Notes
arctanh is a multivalued function: for each x there are infinitely many numbers z such that tanh(z) = x. The convention is to return the z whose imaginary part lies in [-pi/2, pi/2].
For real-valued input data types, arctanh always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arctanh is a complex analytical function that has branch cuts [-1, -inf] and [1, inf] and is continuous from above on the former and from below on the latter.
The inverse hyperbolic tangent is also known as atanh or
tanh^-1
.References
- 1
M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/
- 2
Wikipedia, “Inverse hyperbolic function”, https://en.wikipedia.org/wiki/Arctanh
Examples
>>> np.arctanh([0, -0.5]) array([ 0. , -0.54930614])
-
dask.array.
argmax
(x, axis=None, split_every=None, out=None)¶ Return the maximum of an array or maximum along an axis.
This docstring was copied from numpy.amax.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like (Not supported in Dask)
Input data.
- axisNone or int or tuple of ints, optional
Axis or axes along which to operate. By default, flattened input is used.
New in version 1.7.0.
If this is a tuple of ints, the maximum is selected over multiple axes, instead of a single axis or all the axes as before.
- outndarray, optional
Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See ufuncs-output-type for more details.
- keepdimsbool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the amax method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- initialscalar, optional (Not supported in Dask)
The minimum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
- wherearray_like of bool, optional (Not supported in Dask)
Elements to compare for the maximum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
- Returns
- amaxndarray or scalar
Maximum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension
a.ndim - 1
.
See also
amin
The minimum value of an array along a given axis, propagating any NaNs.
nanmax
The maximum value of an array along a given axis, ignoring any NaNs.
maximum
Element-wise maximum of two arrays, propagating any NaNs.
fmax
Element-wise maximum of two arrays, ignoring any NaNs.
argmax
Return the indices of the maximum values.
nanmin
,minimum
,fmin
Notes
NaN values are propagated, that is if at least one item is NaN, the corresponding max value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmax.
Don’t use amax for element-wise comparison of 2 arrays; when
a.shape[0]
is 2,maximum(a[0], a[1])
is faster thanamax(a, axis=0)
.Examples
>>> a = np.arange(4).reshape((2,2)) >>> a array([[0, 1], [2, 3]]) >>> np.amax(a) # Maximum of the flattened array 3 >>> np.amax(a, axis=0) # Maxima along the first axis array([2, 3]) >>> np.amax(a, axis=1) # Maxima along the second axis array([1, 3]) >>> np.amax(a, where=[False, True], initial=-1, axis=0) array([-1, 3]) >>> b = np.arange(5, dtype=float) >>> b[2] = np.NaN >>> np.amax(b) nan >>> np.amax(b, where=~np.isnan(b), initial=-1) 4.0 >>> np.nanmax(b) 4.0
You can use an initial value to compute the maximum of an empty slice, or to initialize it to a different value:
>>> np.max([[-50], [10]], axis=-1, initial=0) array([ 0, 10])
Notice that the initial value is used as one of the elements for which the maximum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.
>>> np.max([5], initial=6) 6 >>> max([5], default=6) 5
-
dask.array.
argmin
(x, axis=None, split_every=None, out=None)¶ Return the minimum of an array or minimum along an axis.
This docstring was copied from numpy.amin.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like (Not supported in Dask)
Input data.
- axisNone or int or tuple of ints, optional
Axis or axes along which to operate. By default, flattened input is used.
New in version 1.7.0.
If this is a tuple of ints, the minimum is selected over multiple axes, instead of a single axis or all the axes as before.
- outndarray, optional
Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See ufuncs-output-type for more details.
- keepdimsbool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the amin method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- initialscalar, optional (Not supported in Dask)
The maximum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
- wherearray_like of bool, optional (Not supported in Dask)
Elements to compare for the minimum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
- Returns
- aminndarray or scalar
Minimum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension
a.ndim - 1
.
See also
amax
The maximum value of an array along a given axis, propagating any NaNs.
nanmin
The minimum value of an array along a given axis, ignoring any NaNs.
minimum
Element-wise minimum of two arrays, propagating any NaNs.
fmin
Element-wise minimum of two arrays, ignoring any NaNs.
argmin
Return the indices of the minimum values.
nanmax
,maximum
,fmax
Notes
NaN values are propagated, that is if at least one item is NaN, the corresponding min value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmin.
Don’t use amin for element-wise comparison of 2 arrays; when
a.shape[0]
is 2,minimum(a[0], a[1])
is faster thanamin(a, axis=0)
.Examples
>>> a = np.arange(4).reshape((2,2)) >>> a array([[0, 1], [2, 3]]) >>> np.amin(a) # Minimum of the flattened array 0 >>> np.amin(a, axis=0) # Minima along the first axis array([0, 1]) >>> np.amin(a, axis=1) # Minima along the second axis array([0, 2]) >>> np.amin(a, where=[False, True], initial=10, axis=0) array([10, 1])
>>> b = np.arange(5, dtype=float) >>> b[2] = np.NaN >>> np.amin(b) nan >>> np.amin(b, where=~np.isnan(b), initial=10) 0.0 >>> np.nanmin(b) 0.0
>>> np.min([[-50], [10]], axis=-1, initial=0) array([-50, 0])
Notice that the initial value is used as one of the elements for which the minimum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.
Notice that this isn’t the same as Python’s
default
argument.>>> np.min([6], initial=5) 5 >>> min([6], default=5) 6
-
dask.array.
argtopk
(a, k, axis=-1, split_every=None)¶ Extract the indices of the k largest elements from a on the given axis, and return them sorted from largest to smallest. If k is negative, extract the indices of the -k smallest elements instead, and return them sorted from smallest to largest.
This performs best when
k
is much smaller than the chunk size. All results will be returned in a single chunk along the given axis.- Parameters
- x: Array
Data being sorted
- k: int
- axis: int, optional
- split_every: int >=2, optional
See
topk()
. The performance considerations for topk also apply here.
- Returns
- Selection of np.intp indices of x with size abs(k) along the given axis.
Examples
>>> import dask.array as da >>> x = np.array([5, 1, 3, 6]) >>> d = da.from_array(x, chunks=2) >>> d.argtopk(2).compute() array([3, 0]) >>> d.argtopk(-2).compute() array([1, 2])
-
dask.array.
argwhere
(a)¶ Find the indices of array elements that are non-zero, grouped by element.
This docstring was copied from numpy.argwhere.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Input data.
- Returns
- index_array(N, a.ndim) ndarray
Indices of elements that are non-zero. Indices are grouped by element. This array will have shape
(N, a.ndim)
whereN
is the number of non-zero items.
Notes
np.argwhere(a)
is almost the same asnp.transpose(np.nonzero(a))
, but produces a result of the correct shape for a 0D array.The output of
argwhere
is not suitable for indexing arrays. For this purpose usenonzero(a)
instead.Examples
>>> x = np.arange(6).reshape(2,3) >>> x array([[0, 1, 2], [3, 4, 5]]) >>> np.argwhere(x>1) array([[0, 2], [1, 0], [1, 1], [1, 2]])
-
dask.array.
around
(x, decimals=0)¶ Evenly round to the given number of decimals.
This docstring was copied from numpy.around.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like (Not supported in Dask)
Input data.
- decimalsint, optional
Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.
- outndarray, optional (Not supported in Dask)
Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary. See ufuncs-output-type for more details.
- Returns
- rounded_arrayndarray
An array of the same type as a, containing the rounded values. Unless out was specified, a new array is created. A reference to the result is returned.
The real and imaginary parts of complex numbers are rounded separately. The result of rounding a float is a float.
Notes
For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc.
np.around
uses a fast but sometimes inexact algorithm to round floating-point datatypes. For positive decimals it is equivalent tonp.true_divide(np.rint(a * 10**decimals), 10**decimals)
, which has error due to the inexact representation of decimal fractions in the IEEE floating point standard [1] and errors introduced when scaling by powers of ten. For instance, note the extra “1” in the following:>>> np.round(56294995342131.5, 3) 56294995342131.51
If your goal is to print such values with a fixed number of decimals, it is preferable to use numpy’s float printing routines to limit the number of printed decimals:
>>> np.format_float_positional(56294995342131.5, precision=3) '56294995342131.5'
The float printing routines use an accurate but much more computationally demanding algorithm to compute the number of digits after the decimal point.
Alternatively, Python’s builtin round function uses a more accurate but slower algorithm for 64-bit floating point values:
>>> round(56294995342131.5, 3) 56294995342131.5 >>> np.round(16.055, 2), round(16.055, 2) # equals 16.0549999999999997 (16.06, 16.05)
References
- 1
“Lecture Notes on the Status of IEEE 754”, William Kahan, https://people.eecs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF
- 2
“How Futile are Mindless Assessments of Roundoff in Floating-Point Computation?”, William Kahan, https://people.eecs.berkeley.edu/~wkahan/Mindless.pdf
Examples
>>> np.around([0.37, 1.64]) array([0., 2.]) >>> np.around([0.37, 1.64], decimals=1) array([0.4, 1.6]) >>> np.around([.5, 1.5, 2.5, 3.5, 4.5]) # rounds to nearest even value array([0., 2., 2., 4., 4.]) >>> np.around([1,2,3,11], decimals=1) # ndarray of ints is returned array([ 1, 2, 3, 11]) >>> np.around([1,2,3,11], decimals=-1) array([ 0, 0, 0, 10])
-
dask.array.
array
(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)¶ This docstring was copied from numpy.array.
Some inconsistencies with the Dask version may exist.
Create an array.
- Parameters
- objectarray_like
An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence.
- dtypedata-type, optional
The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence.
- copybool, optional
If true (default), then the object is copied. Otherwise, a copy will only be made if __array__ returns a copy, if obj is a nested sequence, or if a copy is needed to satisfy any of the other requirements (dtype, order, etc.).
- order{‘K’, ‘A’, ‘C’, ‘F’}, optional
Specify the memory layout of the array. If object is not an array, the newly created array will be in C order (row major) unless ‘F’ is specified, in which case it will be in Fortran order (column major). If object is an array the following holds.
order
no copy
copy=True
‘K’
unchanged
F & C order preserved, otherwise most similar order
‘A’
unchanged
F order if input is F and not C, otherwise C order
‘C’
C order
C order
‘F’
F order
F order
When
copy=False
and a copy is made for other reasons, the result is the same as ifcopy=True
, with some exceptions for A, see the Notes section. The default order is ‘K’.- subokbool, optional
If True, then sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default).
- ndminint, optional
Specifies the minimum number of dimensions that the resulting array should have. Ones will be pre-pended to the shape as needed to meet this requirement.
- Returns
- outndarray
An array object satisfying the specified requirements.
See also
empty_like
Return an empty array with shape and type of input.
ones_like
Return an array of ones with shape and type of input.
zeros_like
Return an array of zeros with shape and type of input.
full_like
Return a new array with shape of input filled with value.
empty
Return a new uninitialized array.
ones
Return a new array setting values to one.
zeros
Return a new array setting values to zero.
full
Return a new array of given shape filled with value.
Notes
When order is ‘A’ and object is an array in neither ‘C’ nor ‘F’ order, and a copy is forced by a change in dtype, then the order of the result is not necessarily ‘C’ as expected. This is likely a bug.
Examples
>>> np.array([1, 2, 3]) array([1, 2, 3])
Upcasting:
>>> np.array([1, 2, 3.0]) array([ 1., 2., 3.])
More than one dimension:
>>> np.array([[1, 2], [3, 4]]) array([[1, 2], [3, 4]])
Minimum dimensions 2:
>>> np.array([1, 2, 3], ndmin=2) array([[1, 2, 3]])
Type provided:
>>> np.array([1, 2, 3], dtype=complex) array([ 1.+0.j, 2.+0.j, 3.+0.j])
Data-type consisting of more than one element:
>>> x = np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i4')]) >>> x['a'] array([1, 3])
Creating an array from sub-classes:
>>> np.array(np.mat('1 2; 3 4')) array([[1, 2], [3, 4]])
>>> np.array(np.mat('1 2; 3 4'), subok=True) matrix([[1, 2], [3, 4]])
-
dask.array.
asanyarray
(a)¶ Convert the input to a dask array.
Subclasses of
np.ndarray
will be passed through as chunks unchanged.- Parameters
- aarray-like
Input data, in any form that can be converted to a dask array.
- Returns
- outdask array
Dask array interpretation of a.
Examples
>>> import dask.array as da >>> import numpy as np >>> x = np.arange(3) >>> da.asanyarray(x) dask.array<array, shape=(3,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray>
>>> y = [[1, 2, 3], [4, 5, 6]] >>> da.asanyarray(y) dask.array<array, shape=(2, 3), dtype=int64, chunksize=(2, 3), chunktype=numpy.ndarray>
-
dask.array.
asarray
(a, **kwargs)¶ Convert the input to a dask array.
- Parameters
- aarray-like
Input data, in any form that can be converted to a dask array.
- Returns
- outdask array
Dask array interpretation of a.
Examples
>>> import dask.array as da >>> import numpy as np >>> x = np.arange(3) >>> da.asarray(x) dask.array<array, shape=(3,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray>
>>> y = [[1, 2, 3], [4, 5, 6]] >>> da.asarray(y) dask.array<array, shape=(2, 3), dtype=int64, chunksize=(2, 3), chunktype=numpy.ndarray>
-
dask.array.
atleast_1d
(*arys)¶ Convert inputs to arrays with at least one dimension.
This docstring was copied from numpy.atleast_1d.
Some inconsistencies with the Dask version may exist.
Scalar inputs are converted to 1-dimensional arrays, whilst higher-dimensional inputs are preserved.
- Parameters
- arys1, arys2, …array_like
One or more input arrays.
- Returns
- retndarray
An array, or list of arrays, each with
a.ndim >= 1
. Copies are made only if necessary.
See also
Examples
>>> np.atleast_1d(1.0) array([1.])
>>> x = np.arange(9.0).reshape(3,3) >>> np.atleast_1d(x) array([[0., 1., 2.], [3., 4., 5.], [6., 7., 8.]]) >>> np.atleast_1d(x) is x True
>>> np.atleast_1d(1, [3, 4]) [array([1]), array([3, 4])]
-
dask.array.
atleast_2d
(*arys)¶ View inputs as arrays with at least two dimensions.
This docstring was copied from numpy.atleast_2d.
Some inconsistencies with the Dask version may exist.
- Parameters
- arys1, arys2, …array_like
One or more array-like sequences. Non-array inputs are converted to arrays. Arrays that already have two or more dimensions are preserved.
- Returns
- res, res2, …ndarray
An array, or list of arrays, each with
a.ndim >= 2
. Copies are avoided where possible, and views with two or more dimensions are returned.
See also
Examples
>>> np.atleast_2d(3.0) array([[3.]])
>>> x = np.arange(3.0) >>> np.atleast_2d(x) array([[0., 1., 2.]]) >>> np.atleast_2d(x).base is x True
>>> np.atleast_2d(1, [1, 2], [[1, 2]]) [array([[1]]), array([[1, 2]]), array([[1, 2]])]
-
dask.array.
atleast_3d
(*arys)¶ View inputs as arrays with at least three dimensions.
This docstring was copied from numpy.atleast_3d.
Some inconsistencies with the Dask version may exist.
- Parameters
- arys1, arys2, …array_like
One or more array-like sequences. Non-array inputs are converted to arrays. Arrays that already have three or more dimensions are preserved.
- Returns
- res1, res2, …ndarray
An array, or list of arrays, each with
a.ndim >= 3
. Copies are avoided where possible, and views with three or more dimensions are returned. For example, a 1-D array of shape(N,)
becomes a view of shape(1, N, 1)
, and a 2-D array of shape(M, N)
becomes a view of shape(M, N, 1)
.
See also
Examples
>>> np.atleast_3d(3.0) array([[[3.]]])
>>> x = np.arange(3.0) >>> np.atleast_3d(x).shape (1, 3, 1)
>>> x = np.arange(12.0).reshape(4,3) >>> np.atleast_3d(x).shape (4, 3, 1) >>> np.atleast_3d(x).base is x.base # x is a reshape, so not base itself True
>>> for arr in np.atleast_3d([1, 2], [[1, 2]], [[[1, 2]]]): ... print(arr, arr.shape) ... [[[1] [2]]] (1, 2, 1) [[[1] [2]]] (1, 2, 1) [[[1 2]]] (1, 1, 2)
-
dask.array.
average
(a, axis=None, weights=None, returned=False)¶ Compute the weighted average along the specified axis.
This docstring was copied from numpy.average.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Array containing data to be averaged. If a is not an array, a conversion is attempted.
- axisNone or int or tuple of ints, optional
Axis or axes along which to average a. The default, axis=None, will average over all of the elements of the input array. If axis is negative it counts from the last to the first axis.
New in version 1.7.0.
If axis is a tuple of ints, averaging is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.
- weightsarray_like, optional
An array of weights associated with the values in a. Each value in a contributes to the average according to its associated weight. The weights array can either be 1-D (in which case its length must be the size of a along the given axis) or of the same shape as a. If weights=None, then all data in a are assumed to have a weight equal to one. The 1-D calculation is:
avg = sum(a * weights) / sum(weights)
The only constraint on weights is that sum(weights) must not be 0.
- returnedbool, optional
Default is False. If True, the tuple (average, sum_of_weights) is returned, otherwise only the average is returned. If weights=None, sum_of_weights is equivalent to the number of elements over which the average is taken.
- Returns
- retval, [sum_of_weights]array_type or double
Return the average along the specified axis. When returned is True, return a tuple with the average as the first element and the sum of the weights as the second element. sum_of_weights is of the same type as retval. The result dtype follows a genereal pattern. If weights is None, the result dtype will be that of a , or
float64
if a is integral. Otherwise, if weights is not None and a is non- integral, the result type will be the type of lowest precision capable of representing values of both a and weights. If a happens to be integral, the previous rules still applies but the result dtype will at least befloat64
.
- Raises
- ZeroDivisionError
When all weights along axis are zero. See numpy.ma.average for a version robust to this type of error.
- TypeError
When the length of 1D weights is not the same as the shape of a along axis.
See also
mean
ma.average
average for masked arrays – useful if your data contains “missing” values
numpy.result_type
Returns the type that results from applying the numpy type promotion rules to the arguments.
Examples
>>> data = np.arange(1, 5) >>> data array([1, 2, 3, 4]) >>> np.average(data) 2.5 >>> np.average(np.arange(1, 11), weights=np.arange(10, 0, -1)) 4.0
>>> data = np.arange(6).reshape((3,2)) >>> data array([[0, 1], [2, 3], [4, 5]]) >>> np.average(data, axis=1, weights=[1./4, 3./4]) array([0.75, 2.75, 4.75]) >>> np.average(data, weights=[1./4, 3./4]) Traceback (most recent call last): ... TypeError: Axis must be specified when shapes of a and weights differ.
>>> a = np.ones(5, dtype=np.float128) >>> w = np.ones(5, dtype=np.complex64) >>> avg = np.average(a, weights=w) >>> print(avg.dtype) complex256
-
dask.array.
bincount
(x, weights=None, minlength=0)¶ This docstring was copied from numpy.bincount.
Some inconsistencies with the Dask version may exist.
Count number of occurrences of each value in array of non-negative ints.
The number of bins (of size 1) is one larger than the largest value in x. If minlength is specified, there will be at least this number of bins in the output array (though it will be longer if necessary, depending on the contents of x). Each bin gives the number of occurrences of its index value in x. If weights is specified the input array is weighted by it, i.e. if a value
n
is found at positioni
,out[n] += weight[i]
instead ofout[n] += 1
.- Parameters
- xarray_like, 1 dimension, nonnegative ints
Input array.
- weightsarray_like, optional
Weights, array of the same shape as x.
- minlengthint, optional
A minimum number of bins for the output array.
New in version 1.6.0.
- Returns
- outndarray of ints
The result of binning the input array. The length of out is equal to
np.amax(x)+1
.
- Raises
- ValueError
If the input is not 1-dimensional, or contains elements with negative values, or if minlength is negative.
- TypeError
If the type of the input is float or complex.
Examples
>>> np.bincount(np.arange(5)) array([1, 1, 1, 1, 1]) >>> np.bincount(np.array([0, 1, 1, 3, 2, 1, 7])) array([1, 3, 1, 1, 0, 0, 0, 1])
>>> x = np.array([0, 1, 1, 3, 2, 1, 7, 23]) >>> np.bincount(x).size == np.amax(x)+1 True
The input array needs to be of integer dtype, otherwise a TypeError is raised:
>>> np.bincount(np.arange(5, dtype=float)) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: array cannot be safely cast to required type
A possible use of
bincount
is to perform sums over variable-size chunks of an array, using theweights
keyword.>>> w = np.array([0.3, 0.5, 0.2, 0.7, 1., -0.6]) # weights >>> x = np.array([0, 1, 1, 2, 2, 2]) >>> np.bincount(x, weights=w) array([ 0.3, 0.7, 1.1])
-
dask.array.
bitwise_and
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.bitwise_and.
Some inconsistencies with the Dask version may exist.
Compute the bit-wise AND of two arrays element-wise.
Computes the bit-wise AND of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
&
.- Parameters
- x1, x2array_like
Only integer and boolean types are handled. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Result. This is a scalar if both x1 and x2 are scalars.
See also
logical_and
bitwise_or
bitwise_xor
binary_repr
Return the binary representation of the input number as a string.
Examples
The number 13 is represented by
00001101
. Likewise, 17 is represented by00010001
. The bit-wise AND of 13 and 17 is therefore000000001
, or 1:>>> np.bitwise_and(13, 17) 1
>>> np.bitwise_and(14, 13) 12 >>> np.binary_repr(12) '1100' >>> np.bitwise_and([14,3], 13) array([12, 1])
>>> np.bitwise_and([11,7], [4,25]) array([0, 1]) >>> np.bitwise_and(np.array([2,5,255]), np.array([3,14,16])) array([ 2, 4, 16]) >>> np.bitwise_and([True, True], [False, True]) array([False, True])
-
dask.array.
bitwise_not
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.invert.
Some inconsistencies with the Dask version may exist.
Compute bit-wise inversion, or bit-wise NOT, element-wise.
Computes the bit-wise NOT of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
~
.For signed integer inputs, the two’s complement is returned. In a two’s-complement system negative numbers are represented by the two’s complement of the absolute value. This is the most common method of representing signed integers on computers [1]. A N-bit two’s-complement system can represent every integer in the range −2N−1 to +2N−1−1.
- Parameters
- xarray_like
Only integer and boolean types are handled.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Result. This is a scalar if x is a scalar.
See also
bitwise_and
,bitwise_or
,bitwise_xor
logical_not
binary_repr
Return the binary representation of the input number as a string.
Notes
bitwise_not is an alias for invert:
>>> np.bitwise_not is np.invert True
References
- 1
Wikipedia, “Two’s complement”, https://en.wikipedia.org/wiki/Two’s_complement
Examples
We’ve seen that 13 is represented by
00001101
. The invert or bit-wise NOT of 13 is then:>>> x = np.invert(np.array(13, dtype=np.uint8)) >>> x 242 >>> np.binary_repr(x, width=8) '11110010'
The result depends on the bit-width:
>>> x = np.invert(np.array(13, dtype=np.uint16)) >>> x 65522 >>> np.binary_repr(x, width=16) '1111111111110010'
When using signed integer types the result is the two’s complement of the result for the unsigned type:
>>> np.invert(np.array([13], dtype=np.int8)) array([-14], dtype=int8) >>> np.binary_repr(-14, width=8) '11110010'
Booleans are accepted as well:
>>> np.invert(np.array([True, False])) array([False, True])
-
dask.array.
bitwise_or
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.bitwise_or.
Some inconsistencies with the Dask version may exist.
Compute the bit-wise OR of two arrays element-wise.
Computes the bit-wise OR of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
|
.- Parameters
- x1, x2array_like
Only integer and boolean types are handled. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Result. This is a scalar if both x1 and x2 are scalars.
See also
logical_or
bitwise_and
bitwise_xor
binary_repr
Return the binary representation of the input number as a string.
Examples
The number 13 has the binaray representation
00001101
. Likewise, 16 is represented by00010000
. The bit-wise OR of 13 and 16 is then000111011
, or 29:>>> np.bitwise_or(13, 16) 29 >>> np.binary_repr(29) '11101'
>>> np.bitwise_or(32, 2) 34 >>> np.bitwise_or([33, 4], 1) array([33, 5]) >>> np.bitwise_or([33, 4], [1, 2]) array([33, 6])
>>> np.bitwise_or(np.array([2, 5, 255]), np.array([4, 4, 4])) array([ 6, 5, 255]) >>> np.array([2, 5, 255]) | np.array([4, 4, 4]) array([ 6, 5, 255]) >>> np.bitwise_or(np.array([2, 5, 255, 2147483647], dtype=np.int32), ... np.array([4, 4, 4, 2147483647], dtype=np.int32)) array([ 6, 5, 255, 2147483647]) >>> np.bitwise_or([True, True], [False, True]) array([ True, True])
-
dask.array.
bitwise_xor
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.bitwise_xor.
Some inconsistencies with the Dask version may exist.
Compute the bit-wise XOR of two arrays element-wise.
Computes the bit-wise XOR of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
^
.- Parameters
- x1, x2array_like
Only integer and boolean types are handled. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Result. This is a scalar if both x1 and x2 are scalars.
See also
logical_xor
bitwise_and
bitwise_or
binary_repr
Return the binary representation of the input number as a string.
Examples
The number 13 is represented by
00001101
. Likewise, 17 is represented by00010001
. The bit-wise XOR of 13 and 17 is therefore00011100
, or 28:>>> np.bitwise_xor(13, 17) 28 >>> np.binary_repr(28) '11100'
>>> np.bitwise_xor(31, 5) 26 >>> np.bitwise_xor([31,3], 5) array([26, 6])
>>> np.bitwise_xor([31,3], [5,6]) array([26, 5]) >>> np.bitwise_xor([True, True], [False, True]) array([ True, False])
-
dask.array.
block
(arrays, allow_unknown_chunksizes=False)¶ Assemble an nd-array from nested lists of blocks.
Blocks in the innermost lists are concatenated along the last dimension (-1), then these are concatenated along the second-last dimension (-2), and so on until the outermost list is reached
Blocks can be of any dimension, but will not be broadcasted using the normal rules. Instead, leading axes of size 1 are inserted, to make
block.ndim
the same for all blocks. This is primarily useful for working with scalars, and means that code likeblock([v, 1])
is valid, wherev.ndim == 1
.When the nested list is two levels deep, this allows block matrices to be constructed from their components.
- Parameters
- arraysnested list of array_like or scalars (but not tuples)
If passed a single ndarray or scalar (a nested list of depth 0), this is returned unmodified (and not copied).
Elements shapes must match along the appropriate axes (without broadcasting), but leading 1s will be prepended to the shape as necessary to make the dimensions match.
- allow_unknown_chunksizes: bool
Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.
- Returns
- block_arrayndarray
The array assembled from the given blocks.
The dimensionality of the output is equal to the greatest of: * the dimensionality of all the inputs * the depth to which the input list is nested
- Raises
- ValueError
If list depths are mismatched - for instance,
[[a, b], c]
is illegal, and should be spelt[[a, b], [c]]
If lists are empty - for instance,
[[a, b], []]
See also
concatenate
Join a sequence of arrays together.
stack
Stack arrays in sequence along a new dimension.
hstack
Stack arrays in sequence horizontally (column wise).
vstack
Stack arrays in sequence vertically (row wise).
dstack
Stack arrays in sequence depth wise (along third dimension).
vsplit
Split array into a list of multiple sub-arrays vertically.
Notes
When called with only scalars,
block
is equivalent to an ndarray call. Soblock([[1, 2], [3, 4]])
is equivalent toarray([[1, 2], [3, 4]])
.This function does not enforce that the blocks lie on a fixed grid.
block([[a, b], [c, d]])
is not restricted to arrays of the form:AAAbb AAAbb cccDD
But is also allowed to produce, for some
a, b, c, d
:AAAbb AAAbb cDDDD
Since concatenation happens along the last axis first, block is _not_ capable of producing the following directly:
AAAbb cccbb cccDD
Matlab’s “square bracket stacking”,
[A, B, ...; p, q, ...]
, is equivalent toblock([[A, B, ...], [p, q, ...]])
.
-
dask.array.
blockwise
(func, out_ind, *args, name=None, token=None, dtype=None, adjust_chunks=None, new_axes=None, align_arrays=True, concatenate=None, meta=None, **kwargs)¶ Tensor operation: Generalized inner and outer products
A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The
blockwise
function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of blockwise including elementwise, broadcasting, reductions, tensordot, and transpose.- Parameters
- funccallable
Function to apply to individual tuples of blocks
- out_inditerable
Block pattern of the output, something like ‘ijk’ or (1, 2, 3)
- *argssequence of Array, index pairs
Sequence like (x, ‘ij’, y, ‘jk’, z, ‘i’)
- **kwargsdict
Extra keyword arguments to pass to function
- dtypenp.dtype
Datatype of resulting array.
- concatenatebool, keyword only
If true concatenate arrays along dummy indices, else provide lists
- adjust_chunksdict
Dictionary mapping index to function to be applied to chunk sizes
- new_axesdict, keyword only
New indexes and their dimension lengths
Examples
2D embarrassingly parallel operation from two arrays, x, and y.
>>> z = blockwise(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8') # z = x + y
Outer product multiplying x by y, two 1-d vectors
>>> z = blockwise(operator.mul, 'ij', x, 'i', y, 'j', dtype='f8')
z = x.T
>>> z = blockwise(np.transpose, 'ji', x, 'ij', dtype=x.dtype)
The transpose case above is illustrative because it does same transposition both on each in-memory block by calling
np.transpose
and on the order of the blocks themselves, by switching the order of the indexij -> ji
.We can compose these same patterns with more variables and more complex in-memory functions
z = X + Y.T
>>> z = blockwise(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8')
Any index, like
i
missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead passconcatenate=True
.Inner product multiplying x by y, two 1-d vectors
>>> def sequence_dot(x_blocks, y_blocks): ... result = 0 ... for x, y in zip(x_blocks, y_blocks): ... result += x.dot(y) ... return result
>>> z = blockwise(sequence_dot, '', x, 'i', y, 'i', dtype='f8')
Add new single-chunk dimensions with the
new_axes=
keyword, including the length of the new dimension. New dimensions will always be in a single chunk.>>> def f(x): ... return x[:, None] * np.ones((1, 5))
>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': 5}, dtype=x.dtype)
New dimensions can also be multi-chunk by specifying a tuple of chunk sizes. This has limited utility as is (because the chunks are all the same), but the resulting graph can be modified to achieve more useful results (see
da.map_blocks
).>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': (5, 5)}, dtype=x.dtype)
If the applied function changes the size of each chunk you can specify this with a
adjust_chunks={...}
dictionary holding a function for each index that modifies the dimension size in that index.>>> def double(x): ... return np.concatenate([x, x])
>>> y = blockwise(double, 'ij', x, 'ij', ... adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype)
Include literals by indexing with None
>>> y = blockwise(add, 'ij', x, 'ij', 1234, None, dtype=x.dtype)
-
dask.array.
broadcast_arrays
(*args, **kwargs)¶ Broadcast any number of arrays against each other.
This docstring was copied from numpy.broadcast_arrays.
Some inconsistencies with the Dask version may exist.
- Parameters
- `*args`array_likes
The arrays to broadcast.
- subokbool, optional
If True, then sub-classes will be passed-through, otherwise the returned arrays will be forced to be a base-class array (default).
- Returns
- broadcastedlist of arrays
These arrays are views on the original arrays. They are typically not contiguous. Furthermore, more than one element of a broadcasted array may refer to a single memory location. If you need to write to the arrays, make copies first. While you can set the
writable
flag True, writing to a single output value may end up changing more than one location in the output array.Deprecated since version 1.17: The output is currently marked so that if written to, a deprecation warning will be emitted. A future version will set the
writable
flag False so writing to it will raise an error.
Examples
>>> x = np.array([[1,2,3]]) >>> y = np.array([[4],[5]]) >>> np.broadcast_arrays(x, y) [array([[1, 2, 3], [1, 2, 3]]), array([[4, 4, 4], [5, 5, 5]])]
Here is a useful idiom for getting contiguous copies instead of non-contiguous views.
>>> [np.array(a) for a in np.broadcast_arrays(x, y)] [array([[1, 2, 3], [1, 2, 3]]), array([[4, 4, 4], [5, 5, 5]])]
-
dask.array.
broadcast_to
(x, shape, chunks=None)¶ Broadcast an array to a new shape.
- Parameters
- xarray_like
The array to broadcast.
- shapetuple
The shape of the desired array.
- chunkstuple, optional
If provided, then the result will use these chunks instead of the same chunks as the source array. Setting chunks explicitly as part of broadcast_to is more efficient than rechunking afterwards. Chunks are only allowed to differ from the original shape along dimensions that are new on the result or have size 1 the input array.
- Returns
- broadcastdask array
See also
-
dask.array.
coarsen
(reduction, x, axes, trim_excess=False, **kwargs) Coarsen array by applying reduction to fixed size neighborhoods
- Parameters
- reduction: function
Function like np.sum, np.mean, etc…
- x: np.ndarray
Array to be coarsened
- axes: dict
Mapping of axis to coarsening factor
Examples
>>> x = np.array([1, 2, 3, 4, 5, 6]) >>> coarsen(np.sum, x, {0: 2}) array([ 3, 7, 11]) >>> coarsen(np.max, x, {0: 3}) array([3, 6])
Provide dictionary of scale per dimension
>>> x = np.arange(24).reshape((4, 6)) >>> x array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]])
>>> coarsen(np.min, x, {0: 2, 1: 3}) array([[ 0, 3], [12, 15]])
You must avoid excess elements explicitly
>>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8]) >>> coarsen(np.min, x, {0: 3}, trim_excess=True) array([1, 4])
-
dask.array.
ceil
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.ceil.
Some inconsistencies with the Dask version may exist.
Return the ceiling of the input, element-wise.
The ceil of the scalar x is the smallest integer i, such that i >= x. It is often denoted as ⌈x⌉.
- Parameters
- xarray_like
Input data.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or scalar
The ceiling of each element in x, with float dtype. This is a scalar if x is a scalar.
Examples
>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]) >>> np.ceil(a) array([-1., -1., -0., 1., 2., 2., 2.])
-
dask.array.
choose
(a, choices)¶ Construct an array from an index array and a set of arrays to choose from.
This docstring was copied from numpy.choose.
Some inconsistencies with the Dask version may exist.
First of all, if confused or uncertain, definitely look at the Examples - in its full generality, this function is less simple than it might seem from the following code description (below ndi = numpy.lib.index_tricks):
np.choose(a,c) == np.array([c[a[I]][I] for I in ndi.ndindex(a.shape)])
.But this omits some subtleties. Here is a fully general summary:
Given an “index” array (a) of integers and a sequence of n arrays (choices), a and each choice array are first broadcast, as necessary, to arrays of a common shape; calling these Ba and Bchoices[i], i = 0,…,n-1 we have that, necessarily,
Ba.shape == Bchoices[i].shape
for each i. Then, a new array with shapeBa.shape
is created as follows:if
mode=raise
(the default), then, first of all, each element of a (and thus Ba) must be in the range [0, n-1]; now, suppose that i (in that range) is the value at the (j0, j1, …, jm) position in Ba - then the value at the same position in the new array is the value in Bchoices[i] at that same position;if
mode=wrap
, values in a (and thus Ba) may be any (signed) integer; modular arithmetic is used to map integers outside the range [0, n-1] back into that range; and then the new array is constructed as above;if
mode=clip
, values in a (and thus Ba) may be any (signed) integer; negative integers are mapped to 0; values greater than n-1 are mapped to n-1; and then the new array is constructed as above.
- Parameters
- aint array
This array must contain integers in [0, n-1], where n is the number of choices, unless
mode=wrap
ormode=clip
, in which cases any integers are permissible.- choicessequence of arrays
Choice arrays. a and all of the choices must be broadcastable to the same shape. If choices is itself an array (not recommended), then its outermost dimension (i.e., the one corresponding to
choices.shape[0]
) is taken as defining the “sequence”.- outarray, optional (Not supported in Dask)
If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype. Note that out is always buffered if mode=’raise’; use other modes for better performance.
- mode{‘raise’ (default), ‘wrap’, ‘clip’}, optional (Not supported in Dask)
Specifies how indices outside [0, n-1] will be treated:
‘raise’ : an exception is raised
‘wrap’ : value becomes value mod n
‘clip’ : values < 0 are mapped to 0, values > n-1 are mapped to n-1
- Returns
- merged_arrayarray
The merged result.
- Raises
- ValueError: shape mismatch
If a and each choice array are not all broadcastable to the same shape.
See also
ndarray.choose
equivalent method
numpy.take_along_axis
Preferable if choices is an array
Notes
To reduce the chance of misinterpretation, even though the following “abuse” is nominally supported, choices should neither be, nor be thought of as, a single array, i.e., the outermost sequence-like container should be either a list or a tuple.
Examples
>>> choices = [[0, 1, 2, 3], [10, 11, 12, 13], ... [20, 21, 22, 23], [30, 31, 32, 33]] >>> np.choose([2, 3, 1, 0], choices ... # the first element of the result will be the first element of the ... # third (2+1) "array" in choices, namely, 20; the second element ... # will be the second element of the fourth (3+1) choice array, i.e., ... # 31, etc. ... ) array([20, 31, 12, 3]) >>> np.choose([2, 4, 1, 0], choices, mode='clip') # 4 goes to 3 (4-1) array([20, 31, 12, 3]) >>> # because there are 4 choice arrays >>> np.choose([2, 4, 1, 0], choices, mode='wrap') # 4 goes to (4 mod 4) array([20, 1, 12, 3]) >>> # i.e., 0
A couple examples illustrating how choose broadcasts:
>>> a = [[1, 0, 1], [0, 1, 0], [1, 0, 1]] >>> choices = [-10, 10] >>> np.choose(a, choices) array([[ 10, -10, 10], [-10, 10, -10], [ 10, -10, 10]])
>>> # With thanks to Anne Archibald >>> a = np.array([0, 1]).reshape((2,1,1)) >>> c1 = np.array([1, 2, 3]).reshape((1,3,1)) >>> c2 = np.array([-1, -2, -3, -4, -5]).reshape((1,1,5)) >>> np.choose(a, (c1, c2)) # result is 2x3x5, res[0,:,:]=c1, res[1,:,:]=c2 array([[[ 1, 1, 1, 1, 1], [ 2, 2, 2, 2, 2], [ 3, 3, 3, 3, 3]], [[-1, -2, -3, -4, -5], [-1, -2, -3, -4, -5], [-1, -2, -3, -4, -5]]])
-
dask.array.
clip
(*args, **kwargs)¶ Clip (limit) the values in an array.
This docstring was copied from numpy.clip.
Some inconsistencies with the Dask version may exist.
Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of
[0, 1]
is specified, values smaller than 0 become 0, and values larger than 1 become 1.Equivalent to but faster than
np.maximum(a_min, np.minimum(a, a_max))
. No check is performed to ensurea_min < a_max
.- Parameters
- aarray_like (Not supported in Dask)
Array containing elements to clip.
- a_minscalar or array_like or None (Not supported in Dask)
Minimum value. If None, clipping is not performed on lower interval edge. Not more than one of a_min and a_max may be None.
- a_maxscalar or array_like or None (Not supported in Dask)
Maximum value. If None, clipping is not performed on upper interval edge. Not more than one of a_min and a_max may be None. If a_min or a_max are array_like, then the three arrays will be broadcasted to match their shapes.
- outndarray, optional (Not supported in Dask)
The results will be placed in this array. It may be the input array for in-place clipping. out must be of the right shape to hold the output. Its type is preserved.
- **kwargs
For other keyword-only arguments, see the ufunc docs.
New in version 1.17.0.
- Returns
- clipped_arrayndarray
An array with the elements of a, but where values < a_min are replaced with a_min, and those > a_max with a_max.
See also
ufuncs-output-type
Examples
>>> a = np.arange(10) >>> np.clip(a, 1, 8) array([1, 1, 2, 3, 4, 5, 6, 7, 8, 8]) >>> a array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> np.clip(a, 3, 6, out=a) array([3, 3, 3, 3, 4, 5, 6, 6, 6, 6]) >>> a = np.arange(10) >>> a array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> np.clip(a, [3, 4, 1, 1, 1, 4, 4, 4, 4, 4], 8) array([3, 4, 2, 3, 4, 5, 6, 7, 8, 8])
-
dask.array.
compress
(condition, a, axis=None)¶ Return selected slices of an array along given axis.
This docstring was copied from numpy.compress.
Some inconsistencies with the Dask version may exist.
When working along a given axis, a slice along that axis is returned in output for each index where condition evaluates to True. When working on a 1-D array, compress is equivalent to extract.
- Parameters
- condition1-D array of bools
Array that selects which entries to return. If len(condition) is less than the size of a along the given axis, then output is truncated to the length of the condition array.
- aarray_like
Array from which to extract a part.
- axisint, optional
Axis along which to take slices. If None (default), work on the flattened array.
- outndarray, optional (Not supported in Dask)
Output array. Its type is preserved and it must be of the right shape to hold the output.
- Returns
- compressed_arrayndarray
A copy of a without the slices along axis for which condition is false.
See also
Examples
>>> a = np.array([[1, 2], [3, 4], [5, 6]]) >>> a array([[1, 2], [3, 4], [5, 6]]) >>> np.compress([0, 1], a, axis=0) array([[3, 4]]) >>> np.compress([False, True, True], a, axis=0) array([[3, 4], [5, 6]]) >>> np.compress([False, True], a, axis=1) array([[2], [4], [6]])
Working on the flattened array does not return slices along an axis but selects elements.
>>> np.compress([False, True], a) array([2])
-
dask.array.
concatenate
(seq, axis=0, allow_unknown_chunksizes=False) Concatenate arrays along an existing axis
Given a sequence of dask Arrays form a new dask Array by stacking them along an existing dimension (axis=0 by default)
- Parameters
- seq: list of dask.arrays
- axis: int
Dimension along which to align all of the arrays
- allow_unknown_chunksizes: bool
Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.
See also
Examples
Create slices
>>> import dask.array as da >>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2)) ... for i in range(3)]
>>> x = da.concatenate(data, axis=0) >>> x.shape (12, 4)
>>> da.concatenate(data, axis=1).shape (4, 12)
Result is a new dask Array
-
dask.array.
conj
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.conjugate.
Some inconsistencies with the Dask version may exist.
Return the complex conjugate, element-wise.
The complex conjugate of a complex number is obtained by changing the sign of its imaginary part.
- Parameters
- xarray_like
Input value.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The complex conjugate of x, with same dtype as y. This is a scalar if x is a scalar.
Notes
conj is an alias for conjugate:
>>> np.conj is np.conjugate True
Examples
>>> np.conjugate(1+2j) (1-2j)
>>> x = np.eye(2) + 1j * np.eye(2) >>> np.conjugate(x) array([[ 1.-1.j, 0.-0.j], [ 0.-0.j, 1.-1.j]])
-
dask.array.
copysign
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.copysign.
Some inconsistencies with the Dask version may exist.
Change the sign of x1 to that of x2, element-wise.
If x2 is a scalar, its sign will be copied to all elements of x1.
- Parameters
- x1array_like
Values to change the sign of.
- x2array_like
The sign of x2 is copied to x1. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
The values of x1 with the sign of x2. This is a scalar if both x1 and x2 are scalars.
Examples
>>> np.copysign(1.3, -1) -1.3 >>> 1/np.copysign(0, 1) inf >>> 1/np.copysign(0, -1) -inf
>>> np.copysign([-1, 0, 1], -1.1) array([-1., -0., -1.]) >>> np.copysign([-1, 0, 1], np.arange(3)-1) array([-1., 0., 1.])
-
dask.array.
corrcoef
(x, y=None, rowvar=1)¶ Return Pearson product-moment correlation coefficients.
This docstring was copied from numpy.corrcoef.
Some inconsistencies with the Dask version may exist.
Please refer to the documentation for cov for more detail. The relationship between the correlation coefficient matrix, R, and the covariance matrix, C, is
Rij=Cij√Cii∗CjjThe values of R are between -1 and 1, inclusive.
- Parameters
- xarray_like
A 1-D or 2-D array containing multiple variables and observations. Each row of x represents a variable, and each column a single observation of all those variables. Also see rowvar below.
- yarray_like, optional
An additional set of variables and observations. y has the same shape as x.
- rowvarbool, optional
If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.
- bias_NoValue, optional (Not supported in Dask)
Has no effect, do not use.
Deprecated since version 1.10.0.
- ddof_NoValue, optional (Not supported in Dask)
Has no effect, do not use.
Deprecated since version 1.10.0.
- Returns
- Rndarray
The correlation coefficient matrix of the variables.
See also
cov
Covariance matrix
Notes
Due to floating point rounding the resulting array may not be Hermitian, the diagonal elements may not be 1, and the elements may not satisfy the inequality abs(a) <= 1. The real and imaginary parts are clipped to the interval [-1, 1] in an attempt to improve on that situation but is not much help in the complex case.
This function accepts but discards arguments bias and ddof. This is for backwards compatibility with previous versions of this function. These arguments had no effect on the return values of the function and can be safely ignored in this and previous versions of numpy.
-
dask.array.
cos
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.cos.
Some inconsistencies with the Dask version may exist.
Cosine element-wise.
- Parameters
- xarray_like
Input array in radians.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The corresponding cosine values. This is a scalar if x is a scalar.
Notes
If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)
References
M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972.
Examples
>>> np.cos(np.array([0, np.pi/2, np.pi])) array([ 1.00000000e+00, 6.12303177e-17, -1.00000000e+00]) >>> >>> # Example of providing the optional output parameter >>> out1 = np.array([0], dtype='d') >>> out2 = np.cos([0.1], out1) >>> out2 is out1 True >>> >>> # Example of ValueError due to provision of shape mis-matched `out` >>> np.cos(np.zeros((3,3)),np.zeros((2,2))) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
-
dask.array.
cosh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.cosh.
Some inconsistencies with the Dask version may exist.
Hyperbolic cosine, element-wise.
Equivalent to
1/2 * (np.exp(x) + np.exp(-x))
andnp.cos(1j*x)
.- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Output array of same shape as x. This is a scalar if x is a scalar.
Examples
>>> np.cosh(0) 1.0
The hyperbolic cosine describes the shape of a hanging cable:
>>> import matplotlib.pyplot as plt >>> x = np.linspace(-4, 4, 1000) >>> plt.plot(x, np.cosh(x)) >>> plt.show()
-
dask.array.
count_nonzero
(a, axis=None)¶ Counts the number of non-zero values in the array
a
.This docstring was copied from numpy.count_nonzero.
Some inconsistencies with the Dask version may exist.
The word “non-zero” is in reference to the Python 2.x built-in method
__nonzero__()
(renamed__bool__()
in Python 3.x) of Python objects that tests an object’s “truthfulness”. For example, any number is considered truthful if it is nonzero, whereas any string is considered truthful if it is not the empty string. Thus, this function (recursively) counts how many elements ina
(and in sub-arrays thereof) have their__nonzero__()
or__bool__()
method evaluated toTrue
.- Parameters
- aarray_like
The array for which to count non-zeros.
- axisint or tuple, optional
Axis or tuple of axes along which to count non-zeros. Default is None, meaning that non-zeros will be counted along a flattened version of
a
.New in version 1.12.0.
- Returns
- countint or array of int
Number of non-zero values in the array along a given axis. Otherwise, the total number of non-zero values in the array is returned.
See also
nonzero
Return the coordinates of all the non-zero values.
Examples
>>> np.count_nonzero(np.eye(4)) 4 >>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]]) 5 >>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]], axis=0) array([1, 1, 1, 1, 1]) >>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]], axis=1) array([2, 3])
-
dask.array.
cov
(m, y=None, rowvar=1, bias=0, ddof=None)¶ Estimate a covariance matrix, given data and weights.
This docstring was copied from numpy.cov.
Some inconsistencies with the Dask version may exist.
Covariance indicates the level to which two variables vary together. If we examine N-dimensional samples, X=[x1,x2,...xN]T, then the covariance matrix element Cij is the covariance of xi and xj. The element Cii is the variance of xi.
See the notes for an outline of the algorithm.
- Parameters
- marray_like
A 1-D or 2-D array containing multiple variables and observations. Each row of m represents a variable, and each column a single observation of all those variables. Also see rowvar below.
- yarray_like, optional
An additional set of variables and observations. y has the same form as that of m.
- rowvarbool, optional
If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.
- biasbool, optional
Default normalization (False) is by
(N - 1)
, whereN
is the number of observations given (unbiased estimate). If bias is True, then normalization is byN
. These values can be overridden by using the keywordddof
in numpy versions >= 1.5.- ddofint, optional
If not
None
the default value implied by bias is overridden. Note thatddof=1
will return the unbiased estimate, even if both fweights and aweights are specified, andddof=0
will return the simple average. See the notes for the details. The default value isNone
.New in version 1.5.
- fweightsarray_like, int, optional (Not supported in Dask)
1-D array of integer frequency weights; the number of times each observation vector should be repeated.
New in version 1.10.
- aweightsarray_like, optional (Not supported in Dask)
1-D array of observation vector weights. These relative weights are typically large for observations considered “important” and smaller for observations considered less “important”. If
ddof=0
the array of weights can be used to assign probabilities to observation vectors.New in version 1.10.
- Returns
- outndarray
The covariance matrix of the variables.
See also
corrcoef
Normalized covariance matrix
Notes
Assume that the observations are in the columns of the observation array m and let
f = fweights
anda = aweights
for brevity. The steps to compute the weighted covariance are as follows:>>> m = np.arange(10, dtype=np.float64) >>> f = np.arange(10) * 2 >>> a = np.arange(10) ** 2. >>> ddof = 1 >>> w = f * a >>> v1 = np.sum(w) >>> v2 = np.sum(w * a) >>> m -= np.sum(m * w, axis=None, keepdims=True) / v1 >>> cov = np.dot(m * w, m.T) * v1 / (v1**2 - ddof * v2)
Note that when
a == 1
, the normalization factorv1 / (v1**2 - ddof * v2)
goes over to1 / (np.sum(f) - ddof)
as it should.Examples
Consider two variables, x0 and x1, which correlate perfectly, but in opposite directions:
>>> x = np.array([[0, 2], [1, 1], [2, 0]]).T >>> x array([[0, 1, 2], [2, 1, 0]])
Note how x0 increases while x1 decreases. The covariance matrix shows this clearly:
>>> np.cov(x) array([[ 1., -1.], [-1., 1.]])
Note that element C0,1, which shows the correlation between x0 and x1, is negative.
Further, note how x and y are combined:
>>> x = [-2.1, -1, 4.3] >>> y = [3, 1.1, 0.12] >>> X = np.stack((x, y), axis=0) >>> np.cov(X) array([[11.71 , -4.286 ], # may vary [-4.286 , 2.144133]]) >>> np.cov(x, y) array([[11.71 , -4.286 ], # may vary [-4.286 , 2.144133]]) >>> np.cov(x) array(11.71)
-
dask.array.
cumprod
(x, axis=None, dtype=None, out=None)¶ Return the cumulative product of elements along a given axis.
This docstring was copied from numpy.cumprod.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like (Not supported in Dask)
Input array.
- axisint, optional
Axis along which the cumulative product is computed. By default the input is flattened.
- dtypedtype, optional
Type of the returned array, as well as of the accumulator in which the elements are multiplied. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used instead.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type of the resulting values will be cast if necessary.
- Returns
- cumprodndarray
A new array holding the result is returned unless out is specified, in which case a reference to out is returned.
See also
ufuncs-output-type
Notes
Arithmetic is modular when using integer types, and no error is raised on overflow.
Examples
>>> a = np.array([1,2,3]) >>> np.cumprod(a) # intermediate results 1, 1*2 ... # total product 1*2*3 = 6 array([1, 2, 6]) >>> a = np.array([[1, 2, 3], [4, 5, 6]]) >>> np.cumprod(a, dtype=float) # specify type of output array([ 1., 2., 6., 24., 120., 720.])
The cumulative product for each column (i.e., over the rows) of a:
>>> np.cumprod(a, axis=0) array([[ 1, 2, 3], [ 4, 10, 18]])
The cumulative product for each row (i.e. over the columns) of a:
>>> np.cumprod(a,axis=1) array([[ 1, 2, 6], [ 4, 20, 120]])
-
dask.array.
cumsum
(x, axis=None, dtype=None, out=None)¶ Return the cumulative sum of the elements along a given axis.
This docstring was copied from numpy.cumsum.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like (Not supported in Dask)
Input array.
- axisint, optional
Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array.
- dtypedtype, optional
Type of the returned array and of the accumulator in which the elements are summed. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type will be cast if necessary. See ufuncs-output-type for more details.
- Returns
- cumsum_along_axisndarray.
A new array holding the result is returned unless out is specified, in which case a reference to out is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.
See also
Notes
Arithmetic is modular when using integer types, and no error is raised on overflow.
Examples
>>> a = np.array([[1,2,3], [4,5,6]]) >>> a array([[1, 2, 3], [4, 5, 6]]) >>> np.cumsum(a) array([ 1, 3, 6, 10, 15, 21]) >>> np.cumsum(a, dtype=float) # specifies type of output value(s) array([ 1., 3., 6., 10., 15., 21.])
>>> np.cumsum(a,axis=0) # sum over rows for each of the 3 columns array([[1, 2, 3], [5, 7, 9]]) >>> np.cumsum(a,axis=1) # sum over columns for each of the 2 rows array([[ 1, 3, 6], [ 4, 9, 15]])
-
dask.array.
deg2rad
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.deg2rad.
Some inconsistencies with the Dask version may exist.
Convert angles from degrees to radians.
- Parameters
- xarray_like
Angles in degrees.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The corresponding angle in radians. This is a scalar if x is a scalar.
See also
rad2deg
Convert angles from radians to degrees.
unwrap
Remove large jumps in angle by wrapping.
Notes
New in version 1.3.0.
deg2rad(x)
isx * pi / 180
.Examples
>>> np.deg2rad(180) 3.1415926535897931
-
dask.array.
degrees
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.degrees.
Some inconsistencies with the Dask version may exist.
Convert angles from radians to degrees.
- Parameters
- xarray_like
Input array in radians.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray of floats
The corresponding degree values; if out was supplied this is a reference to it. This is a scalar if x is a scalar.
See also
rad2deg
equivalent function
Examples
Convert a radian array to degrees
>>> rad = np.arange(12.)*np.pi/6 >>> np.degrees(rad) array([ 0., 30., 60., 90., 120., 150., 180., 210., 240., 270., 300., 330.])
>>> out = np.zeros((rad.shape)) >>> r = np.degrees(rad, out) >>> np.all(r == out) True
-
dask.array.
diag
(v)¶ Extract a diagonal or construct a diagonal array.
This docstring was copied from numpy.diag.
Some inconsistencies with the Dask version may exist.
See the more detailed documentation for
numpy.diagonal
if you use this function to extract a diagonal and wish to write to the resulting array; whether it returns a copy or a view depends on what version of numpy you are using.- Parameters
- varray_like
If v is a 2-D array, return a copy of its k-th diagonal. If v is a 1-D array, return a 2-D array with v on the k-th diagonal.
- kint, optional (Not supported in Dask)
Diagonal in question. The default is 0. Use k>0 for diagonals above the main diagonal, and k<0 for diagonals below the main diagonal.
- Returns
- outndarray
The extracted diagonal or constructed diagonal array.
See also
Examples
>>> x = np.arange(9).reshape((3,3)) >>> x array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
>>> np.diag(x) array([0, 4, 8]) >>> np.diag(x, k=1) array([1, 5]) >>> np.diag(x, k=-1) array([3, 7])
>>> np.diag(np.diag(x)) array([[0, 0, 0], [0, 4, 0], [0, 0, 8]])
-
dask.array.
diagonal
(a, offset=0, axis1=0, axis2=1)¶ Return specified diagonals.
This docstring was copied from numpy.diagonal.
Some inconsistencies with the Dask version may exist.
If a is 2-D, returns the diagonal of a with the given offset, i.e., the collection of elements of the form
a[i, i+offset]
. If a has more than two dimensions, then the axes specified by axis1 and axis2 are used to determine the 2-D sub-array whose diagonal is returned. The shape of the resulting array can be determined by removing axis1 and axis2 and appending an index to the right equal to the size of the resulting diagonals.In versions of NumPy prior to 1.7, this function always returned a new, independent array containing a copy of the values in the diagonal.
In NumPy 1.7 and 1.8, it continues to return a copy of the diagonal, but depending on this fact is deprecated. Writing to the resulting array continues to work as it used to, but a FutureWarning is issued.
Starting in NumPy 1.9 it returns a read-only view on the original array. Attempting to write to the resulting array will produce an error.
In some future release, it will return a read/write view and writing to the returned array will alter your original array. The returned array will have the same type as the input array.
If you don’t write to the array returned by this function, then you can just ignore all of the above.
If you depend on the current behavior, then we suggest copying the returned array explicitly, i.e., use
np.diagonal(a).copy()
instead of justnp.diagonal(a)
. This will work with both past and future versions of NumPy.- Parameters
- aarray_like
Array from which the diagonals are taken.
- offsetint, optional
Offset of the diagonal from the main diagonal. Can be positive or negative. Defaults to main diagonal (0).
- axis1int, optional
Axis to be used as the first axis of the 2-D sub-arrays from which the diagonals should be taken. Defaults to first axis (0).
- axis2int, optional
Axis to be used as the second axis of the 2-D sub-arrays from which the diagonals should be taken. Defaults to second axis (1).
- Returns
- array_of_diagonalsndarray
If a is 2-D, then a 1-D array containing the diagonal and of the same type as a is returned unless a is a matrix, in which case a 1-D array rather than a (2-D) matrix is returned in order to maintain backward compatibility.
If
a.ndim > 2
, then the dimensions specified by axis1 and axis2 are removed, and a new axis inserted at the end corresponding to the diagonal.
- Raises
- ValueError
If the dimension of a is less than 2.
See also
diag
MATLAB work-a-like for 1-D and 2-D arrays.
diagflat
Create diagonal arrays.
trace
Sum along diagonals.
Examples
>>> a = np.arange(4).reshape(2,2) >>> a array([[0, 1], [2, 3]]) >>> a.diagonal() array([0, 3]) >>> a.diagonal(1) array([1])
A 3-D example:
>>> a = np.arange(8).reshape(2,2,2); a array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]]) >>> a.diagonal(0, # Main diagonals of two arrays created by skipping ... 0, # across the outer(left)-most axis last and ... 1) # the "middle" (row) axis first. array([[0, 6], [1, 7]])
The sub-arrays whose main diagonals we just obtained; note that each corresponds to fixing the right-most (column) axis, and that the diagonals are “packed” in rows.
>>> a[:,:,0] # main diagonal is [0 6] array([[0, 2], [4, 6]]) >>> a[:,:,1] # main diagonal is [1 7] array([[1, 3], [5, 7]])
The anti-diagonal can be obtained by reversing the order of elements using either numpy.flipud or numpy.fliplr.
>>> a = np.arange(9).reshape(3, 3) >>> a array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) >>> np.fliplr(a).diagonal() # Horizontal flip array([2, 4, 6]) >>> np.flipud(a).diagonal() # Vertical flip array([6, 4, 2])
Note that the order in which the diagonal is retrieved varies depending on the flip function.
-
dask.array.
diff
(a, n=1, axis=-1)¶ Calculate the n-th discrete difference along the given axis.
This docstring was copied from numpy.diff.
Some inconsistencies with the Dask version may exist.
The first difference is given by
out[i] = a[i+1] - a[i]
along the given axis, higher differences are calculated by using diff recursively.- Parameters
- aarray_like
Input array
- nint, optional
The number of times values are differenced. If zero, the input is returned as-is.
- axisint, optional
The axis along which the difference is taken, default is the last axis.
- prepend, appendarray_like, optional
Values to prepend or append to a along axis prior to performing the difference. Scalar values are expanded to arrays with length 1 in the direction of axis and the shape of the input array in along all other axes. Otherwise the dimension and shape must match a except along axis.
New in version 1.16.0.
- Returns
- diffndarray
The n-th differences. The shape of the output is the same as a except along axis where the dimension is smaller by n. The type of the output is the same as the type of the difference between any two elements of a. This is the same as the type of a in most cases. A notable exception is datetime64, which results in a timedelta64 output array.
Notes
Type is preserved for boolean arrays, so the result will contain False when consecutive elements are the same and True when they differ.
For unsigned integer arrays, the results will also be unsigned. This should not be surprising, as the result is consistent with calculating the difference directly:
>>> u8_arr = np.array([1, 0], dtype=np.uint8) >>> np.diff(u8_arr) array([255], dtype=uint8) >>> u8_arr[1,...] - u8_arr[0,...] 255
If this is not desirable, then the array should be cast to a larger integer type first:
>>> i16_arr = u8_arr.astype(np.int16) >>> np.diff(i16_arr) array([-1], dtype=int16)
Examples
>>> x = np.array([1, 2, 4, 7, 0]) >>> np.diff(x) array([ 1, 2, 3, -7]) >>> np.diff(x, n=2) array([ 1, 1, -10])
>>> x = np.array([[1, 3, 6, 10], [0, 5, 6, 8]]) >>> np.diff(x) array([[2, 3, 4], [5, 1, 2]]) >>> np.diff(x, axis=0) array([[-1, 2, 0, -2]])
>>> x = np.arange('1066-10-13', '1066-10-16', dtype=np.datetime64) >>> np.diff(x) array([1, 1], dtype='timedelta64[D]')
-
dask.array.
digitize
(a, bins, right=False)¶ Return the indices of the bins to which each value in input array belongs.
This docstring was copied from numpy.digitize.
Some inconsistencies with the Dask version may exist.
right
order of bins
returned index i satisfies
False
increasing
bins[i-1] <= x < bins[i]
True
increasing
bins[i-1] < x <= bins[i]
False
decreasing
bins[i-1] > x >= bins[i]
True
decreasing
bins[i-1] >= x > bins[i]
If values in x are beyond the bounds of bins, 0 or
len(bins)
is returned as appropriate.- Parameters
- xarray_like (Not supported in Dask)
Input array to be binned. Prior to NumPy 1.10.0, this array had to be 1-dimensional, but can now have any shape.
- binsarray_like
Array of bins. It has to be 1-dimensional and monotonic.
- rightbool, optional
Indicating whether the intervals include the right or the left bin edge. Default behavior is (right==False) indicating that the interval does not include the right edge. The left bin end is open in this case, i.e., bins[i-1] <= x < bins[i] is the default behavior for monotonically increasing bins.
- Returns
- indicesndarray of ints
Output array of indices, of same shape as x.
- Raises
- ValueError
If bins is not monotonic.
- TypeError
If the type of the input is complex.
Notes
If values in x are such that they fall outside the bin range, attempting to index bins with the indices that digitize returns will result in an IndexError.
New in version 1.10.0.
np.digitize is implemented in terms of np.searchsorted. This means that a binary search is used to bin the values, which scales much better for larger number of bins than the previous linear search. It also removes the requirement for the input array to be 1-dimensional.
For monotonically _increasing_ bins, the following are equivalent:
np.digitize(x, bins, right=True) np.searchsorted(bins, x, side='left')
Note that as the order of the arguments are reversed, the side must be too. The searchsorted call is marginally faster, as it does not do any monotonicity checks. Perhaps more importantly, it supports all dtypes.
Examples
>>> x = np.array([0.2, 6.4, 3.0, 1.6]) >>> bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0]) >>> inds = np.digitize(x, bins) >>> inds array([1, 4, 3, 2]) >>> for n in range(x.size): ... print(bins[inds[n]-1], "<=", x[n], "<", bins[inds[n]]) ... 0.0 <= 0.2 < 1.0 4.0 <= 6.4 < 10.0 2.5 <= 3.0 < 4.0 1.0 <= 1.6 < 2.5
>>> x = np.array([1.2, 10.0, 12.4, 15.5, 20.]) >>> bins = np.array([0, 5, 10, 15, 20]) >>> np.digitize(x,bins,right=True) array([1, 2, 3, 4, 4]) >>> np.digitize(x,bins,right=False) array([1, 3, 3, 4, 5])
-
dask.array.
dot
(a, b, out=None)¶ This docstring was copied from numpy.dot.
Some inconsistencies with the Dask version may exist.
Dot product of two arrays. Specifically,
If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
If both a and b are 2-D arrays, it is matrix multiplication, but using
matmul()
ora @ b
is preferred.If either a or b is 0-D (scalar), it is equivalent to
multiply()
and usingnumpy.multiply(a, b)
ora * b
is preferred.If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
If a is an N-D array and b is an M-D array (where
M>=2
), it is a sum product over the last axis of a and the second-to-last axis of b:dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
- Parameters
- aarray_like
First argument.
- barray_like
Second argument.
- outndarray, optional
Output argument. This must have the exact kind that would be returned if it was not used. In particular, it must have the right type, must be C-contiguous, and its dtype must be the dtype that would be returned for dot(a,b). This is a performance feature. Therefore, if these conditions are not met, an exception is raised, instead of attempting to be flexible.
- Returns
- outputndarray
Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned. If out is given, then it is returned.
- Raises
- ValueError
If the last dimension of a is not the same size as the second-to-last dimension of b.
See also
Examples
>>> np.dot(3, 4) 12
Neither argument is complex-conjugated:
>>> np.dot([2j, 3j], [2j, 3j]) (-13+0j)
For 2-D arrays it is the matrix product:
>>> a = [[1, 0], [0, 1]] >>> b = [[4, 1], [2, 2]] >>> np.dot(a, b) array([[4, 1], [2, 2]])
>>> a = np.arange(3*4*5*6).reshape((3,4,5,6)) >>> b = np.arange(3*4*5*6)[::-1].reshape((5,4,6,3)) >>> np.dot(a, b)[2,3,2,1,2,2] 499128 >>> sum(a[2,3,2,:] * b[1,2,:,2]) 499128
-
dask.array.
dstack
(tup, allow_unknown_chunksizes=False)¶ Stack arrays in sequence depth wise (along third axis).
This docstring was copied from numpy.dstack.
Some inconsistencies with the Dask version may exist.
This is equivalent to concatenation along the third axis after 2-D arrays of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of shape (N,) have been reshaped to (1,N,1). Rebuilds arrays divided by dsplit.
This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations.
- Parameters
- tupsequence of arrays
The arrays must have the same shape along all but the third axis. 1-D or 2-D arrays must have the same shape.
- Returns
- stackedndarray
The array formed by stacking the given arrays, will be at least 3-D.
See also
stack
Join a sequence of arrays along a new axis.
vstack
Stack along first axis.
hstack
Stack along second axis.
concatenate
Join a sequence of arrays along an existing axis.
dsplit
Split array along third axis.
Examples
>>> a = np.array((1,2,3)) >>> b = np.array((2,3,4)) >>> np.dstack((a,b)) array([[[1, 2], [2, 3], [3, 4]]])
>>> a = np.array([[1],[2],[3]]) >>> b = np.array([[2],[3],[4]]) >>> np.dstack((a,b)) array([[[1, 2]], [[2, 3]], [[3, 4]]])
-
dask.array.
ediff1d
(ary, to_end=None, to_begin=None)¶ The differences between consecutive elements of an array.
This docstring was copied from numpy.ediff1d.
Some inconsistencies with the Dask version may exist.
- Parameters
- aryarray_like
If necessary, will be flattened before the differences are taken.
- to_endarray_like, optional
Number(s) to append at the end of the returned differences.
- to_beginarray_like, optional
Number(s) to prepend at the beginning of the returned differences.
- Returns
- ediff1dndarray
The differences. Loosely, this is
ary.flat[1:] - ary.flat[:-1]
.
Notes
When applied to masked arrays, this function drops the mask information if the to_begin and/or to_end parameters are used.
Examples
>>> x = np.array([1, 2, 4, 7, 0]) >>> np.ediff1d(x) array([ 1, 2, 3, -7])
>>> np.ediff1d(x, to_begin=-99, to_end=np.array([88, 99])) array([-99, 1, 2, ..., -7, 88, 99])
The returned array is always 1D.
>>> y = [[1, 2, 4], [1, 6, 24]] >>> np.ediff1d(y) array([ 1, 2, -3, 5, 18])
-
dask.array.
empty
(*args, **kwargs)¶ Blocked variant of empty
Follows the signature of empty exactly except that it also features optional keyword arguments
chunks: int, tuple, or dict
andname: str
.Original signature follows below. empty(shape, dtype=float, order=’C’)
Return a new array of given shape and type, without initializing entries.
- Parameters
- shapeint or tuple of int
Shape of the empty array, e.g.,
(2, 3)
or2
.- dtypedata-type, optional
Desired output data-type for the array, e.g, numpy.int8. Default is numpy.float64.
- order{‘C’, ‘F’}, optional, default: ‘C’
Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
- Returns
- outndarray
Array of uninitialized (arbitrary) data of the given shape, dtype, and order. Object arrays will be initialized to None.
See also
empty_like
Return an empty array with shape and type of input.
ones
Return a new array setting values to one.
zeros
Return a new array setting values to zero.
full
Return a new array of given shape filled with value.
Notes
empty, unlike zeros, does not set the array values to zero, and may therefore be marginally faster. On the other hand, it requires the user to manually set all the values in the array, and should be used with caution.
Examples
>>> np.empty([2, 2]) array([[ -9.74499359e+001, 6.69583040e-309], [ 2.13182611e-314, 3.06959433e-309]]) #uninitialized
>>> np.empty([2, 2], dtype=int) array([[-1073741821, -1067949133], [ 496041986, 19249760]]) #uninitialized
-
dask.array.
empty_like
(a, dtype=None, order='C', chunks=None, name=None, shape=None)¶ Return a new array with the same shape and type as a given array.
- Parameters
- aarray_like
The shape and data-type of a define these same attributes of the returned array.
- dtypedata-type, optional
Overrides the data type of the result.
- order{‘C’, ‘F’}, optional
Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.
- chunkssequence of ints
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.- namestr, optional
An optional keyname for the array. Defaults to hashing the input keyword arguments.
- shapeint or sequence of ints, optional.
Overrides the shape of the result.
- Returns
- outndarray
Array of uninitialized (arbitrary) data with the same shape and type as a.
See also
ones_like
Return an array of ones with shape and type of input.
zeros_like
Return an array of zeros with shape and type of input.
empty
Return a new uninitialized array.
ones
Return a new array setting values to one.
zeros
Return a new array setting values to zero.
Notes
This function does not initialize the returned array; to do that use zeros_like or ones_like instead. It may be marginally faster than the functions that do set the array values.
-
dask.array.
einsum
(subscripts, *operands, out=None, dtype=None, order='K', casting='safe', optimize=False)¶ This docstring was copied from numpy.einsum.
Some inconsistencies with the Dask version may exist.
Evaluates the Einstein summation convention on the operands.
Using the Einstein summation convention, many common multi-dimensional, linear algebraic array operations can be represented in a simple fashion. In implicit mode einsum computes these values.
In explicit mode, einsum provides further flexibility to compute other array operations that might not be considered classical Einstein summation operations, by disabling, or forcing summation over specified subscript labels.
See the notes and examples for clarification.
- Parameters
- subscriptsstr
Specifies the subscripts for summation as comma separated list of subscript labels. An implicit (classical Einstein summation) calculation is performed unless the explicit indicator ‘->’ is included as well as subscript labels of the precise output form.
- operandslist of array_like
These are the arrays for the operation.
- outndarray, optional
If provided, the calculation is done into this array.
- dtype{data-type, None}, optional
If provided, forces the calculation to use the data type specified. Note that you may have to also give a more liberal casting parameter to allow the conversions. Default is None.
- order{‘C’, ‘F’, ‘A’, ‘K’}, optional
Controls the memory layout of the output. ‘C’ means it should be C contiguous. ‘F’ means it should be Fortran contiguous, ‘A’ means it should be ‘F’ if the inputs are all ‘F’, ‘C’ otherwise. ‘K’ means it should be as close to the layout as the inputs as is possible, including arbitrarily permuted axes. Default is ‘K’.
- casting{‘no’, ‘equiv’, ‘safe’, ‘same_kind’, ‘unsafe’}, optional
Controls what kind of data casting may occur. Setting this to ‘unsafe’ is not recommended, as it can adversely affect accumulations.
‘no’ means the data types should not be cast at all.
‘equiv’ means only byte-order changes are allowed.
‘safe’ means only casts which can preserve values are allowed.
‘same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed.
‘unsafe’ means any data conversions may be done.
Default is ‘safe’.
- optimize{False, True, ‘greedy’, ‘optimal’}, optional
Controls if intermediate optimization should occur. No optimization will occur if False and True will default to the ‘greedy’ algorithm. Also accepts an explicit contraction list from the
np.einsum_path
function. Seenp.einsum_path
for more details. Defaults to False.
- Returns
- outputndarray
The calculation based on the Einstein summation convention.
Notes
New in version 1.6.0.
The Einstein summation convention can be used to compute many multi-dimensional, linear algebraic array operations. einsum provides a succinct way of representing these.
A non-exhaustive list of these operations, which can be computed by einsum, is shown below along with examples:
Trace of an array,
numpy.trace()
.Return a diagonal,
numpy.diag()
.Array axis summations,
numpy.sum()
.Transpositions and permutations,
numpy.transpose()
.Matrix multiplication and dot product,
numpy.matmul()
numpy.dot()
.Vector inner and outer products,
numpy.inner()
numpy.outer()
.Broadcasting, element-wise and scalar multiplication,
numpy.multiply()
.Tensor contractions,
numpy.tensordot()
.Chained array operations, in efficient calculation order,
numpy.einsum_path()
.
The subscripts string is a comma-separated list of subscript labels, where each label refers to a dimension of the corresponding operand. Whenever a label is repeated it is summed, so
np.einsum('i,i', a, b)
is equivalent tonp.inner(a,b)
. If a label appears only once, it is not summed, sonp.einsum('i', a)
produces a view ofa
with no changes. A further examplenp.einsum('ij,jk', a, b)
describes traditional matrix multiplication and is equivalent tonp.matmul(a,b)
. Repeated subscript labels in one operand take the diagonal. For example,np.einsum('ii', a)
is equivalent tonp.trace(a)
.In implicit mode, the chosen subscripts are important since the axes of the output are reordered alphabetically. This means that
np.einsum('ij', a)
doesn’t affect a 2D array, whilenp.einsum('ji', a)
takes its transpose. Additionally,np.einsum('ij,jk', a, b)
returns a matrix multiplication, while,np.einsum('ij,jh', a, b)
returns the transpose of the multiplication since subscript ‘h’ precedes subscript ‘i’.In explicit mode the output can be directly controlled by specifying output subscript labels. This requires the identifier ‘->’ as well as the list of output subscript labels. This feature increases the flexibility of the function since summing can be disabled or forced when required. The call
np.einsum('i->', a)
is likenp.sum(a, axis=-1)
, andnp.einsum('ii->i', a)
is likenp.diag(a)
. The difference is that einsum does not allow broadcasting by default. Additionallynp.einsum('ij,jh->ih', a, b)
directly specifies the order of the output subscript labels and therefore returns matrix multiplication, unlike the example above in implicit mode.To enable and control broadcasting, use an ellipsis. Default NumPy-style broadcasting is done by adding an ellipsis to the left of each term, like
np.einsum('...ii->...i', a)
. To take the trace along the first and last axes, you can donp.einsum('i...i', a)
, or to do a matrix-matrix product with the left-most indices instead of rightmost, one can donp.einsum('ij...,jk...->ik...', a, b)
.When there is only one operand, no axes are summed, and no output parameter is provided, a view into the operand is returned instead of a new array. Thus, taking the diagonal as
np.einsum('ii->i', a)
produces a view (changed in version 1.10.0).einsum also provides an alternative way to provide the subscripts and operands as
einsum(op0, sublist0, op1, sublist1, ..., [sublistout])
. If the output shape is not provided in this format einsum will be calculated in implicit mode, otherwise it will be performed explicitly. The examples below have corresponding einsum calls with the two parameter methods.New in version 1.10.0.
Views returned from einsum are now writeable whenever the input array is writeable. For example,
np.einsum('ijk...->kji...', a)
will now have the same effect asnp.swapaxes(a, 0, 2)
andnp.einsum('ii->i', a)
will return a writeable view of the diagonal of a 2D array.New in version 1.12.0.
Added the
optimize
argument which will optimize the contraction order of an einsum expression. For a contraction with three or more operands this can greatly increase the computational efficiency at the cost of a larger memory footprint during computation.Typically a ‘greedy’ algorithm is applied which empirical tests have shown returns the optimal path in the majority of cases. In some cases ‘optimal’ will return the superlative path through a more expensive, exhaustive search. For iterative calculations it may be advisable to calculate the optimal path once and reuse that path by supplying it as an argument. An example is given below.
See
numpy.einsum_path()
for more details.Examples
>>> a = np.arange(25).reshape(5,5) >>> b = np.arange(5) >>> c = np.arange(6).reshape(2,3)
Trace of a matrix:
>>> np.einsum('ii', a) 60 >>> np.einsum(a, [0,0]) 60 >>> np.trace(a) 60
Extract the diagonal (requires explicit form):
>>> np.einsum('ii->i', a) array([ 0, 6, 12, 18, 24]) >>> np.einsum(a, [0,0], [0]) array([ 0, 6, 12, 18, 24]) >>> np.diag(a) array([ 0, 6, 12, 18, 24])
Sum over an axis (requires explicit form):
>>> np.einsum('ij->i', a) array([ 10, 35, 60, 85, 110]) >>> np.einsum(a, [0,1], [0]) array([ 10, 35, 60, 85, 110]) >>> np.sum(a, axis=1) array([ 10, 35, 60, 85, 110])
For higher dimensional arrays summing a single axis can be done with ellipsis:
>>> np.einsum('...j->...', a) array([ 10, 35, 60, 85, 110]) >>> np.einsum(a, [Ellipsis,1], [Ellipsis]) array([ 10, 35, 60, 85, 110])
Compute a matrix transpose, or reorder any number of axes:
>>> np.einsum('ji', c) array([[0, 3], [1, 4], [2, 5]]) >>> np.einsum('ij->ji', c) array([[0, 3], [1, 4], [2, 5]]) >>> np.einsum(c, [1,0]) array([[0, 3], [1, 4], [2, 5]]) >>> np.transpose(c) array([[0, 3], [1, 4], [2, 5]])
Vector inner products:
>>> np.einsum('i,i', b, b) 30 >>> np.einsum(b, [0], b, [0]) 30 >>> np.inner(b,b) 30
Matrix vector multiplication:
>>> np.einsum('ij,j', a, b) array([ 30, 80, 130, 180, 230]) >>> np.einsum(a, [0,1], b, [1]) array([ 30, 80, 130, 180, 230]) >>> np.dot(a, b) array([ 30, 80, 130, 180, 230]) >>> np.einsum('...j,j', a, b) array([ 30, 80, 130, 180, 230])
Broadcasting and scalar multiplication:
>>> np.einsum('..., ...', 3, c) array([[ 0, 3, 6], [ 9, 12, 15]]) >>> np.einsum(',ij', 3, c) array([[ 0, 3, 6], [ 9, 12, 15]]) >>> np.einsum(3, [Ellipsis], c, [Ellipsis]) array([[ 0, 3, 6], [ 9, 12, 15]]) >>> np.multiply(3, c) array([[ 0, 3, 6], [ 9, 12, 15]])
Vector outer product:
>>> np.einsum('i,j', np.arange(2)+1, b) array([[0, 1, 2, 3, 4], [0, 2, 4, 6, 8]]) >>> np.einsum(np.arange(2)+1, [0], b, [1]) array([[0, 1, 2, 3, 4], [0, 2, 4, 6, 8]]) >>> np.outer(np.arange(2)+1, b) array([[0, 1, 2, 3, 4], [0, 2, 4, 6, 8]])
Tensor contraction:
>>> a = np.arange(60.).reshape(3,4,5) >>> b = np.arange(24.).reshape(4,3,2) >>> np.einsum('ijk,jil->kl', a, b) array([[4400., 4730.], [4532., 4874.], [4664., 5018.], [4796., 5162.], [4928., 5306.]]) >>> np.einsum(a, [0,1,2], b, [1,0,3], [2,3]) array([[4400., 4730.], [4532., 4874.], [4664., 5018.], [4796., 5162.], [4928., 5306.]]) >>> np.tensordot(a,b, axes=([1,0],[0,1])) array([[4400., 4730.], [4532., 4874.], [4664., 5018.], [4796., 5162.], [4928., 5306.]])
Writeable returned arrays (since version 1.10.0):
>>> a = np.zeros((3, 3)) >>> np.einsum('ii->i', a)[:] = 1 >>> a array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
Example of ellipsis use:
>>> a = np.arange(6).reshape((3,2)) >>> b = np.arange(12).reshape((4,3)) >>> np.einsum('ki,jk->ij', a, b) array([[10, 28, 46, 64], [13, 40, 67, 94]]) >>> np.einsum('ki,...k->i...', a, b) array([[10, 28, 46, 64], [13, 40, 67, 94]]) >>> np.einsum('k...,jk', a, b) array([[10, 28, 46, 64], [13, 40, 67, 94]])
Chained array operations. For more complicated contractions, speed ups might be achieved by repeatedly computing a ‘greedy’ path or pre-computing the ‘optimal’ path and repeatedly applying it, using an einsum_path insertion (since version 1.12.0). Performance improvements can be particularly significant with larger arrays:
>>> a = np.ones(64).reshape(2,4,8)
Basic einsum: ~1520ms (benchmarked on 3.1GHz Intel i5.)
>>> for iteration in range(500): ... _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a)
Sub-optimal einsum (due to repeated path calculation time): ~330ms
>>> for iteration in range(500): ... _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize='optimal')
Greedy einsum (faster optimal path approximation): ~160ms
>>> for iteration in range(500): ... _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize='greedy')
Optimal einsum (best usage pattern in some use cases): ~110ms
>>> path = np.einsum_path('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize='optimal')[0] >>> for iteration in range(500): ... _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize=path)
-
dask.array.
exp
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.exp.
Some inconsistencies with the Dask version may exist.
Calculate the exponential of all elements in the input array.
- Parameters
- xarray_like
Input values.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Output array, element-wise exponential of x. This is a scalar if x is a scalar.
See also
expm1
Calculate
exp(x) - 1
for all elements in the array.exp2
Calculate
2**x
for all elements in the array.
Notes
The irrational number
e
is also known as Euler’s number. It is approximately 2.718281, and is the base of the natural logarithm,ln
(this means that, if x=lny=logey, then ex=y. For real input,exp(x)
is always positive.For complex arguments,
x = a + ib
, we can write ex=eaeib. The first term, ea, is already known (it is the real argument, described above). The second term, eib, is cosb+isinb, a function with magnitude 1 and a periodic phase.References
- 1
Wikipedia, “Exponential function”, https://en.wikipedia.org/wiki/Exponential_function
- 2
M. Abramovitz and I. A. Stegun, “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables,” Dover, 1964, p. 69, http://www.math.sfu.ca/~cbm/aands/page_69.htm
Examples
Plot the magnitude and phase of
exp(x)
in the complex plane:>>> import matplotlib.pyplot as plt
>>> x = np.linspace(-2*np.pi, 2*np.pi, 100) >>> xx = x + 1j * x[:, np.newaxis] # a + ib over complex plane >>> out = np.exp(xx)
>>> plt.subplot(121) >>> plt.imshow(np.abs(out), ... extent=[-2*np.pi, 2*np.pi, -2*np.pi, 2*np.pi], cmap='gray') >>> plt.title('Magnitude of exp(x)')
>>> plt.subplot(122) >>> plt.imshow(np.angle(out), ... extent=[-2*np.pi, 2*np.pi, -2*np.pi, 2*np.pi], cmap='hsv') >>> plt.title('Phase (angle) of exp(x)') >>> plt.show()
-
dask.array.
expm1
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.expm1.
Some inconsistencies with the Dask version may exist.
Calculate
exp(x) - 1
for all elements in the array.- Parameters
- xarray_like
Input values.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Element-wise exponential minus one:
out = exp(x) - 1
. This is a scalar if x is a scalar.
See also
log1p
log(1 + x)
, the inverse of expm1.
Notes
This function provides greater precision than
exp(x) - 1
for small values ofx
.Examples
The true value of
exp(1e-10) - 1
is1.00000000005e-10
to about 32 significant digits. This example shows the superiority of expm1 in this case.>>> np.expm1(1e-10) 1.00000000005e-10 >>> np.exp(1e-10) - 1 1.000000082740371e-10
-
dask.array.
eye
(N, chunks='auto', M=None, k=0, dtype=<class 'float'>)¶ Return a 2-D Array with ones on the diagonal and zeros elsewhere.
- Parameters
- Nint
Number of rows in the output.
- chunksint, str
How to chunk the array. Must be one of the following forms:
A blocksize like 1000.
A size in bytes, like “100 MiB” which will choose a uniform block-like shape
The word “auto” which acts like the above, but uses a configuration value
array.chunk-size
for the chunk size
- Mint, optional
Number of columns in the output. If None, defaults to N.
- kint, optional
Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value refers to an upper diagonal, and a negative value to a lower diagonal.
- dtypedata-type, optional
Data-type of the returned array.
- Returns
- IArray of shape (N,M)
An array where all elements are equal to zero, except for the k-th diagonal, whose values are equal to one.
-
dask.array.
fabs
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.fabs.
Some inconsistencies with the Dask version may exist.
Compute the absolute values element-wise.
This function returns the absolute values (positive magnitude) of the data in x. Complex values are not handled, use absolute to find the absolute values of complex data.
- Parameters
- xarray_like
The array of numbers for which the absolute values are required. If x is a scalar, the result y will also be a scalar.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or scalar
The absolute values of x, the returned values are always floats. This is a scalar if x is a scalar.
See also
absolute
Absolute values including complex types.
Examples
>>> np.fabs(-1) 1.0 >>> np.fabs([-1.2, 1.2]) array([ 1.2, 1.2])
-
dask.array.
fix
(*args, **kwargs)¶ Round to nearest integer towards zero.
This docstring was copied from numpy.fix.
Some inconsistencies with the Dask version may exist.
Round an array of floats element-wise to nearest integer towards zero. The rounded values are returned as floats.
- Parameters
- xarray_like (Not supported in Dask)
An array of floats to be rounded
- yndarray, optional
Output array
- Returns
- outndarray of floats (Not supported in Dask)
The array of rounded numbers
Examples
>>> np.fix(3.14) 3.0 >>> np.fix(3) 3.0 >>> np.fix([2.1, 2.9, -2.1, -2.9]) array([ 2., 2., -2., -2.])
-
dask.array.
flatnonzero
(a)¶ Return indices that are non-zero in the flattened version of a.
This docstring was copied from numpy.flatnonzero.
Some inconsistencies with the Dask version may exist.
This is equivalent to np.nonzero(np.ravel(a))[0].
- Parameters
- aarray_like
Input data.
- Returns
- resndarray
Output array, containing the indices of the elements of a.ravel() that are non-zero.
See also
Examples
>>> x = np.arange(-2, 3) >>> x array([-2, -1, 0, 1, 2]) >>> np.flatnonzero(x) array([0, 1, 3, 4])
Use the indices of the non-zero elements as an index array to extract these elements:
>>> x.ravel()[np.flatnonzero(x)] array([-2, -1, 1, 2])
-
dask.array.
flip
(m, axis)¶ Reverse element order along axis.
- Parameters
- axisint
Axis to reverse element order of.
- Returns
- reversed arrayndarray
-
dask.array.
flipud
(m)¶ Flip array in the up/down direction.
This docstring was copied from numpy.flipud.
Some inconsistencies with the Dask version may exist.
Flip the entries in each column in the up/down direction. Rows are preserved, but appear in a different order than before.
- Parameters
- marray_like
Input array.
- Returns
- outarray_like
A view of m with the rows reversed. Since a view is returned, this operation is O(1).
See also
fliplr
Flip array in the left/right direction.
rot90
Rotate array counterclockwise.
Notes
Equivalent to
m[::-1,...]
. Does not require the array to be two-dimensional.Examples
>>> A = np.diag([1.0, 2, 3]) >>> A array([[1., 0., 0.], [0., 2., 0.], [0., 0., 3.]]) >>> np.flipud(A) array([[0., 0., 3.], [0., 2., 0.], [1., 0., 0.]])
>>> A = np.random.randn(2,3,5) >>> np.all(np.flipud(A) == A[::-1,...]) True
>>> np.flipud([1,2]) array([2, 1])
-
dask.array.
fliplr
(m)¶ Flip array in the left/right direction.
This docstring was copied from numpy.fliplr.
Some inconsistencies with the Dask version may exist.
Flip the entries in each row in the left/right direction. Columns are preserved, but appear in a different order than before.
- Parameters
- marray_like
Input array, must be at least 2-D.
- Returns
- fndarray
A view of m with the columns reversed. Since a view is returned, this operation is O(1).
See also
flipud
Flip array in the up/down direction.
rot90
Rotate array counterclockwise.
Notes
Equivalent to m[:,::-1]. Requires the array to be at least 2-D.
Examples
>>> A = np.diag([1.,2.,3.]) >>> A array([[1., 0., 0.], [0., 2., 0.], [0., 0., 3.]]) >>> np.fliplr(A) array([[0., 0., 1.], [0., 2., 0.], [3., 0., 0.]])
>>> A = np.random.randn(2,3,5) >>> np.all(np.fliplr(A) == A[:,::-1,...]) True
-
dask.array.
floor
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.floor.
Some inconsistencies with the Dask version may exist.
Return the floor of the input, element-wise.
The floor of the scalar x is the largest integer i, such that i <= x. It is often denoted as ⌊x⌋.
- Parameters
- xarray_like
Input data.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or scalar
The floor of each element in x. This is a scalar if x is a scalar.
Notes
Some spreadsheet programs calculate the “floor-towards-zero”, in other words
floor(-2.5) == -2
. NumPy instead uses the definition of floor where floor(-2.5) == -3.Examples
>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]) >>> np.floor(a) array([-2., -2., -1., 0., 1., 1., 2.])
-
dask.array.
fmax
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.fmax.
Some inconsistencies with the Dask version may exist.
Element-wise maximum of array elements.
Compare two arrays and returns a new array containing the element-wise maxima. If one of the elements being compared is a NaN, then the non-nan element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are ignored when possible.
- Parameters
- x1, x2array_like
The arrays holding the elements to be compared. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or scalar
The maximum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.
See also
Notes
New in version 1.3.0.
The fmax is equivalent to
np.where(x1 >= x2, x1, x2)
when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.Examples
>>> np.fmax([2, 3, 4], [1, 5, 2]) array([ 2., 5., 4.])
>>> np.fmax(np.eye(2), [0.5, 2]) array([[ 1. , 2. ], [ 0.5, 2. ]])
>>> np.fmax([np.nan, 0, np.nan],[0, np.nan, np.nan]) array([ 0., 0., nan])
-
dask.array.
fmin
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.fmin.
Some inconsistencies with the Dask version may exist.
Element-wise minimum of array elements.
Compare two arrays and returns a new array containing the element-wise minima. If one of the elements being compared is a NaN, then the non-nan element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are ignored when possible.
- Parameters
- x1, x2array_like
The arrays holding the elements to be compared. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or scalar
The minimum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.
See also
Notes
New in version 1.3.0.
The fmin is equivalent to
np.where(x1 <= x2, x1, x2)
when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.Examples
>>> np.fmin([2, 3, 4], [1, 5, 2]) array([1, 3, 2])
>>> np.fmin(np.eye(2), [0.5, 2]) array([[ 0.5, 0. ], [ 0. , 1. ]])
>>> np.fmin([np.nan, 0, np.nan],[0, np.nan, np.nan]) array([ 0., 0., nan])
-
dask.array.
fmod
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.fmod.
Some inconsistencies with the Dask version may exist.
Return the element-wise remainder of division.
This is the NumPy implementation of the C library function fmod, the remainder has the same sign as the dividend x1. It is equivalent to the Matlab(TM)
rem
function and should not be confused with the Python modulus operatorx1 % x2
.- Parameters
- x1array_like
Dividend.
- x2array_like
Divisor. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yarray_like
The remainder of the division of x1 by x2. This is a scalar if both x1 and x2 are scalars.
See also
remainder
Equivalent to the Python
%
operator.divide
Notes
The result of the modulo operation for negative dividend and divisors is bound by conventions. For fmod, the sign of result is the sign of the dividend, while for remainder the sign of the result is the sign of the divisor. The fmod function is equivalent to the Matlab(TM)
rem
function.Examples
>>> np.fmod([-3, -2, -1, 1, 2, 3], 2) array([-1, 0, -1, 1, 0, 1]) >>> np.remainder([-3, -2, -1, 1, 2, 3], 2) array([1, 0, 1, 1, 0, 1])
>>> np.fmod([5, 3], [2, 2.]) array([ 1., 1.]) >>> a = np.arange(-3, 3).reshape(3, 2) >>> a array([[-3, -2], [-1, 0], [ 1, 2]]) >>> np.fmod(a, [2,2]) array([[-1, 0], [-1, 0], [ 1, 0]])
-
dask.array.
frexp
(x, [out1, out2, ]/, [out=(None, None), ]*, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.frexp.
Some inconsistencies with the Dask version may exist.
Decompose the elements of x into mantissa and twos exponent.
Returns (mantissa, exponent), where x = mantissa * 2**exponent`. The mantissa is lies in the open interval(-1, 1), while the twos exponent is a signed integer.
- Parameters
- xarray_like
Array of numbers to be decomposed.
- out1ndarray, optional
Output array for the mantissa. Must have the same shape as x.
- out2ndarray, optional
Output array for the exponent. Must have the same shape as x.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- mantissandarray
Floating values between -1 and 1. This is a scalar if x is a scalar.
- exponentndarray
Integer exponents of 2. This is a scalar if x is a scalar.
See also
ldexp
Compute
y = x1 * 2**x2
, the inverse of frexp.
Notes
Complex dtypes are not supported, they will raise a TypeError.
Examples
>>> x = np.arange(9) >>> y1, y2 = np.frexp(x) >>> y1 array([ 0. , 0.5 , 0.5 , 0.75 , 0.5 , 0.625, 0.75 , 0.875, 0.5 ]) >>> y2 array([0, 1, 2, 2, 3, 3, 3, 3, 4]) >>> y1 * 2**y2 array([ 0., 1., 2., 3., 4., 5., 6., 7., 8.])
-
dask.array.
fromfunction
(func, chunks='auto', shape=None, dtype=None, **kwargs)¶ Construct an array by executing a function over each coordinate.
This docstring was copied from numpy.fromfunction.
Some inconsistencies with the Dask version may exist.
The resulting array therefore has a value
fn(x, y, z)
at coordinate(x, y, z)
.- Parameters
- functioncallable (Not supported in Dask)
The function is called with N parameters, where N is the rank of shape. Each parameter represents the coordinates of the array varying along a specific axis. For example, if shape were
(2, 2)
, then the parameters would bearray([[0, 0], [1, 1]])
andarray([[0, 1], [0, 1]])
- shape(N,) tuple of ints
Shape of the output array, which also determines the shape of the coordinate arrays passed to function.
- dtypedata-type, optional
Data-type of the coordinate arrays passed to function. By default, dtype is float.
- Returns
- fromfunctionany
The result of the call to function is passed back directly. Therefore the shape of fromfunction is completely determined by function. If function returns a scalar value, the shape of fromfunction would not match the shape parameter.
Notes
Keywords other than dtype are passed to function.
Examples
>>> np.fromfunction(lambda i, j: i == j, (3, 3), dtype=int) array([[ True, False, False], [False, True, False], [False, False, True]])
>>> np.fromfunction(lambda i, j: i + j, (3, 3), dtype=int) array([[0, 1, 2], [1, 2, 3], [2, 3, 4]])
-
dask.array.
frompyfunc
(func, nin, nout)¶ This docstring was copied from numpy.frompyfunc.
Some inconsistencies with the Dask version may exist.
Takes an arbitrary Python function and returns a NumPy ufunc.
Can be used, for example, to add broadcasting to a built-in Python function (see Examples section).
- Parameters
- funcPython function object
An arbitrary Python function.
- ninint
The number of input arguments.
- noutint
The number of objects returned by func.
- Returns
- outufunc
Returns a NumPy universal function (
ufunc
) object.
See also
vectorize
Evaluates pyfunc over input arrays using broadcasting rules of numpy.
Notes
The returned ufunc always returns PyObject arrays.
Examples
Use frompyfunc to add broadcasting to the Python function
oct
:>>> oct_array = np.frompyfunc(oct, 1, 1) >>> oct_array(np.array((10, 30, 100))) array(['0o12', '0o36', '0o144'], dtype=object) >>> np.array((oct(10), oct(30), oct(100))) # for comparison array(['0o12', '0o36', '0o144'], dtype='<U5')
-
dask.array.
full
(shape, fill_value, *args, **kwargs)¶ Blocked variant of full
Follows the signature of full exactly except that it also features optional keyword arguments
chunks: int, tuple, or dict
andname: str
.Original signature follows below.
Return a new array of given shape and type, filled with fill_value.
- Parameters
- shapeint or sequence of ints
Shape of the new array, e.g.,
(2, 3)
or2
.- fill_valuescalar
Fill value.
- dtypedata-type, optional
- The desired data-type for the array The default, None, means
np.array(fill_value).dtype.
- order{‘C’, ‘F’}, optional
Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.
- Returns
- outndarray
Array of fill_value with the given shape, dtype, and order.
See also
Examples
>>> np.full((2, 2), np.inf) array([[inf, inf], [inf, inf]]) >>> np.full((2, 2), 10) array([[10, 10], [10, 10]])
-
dask.array.
full_like
(a, fill_value, order='C', dtype=None, chunks=None, name=None, shape=None)¶ Return a full array with the same shape and type as a given array.
- Parameters
- aarray_like
The shape and data-type of a define these same attributes of the returned array.
- fill_valuescalar
Fill value.
- dtypedata-type, optional
Overrides the data type of the result.
- order{‘C’, ‘F’}, optional
Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.
- chunkssequence of ints
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.- namestr, optional
An optional keyname for the array. Defaults to hashing the input keyword arguments.
- shapeint or sequence of ints, optional.
Overrides the shape of the result.
- Returns
- outndarray
Array of fill_value with the same shape and type as a.
See also
zeros_like
Return an array of zeros with shape and type of input.
ones_like
Return an array of ones with shape and type of input.
empty_like
Return an empty array with shape and type of input.
zeros
Return a new array setting values to zero.
ones
Return a new array setting values to one.
empty
Return a new uninitialized array.
full
Fill a new array.
-
dask.array.
gradient
(f, *varargs, **kwargs)¶ Return the gradient of an N-dimensional array.
This docstring was copied from numpy.gradient.
Some inconsistencies with the Dask version may exist.
The gradient is computed using second order accurate central differences in the interior points and either first or second order accurate one-sides (forward or backwards) differences at the boundaries. The returned gradient hence has the same shape as the input array.
- Parameters
- farray_like
An N-dimensional array containing samples of a scalar function.
- varargslist of scalar or array, optional
Spacing between f values. Default unitary spacing for all dimensions. Spacing can be specified using:
single scalar to specify a sample distance for all dimensions.
N scalars to specify a constant sample distance for each dimension. i.e. dx, dy, dz, …
N arrays to specify the coordinates of the values along each dimension of F. The length of the array must match the size of the corresponding dimension
Any combination of N scalars/arrays with the meaning of 2. and 3.
If axis is given, the number of varargs must equal the number of axes. Default: 1.
- edge_order{1, 2}, optional
Gradient is calculated using N-th order accurate differences at the boundaries. Default: 1.
New in version 1.9.1.
- axisNone or int or tuple of ints, optional
Gradient is calculated only along the given axis or axes The default (axis = None) is to calculate the gradient for all the axes of the input array. axis may be negative, in which case it counts from the last to the first axis.
New in version 1.11.0.
- Returns
- gradientndarray or list of ndarray
A set of ndarrays (or a single ndarray if there is only one dimension) corresponding to the derivatives of f with respect to each dimension. Each derivative has the same shape as f.
Notes
Assuming that f∈C3 (i.e., f has at least 3 continuous derivatives) and let h∗ be a non-homogeneous stepsize, we minimize the “consistency error” ηi between the true gradient and its estimate from a linear combination of the neighboring grid-points:
ηi=f(1)i−[αf(xi)+βf(xi+hd)+γf(xi−hs)]By substituting f(xi+hd) and f(xi−hs) with their Taylor series expansion, this translates into solving the following the linear system:
{α+β+γ=0βhd−γhs=1βh2d+γh2s=0The resulting approximation of f(1)i is the following:
ˆf(1)i=h2sf(xi+hd)+(h2d−h2s)f(xi)−h2df(xi−hs)hshd(hd+hs)+O(hdh2s+hsh2dhd+hs)It is worth noting that if hs=hd (i.e., data are evenly spaced) we find the standard second order approximation:
ˆf(1)i=f(xi+1)−f(xi−1)2h+O(h2)With a similar procedure the forward/backward approximations used for boundaries can be derived.
References
- 1
Quarteroni A., Sacco R., Saleri F. (2007) Numerical Mathematics (Texts in Applied Mathematics). New York: Springer.
- 2
Durran D. R. (1999) Numerical Methods for Wave Equations in Geophysical Fluid Dynamics. New York: Springer.
- 3
Fornberg B. (1988) Generation of Finite Difference Formulas on Arbitrarily Spaced Grids, Mathematics of Computation 51, no. 184 : 699-706. PDF.
Examples
>>> f = np.array([1, 2, 4, 7, 11, 16], dtype=float) >>> np.gradient(f) array([1. , 1.5, 2.5, 3.5, 4.5, 5. ]) >>> np.gradient(f, 2) array([0.5 , 0.75, 1.25, 1.75, 2.25, 2.5 ])
Spacing can be also specified with an array that represents the coordinates of the values F along the dimensions. For instance a uniform spacing:
>>> x = np.arange(f.size) >>> np.gradient(f, x) array([1. , 1.5, 2.5, 3.5, 4.5, 5. ])
Or a non uniform one:
>>> x = np.array([0., 1., 1.5, 3.5, 4., 6.], dtype=float) >>> np.gradient(f, x) array([1. , 3. , 3.5, 6.7, 6.9, 2.5])
For two dimensional arrays, the return will be two arrays ordered by axis. In this example the first array stands for the gradient in rows and the second one in columns direction:
>>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float)) [array([[ 2., 2., -1.], [ 2., 2., -1.]]), array([[1. , 2.5, 4. ], [1. , 1. , 1. ]])]
In this example the spacing is also specified: uniform for axis=0 and non uniform for axis=1
>>> dx = 2. >>> y = [1., 1.5, 3.5] >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float), dx, y) [array([[ 1. , 1. , -0.5], [ 1. , 1. , -0.5]]), array([[2. , 2. , 2. ], [2. , 1.7, 0.5]])]
It is possible to specify how boundaries are treated using edge_order
>>> x = np.array([0, 1, 2, 3, 4]) >>> f = x**2 >>> np.gradient(f, edge_order=1) array([1., 2., 4., 6., 7.]) >>> np.gradient(f, edge_order=2) array([0., 2., 4., 6., 8.])
The axis keyword can be used to specify a subset of axes of which the gradient is calculated
>>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float), axis=0) array([[ 2., 2., -1.], [ 2., 2., -1.]])
-
dask.array.
histogram
(a, bins=None, range=None, normed=False, weights=None, density=None)¶ Blocked variant of
numpy.histogram()
.- Parameters
- aarray_like
Input data. The histogram is computed over the flattened array.
- binsint or sequence of scalars, optional
Either an iterable specifying the
bins
or the number ofbins
and arange
argument is required as computingmin
andmax
over blocked arrays is an expensive operation that must be performed explicitly. If bins is an int, it defines the number of equal-width bins in the given range (10, by default). If bins is a sequence, it defines a monotonically increasing array of bin edges, including the rightmost edge, allowing for non-uniform bin widths.- range(float, float), optional
The lower and upper range of the bins. If not provided, range is simply
(a.min(), a.max())
. Values outside the range are ignored. The first element of the range must be less than or equal to the second. range affects the automatic bin computation as well. While bin width is computed to be optimal based on the actual data within range, the bin count will fill the entire range including portions containing no data.- normedbool, optional
This is equivalent to the
density
argument, but produces incorrect results for unequal bin widths. It should not be used.- weightsarray_like, optional
A dask.array.Array of weights, of the same block structure as
a
. Each value ina
only contributes its associated weight towards the bin count (instead of 1). Ifdensity
is True, the weights are normalized, so that the integral of the density over the range remains 1.- densitybool, optional
If
False
, the result will contain the number of samples in each bin. IfTrue
, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1. Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function. Overrides thenormed
keyword if given. Ifdensity
is True,bins
cannot be a single-number delayed value. It must be a concrete number, or a (possibly-delayed) array/sequence of the bin edges.- Returns
- ——-
- histdask Array
The values of the histogram. See density and weights for a description of the possible semantics.
- bin_edgesdask Array of dtype float
Return the bin edges
(length(hist)+1)
.
Examples
Using number of bins and range:
>>> import dask.array as da >>> import numpy as np >>> x = da.from_array(np.arange(10000), chunks=10) >>> h, bins = da.histogram(x, bins=10, range=[0, 10000]) >>> bins array([ 0., 1000., 2000., 3000., 4000., 5000., 6000., 7000., 8000., 9000., 10000.]) >>> h.compute() array([1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000])
Explicitly specifying the bins:
>>> h, bins = da.histogram(x, bins=np.array([0, 5000, 10000])) >>> bins array([ 0, 5000, 10000]) >>> h.compute() array([5000, 5000])
-
dask.array.
hstack
(tup, allow_unknown_chunksizes=False)¶ Stack arrays in sequence horizontally (column wise).
This docstring was copied from numpy.hstack.
Some inconsistencies with the Dask version may exist.
This is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis. Rebuilds arrays divided by hsplit.
This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations.
- Parameters
- tupsequence of ndarrays
The arrays must have the same shape along all but the second axis, except 1-D arrays which can be any length.
- Returns
- stackedndarray
The array formed by stacking the given arrays.
See also
stack
Join a sequence of arrays along a new axis.
vstack
Stack arrays in sequence vertically (row wise).
dstack
Stack arrays in sequence depth wise (along third axis).
concatenate
Join a sequence of arrays along an existing axis.
hsplit
Split array along second axis.
block
Assemble arrays from blocks.
Examples
>>> a = np.array((1,2,3)) >>> b = np.array((2,3,4)) >>> np.hstack((a,b)) array([1, 2, 3, 2, 3, 4]) >>> a = np.array([[1],[2],[3]]) >>> b = np.array([[2],[3],[4]]) >>> np.hstack((a,b)) array([[1, 2], [2, 3], [3, 4]])
-
dask.array.
hypot
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.hypot.
Some inconsistencies with the Dask version may exist.
Given the “legs” of a right triangle, return its hypotenuse.
Equivalent to
sqrt(x1**2 + x2**2)
, element-wise. If x1 or x2 is scalar_like (i.e., unambiguously cast-able to a scalar type), it is broadcast for use with each element of the other argument. (See Examples)- Parameters
- x1, x2array_like
Leg of the triangle(s). If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- zndarray
The hypotenuse of the triangle(s). This is a scalar if both x1 and x2 are scalars.
Examples
>>> np.hypot(3*np.ones((3, 3)), 4*np.ones((3, 3))) array([[ 5., 5., 5.], [ 5., 5., 5.], [ 5., 5., 5.]])
Example showing broadcast of scalar_like argument:
>>> np.hypot(3*np.ones((3, 3)), [4]) array([[ 5., 5., 5.], [ 5., 5., 5.], [ 5., 5., 5.]])
-
dask.array.
imag
(*args, **kwargs)¶ Return the imaginary part of the complex argument.
This docstring was copied from numpy.imag.
Some inconsistencies with the Dask version may exist.
- Parameters
- valarray_like (Not supported in Dask)
Input array.
- Returns
- outndarray or scalar
The imaginary component of the complex argument. If val is real, the type of val is used for the output. If val has complex elements, the returned type is float.
Examples
>>> a = np.array([1+2j, 3+4j, 5+6j]) >>> a.imag array([2., 4., 6.]) >>> a.imag = np.array([8, 10, 12]) >>> a array([1. +8.j, 3.+10.j, 5.+12.j]) >>> np.imag(1 + 1j) 1.0
-
dask.array.
indices
(dimensions, dtype=<class 'int'>, chunks='auto')¶ Implements NumPy’s
indices
for Dask Arrays.Generates a grid of indices covering the dimensions provided.
The final array has the shape
(len(dimensions), *dimensions)
. The chunks are used to specify the chunking for axis 1 up tolen(dimensions)
. The 0th axis always has chunks of length 1.- Parameters
- dimensionssequence of ints
The shape of the index grid.
- dtypedtype, optional
Type to use for the array. Default is
int
.- chunkssequence of ints, str
The size of each block. Must be one of the following forms:
A blocksize like (500, 1000)
A size in bytes, like “100 MiB” which will choose a uniform block-like shape
The word “auto” which acts like the above, but uses a configuration value
array.chunk-size
for the chunk size
Note that the last block will have fewer samples if
len(array) % chunks != 0
.
- Returns
- griddask array
-
dask.array.
insert
(arr, obj, values, axis)¶ Insert values along the given axis before the given indices.
This docstring was copied from numpy.insert.
Some inconsistencies with the Dask version may exist.
- Parameters
- arrarray_like
Input array.
- objint, slice or sequence of ints
Object that defines the index or indices before which values is inserted.
New in version 1.8.0.
Support for multiple insertions when obj is a single scalar or a sequence with one element (similar to calling insert multiple times).
- valuesarray_like
Values to insert into arr. If the type of values is different from that of arr, values is converted to the type of arr. values should be shaped so that
arr[...,obj,...] = values
is legal.- axisint, optional
Axis along which to insert values. If axis is None then arr is flattened first.
- Returns
- outndarray
A copy of arr with values inserted. Note that insert does not occur in-place: a new array is returned. If axis is None, out is a flattened array.
See also
append
Append elements at the end of an array.
concatenate
Join a sequence of arrays along an existing axis.
delete
Delete elements from an array.
Notes
Note that for higher dimensional inserts obj=0 behaves very different from obj=[0] just like arr[:,0,:] = values is different from arr[:,[0],:] = values.
Examples
>>> a = np.array([[1, 1], [2, 2], [3, 3]]) >>> a array([[1, 1], [2, 2], [3, 3]]) >>> np.insert(a, 1, 5) array([1, 5, 1, ..., 2, 3, 3]) >>> np.insert(a, 1, 5, axis=1) array([[1, 5, 1], [2, 5, 2], [3, 5, 3]])
Difference between sequence and scalars:
>>> np.insert(a, [1], [[1],[2],[3]], axis=1) array([[1, 1, 1], [2, 2, 2], [3, 3, 3]]) >>> np.array_equal(np.insert(a, 1, [1, 2, 3], axis=1), ... np.insert(a, [1], [[1],[2],[3]], axis=1)) True
>>> b = a.flatten() >>> b array([1, 1, 2, 2, 3, 3]) >>> np.insert(b, [2, 2], [5, 6]) array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, slice(2, 4), [5, 6]) array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, [2, 2], [7.13, False]) # type casting array([1, 1, 7, ..., 2, 3, 3])
>>> x = np.arange(8).reshape(2, 4) >>> idx = (1, 3) >>> np.insert(x, idx, 999, axis=1) array([[ 0, 999, 1, 2, 999, 3], [ 4, 999, 5, 6, 999, 7]])
-
dask.array.
invert
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.invert.
Some inconsistencies with the Dask version may exist.
Compute bit-wise inversion, or bit-wise NOT, element-wise.
Computes the bit-wise NOT of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
~
.For signed integer inputs, the two’s complement is returned. In a two’s-complement system negative numbers are represented by the two’s complement of the absolute value. This is the most common method of representing signed integers on computers [1]. A N-bit two’s-complement system can represent every integer in the range −2N−1 to +2N−1−1.
- Parameters
- xarray_like
Only integer and boolean types are handled.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Result. This is a scalar if x is a scalar.
See also
bitwise_and
,bitwise_or
,bitwise_xor
logical_not
binary_repr
Return the binary representation of the input number as a string.
Notes
bitwise_not is an alias for invert:
>>> np.bitwise_not is np.invert True
References
- 1
Wikipedia, “Two’s complement”, https://en.wikipedia.org/wiki/Two’s_complement
Examples
We’ve seen that 13 is represented by
00001101
. The invert or bit-wise NOT of 13 is then:>>> x = np.invert(np.array(13, dtype=np.uint8)) >>> x 242 >>> np.binary_repr(x, width=8) '11110010'
The result depends on the bit-width:
>>> x = np.invert(np.array(13, dtype=np.uint16)) >>> x 65522 >>> np.binary_repr(x, width=16) '1111111111110010'
When using signed integer types the result is the two’s complement of the result for the unsigned type:
>>> np.invert(np.array([13], dtype=np.int8)) array([-14], dtype=int8) >>> np.binary_repr(-14, width=8) '11110010'
Booleans are accepted as well:
>>> np.invert(np.array([True, False])) array([False, True])
-
dask.array.
isclose
(arr1, arr2, rtol=1e-05, atol=1e-08, equal_nan=False)¶ Returns a boolean array where two arrays are element-wise equal within a tolerance.
This docstring was copied from numpy.isclose.
Some inconsistencies with the Dask version may exist.
The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.
Warning
The default atol is not appropriate for comparing numbers that are much smaller than one (see Notes).
- Parameters
- a, barray_like
Input arrays to compare.
- rtolfloat
The relative tolerance parameter (see Notes).
- atolfloat
The absolute tolerance parameter (see Notes).
- equal_nanbool
Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array.
- Returns
- yarray_like
Returns a boolean array of where a and b are equal within the given tolerance. If both a and b are scalars, returns a single boolean value.
See also
Notes
New in version 1.7.0.
For finite values, isclose uses the following equation to test whether two floating point values are equivalent.
absolute(a - b) <= (atol + rtol * absolute(b))
Unlike the built-in math.isclose, the above equation is not symmetric in a and b – it assumes b is the reference value – so that isclose(a, b) might be different from isclose(b, a). Furthermore, the default value of atol is not zero, and is used to determine what small values should be considered close to zero. The default value is appropriate for expected values of order unity: if the expected values are significantly smaller than one, it can result in false positives. atol should be carefully selected for the use case at hand. A zero value for atol will result in False if either a or b is zero.
Examples
>>> np.isclose([1e10,1e-7], [1.00001e10,1e-8]) array([ True, False]) >>> np.isclose([1e10,1e-8], [1.00001e10,1e-9]) array([ True, True]) >>> np.isclose([1e10,1e-8], [1.0001e10,1e-9]) array([False, True]) >>> np.isclose([1.0, np.nan], [1.0, np.nan]) array([ True, False]) >>> np.isclose([1.0, np.nan], [1.0, np.nan], equal_nan=True) array([ True, True]) >>> np.isclose([1e-8, 1e-7], [0.0, 0.0]) array([ True, False]) >>> np.isclose([1e-100, 1e-7], [0.0, 0.0], atol=0.0) array([False, False]) >>> np.isclose([1e-10, 1e-10], [1e-20, 0.0]) array([ True, True]) >>> np.isclose([1e-10, 1e-10], [1e-20, 0.999999e-10], atol=0.0) array([False, True])
-
dask.array.
iscomplex
(*args, **kwargs)¶ Returns a bool array, where True if input element is complex.
This docstring was copied from numpy.iscomplex.
Some inconsistencies with the Dask version may exist.
What is tested is whether the input has a non-zero imaginary part, not if the input type is complex.
- Parameters
- xarray_like (Not supported in Dask)
Input array.
- Returns
- outndarray of bools
Output array.
See also
isreal
iscomplexobj
Return True if x is a complex type or an array of complex numbers.
Examples
>>> np.iscomplex([1+1j, 1+0j, 4.5, 3, 2, 2j]) array([ True, False, False, False, False, True])
-
dask.array.
isfinite
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.isfinite.
Some inconsistencies with the Dask version may exist.
Test element-wise for finiteness (not infinity or not Not a Number).
The result is returned as a boolean array.
- Parameters
- xarray_like
Input values.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray, bool
True where
x
is not positive infinity, negative infinity, or NaN; false otherwise. This is a scalar if x is a scalar.
Notes
Not a Number, positive infinity and negative infinity are considered to be non-finite.
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Also that positive infinity is not equivalent to negative infinity. But infinity is equivalent to positive infinity. Errors result if the second argument is also supplied when x is a scalar input, or if first and second arguments have different shapes.
Examples
>>> np.isfinite(1) True >>> np.isfinite(0) True >>> np.isfinite(np.nan) False >>> np.isfinite(np.inf) False >>> np.isfinite(np.NINF) False >>> np.isfinite([np.log(-1.),1.,np.log(0)]) array([False, True, False])
>>> x = np.array([-np.inf, 0., np.inf]) >>> y = np.array([2, 2, 2]) >>> np.isfinite(x, y) array([0, 1, 0]) >>> y array([0, 1, 0])
-
dask.array.
isin
(element, test_elements, assume_unique=False, invert=False)¶ Calculates element in test_elements, broadcasting over element only. Returns a boolean array of the same shape as element that is True where an element of element is in test_elements and False otherwise.
- Parameters
- elementarray_like
Input array.
- test_elementsarray_like
The values against which to test each value of element. This argument is flattened if it is an array or array_like. See notes for behavior with non-array-like parameters.
- assume_uniquebool, optional
If True, the input arrays are both assumed to be unique, which can speed up the calculation. Default is False.
- invertbool, optional
If True, the values in the returned array are inverted, as if calculating element not in test_elements. Default is False.
np.isin(a, b, invert=True)
is equivalent to (but faster than)np.invert(np.isin(a, b))
.
- Returns
- isinndarray, bool
Has the same shape as element. The values element[isin] are in test_elements.
See also
in1d
Flattened version of this function.
numpy.lib.arraysetops
Module with a number of other functions for performing set operations on arrays.
Notes
isin is an element-wise function version of the python keyword in.
isin(a, b)
is roughly equivalent tonp.array([item in b for item in a])
if a and b are 1-D sequences.element and test_elements are converted to arrays if they are not already. If test_elements is a set (or other non-sequence collection) it will be converted to an object array with one element, rather than an array of the values contained in test_elements. This is a consequence of the array constructor’s way of handling non-sequence collections. Converting the set to a list usually gives the desired behavior.
New in version 1.13.0.
Examples
>>> element = 2*np.arange(4).reshape((2, 2)) >>> element array([[0, 2], [4, 6]]) >>> test_elements = [1, 2, 4, 8] >>> mask = np.isin(element, test_elements) >>> mask array([[False, True], [ True, False]]) >>> element[mask] array([2, 4])
The indices of the matched values can be obtained with nonzero:
>>> np.nonzero(mask) (array([0, 1]), array([1, 0]))
The test can also be inverted:
>>> mask = np.isin(element, test_elements, invert=True) >>> mask array([[ True, False], [False, True]]) >>> element[mask] array([0, 6])
Because of how array handles sets, the following does not work as expected:
>>> test_set = {1, 2, 4, 8} >>> np.isin(element, test_set) array([[False, False], [False, False]])
Casting the set to a list gives the expected result:
>>> np.isin(element, list(test_set)) array([[False, True], [ True, False]])
-
dask.array.
isinf
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.isinf.
Some inconsistencies with the Dask version may exist.
Test element-wise for positive or negative infinity.
Returns a boolean array of the same shape as x, True where
x == +/-inf
, otherwise False.- Parameters
- xarray_like
Input values
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- ybool (scalar) or boolean ndarray
True where
x
is positive or negative infinity, false otherwise. This is a scalar if x is a scalar.
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754).
Errors result if the second argument is supplied when the first argument is a scalar, or if the first and second arguments have different shapes.
Examples
>>> np.isinf(np.inf) True >>> np.isinf(np.nan) False >>> np.isinf(np.NINF) True >>> np.isinf([np.inf, -np.inf, 1.0, np.nan]) array([ True, True, False, False])
>>> x = np.array([-np.inf, 0., np.inf]) >>> y = np.array([2, 2, 2]) >>> np.isinf(x, y) array([1, 0, 1]) >>> y array([1, 0, 1])
-
dask.array.
isneginf
(*args, **kwargs)¶ This docstring was copied from numpy.equal.
Some inconsistencies with the Dask version may exist.
Return (x1 == x2) element-wise.
- Parameters
- x1, x2array_like
Input arrays. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Output array, element-wise comparison of x1 and x2. Typically of type bool, unless
dtype=object
is passed. This is a scalar if both x1 and x2 are scalars.
See also
not_equal
,greater_equal
,less_equal
,greater
,less
Examples
>>> np.equal([0, 1, 3], np.arange(3)) array([ True, True, False])
What is compared are values, not types. So an int (1) and an array of length one can evaluate as True:
>>> np.equal(1, np.ones(1)) array([ True])
-
dask.array.
isnan
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.isnan.
Some inconsistencies with the Dask version may exist.
Test element-wise for NaN and return result as a boolean array.
- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or bool
True where
x
is NaN, false otherwise. This is a scalar if x is a scalar.
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity.
Examples
>>> np.isnan(np.nan) True >>> np.isnan(np.inf) False >>> np.isnan([np.log(-1.),1.,np.log(0)]) array([ True, False, False])
-
dask.array.
isnull
(values)¶ pandas.isnull for dask arrays
-
dask.array.
isposinf
(*args, **kwargs)¶ This docstring was copied from numpy.equal.
Some inconsistencies with the Dask version may exist.
Return (x1 == x2) element-wise.
- Parameters
- x1, x2array_like
Input arrays. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Output array, element-wise comparison of x1 and x2. Typically of type bool, unless
dtype=object
is passed. This is a scalar if both x1 and x2 are scalars.
See also
not_equal
,greater_equal
,less_equal
,greater
,less
Examples
>>> np.equal([0, 1, 3], np.arange(3)) array([ True, True, False])
What is compared are values, not types. So an int (1) and an array of length one can evaluate as True:
>>> np.equal(1, np.ones(1)) array([ True])
-
dask.array.
isreal
(*args, **kwargs)¶ Returns a bool array, where True if input element is real.
This docstring was copied from numpy.isreal.
Some inconsistencies with the Dask version may exist.
If element has complex type with zero complex part, the return value for that element is True.
- Parameters
- xarray_like (Not supported in Dask)
Input array.
- Returns
- outndarray, bool
Boolean array of same shape as x.
See also
iscomplex
isrealobj
Return True if x is not a complex type.
Examples
>>> np.isreal([1+1j, 1+0j, 4.5, 3, 2, 2j]) array([False, True, True, True, True, False])
-
dask.array.
ldexp
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.ldexp.
Some inconsistencies with the Dask version may exist.
Returns x1 * 2**x2, element-wise.
The mantissas x1 and twos exponents x2 are used to construct floating point numbers
x1 * 2**x2
.- Parameters
- x1array_like
Array of multipliers.
- x2array_like, int
Array of twos exponents. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or scalar
The result of
x1 * 2**x2
. This is a scalar if both x1 and x2 are scalars.
See also
frexp
Return (y1, y2) from
x = y1 * 2**y2
, inverse to ldexp.
Notes
Complex dtypes are not supported, they will raise a TypeError.
ldexp is useful as the inverse of frexp, if used by itself it is more clear to simply use the expression
x1 * 2**x2
.Examples
>>> np.ldexp(5, np.arange(4)) array([ 5., 10., 20., 40.], dtype=float16)
>>> x = np.arange(6) >>> np.ldexp(*np.frexp(x)) array([ 0., 1., 2., 3., 4., 5.])
-
dask.array.
linspace
(start, stop, num=50, endpoint=True, retstep=False, chunks='auto', dtype=None)¶ Return num evenly spaced values over the closed interval [start, stop].
- Parameters
- startscalar
The starting value of the sequence.
- stopscalar
The last value of the sequence.
- numint, optional
Number of samples to include in the returned dask array, including the endpoints. Default is 50.
- endpointbool, optional
If True,
stop
is the last sample. Otherwise, it is not included. Default is True.- retstepbool, optional
If True, return (samples, step), where step is the spacing between samples. Default is False.
- chunksint
The number of samples on each block. Note that the last block will have fewer samples if num % blocksize != 0
- dtypedtype, optional
The type of the output array.
- Returns
- samplesdask array
- stepfloat, optional
Only returned if
retstep
is True. Size of spacing between samples.
See also
-
dask.array.
log
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.log.
Some inconsistencies with the Dask version may exist.
Natural logarithm, element-wise.
The natural logarithm log is the inverse of the exponential function, so that log(exp(x)) = x. The natural logarithm is logarithm in base e.
- Parameters
- xarray_like
Input value.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The natural logarithm of x, element-wise. This is a scalar if x is a scalar.
Notes
Logarithm is a multivalued function: for each x there is an infinite number of z such that exp(z) = x. The convention is to return the z whose imaginary part lies in [-pi, pi].
For real-valued input data types, log always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, log is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.
References
- 1
M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/
- 2
Wikipedia, “Logarithm”. https://en.wikipedia.org/wiki/Logarithm
Examples
>>> np.log([1, np.e, np.e**2, 0]) array([ 0., 1., 2., -Inf])
-
dask.array.
log10
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.log10.
Some inconsistencies with the Dask version may exist.
Return the base 10 logarithm of the input array, element-wise.
- Parameters
- xarray_like
Input values.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The logarithm to the base 10 of x, element-wise. NaNs are returned where x is negative. This is a scalar if x is a scalar.
See also
emath.log10
Notes
Logarithm is a multivalued function: for each x there is an infinite number of z such that 10**z = x. The convention is to return the z whose imaginary part lies in [-pi, pi].
For real-valued input data types, log10 always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, log10 is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log10 handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.
References
- 1
M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/
- 2
Wikipedia, “Logarithm”. https://en.wikipedia.org/wiki/Logarithm
Examples
>>> np.log10([1e-15, -3.]) array([-15., nan])
-
dask.array.
log1p
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.log1p.
Some inconsistencies with the Dask version may exist.
Return the natural logarithm of one plus the input array, element-wise.
Calculates
log(1 + x)
.- Parameters
- xarray_like
Input values.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
Natural logarithm of 1 + x, element-wise. This is a scalar if x is a scalar.
See also
expm1
exp(x) - 1
, the inverse of log1p.
Notes
For real-valued input, log1p is accurate also for x so small that 1 + x == 1 in floating-point accuracy.
Logarithm is a multivalued function: for each x there is an infinite number of z such that exp(z) = 1 + x. The convention is to return the z whose imaginary part lies in [-pi, pi].
For real-valued input data types, log1p always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, log1p is a complex analytical function that has a branch cut [-inf, -1] and is continuous from above on it. log1p handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.
References
- 1
M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/
- 2
Wikipedia, “Logarithm”. https://en.wikipedia.org/wiki/Logarithm
Examples
>>> np.log1p(1e-99) 1e-99 >>> np.log(1 + 1e-99) 0.0
-
dask.array.
log2
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.log2.
Some inconsistencies with the Dask version may exist.
Base-2 logarithm of x.
- Parameters
- xarray_like
Input values.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
Base-2 logarithm of x. This is a scalar if x is a scalar.
Notes
New in version 1.3.0.
Logarithm is a multivalued function: for each x there is an infinite number of z such that 2**z = x. The convention is to return the z whose imaginary part lies in [-pi, pi].
For real-valued input data types, log2 always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, log2 is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log2 handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.
Examples
>>> x = np.array([0, 1, 2, 2**4]) >>> np.log2(x) array([-Inf, 0., 1., 4.])
>>> xi = np.array([0+1.j, 1, 2+0.j, 4.j]) >>> np.log2(xi) array([ 0.+2.26618007j, 0.+0.j , 1.+0.j , 2.+2.26618007j])
-
dask.array.
logaddexp
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logaddexp.
Some inconsistencies with the Dask version may exist.
Logarithm of the sum of exponentiations of the inputs.
Calculates
log(exp(x1) + exp(x2))
. This function is useful in statistics where the calculated probabilities of events may be so small as to exceed the range of normal floating point numbers. In such cases the logarithm of the calculated probability is stored. This function allows adding probabilities stored in such a fashion.- Parameters
- x1, x2array_like
Input values. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- resultndarray
Logarithm of
exp(x1) + exp(x2)
. This is a scalar if both x1 and x2 are scalars.
See also
logaddexp2
Logarithm of the sum of exponentiations of inputs in base 2.
Notes
New in version 1.3.0.
Examples
>>> prob1 = np.log(1e-50) >>> prob2 = np.log(2.5e-50) >>> prob12 = np.logaddexp(prob1, prob2) >>> prob12 -113.87649168120691 >>> np.exp(prob12) 3.5000000000000057e-50
-
dask.array.
logaddexp2
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logaddexp2.
Some inconsistencies with the Dask version may exist.
Logarithm of the sum of exponentiations of the inputs in base-2.
Calculates
log2(2**x1 + 2**x2)
. This function is useful in machine learning when the calculated probabilities of events may be so small as to exceed the range of normal floating point numbers. In such cases the base-2 logarithm of the calculated probability can be used instead. This function allows adding probabilities stored in such a fashion.- Parameters
- x1, x2array_like
Input values. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- resultndarray
Base-2 logarithm of
2**x1 + 2**x2
. This is a scalar if both x1 and x2 are scalars.
See also
logaddexp
Logarithm of the sum of exponentiations of the inputs.
Notes
New in version 1.3.0.
Examples
>>> prob1 = np.log2(1e-50) >>> prob2 = np.log2(2.5e-50) >>> prob12 = np.logaddexp2(prob1, prob2) >>> prob1, prob2, prob12 (-166.09640474436813, -164.77447664948076, -164.28904982231052) >>> 2**prob12 3.4999999999999914e-50
-
dask.array.
logical_and
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logical_and.
Some inconsistencies with the Dask version may exist.
Compute the truth value of x1 AND x2 element-wise.
- Parameters
- x1, x2array_like
Input arrays. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or bool
Boolean result of the logical AND operation applied to the elements of x1 and x2; the shape is determined by broadcasting. This is a scalar if both x1 and x2 are scalars.
See also
Examples
>>> np.logical_and(True, False) False >>> np.logical_and([True, False], [False, False]) array([False, False])
>>> x = np.arange(5) >>> np.logical_and(x>1, x<4) array([False, False, True, True, False])
-
dask.array.
logical_not
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logical_not.
Some inconsistencies with the Dask version may exist.
Compute the truth value of NOT x element-wise.
- Parameters
- xarray_like
Logical NOT is applied to the elements of x.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- ybool or ndarray of bool
Boolean result with the same shape as x of the NOT operation on elements of x. This is a scalar if x is a scalar.
See also
Examples
>>> np.logical_not(3) False >>> np.logical_not([True, False, 0, 1]) array([False, True, True, False])
>>> x = np.arange(5) >>> np.logical_not(x<3) array([False, False, False, True, True])
-
dask.array.
logical_or
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logical_or.
Some inconsistencies with the Dask version may exist.
Compute the truth value of x1 OR x2 element-wise.
- Parameters
- x1, x2array_like
Logical OR is applied to the elements of x1 and x2. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or bool
Boolean result of the logical OR operation applied to the elements of x1 and x2; the shape is determined by broadcasting. This is a scalar if both x1 and x2 are scalars.
See also
Examples
>>> np.logical_or(True, False) True >>> np.logical_or([True, False], [False, False]) array([ True, False])
>>> x = np.arange(5) >>> np.logical_or(x < 1, x > 3) array([ True, False, False, False, True])
-
dask.array.
logical_xor
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logical_xor.
Some inconsistencies with the Dask version may exist.
Compute the truth value of x1 XOR x2, element-wise.
- Parameters
- x1, x2array_like
Logical XOR is applied to the elements of x1 and x2. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- ybool or ndarray of bool
Boolean result of the logical XOR operation applied to the elements of x1 and x2; the shape is determined by broadcasting. This is a scalar if both x1 and x2 are scalars.
See also
Examples
>>> np.logical_xor(True, False) True >>> np.logical_xor([True, True, False, False], [True, False, True, False]) array([False, True, True, False])
>>> x = np.arange(5) >>> np.logical_xor(x < 1, x > 3) array([ True, False, False, False, True])
Simple example showing support of broadcasting
>>> np.logical_xor(0, np.eye(2)) array([[ True, False], [False, True]])
-
dask.array.
map_blocks
(func, *args, name=None, token=None, dtype=None, chunks=None, drop_axis=[], new_axis=None, meta=None, **kwargs)¶ Map a function across all blocks of a dask array.
- Parameters
- funccallable
Function to apply to every block in the array.
- argsdask arrays or other objects
- dtypenp.dtype, optional
The
dtype
of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data.- chunkstuple, optional
Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array.
- drop_axisnumber or iterable, optional
Dimensions lost by the function.
- new_axisnumber or iterable, optional
New dimensions created by the function. Note that these are applied after
drop_axis
(if present).- tokenstring, optional
The key prefix to use for the output array. If not provided, will be determined from the function name.
- namestring, optional
The key name to use for the output array. Note that this fully specifies the output key name, and must be unique. If not provided, will be determined by a hash of the arguments.
- **kwargs :
Other keyword arguments to pass to function. Values must be constants (not dask.arrays)
See also
dask.array.blockwise
Generalized operation with control over block alignment.
Examples
>>> import dask.array as da >>> x = da.arange(6, chunks=3)
>>> x.map_blocks(lambda x: x * 2).compute() array([ 0, 2, 4, 6, 8, 10])
The
da.map_blocks
function can also accept multiple arrays.>>> d = da.arange(5, chunks=2) >>> e = da.arange(5, chunks=2)
>>> f = map_blocks(lambda a, b: a + b**2, d, e) >>> f.compute() array([ 0, 2, 6, 12, 20])
If the function changes shape of the blocks then you must provide chunks explicitly.
>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))
You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.
>>> a = da.arange(18, chunks=(6,)) >>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))
If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.
>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1), ... new_axis=[0, 2])
If
chunks
is specified butnew_axis
is not, then it is inferred to add the necessary number of axes on the left.Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.
>>> x = da.arange(1000, chunks=(100,)) >>> y = da.arange(100, chunks=(10,))
The relevant attribute to match is numblocks.
>>> x.numblocks (10,) >>> y.numblocks (10,)
If these match (up to broadcasting rules) then we can map arbitrary functions across blocks
>>> def func(a, b): ... return np.array([a.max(), b.max()])
>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8') dask.array<func, shape=(20,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> _.compute() array([ 99, 9, 199, 19, 299, 29, 399, 39, 499, 49, 599, 59, 699, 69, 799, 79, 899, 89, 999, 99])
Your block function get information about where it is in the array by accepting a special
block_info
keyword argument.>>> def func(block, block_info=None): ... pass
This will receive the following information:
>>> block_info {0: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)]}, None: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)], 'chunk-shape': (100,), 'dtype': dtype('float64')}}
For each argument and keyword arguments that are dask arrays (the positions of which are the first index), you will receive the shape of the full array, the number of chunks of the full array in each dimension, the chunk location (for example the fourth chunk over in the first dimension), and the array location (for example the slice corresponding to
40:50
). The same information is provided for the output, with the keyNone
, plus the shape and dtype that should be returned.These features can be combined to synthesize an array from scratch, for example:
>>> def func(block_info=None): ... loc = block_info[None]['array-location'][0] ... return np.arange(loc[0], loc[1])
>>> da.map_blocks(func, chunks=((4, 4),), dtype=np.float_) dask.array<func, shape=(8,), dtype=float64, chunksize=(4,), chunktype=numpy.ndarray>
>>> _.compute() array([0, 1, 2, 3, 4, 5, 6, 7])
You may specify the key name prefix of the resulting task in the graph with the optional
token
keyword argument.>>> x.map_blocks(lambda x: x + 1, name='increment') dask.array<increment, shape=(100,), dtype=int64, chunksize=(10,), chunktype=numpy.ndarray>
-
dask.array.
matmul
(x1, x2, /, out=None, *, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.matmul.
Some inconsistencies with the Dask version may exist.
Matrix product of two arrays.
- Parameters
- x1, x2array_like
Input arrays, scalars not allowed.
- outndarray, optional
A location into which the result is stored. If provided, it must have a shape that matches the signature (n,k),(k,m)->(n,m). If not provided or None, a freshly-allocated array is returned.
- **kwargs
For other keyword-only arguments, see the ufunc docs.
New in version 1.16: Now handles ufunc kwargs
- Returns
- yndarray
The matrix product of the inputs. This is a scalar only when both x1, x2 are 1-d vectors.
- Raises
- ValueError
If the last dimension of a is not the same size as the second-to-last dimension of b.
If a scalar value is passed in.
See also
Notes
The behavior depends on the arguments in the following way.
If both arguments are 2-D they are multiplied like conventional matrices.
If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.
If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.
matmul
differs fromdot
in two important ways:Multiplication by scalars is not allowed, use
*
instead.Stacks of matrices are broadcast together as if the matrices were elements, respecting the signature
(n,k),(k,m)->(n,m)
:>>> a = np.ones([9, 5, 7, 4]) >>> c = np.ones([9, 5, 4, 3]) >>> np.dot(a, c).shape (9, 5, 7, 9, 5, 3) >>> np.matmul(a, c).shape (9, 5, 7, 3) >>> # n is 7, k is 4, m is 3
The matmul function implements the semantics of the @ operator introduced in Python 3.5 following PEP465.
Examples
For 2-D arrays it is the matrix product:
>>> a = np.array([[1, 0], ... [0, 1]]) >>> b = np.array([[4, 1], ... [2, 2]]) >>> np.matmul(a, b) array([[4, 1], [2, 2]])
For 2-D mixed with 1-D, the result is the usual.
>>> a = np.array([[1, 0], ... [0, 1]]) >>> b = np.array([1, 2]) >>> np.matmul(a, b) array([1, 2]) >>> np.matmul(b, a) array([1, 2])
Broadcasting is conventional for stacks of arrays
>>> a = np.arange(2 * 2 * 4).reshape((2, 2, 4)) >>> b = np.arange(2 * 2 * 4).reshape((2, 4, 2)) >>> np.matmul(a,b).shape (2, 2, 2) >>> np.matmul(a, b)[0, 1, 1] 98 >>> sum(a[0, 1, :] * b[0 , :, 1]) 98
Vector, vector returns the scalar inner product, but neither argument is complex-conjugated:
>>> np.matmul([2j, 3j], [2j, 3j]) (-13+0j)
Scalar multiplication raises an error.
>>> np.matmul([1,2], 3) Traceback (most recent call last): ... ValueError: matmul: Input operand 1 does not have enough dimensions ...
New in version 1.10.0.
-
dask.array.
max
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Return the maximum of an array or maximum along an axis.
This docstring was copied from numpy.max.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Input data.
- axisNone or int or tuple of ints, optional
Axis or axes along which to operate. By default, flattened input is used.
New in version 1.7.0.
If this is a tuple of ints, the maximum is selected over multiple axes, instead of a single axis or all the axes as before.
- outndarray, optional
Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See ufuncs-output-type for more details.
- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the amax method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- initialscalar, optional (Not supported in Dask)
The minimum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
- wherearray_like of bool, optional (Not supported in Dask)
Elements to compare for the maximum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
- Returns
- amaxndarray or scalar
Maximum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension
a.ndim - 1
.
See also
amin
The minimum value of an array along a given axis, propagating any NaNs.
nanmax
The maximum value of an array along a given axis, ignoring any NaNs.
maximum
Element-wise maximum of two arrays, propagating any NaNs.
fmax
Element-wise maximum of two arrays, ignoring any NaNs.
argmax
Return the indices of the maximum values.
nanmin
,minimum
,fmin
Notes
NaN values are propagated, that is if at least one item is NaN, the corresponding max value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmax.
Don’t use amax for element-wise comparison of 2 arrays; when
a.shape[0]
is 2,maximum(a[0], a[1])
is faster thanamax(a, axis=0)
.Examples
>>> a = np.arange(4).reshape((2,2)) >>> a array([[0, 1], [2, 3]]) >>> np.amax(a) # Maximum of the flattened array 3 >>> np.amax(a, axis=0) # Maxima along the first axis array([2, 3]) >>> np.amax(a, axis=1) # Maxima along the second axis array([1, 3]) >>> np.amax(a, where=[False, True], initial=-1, axis=0) array([-1, 3]) >>> b = np.arange(5, dtype=float) >>> b[2] = np.NaN >>> np.amax(b) nan >>> np.amax(b, where=~np.isnan(b), initial=-1) 4.0 >>> np.nanmax(b) 4.0
You can use an initial value to compute the maximum of an empty slice, or to initialize it to a different value:
>>> np.max([[-50], [10]], axis=-1, initial=0) array([ 0, 10])
Notice that the initial value is used as one of the elements for which the maximum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.
>>> np.max([5], initial=6) 6 >>> max([5], default=6) 5
-
dask.array.
maximum
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.maximum.
Some inconsistencies with the Dask version may exist.
Element-wise maximum of array elements.
Compare two arrays and returns a new array containing the element-wise maxima. If one of the elements being compared is a NaN, then that element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are propagated.
- Parameters
- x1, x2array_like
The arrays holding the elements to be compared. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or scalar
The maximum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.
See also
Notes
The maximum is equivalent to
np.where(x1 >= x2, x1, x2)
when neither x1 nor x2 are nans, but it is faster and does proper broadcasting.Examples
>>> np.maximum([2, 3, 4], [1, 5, 2]) array([2, 5, 4])
>>> np.maximum(np.eye(2), [0.5, 2]) # broadcasting array([[ 1. , 2. ], [ 0.5, 2. ]])
>>> np.maximum([np.nan, 0, np.nan], [0, np.nan, np.nan]) array([nan, nan, nan]) >>> np.maximum(np.Inf, 1) inf
-
dask.array.
mean
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Compute the arithmetic mean along the specified axis.
This docstring was copied from numpy.mean.
Some inconsistencies with the Dask version may exist.
Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.
- Parameters
- aarray_like
Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted.
- axisNone or int or tuple of ints, optional
Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.
New in version 1.7.0.
If this is a tuple of ints, a mean is performed over multiple axes, instead of a single axis or all the axes as before.
- dtypedata-type, optional
Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.
- outndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the mean method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- Returns
- mndarray, see dtype parameter above
If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned.
Notes
The arithmetic mean is the sum of the elements along the axis divided by the number of elements.
Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.
By default, float16 results are computed using float32 intermediates for extra precision.
Examples
>>> a = np.array([[1, 2], [3, 4]]) >>> np.mean(a) 2.5 >>> np.mean(a, axis=0) array([2., 3.]) >>> np.mean(a, axis=1) array([1.5, 3.5])
In single precision, mean can be inaccurate:
>>> a = np.zeros((2, 512*512), dtype=np.float32) >>> a[0, :] = 1.0 >>> a[1, :] = 0.1 >>> np.mean(a) 0.54999924
Computing the mean in float64 is more accurate:
>>> np.mean(a, dtype=np.float64) 0.55000000074505806 # may vary
-
dask.array.
median
(a, axis=None, keepdims=False, out=None)¶ Compute the median along the specified axis.
This docstring was copied from numpy.median.
Some inconsistencies with the Dask version may exist.
This works by automatically chunking the reduced axes to a single chunk and then calling
numpy.median
function across the remaining dimensionsReturns the median of the array elements.
- Parameters
- aarray_like
Input array or object that can be converted to an array.
- axis{int, sequence of int, None}, optional
Axis or axes along which the medians are computed. The default is to compute the median along a flattened version of the array. A sequence of axes is supported since version 1.9.0.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type (of the output) will be cast if necessary.
- overwrite_inputbool, optional (Not supported in Dask)
If True, then allow use of memory of input array a for calculations. The input array will be modified by the call to median. This will save memory when you do not need to preserve the contents of the input array. Treat the input as undefined, but it will probably be fully or partially sorted. Default is False. If overwrite_input is
True
and a is not already an ndarray, an error will be raised.- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.
New in version 1.9.0.
- Returns
- medianndarray
A new array holding the result. If the input contains integers or floats smaller than
float64
, then the output data-type isnp.float64
. Otherwise, the data-type of the output is the same as that of the input. If out is specified, that array is returned instead.
See also
Notes
Given a vector
V
of lengthN
, the median ofV
is the middle value of a sorted copy ofV
,V_sorted
- i e.,V_sorted[(N-1)/2]
, whenN
is odd, and the average of the two middle values ofV_sorted
whenN
is even.Examples
>>> a = np.array([[10, 7, 4], [3, 2, 1]]) >>> a array([[10, 7, 4], [ 3, 2, 1]]) >>> np.median(a) 3.5 >>> np.median(a, axis=0) array([6.5, 4.5, 2.5]) >>> np.median(a, axis=1) array([7., 2.]) >>> m = np.median(a, axis=0) >>> out = np.zeros_like(m) >>> np.median(a, axis=0, out=m) array([6.5, 4.5, 2.5]) >>> m array([6.5, 4.5, 2.5]) >>> b = a.copy() >>> np.median(b, axis=1, overwrite_input=True) array([7., 2.]) >>> assert not np.all(a==b) >>> b = a.copy() >>> np.median(b, axis=None, overwrite_input=True) 3.5 >>> assert not np.all(a==b)
-
dask.array.
meshgrid
(*xi, **kwargs)¶ Return coordinate matrices from coordinate vectors.
This docstring was copied from numpy.meshgrid.
Some inconsistencies with the Dask version may exist.
Make N-D coordinate arrays for vectorized evaluations of N-D scalar/vector fields over N-D grids, given one-dimensional coordinate arrays x1, x2,…, xn.
Changed in version 1.9: 1-D and 0-D cases are allowed.
- Parameters
- x1, x2,…, xnarray_like
1-D arrays representing the coordinates of a grid.
- indexing{‘xy’, ‘ij’}, optional
Cartesian (‘xy’, default) or matrix (‘ij’) indexing of output. See Notes for more details.
New in version 1.7.0.
- sparsebool, optional
If True a sparse grid is returned in order to conserve memory. Default is False.
New in version 1.7.0.
- copybool, optional
If False, a view into the original arrays are returned in order to conserve memory. Default is True. Please note that
sparse=False, copy=False
will likely return non-contiguous arrays. Furthermore, more than one element of a broadcast array may refer to a single memory location. If you need to write to the arrays, make copies first.New in version 1.7.0.
- Returns
- X1, X2,…, XNndarray
For vectors x1, x2,…, ‘xn’ with lengths
Ni=len(xi)
, return(N1, N2, N3,...Nn)
shaped arrays if indexing=’ij’ or(N2, N1, N3,...Nn)
shaped arrays if indexing=’xy’ with the elements of xi repeated to fill the matrix along the first dimension for x1, the second for x2 and so on.
See also
index_tricks.mgrid
Construct a multi-dimensional “meshgrid” using indexing notation.
index_tricks.ogrid
Construct an open multi-dimensional “meshgrid” using indexing notation.
Notes
This function supports both indexing conventions through the indexing keyword argument. Giving the string ‘ij’ returns a meshgrid with matrix indexing, while ‘xy’ returns a meshgrid with Cartesian indexing. In the 2-D case with inputs of length M and N, the outputs are of shape (N, M) for ‘xy’ indexing and (M, N) for ‘ij’ indexing. In the 3-D case with inputs of length M, N and P, outputs are of shape (N, M, P) for ‘xy’ indexing and (M, N, P) for ‘ij’ indexing. The difference is illustrated by the following code snippet:
xv, yv = np.meshgrid(x, y, sparse=False, indexing='ij') for i in range(nx): for j in range(ny): # treat xv[i,j], yv[i,j] xv, yv = np.meshgrid(x, y, sparse=False, indexing='xy') for i in range(nx): for j in range(ny): # treat xv[j,i], yv[j,i]
In the 1-D and 0-D case, the indexing and sparse keywords have no effect.
Examples
>>> nx, ny = (3, 2) >>> x = np.linspace(0, 1, nx) >>> y = np.linspace(0, 1, ny) >>> xv, yv = np.meshgrid(x, y) >>> xv array([[0. , 0.5, 1. ], [0. , 0.5, 1. ]]) >>> yv array([[0., 0., 0.], [1., 1., 1.]]) >>> xv, yv = np.meshgrid(x, y, sparse=True) # make sparse output arrays >>> xv array([[0. , 0.5, 1. ]]) >>> yv array([[0.], [1.]])
meshgrid is very useful to evaluate functions on a grid.
>>> import matplotlib.pyplot as plt >>> x = np.arange(-5, 5, 0.1) >>> y = np.arange(-5, 5, 0.1) >>> xx, yy = np.meshgrid(x, y, sparse=True) >>> z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2) >>> h = plt.contourf(x,y,z) >>> plt.show()
-
dask.array.
min
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Return the minimum of an array or minimum along an axis.
This docstring was copied from numpy.min.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Input data.
- axisNone or int or tuple of ints, optional
Axis or axes along which to operate. By default, flattened input is used.
New in version 1.7.0.
If this is a tuple of ints, the minimum is selected over multiple axes, instead of a single axis or all the axes as before.
- outndarray, optional
Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See ufuncs-output-type for more details.
- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the amin method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- initialscalar, optional (Not supported in Dask)
The maximum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
- wherearray_like of bool, optional (Not supported in Dask)
Elements to compare for the minimum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
- Returns
- aminndarray or scalar
Minimum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension
a.ndim - 1
.
See also
amax
The maximum value of an array along a given axis, propagating any NaNs.
nanmin
The minimum value of an array along a given axis, ignoring any NaNs.
minimum
Element-wise minimum of two arrays, propagating any NaNs.
fmin
Element-wise minimum of two arrays, ignoring any NaNs.
argmin
Return the indices of the minimum values.
nanmax
,maximum
,fmax
Notes
NaN values are propagated, that is if at least one item is NaN, the corresponding min value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmin.
Don’t use amin for element-wise comparison of 2 arrays; when
a.shape[0]
is 2,minimum(a[0], a[1])
is faster thanamin(a, axis=0)
.Examples
>>> a = np.arange(4).reshape((2,2)) >>> a array([[0, 1], [2, 3]]) >>> np.amin(a) # Minimum of the flattened array 0 >>> np.amin(a, axis=0) # Minima along the first axis array([0, 1]) >>> np.amin(a, axis=1) # Minima along the second axis array([0, 2]) >>> np.amin(a, where=[False, True], initial=10, axis=0) array([10, 1])
>>> b = np.arange(5, dtype=float) >>> b[2] = np.NaN >>> np.amin(b) nan >>> np.amin(b, where=~np.isnan(b), initial=10) 0.0 >>> np.nanmin(b) 0.0
>>> np.min([[-50], [10]], axis=-1, initial=0) array([-50, 0])
Notice that the initial value is used as one of the elements for which the minimum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.
Notice that this isn’t the same as Python’s
default
argument.>>> np.min([6], initial=5) 5 >>> min([6], default=5) 6
-
dask.array.
minimum
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.minimum.
Some inconsistencies with the Dask version may exist.
Element-wise minimum of array elements.
Compare two arrays and returns a new array containing the element-wise minima. If one of the elements being compared is a NaN, then that element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are propagated.
- Parameters
- x1, x2array_like
The arrays holding the elements to be compared. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or scalar
The minimum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.
See also
Notes
The minimum is equivalent to
np.where(x1 <= x2, x1, x2)
when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.Examples
>>> np.minimum([2, 3, 4], [1, 5, 2]) array([1, 3, 2])
>>> np.minimum(np.eye(2), [0.5, 2]) # broadcasting array([[ 0.5, 0. ], [ 0. , 1. ]])
>>> np.minimum([np.nan, 0, np.nan],[0, np.nan, np.nan]) array([nan, nan, nan]) >>> np.minimum(-np.Inf, 1) -inf
-
dask.array.
modf
(x, [out1, out2, ]/, [out=(None, None), ]*, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.modf.
Some inconsistencies with the Dask version may exist.
Return the fractional and integral parts of an array, element-wise.
The fractional and integral parts are negative if the given number is negative.
- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- y1ndarray
Fractional part of x. This is a scalar if x is a scalar.
- y2ndarray
Integral part of x. This is a scalar if x is a scalar.
See also
divmod
divmod(x, 1)
is equivalent tomodf
with the return values switched, except it always has a positive remainder.
Notes
For integer input the return values are floats.
Examples
>>> np.modf([0, 3.5]) (array([ 0. , 0.5]), array([ 0., 3.])) >>> np.modf(-0.5) (-0.5, -0)
-
dask.array.
moment
(a, order, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶
-
dask.array.
moveaxis
(a, source, destination)¶ Move axes of an array to new positions.
This docstring was copied from numpy.moveaxis.
Some inconsistencies with the Dask version may exist.
Other axes remain in their original order.
New in version 1.11.0.
- Parameters
- anp.ndarray
The array whose axes should be reordered.
- sourceint or sequence of int
Original positions of the axes to move. These must be unique.
- destinationint or sequence of int
Destination positions for each of the original axes. These must also be unique.
- Returns
- resultnp.ndarray
Array with moved axes. This array is a view of the input array.
See also
transpose
Permute the dimensions of an array.
swapaxes
Interchange two axes of an array.
Examples
>>> x = np.zeros((3, 4, 5)) >>> np.moveaxis(x, 0, -1).shape (4, 5, 3) >>> np.moveaxis(x, -1, 0).shape (5, 3, 4)
These all achieve the same result:
>>> np.transpose(x).shape (5, 4, 3) >>> np.swapaxes(x, 0, -1).shape (5, 4, 3) >>> np.moveaxis(x, [0, 1], [-1, -2]).shape (5, 4, 3) >>> np.moveaxis(x, [0, 1, 2], [-1, -2, -3]).shape (5, 4, 3)
-
dask.array.
nanargmax
(x, axis=None, split_every=None, out=None)¶ Return the maximum of an array or maximum along an axis, ignoring any NaNs. When all-NaN slices are encountered a
RuntimeWarning
is raised and NaN is returned for that slice.This docstring was copied from numpy.nanmax.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like (Not supported in Dask)
Array containing numbers whose maximum is desired. If a is not an array, a conversion is attempted.
- axis{int, tuple of int, None}, optional
Axis or axes along which the maximum is computed. The default is to compute the maximum of the flattened array.
- outndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.New in version 1.8.0.
- keepdimsbool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the max method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
- Returns
- nanmaxndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.
See also
nanmin
The minimum value of an array along a given axis, ignoring any NaNs.
amax
The maximum value of an array along a given axis, propagating any NaNs.
fmax
Element-wise maximum of two arrays, ignoring any NaNs.
maximum
Element-wise maximum of two arrays, propagating any NaNs.
isnan
Shows which elements are Not a Number (NaN).
isfinite
Shows which elements are neither NaN nor infinity.
amin
,fmin
,minimum
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.
If the input has a integer type the function is equivalent to np.max.
Examples
>>> a = np.array([[1, 2], [3, np.nan]]) >>> np.nanmax(a) 3.0 >>> np.nanmax(a, axis=0) array([3., 2.]) >>> np.nanmax(a, axis=1) array([2., 3.])
When positive infinity and negative infinity are present:
>>> np.nanmax([1, 2, np.nan, np.NINF]) 2.0 >>> np.nanmax([1, 2, np.nan, np.inf]) inf
-
dask.array.
nanargmin
(x, axis=None, split_every=None, out=None)¶ Return minimum of an array or minimum along an axis, ignoring any NaNs. When all-NaN slices are encountered a
RuntimeWarning
is raised and Nan is returned for that slice.This docstring was copied from numpy.nanmin.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like (Not supported in Dask)
Array containing numbers whose minimum is desired. If a is not an array, a conversion is attempted.
- axis{int, tuple of int, None}, optional
Axis or axes along which the minimum is computed. The default is to compute the minimum of the flattened array.
- outndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.New in version 1.8.0.
- keepdimsbool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the min method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
- Returns
- nanminndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.
See also
nanmax
The maximum value of an array along a given axis, ignoring any NaNs.
amin
The minimum value of an array along a given axis, propagating any NaNs.
fmin
Element-wise minimum of two arrays, ignoring any NaNs.
minimum
Element-wise minimum of two arrays, propagating any NaNs.
isnan
Shows which elements are Not a Number (NaN).
isfinite
Shows which elements are neither NaN nor infinity.
amax
,fmax
,maximum
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.
If the input has a integer type the function is equivalent to np.min.
Examples
>>> a = np.array([[1, 2], [3, np.nan]]) >>> np.nanmin(a) 1.0 >>> np.nanmin(a, axis=0) array([1., 2.]) >>> np.nanmin(a, axis=1) array([1., 3.])
When positive infinity and negative infinity are present:
>>> np.nanmin([1, 2, np.nan, np.inf]) 1.0 >>> np.nanmin([1, 2, np.nan, np.NINF]) -inf
-
dask.array.
nancumprod
(x, axis, dtype=None, out=None)¶ Return the cumulative product of array elements over a given axis treating Not a Numbers (NaNs) as one. The cumulative product does not change when NaNs are encountered and leading NaNs are replaced by ones.
This docstring was copied from numpy.nancumprod.
Some inconsistencies with the Dask version may exist.
Ones are returned for slices that are all-NaN or empty.
New in version 1.12.0.
- Parameters
- aarray_like (Not supported in Dask)
Input array.
- axisint, optional
Axis along which the cumulative product is computed. By default the input is flattened.
- dtypedtype, optional
Type of the returned array, as well as of the accumulator in which the elements are multiplied. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used instead.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type of the resulting values will be cast if necessary.
- Returns
- nancumprodndarray
A new array holding the result is returned unless out is specified, in which case it is returned.
See also
numpy.cumprod
Cumulative product across array propagating NaNs.
isnan
Show which elements are NaN.
Examples
>>> np.nancumprod(1) array([1]) >>> np.nancumprod([1]) array([1]) >>> np.nancumprod([1, np.nan]) array([1., 1.]) >>> a = np.array([[1, 2], [3, np.nan]]) >>> np.nancumprod(a) array([1., 2., 6., 6.]) >>> np.nancumprod(a, axis=0) array([[1., 2.], [3., 2.]]) >>> np.nancumprod(a, axis=1) array([[1., 2.], [3., 3.]])
-
dask.array.
nancumsum
(x, axis, dtype=None, out=None)¶ Return the cumulative sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. The cumulative sum does not change when NaNs are encountered and leading NaNs are replaced by zeros.
This docstring was copied from numpy.nancumsum.
Some inconsistencies with the Dask version may exist.
Zeros are returned for slices that are all-NaN or empty.
New in version 1.12.0.
- Parameters
- aarray_like (Not supported in Dask)
Input array.
- axisint, optional
Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array.
- dtypedtype, optional
Type of the returned array and of the accumulator in which the elements are summed. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type will be cast if necessary. See ufuncs-output-type for more details.
- Returns
- nancumsumndarray.
A new array holding the result is returned unless out is specified, in which it is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.
See also
numpy.cumsum
Cumulative sum across array propagating NaNs.
isnan
Show which elements are NaN.
Examples
>>> np.nancumsum(1) array([1]) >>> np.nancumsum([1]) array([1]) >>> np.nancumsum([1, np.nan]) array([1., 1.]) >>> a = np.array([[1, 2], [3, np.nan]]) >>> np.nancumsum(a) array([1., 3., 6., 6.]) >>> np.nancumsum(a, axis=0) array([[1., 2.], [4., 2.]]) >>> np.nancumsum(a, axis=1) array([[1., 3.], [3., 3.]])
-
dask.array.
nanmax
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Return the maximum of an array or maximum along an axis, ignoring any NaNs. When all-NaN slices are encountered a
RuntimeWarning
is raised and NaN is returned for that slice.This docstring was copied from numpy.nanmax.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Array containing numbers whose maximum is desired. If a is not an array, a conversion is attempted.
- axis{int, tuple of int, None}, optional
Axis or axes along which the maximum is computed. The default is to compute the maximum of the flattened array.
- outndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.New in version 1.8.0.
- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the max method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
- Returns
- nanmaxndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.
See also
nanmin
The minimum value of an array along a given axis, ignoring any NaNs.
amax
The maximum value of an array along a given axis, propagating any NaNs.
fmax
Element-wise maximum of two arrays, ignoring any NaNs.
maximum
Element-wise maximum of two arrays, propagating any NaNs.
isnan
Shows which elements are Not a Number (NaN).
isfinite
Shows which elements are neither NaN nor infinity.
amin
,fmin
,minimum
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.
If the input has a integer type the function is equivalent to np.max.
Examples
>>> a = np.array([[1, 2], [3, np.nan]]) >>> np.nanmax(a) 3.0 >>> np.nanmax(a, axis=0) array([3., 2.]) >>> np.nanmax(a, axis=1) array([2., 3.])
When positive infinity and negative infinity are present:
>>> np.nanmax([1, 2, np.nan, np.NINF]) 2.0 >>> np.nanmax([1, 2, np.nan, np.inf]) inf
-
dask.array.
nanmean
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Compute the arithmetic mean along the specified axis, ignoring NaNs.
This docstring was copied from numpy.nanmean.
Some inconsistencies with the Dask version may exist.
Compute the arithmetic mean along the specified axis, ignoring NaNs.
This docstring was copied from numpy.nanmean.
Some inconsistencies with the Dask version may exist.
Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.
For all-NaN slices, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
- Parameters
- aarray_like
Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted.
- axis{int, tuple of int, None}, optional
Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.
- dtypedata-type, optional
Type to use in computing the mean. For integer inputs, the default is float64; for inexact inputs, it is the same as the input dtype.
- outndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the mean or sum methods of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
- Returns
- mndarray, see dtype parameter above
If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned. Nan is returned for slices that contain only NaNs.
Notes
The arithmetic mean is the sum of the non-NaN elements along the axis divided by the number of non-NaN elements.
Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32. Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.
Examples
>>> a = np.array([[1, np.nan], [3, 4]]) >>> np.nanmean(a) 2.6666666666666665 >>> np.nanmean(a, axis=0) array([2., 4.]) >>> np.nanmean(a, axis=1) array([1., 3.5]) # may vary
Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.
For all-NaN slices, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
-
dask.array.
nanmedian
(a, axis=None, keepdims=False, out=None)¶ Compute the median along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanmedian.
Some inconsistencies with the Dask version may exist.
This works by automatically chunking the reduced axes to a single chunk and then calling
numpy.nanmedian
function across the remaining dimensionsReturns the median of the array elements.
New in version 1.9.0.
- Parameters
- aarray_like
Input array or object that can be converted to an array.
- axis{int, sequence of int, None}, optional
Axis or axes along which the medians are computed. The default is to compute the median along a flattened version of the array. A sequence of axes is supported since version 1.9.0.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type (of the output) will be cast if necessary.
- overwrite_inputbool, optional (Not supported in Dask)
If True, then allow use of memory of input array a for calculations. The input array will be modified by the call to median. This will save memory when you do not need to preserve the contents of the input array. Treat the input as undefined, but it will probably be fully or partially sorted. Default is False. If overwrite_input is
True
and a is not already an ndarray, an error will be raised.- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If this is anything but the default value it will be passed through (in the special case of an empty array) to the mean function of the underlying array. If the array is a sub-class and mean does not have the kwarg keepdims this will raise a RuntimeError.
- Returns
- medianndarray
A new array holding the result. If the input contains integers or floats smaller than
float64
, then the output data-type isnp.float64
. Otherwise, the data-type of the output is the same as that of the input. If out is specified, that array is returned instead.
See also
Notes
Given a vector
V
of lengthN
, the median ofV
is the middle value of a sorted copy ofV
,V_sorted
- i.e.,V_sorted[(N-1)/2]
, whenN
is odd and the average of the two middle values ofV_sorted
whenN
is even.Examples
>>> a = np.array([[10.0, 7, 4], [3, 2, 1]]) >>> a[0, 1] = np.nan >>> a array([[10., nan, 4.], [ 3., 2., 1.]]) >>> np.median(a) nan >>> np.nanmedian(a) 3.0 >>> np.nanmedian(a, axis=0) array([6.5, 2. , 2.5]) >>> np.median(a, axis=1) array([nan, 2.]) >>> b = a.copy() >>> np.nanmedian(b, axis=1, overwrite_input=True) array([7., 2.]) >>> assert not np.all(a==b) >>> b = a.copy() >>> np.nanmedian(b, axis=None, overwrite_input=True) 3.0 >>> assert not np.all(a==b)
-
dask.array.
nanmin
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Return minimum of an array or minimum along an axis, ignoring any NaNs. When all-NaN slices are encountered a
RuntimeWarning
is raised and Nan is returned for that slice.This docstring was copied from numpy.nanmin.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Array containing numbers whose minimum is desired. If a is not an array, a conversion is attempted.
- axis{int, tuple of int, None}, optional
Axis or axes along which the minimum is computed. The default is to compute the minimum of the flattened array.
- outndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.New in version 1.8.0.
- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the min method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
- Returns
- nanminndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.
See also
nanmax
The maximum value of an array along a given axis, ignoring any NaNs.
amin
The minimum value of an array along a given axis, propagating any NaNs.
fmin
Element-wise minimum of two arrays, ignoring any NaNs.
minimum
Element-wise minimum of two arrays, propagating any NaNs.
isnan
Shows which elements are Not a Number (NaN).
isfinite
Shows which elements are neither NaN nor infinity.
amax
,fmax
,maximum
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.
If the input has a integer type the function is equivalent to np.min.
Examples
>>> a = np.array([[1, 2], [3, np.nan]]) >>> np.nanmin(a) 1.0 >>> np.nanmin(a, axis=0) array([1., 2.]) >>> np.nanmin(a, axis=1) array([1., 3.])
When positive infinity and negative infinity are present:
>>> np.nanmin([1, 2, np.nan, np.inf]) 1.0 >>> np.nanmin([1, 2, np.nan, np.NINF]) -inf
-
dask.array.
nanprod
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Return the product of array elements over a given axis treating Not a Numbers (NaNs) as ones.
This docstring was copied from numpy.nanprod.
Some inconsistencies with the Dask version may exist.
One is returned for slices that are all-NaN or empty.
New in version 1.10.0.
- Parameters
- aarray_like
Array containing numbers whose product is desired. If a is not an array, a conversion is attempted.
- axis{int, tuple of int, None}, optional
Axis or axes along which the product is computed. The default is to compute the product of the flattened array.
- dtypedata-type, optional
The type of the returned array and of the accumulator in which the elements are summed. By default, the dtype of a is used. An exception is when a has an integer type with less precision than the platform (u)intp. In that case, the default will be either (u)int32 or (u)int64 depending on whether the platform is 32 or 64 bits. For inexact inputs, dtype must be inexact.
- outndarray, optional
Alternate output array in which to place the result. The default is
None
. If provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details. The casting of NaN to integer can yield unexpected results.- keepdimsbool, optional
If True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.
- Returns
- nanprodndarray
A new array holding the result is returned unless out is specified, in which case it is returned.
See also
numpy.prod
Product across array propagating NaNs.
isnan
Show which elements are NaN.
Examples
>>> np.nanprod(1) 1 >>> np.nanprod([1]) 1 >>> np.nanprod([1, np.nan]) 1.0 >>> a = np.array([[1, 2], [3, np.nan]]) >>> np.nanprod(a) 6.0 >>> np.nanprod(a, axis=0) array([3., 2.])
-
dask.array.
nanstd
(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Compute the standard deviation along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanstd.
Some inconsistencies with the Dask version may exist.
Compute the standard deviation along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanstd.
Some inconsistencies with the Dask version may exist.
Returns the standard deviation, a measure of the spread of a distribution, of the non-NaN array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.
For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
- Parameters
- aarray_like
Calculate the standard deviation of the non-NaN values.
- axis{int, tuple of int, None}, optional
Axis or axes along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array.
- dtypedtype, optional
Type to use in computing the standard deviation. For arrays of integer type the default is float64, for arrays of float types it is the same as the array type.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape as the expected output but the type (of the calculated values) will be cast if necessary.
- ddofint, optional
Means Delta Degrees of Freedom. The divisor used in calculations is
N - ddof
, whereN
represents the number of non-NaN elements. By default ddof is zero.- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If this value is anything but the default it is passed through as-is to the relevant functions of the sub-classes. If these functions do not have a keepdims kwarg, a RuntimeError will be raised.
- Returns
- standard_deviationndarray, see dtype parameter above.
If out is None, return a new array containing the standard deviation, otherwise return a reference to the output array. If ddof is >= the number of non-NaN elements in a slice or the slice contains only NaNs, then the result for that slice is NaN.
Notes
The standard deviation is the square root of the average of the squared deviations from the mean:
std = sqrt(mean(abs(x - x.mean())**2))
.The average squared deviation is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of the infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even withddof=1
, it will not be an unbiased estimate of the standard deviation per se.Note that, for complex numbers, std takes the absolute value before squaring, so that the result is always real and nonnegative.
For floating-point input, the std is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.
Examples
>>> a = np.array([[1, np.nan], [3, 4]]) >>> np.nanstd(a) 1.247219128924647 >>> np.nanstd(a, axis=0) array([1., 0.]) >>> np.nanstd(a, axis=1) array([0., 0.5]) # may vary
Returns the standard deviation, a measure of the spread of a distribution, of the non-NaN array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.
For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
-
dask.array.
nansum
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero.
This docstring was copied from numpy.nansum.
Some inconsistencies with the Dask version may exist.
In NumPy versions <= 1.9.0 Nan is returned for slices that are all-NaN or empty. In later versions zero is returned.
- Parameters
- aarray_like
Array containing numbers whose sum is desired. If a is not an array, a conversion is attempted.
- axis{int, tuple of int, None}, optional
Axis or axes along which the sum is computed. The default is to compute the sum of the flattened array.
- dtypedata-type, optional
The type of the returned array and of the accumulator in which the elements are summed. By default, the dtype of a is used. An exception is when a has an integer type with less precision than the platform (u)intp. In that case, the default will be either (u)int32 or (u)int64 depending on whether the platform is 32 or 64 bits. For inexact inputs, dtype must be inexact.
New in version 1.8.0.
- outndarray, optional
Alternate output array in which to place the result. The default is
None
. If provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details. The casting of NaN to integer can yield unexpected results.New in version 1.8.0.
- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the mean or sum methods of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
- Returns
- nansumndarray.
A new array holding the result is returned unless out is specified, in which it is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.
See also
Notes
If both positive and negative infinity are present, the sum will be Not A Number (NaN).
Examples
>>> np.nansum(1) 1 >>> np.nansum([1]) 1 >>> np.nansum([1, np.nan]) 1.0 >>> a = np.array([[1, 1], [1, np.nan]]) >>> np.nansum(a) 3.0 >>> np.nansum(a, axis=0) array([2., 1.]) >>> np.nansum([1, np.nan, np.inf]) inf >>> np.nansum([1, np.nan, np.NINF]) -inf >>> from numpy.testing import suppress_warnings >>> with suppress_warnings() as sup: ... sup.filter(RuntimeWarning) ... np.nansum([1, np.nan, np.inf, -np.inf]) # both +/- infinity present nan
-
dask.array.
nanvar
(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Compute the variance along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanvar.
Some inconsistencies with the Dask version may exist.
Compute the variance along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanvar.
Some inconsistencies with the Dask version may exist.
Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.
For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
- Parameters
- aarray_like
Array containing numbers whose variance is desired. If a is not an array, a conversion is attempted.
- axis{int, tuple of int, None}, optional
Axis or axes along which the variance is computed. The default is to compute the variance of the flattened array.
- dtypedata-type, optional
Type to use in computing the variance. For arrays of integer type the default is float64; for arrays of float types it is the same as the array type.
- outndarray, optional
Alternate output array in which to place the result. It must have the same shape as the expected output, but the type is cast if necessary.
- ddofint, optional
“Delta Degrees of Freedom”: the divisor used in the calculation is
N - ddof
, whereN
represents the number of non-NaN elements. By default ddof is zero.- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
- Returns
- variancendarray, see dtype parameter above
If out is None, return a new array containing the variance, otherwise return a reference to the output array. If ddof is >= the number of non-NaN elements in a slice or the slice contains only NaNs, then the result for that slice is NaN.
See also
Notes
The variance is the average of the squared deviations from the mean, i.e.,
var = mean(abs(x - x.mean())**2)
.The mean is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of a hypothetical infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables.Note that for complex numbers, the absolute value is taken before squaring, so that the result is always real and nonnegative.
For floating-point input, the variance is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the
dtype
keyword can alleviate this issue.For this function to work on sub-classes of ndarray, they must define sum with the kwarg keepdims
Examples
>>> a = np.array([[1, np.nan], [3, 4]]) >>> np.nanvar(a) 1.5555555555555554 >>> np.nanvar(a, axis=0) array([1., 0.]) >>> np.nanvar(a, axis=1) array([0., 0.25]) # may vary
Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.
For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
-
dask.array.
nan_to_num
(*args, **kwargs)¶ Replace NaN with zero and infinity with large finite numbers (default behaviour) or with the numbers defined by the user using the nan, posinf and/or neginf keywords.
This docstring was copied from numpy.nan_to_num.
Some inconsistencies with the Dask version may exist.
If x is inexact, NaN is replaced by zero or by the user defined value in nan keyword, infinity is replaced by the largest finite floating point values representable by
x.dtype
or by the user defined value in posinf keyword and -infinity is replaced by the most negative finite floating point values representable byx.dtype
or by the user defined value in neginf keyword.For complex dtypes, the above is applied to each of the real and imaginary components of x separately.
If x is not inexact, then no replacements are made.
- Parameters
- xscalar or array_like (Not supported in Dask)
Input data.
- copybool, optional (Not supported in Dask)
Whether to create a copy of x (True) or to replace values in-place (False). The in-place operation only occurs if casting to an array does not require a copy. Default is True.
New in version 1.13.
- nanint, float, optional (Not supported in Dask)
Value to be used to fill NaN values. If no value is passed then NaN values will be replaced with 0.0.
New in version 1.17.
- posinfint, float, optional (Not supported in Dask)
Value to be used to fill positive infinity values. If no value is passed then positive infinity values will be replaced with a very large number.
New in version 1.17.
- neginfint, float, optional (Not supported in Dask)
Value to be used to fill negative infinity values. If no value is passed then negative infinity values will be replaced with a very small (or negative) number.
New in version 1.17.
- Returns
- outndarray
x, with the non-finite values replaced. If copy is False, this may be x itself.
See also
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity.
Examples
>>> np.nan_to_num(np.inf) 1.7976931348623157e+308 >>> np.nan_to_num(-np.inf) -1.7976931348623157e+308 >>> np.nan_to_num(np.nan) 0.0 >>> x = np.array([np.inf, -np.inf, np.nan, -128, 128]) >>> np.nan_to_num(x) array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000, # may vary -1.28000000e+002, 1.28000000e+002]) >>> np.nan_to_num(x, nan=-9999, posinf=33333333, neginf=33333333) array([ 3.3333333e+07, 3.3333333e+07, -9.9990000e+03, -1.2800000e+02, 1.2800000e+02]) >>> y = np.array([complex(np.inf, np.nan), np.nan, complex(np.nan, np.inf)]) array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000, # may vary -1.28000000e+002, 1.28000000e+002]) >>> np.nan_to_num(y) array([ 1.79769313e+308 +0.00000000e+000j, # may vary 0.00000000e+000 +0.00000000e+000j, 0.00000000e+000 +1.79769313e+308j]) >>> np.nan_to_num(y, nan=111111, posinf=222222) array([222222.+111111.j, 111111. +0.j, 111111.+222222.j])
-
dask.array.
nextafter
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.nextafter.
Some inconsistencies with the Dask version may exist.
Return the next floating-point value after x1 towards x2, element-wise.
- Parameters
- x1array_like
Values to find the next representable value of.
- x2array_like
The direction where to look for the next representable value of x1. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
The next representable values of x1 in the direction of x2. This is a scalar if both x1 and x2 are scalars.
Examples
>>> eps = np.finfo(np.float64).eps >>> np.nextafter(1, 2) == eps + 1 True >>> np.nextafter([1, 2], [2, 1]) == [eps + 1, 2 - eps] array([ True, True])
-
dask.array.
nonzero
(a)¶ Return the indices of the elements that are non-zero.
This docstring was copied from numpy.nonzero.
Some inconsistencies with the Dask version may exist.
Returns a tuple of arrays, one for each dimension of a, containing the indices of the non-zero elements in that dimension. The values in a are always tested and returned in row-major, C-style order.
To group the indices by element, rather than dimension, use argwhere, which returns a row for each non-zero element.
Note
When called on a zero-d array or scalar,
nonzero(a)
is treated asnonzero(atleast1d(a))
.Deprecated since version 1.17.0: Use atleast1d explicitly if this behavior is deliberate.
- Parameters
- aarray_like
Input array.
- Returns
- tuple_of_arraystuple
Indices of elements that are non-zero.
See also
flatnonzero
Return indices that are non-zero in the flattened version of the input array.
ndarray.nonzero
Equivalent ndarray method.
count_nonzero
Counts the number of non-zero elements in the input array.
Notes
While the nonzero values can be obtained with
a[nonzero(a)]
, it is recommended to usex[x.astype(bool)]
orx[x != 0]
instead, which will correctly handle 0-d arrays.Examples
>>> x = np.array([[3, 0, 0], [0, 4, 0], [5, 6, 0]]) >>> x array([[3, 0, 0], [0, 4, 0], [5, 6, 0]]) >>> np.nonzero(x) (array([0, 1, 2, 2]), array([0, 1, 0, 1]))
>>> x[np.nonzero(x)] array([3, 4, 5, 6]) >>> np.transpose(np.nonzero(x)) array([[0, 0], [1, 1], [2, 0], [2, 1]])
A common use for
nonzero
is to find the indices of an array, where a condition is True. Given an array a, the condition a > 3 is a boolean array and since False is interpreted as 0, np.nonzero(a > 3) yields the indices of the a where the condition is true.>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> a > 3 array([[False, False, False], [ True, True, True], [ True, True, True]]) >>> np.nonzero(a > 3) (array([1, 1, 1, 2, 2, 2]), array([0, 1, 2, 0, 1, 2]))
Using this result to index a is equivalent to using the mask directly:
>>> a[np.nonzero(a > 3)] array([4, 5, 6, 7, 8, 9]) >>> a[a > 3] # prefer this spelling array([4, 5, 6, 7, 8, 9])
nonzero
can also be called as a method of the array.>>> (a > 3).nonzero() (array([1, 1, 1, 2, 2, 2]), array([0, 1, 2, 0, 1, 2]))
-
dask.array.
notnull
(values)¶ pandas.notnull for dask arrays
-
dask.array.
ones
(*args, **kwargs)¶ Blocked variant of ones
Follows the signature of ones exactly except that it also features optional keyword arguments
chunks: int, tuple, or dict
andname: str
.Original signature follows below.
Return a new array of given shape and type, filled with ones.
- Parameters
- shapeint or sequence of ints
Shape of the new array, e.g.,
(2, 3)
or2
.- dtypedata-type, optional
The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64.
- order{‘C’, ‘F’}, optional, default: C
Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
- Returns
- outndarray
Array of ones with the given shape, dtype, and order.
See also
Examples
>>> np.ones(5) array([1., 1., 1., 1., 1.])
>>> np.ones((5,), dtype=int) array([1, 1, 1, 1, 1])
>>> np.ones((2, 1)) array([[1.], [1.]])
>>> s = (2,2) >>> np.ones(s) array([[1., 1.], [1., 1.]])
-
dask.array.
ones_like
(a, dtype=None, order='C', chunks=None, name=None, shape=None)¶ Return an array of ones with the same shape and type as a given array.
- Parameters
- aarray_like
The shape and data-type of a define these same attributes of the returned array.
- dtypedata-type, optional
Overrides the data type of the result.
- order{‘C’, ‘F’}, optional
Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.
- chunkssequence of ints
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.- namestr, optional
An optional keyname for the array. Defaults to hashing the input keyword arguments.
- shapeint or sequence of ints, optional.
Overrides the shape of the result.
- Returns
- outndarray
Array of ones with the same shape and type as a.
See also
zeros_like
Return an array of zeros with shape and type of input.
empty_like
Return an empty array with shape and type of input.
zeros
Return a new array setting values to zero.
ones
Return a new array setting values to one.
empty
Return a new uninitialized array.
-
dask.array.
outer
(a, b)¶ Compute the outer product of two vectors.
This docstring was copied from numpy.outer.
Some inconsistencies with the Dask version may exist.
Given two vectors,
a = [a0, a1, ..., aM]
andb = [b0, b1, ..., bN]
, the outer product [1] is:[[a0*b0 a0*b1 ... a0*bN ] [a1*b0 . [ ... . [aM*b0 aM*bN ]]
- Parameters
- a(M,) array_like
First input vector. Input is flattened if not already 1-dimensional.
- b(N,) array_like
Second input vector. Input is flattened if not already 1-dimensional.
- out(M, N) ndarray, optional
A location where the result is stored
New in version 1.9.0.
- Returns
- out(M, N) ndarray
out[i, j] = a[i] * b[j]
See also
inner
einsum
einsum('i,j->ij', a.ravel(), b.ravel())
is the equivalent.ufunc.outer
A generalization to N dimensions and other operations.
np.multiply.outer(a.ravel(), b.ravel())
is the equivalent.
References
- 1
: G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed., Baltimore, MD, Johns Hopkins University Press, 1996, pg. 8.
Examples
Make a (very coarse) grid for computing a Mandelbrot set:
>>> rl = np.outer(np.ones((5,)), np.linspace(-2, 2, 5)) >>> rl array([[-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.]]) >>> im = np.outer(1j*np.linspace(2, -2, 5), np.ones((5,))) >>> im array([[0.+2.j, 0.+2.j, 0.+2.j, 0.+2.j, 0.+2.j], [0.+1.j, 0.+1.j, 0.+1.j, 0.+1.j, 0.+1.j], [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], [0.-1.j, 0.-1.j, 0.-1.j, 0.-1.j, 0.-1.j], [0.-2.j, 0.-2.j, 0.-2.j, 0.-2.j, 0.-2.j]]) >>> grid = rl + im >>> grid array([[-2.+2.j, -1.+2.j, 0.+2.j, 1.+2.j, 2.+2.j], [-2.+1.j, -1.+1.j, 0.+1.j, 1.+1.j, 2.+1.j], [-2.+0.j, -1.+0.j, 0.+0.j, 1.+0.j, 2.+0.j], [-2.-1.j, -1.-1.j, 0.-1.j, 1.-1.j, 2.-1.j], [-2.-2.j, -1.-2.j, 0.-2.j, 1.-2.j, 2.-2.j]])
An example using a “vector” of letters:
>>> x = np.array(['a', 'b', 'c'], dtype=object) >>> np.outer(x, [1, 2, 3]) array([['a', 'aa', 'aaa'], ['b', 'bb', 'bbb'], ['c', 'cc', 'ccc']], dtype=object)
-
dask.array.
pad
(array, pad_width, mode='constant', **kwargs)¶ Pad an array.
This docstring was copied from numpy.pad.
Some inconsistencies with the Dask version may exist.
- Parameters
- arrayarray_like of rank N
The array to pad.
- pad_width{sequence, array_like, int}
Number of values padded to the edges of each axis. ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. ((before, after),) yields same before and after pad for each axis. (pad,) or int is a shortcut for before = after = pad width for all axes.
- modestr or function, optional
One of the following string values or a user supplied function.
- ‘constant’ (default)
Pads with a constant value.
- ‘edge’
Pads with the edge values of array.
- ‘linear_ramp’
Pads with the linear ramp between end_value and the array edge value.
- ‘maximum’
Pads with the maximum value of all or part of the vector along each axis.
- ‘mean’
Pads with the mean value of all or part of the vector along each axis.
- ‘median’
Pads with the median value of all or part of the vector along each axis.
- ‘minimum’
Pads with the minimum value of all or part of the vector along each axis.
- ‘reflect’
Pads with the reflection of the vector mirrored on the first and last values of the vector along each axis.
- ‘symmetric’
Pads with the reflection of the vector mirrored along the edge of the array.
- ‘wrap’
Pads with the wrap of the vector along the axis. The first values are used to pad the end and the end values are used to pad the beginning.
- ‘empty’
Pads with undefined values.
New in version 1.17.
- <function>
Padding function, see Notes.
- stat_lengthsequence or int, optional
Used in ‘maximum’, ‘mean’, ‘median’, and ‘minimum’. Number of values at edge of each axis used to calculate the statistic value.
((before_1, after_1), … (before_N, after_N)) unique statistic lengths for each axis.
((before, after),) yields same before and after statistic lengths for each axis.
(stat_length,) or int is a shortcut for before = after = statistic length for all axes.
Default is
None
, to use the entire axis.- constant_valuessequence or scalar, optional
Used in ‘constant’. The values to set the padded values for each axis.
((before_1, after_1), ... (before_N, after_N))
unique pad constants for each axis.((before, after),)
yields same before and after constants for each axis.(constant,)
orconstant
is a shortcut forbefore = after = constant
for all axes.Default is 0.
- end_valuessequence or scalar, optional
Used in ‘linear_ramp’. The values used for the ending value of the linear_ramp and that will form the edge of the padded array.
((before_1, after_1), ... (before_N, after_N))
unique end values for each axis.((before, after),)
yields same before and after end values for each axis.(constant,)
orconstant
is a shortcut forbefore = after = constant
for all axes.Default is 0.
- reflect_type{‘even’, ‘odd’}, optional
Used in ‘reflect’, and ‘symmetric’. The ‘even’ style is the default with an unaltered reflection around the edge value. For the ‘odd’ style, the extended part of the array is created by subtracting the reflected values from two times the edge value.
- Returns
- padndarray
Padded array of rank equal to array with shape increased according to pad_width.
Notes
New in version 1.7.0.
For an array with rank greater than 1, some of the padding of later axes is calculated from padding of previous axes. This is easiest to think about with a rank 2 array where the corners of the padded array are calculated by using padded values from the first axis.
The padding function, if used, should modify a rank 1 array in-place. It has the following signature:
padding_func(vector, iaxis_pad_width, iaxis, kwargs)
where
- vectorndarray
A rank 1 array already padded with zeros. Padded values are vector[:iaxis_pad_width[0]] and vector[-iaxis_pad_width[1]:].
- iaxis_pad_widthtuple
A 2-tuple of ints, iaxis_pad_width[0] represents the number of values padded at the beginning of vector where iaxis_pad_width[1] represents the number of values padded at the end of vector.
- iaxisint
The axis currently being calculated.
- kwargsdict
Any keyword arguments the function requires.
Examples
>>> a = [1, 2, 3, 4, 5] >>> np.pad(a, (2, 3), 'constant', constant_values=(4, 6)) array([4, 4, 1, ..., 6, 6, 6])
>>> np.pad(a, (2, 3), 'edge') array([1, 1, 1, ..., 5, 5, 5])
>>> np.pad(a, (2, 3), 'linear_ramp', end_values=(5, -4)) array([ 5, 3, 1, 2, 3, 4, 5, 2, -1, -4])
>>> np.pad(a, (2,), 'maximum') array([5, 5, 1, 2, 3, 4, 5, 5, 5])
>>> np.pad(a, (2,), 'mean') array([3, 3, 1, 2, 3, 4, 5, 3, 3])
>>> np.pad(a, (2,), 'median') array([3, 3, 1, 2, 3, 4, 5, 3, 3])
>>> a = [[1, 2], [3, 4]] >>> np.pad(a, ((3, 2), (2, 3)), 'minimum') array([[1, 1, 1, 2, 1, 1, 1], [1, 1, 1, 2, 1, 1, 1], [1, 1, 1, 2, 1, 1, 1], [1, 1, 1, 2, 1, 1, 1], [3, 3, 3, 4, 3, 3, 3], [1, 1, 1, 2, 1, 1, 1], [1, 1, 1, 2, 1, 1, 1]])
>>> a = [1, 2, 3, 4, 5] >>> np.pad(a, (2, 3), 'reflect') array([3, 2, 1, 2, 3, 4, 5, 4, 3, 2])
>>> np.pad(a, (2, 3), 'reflect', reflect_type='odd') array([-1, 0, 1, 2, 3, 4, 5, 6, 7, 8])
>>> np.pad(a, (2, 3), 'symmetric') array([2, 1, 1, 2, 3, 4, 5, 5, 4, 3])
>>> np.pad(a, (2, 3), 'symmetric', reflect_type='odd') array([0, 1, 1, 2, 3, 4, 5, 5, 6, 7])
>>> np.pad(a, (2, 3), 'wrap') array([4, 5, 1, 2, 3, 4, 5, 1, 2, 3])
>>> def pad_with(vector, pad_width, iaxis, kwargs): ... pad_value = kwargs.get('padder', 10) ... vector[:pad_width[0]] = pad_value ... vector[-pad_width[1]:] = pad_value >>> a = np.arange(6) >>> a = a.reshape((2, 3)) >>> np.pad(a, 2, pad_with) array([[10, 10, 10, 10, 10, 10, 10], [10, 10, 10, 10, 10, 10, 10], [10, 10, 0, 1, 2, 10, 10], [10, 10, 3, 4, 5, 10, 10], [10, 10, 10, 10, 10, 10, 10], [10, 10, 10, 10, 10, 10, 10]]) >>> np.pad(a, 2, pad_with, padder=100) array([[100, 100, 100, 100, 100, 100, 100], [100, 100, 100, 100, 100, 100, 100], [100, 100, 0, 1, 2, 100, 100], [100, 100, 3, 4, 5, 100, 100], [100, 100, 100, 100, 100, 100, 100], [100, 100, 100, 100, 100, 100, 100]])
-
dask.array.
percentile
(a, q, interpolation='linear', method='default')¶ Approximate percentile of 1-D array
- Parameters
- aArray
- qarray_like of float
Percentile or sequence of percentiles to compute, which must be between 0 and 100 inclusive.
- interpolation{‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}, optional
The interpolation method to use when the desired percentile lies between two data points
i < j
. Only valid formethod='dask'
.‘linear’:
i + (j - i) * fraction
, wherefraction
is the fractional part of the index surrounded byi
andj
.‘lower’:
i
.‘higher’:
j
.‘nearest’:
i
orj
, whichever is nearest.‘midpoint’:
(i + j) / 2
.
- method{‘default’, ‘dask’, ‘tdigest’}, optional
What method to use. By default will use dask’s internal custom algorithm (
'dask'
). If set to'tdigest'
will use tdigest for floats and ints and fallback to the'dask'
otherwise.
See also
numpy.percentile
Numpy’s equivalent Percentile function
-
dask.array.
piecewise
(x, condlist, funclist, *args, **kw)¶ Evaluate a piecewise-defined function.
This docstring was copied from numpy.piecewise.
Some inconsistencies with the Dask version may exist.
Given a set of conditions and corresponding functions, evaluate each function on the input data wherever its condition is true.
- Parameters
- xndarray or scalar
The input domain.
- condlistlist of bool arrays or bool scalars
Each boolean array corresponds to a function in funclist. Wherever condlist[i] is True, funclist[i](x) is used as the output value.
Each boolean array in condlist selects a piece of x, and should therefore be of the same shape as x.
The length of condlist must correspond to that of funclist. If one extra function is given, i.e. if
len(funclist) == len(condlist) + 1
, then that extra function is the default value, used wherever all conditions are false.- funclistlist of callables, f(x,*args,**kw), or scalars
Each function is evaluated over x wherever its corresponding condition is True. It should take a 1d array as input and give an 1d array or a scalar value as output. If, instead of a callable, a scalar is provided then a constant function (
lambda x: scalar
) is assumed.- argstuple, optional
Any further arguments given to piecewise are passed to the functions upon execution, i.e., if called
piecewise(..., ..., 1, 'a')
, then each function is called asf(x, 1, 'a')
.- kwdict, optional
Keyword arguments used in calling piecewise are passed to the functions upon execution, i.e., if called
piecewise(..., ..., alpha=1)
, then each function is called asf(x, alpha=1)
.
- Returns
- outndarray
The output is the same shape and type as x and is found by calling the functions in funclist on the appropriate portions of x, as defined by the boolean arrays in condlist. Portions not covered by any condition have a default value of 0.
Notes
This is similar to choose or select, except that functions are evaluated on elements of x that satisfy the corresponding condition from condlist.
The result is:
|-- |funclist[0](x[condlist[0]]) out = |funclist[1](x[condlist[1]]) |... |funclist[n2](x[condlist[n2]]) |--
Examples
Define the sigma function, which is -1 for
x < 0
and +1 forx >= 0
.>>> x = np.linspace(-2.5, 2.5, 6) >>> np.piecewise(x, [x < 0, x >= 0], [-1, 1]) array([-1., -1., -1., 1., 1., 1.])
Define the absolute value, which is
-x
forx <0
andx
forx >= 0
.>>> np.piecewise(x, [x < 0, x >= 0], [lambda x: -x, lambda x: x]) array([2.5, 1.5, 0.5, 0.5, 1.5, 2.5])
Apply the same function to a scalar value.
>>> y = -2 >>> np.piecewise(y, [y < 0, y >= 0], [lambda x: -x, lambda x: x]) array(2)
-
dask.array.
prod
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Return the product of array elements over a given axis.
This docstring was copied from numpy.prod.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Input data.
- axisNone or int or tuple of ints, optional
Axis or axes along which a product is performed. The default, axis=None, will calculate the product of all the elements in the input array. If axis is negative it counts from the last to the first axis.
New in version 1.7.0.
If axis is a tuple of ints, a product is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.
- dtypedtype, optional
The type of the returned array, as well as of the accumulator in which the elements are multiplied. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.
- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the prod method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- initialscalar, optional (Not supported in Dask)
The starting value for this product. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
- wherearray_like of bool, optional (Not supported in Dask)
Elements to include in the product. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
- Returns
- product_along_axisndarray, see dtype parameter above.
An array shaped as a but with the specified axis removed. Returns a reference to out if specified.
See also
ndarray.prod
equivalent method
ufuncs-output-type
Notes
Arithmetic is modular when using integer types, and no error is raised on overflow. That means that, on a 32-bit platform:
>>> x = np.array([536870910, 536870910, 536870910, 536870910]) >>> np.prod(x) 16 # may vary
The product of an empty array is the neutral element 1:
>>> np.prod([]) 1.0
Examples
By default, calculate the product of all elements:
>>> np.prod([1.,2.]) 2.0
Even when the input array is two-dimensional:
>>> np.prod([[1.,2.],[3.,4.]]) 24.0
But we can also specify the axis over which to multiply:
>>> np.prod([[1.,2.],[3.,4.]], axis=1) array([ 2., 12.])
Or select specific elements to include:
>>> np.prod([1., np.nan, 3.], where=[True, False, True]) 3.0
If the type of x is unsigned, then the output type is the unsigned platform integer:
>>> x = np.array([1, 2, 3], dtype=np.uint8) >>> np.prod(x).dtype == np.uint True
If x is of a signed integer type, then the output type is the default platform integer:
>>> x = np.array([1, 2, 3], dtype=np.int8) >>> np.prod(x).dtype == int True
You can also start the product with a value other than one:
>>> np.prod([1, 2], initial=5) 10
-
dask.array.
ptp
(a, axis=None)¶ Range of values (maximum - minimum) along an axis.
This docstring was copied from numpy.ptp.
Some inconsistencies with the Dask version may exist.
The name of the function comes from the acronym for ‘peak to peak’.
- Parameters
- aarray_like
Input values.
- axisNone or int or tuple of ints, optional
Axis along which to find the peaks. By default, flatten the array. axis may be negative, in which case it counts from the last to the first axis.
New in version 1.15.0.
If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.
- outarray_like (Not supported in Dask)
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type of the output values will be cast if necessary.
- keepdimsbool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the ptp method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- Returns
- ptpndarray
A new array holding the result, unless out was specified, in which case a reference to out is returned.
Examples
>>> x = np.arange(4).reshape((2,2)) >>> x array([[0, 1], [2, 3]])
>>> np.ptp(x, axis=0) array([2, 2])
>>> np.ptp(x, axis=1) array([1, 1])
-
dask.array.
rad2deg
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.rad2deg.
Some inconsistencies with the Dask version may exist.
Convert angles from radians to degrees.
- Parameters
- xarray_like
Angle in radians.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The corresponding angle in degrees. This is a scalar if x is a scalar.
See also
deg2rad
Convert angles from degrees to radians.
unwrap
Remove large jumps in angle by wrapping.
Notes
New in version 1.3.0.
rad2deg(x) is
180 * x / pi
.Examples
>>> np.rad2deg(np.pi/2) 90.0
-
dask.array.
radians
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.radians.
Some inconsistencies with the Dask version may exist.
Convert angles from degrees to radians.
- Parameters
- xarray_like
Input array in degrees.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The corresponding radian values. This is a scalar if x is a scalar.
See also
deg2rad
equivalent function
Examples
Convert a degree array to radians
>>> deg = np.arange(12.) * 30. >>> np.radians(deg) array([ 0. , 0.52359878, 1.04719755, 1.57079633, 2.0943951 , 2.61799388, 3.14159265, 3.66519143, 4.1887902 , 4.71238898, 5.23598776, 5.75958653])
>>> out = np.zeros((deg.shape)) >>> ret = np.radians(deg, out) >>> ret is out True
-
dask.array.
ravel
(array)¶ Return a contiguous flattened array.
This docstring was copied from numpy.ravel.
Some inconsistencies with the Dask version may exist.
A 1-D array, containing the elements of the input, is returned. A copy is made only if needed.
As of NumPy 1.10, the returned array will have the same type as the input array. (for example, a masked array will be returned for a masked array input)
- Parameters
- aarray_like (Not supported in Dask)
Input array. The elements in a are read in the order specified by order, and packed as a 1-D array.
- order{‘C’,’F’, ‘A’, ‘K’}, optional (Not supported in Dask)
The elements of a are read using this index order. ‘C’ means to index the elements in row-major, C-style order, with the last axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to index the elements in column-major, Fortran-style order, with the first index changing fastest, and the last index changing slowest. Note that the ‘C’ and ‘F’ options take no account of the memory layout of the underlying array, and only refer to the order of axis indexing. ‘A’ means to read the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise. ‘K’ means to read the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, ‘C’ index order is used.
- Returns
- yarray_like
y is an array of the same subtype as a, with shape
(a.size,)
. Note that matrices are special cased for backward compatibility, if a is a matrix, then y is a 1-D ndarray.
See also
ndarray.flat
1-D iterator over an array.
ndarray.flatten
1-D array copy of the elements of an array in row-major order.
ndarray.reshape
Change the shape of an array without changing its data.
Notes
In row-major, C-style order, in two dimensions, the row index varies the slowest, and the column index the quickest. This can be generalized to multiple dimensions, where row-major order implies that the index along the first axis varies slowest, and the index along the last quickest. The opposite holds for column-major, Fortran-style index ordering.
When a view is desired in as many cases as possible,
arr.reshape(-1)
may be preferable.Examples
It is equivalent to
reshape(-1, order=order)
.>>> x = np.array([[1, 2, 3], [4, 5, 6]]) >>> np.ravel(x) array([1, 2, 3, 4, 5, 6])
>>> x.reshape(-1) array([1, 2, 3, 4, 5, 6])
>>> np.ravel(x, order='F') array([1, 4, 2, 5, 3, 6])
When
order
is ‘A’, it will preserve the array’s ‘C’ or ‘F’ ordering:>>> np.ravel(x.T) array([1, 4, 2, 5, 3, 6]) >>> np.ravel(x.T, order='A') array([1, 2, 3, 4, 5, 6])
When
order
is ‘K’, it will preserve orderings that are neither ‘C’ nor ‘F’, but won’t reverse axes:>>> a = np.arange(3)[::-1]; a array([2, 1, 0]) >>> a.ravel(order='C') array([2, 1, 0]) >>> a.ravel(order='K') array([2, 1, 0])
>>> a = np.arange(12).reshape(2,3,2).swapaxes(1,2); a array([[[ 0, 2, 4], [ 1, 3, 5]], [[ 6, 8, 10], [ 7, 9, 11]]]) >>> a.ravel(order='C') array([ 0, 2, 4, 1, 3, 5, 6, 8, 10, 7, 9, 11]) >>> a.ravel(order='K') array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
-
dask.array.
real
(*args, **kwargs)¶ Return the real part of the complex argument.
This docstring was copied from numpy.real.
Some inconsistencies with the Dask version may exist.
- Parameters
- valarray_like (Not supported in Dask)
Input array.
- Returns
- outndarray or scalar
The real component of the complex argument. If val is real, the type of val is used for the output. If val has complex elements, the returned type is float.
Examples
>>> a = np.array([1+2j, 3+4j, 5+6j]) >>> a.real array([1., 3., 5.]) >>> a.real = 9 >>> a array([9.+2.j, 9.+4.j, 9.+6.j]) >>> a.real = np.array([9, 8, 7]) >>> a array([9.+2.j, 8.+4.j, 7.+6.j]) >>> np.real(1 + 1j) 1.0
-
dask.array.
rechunk
(x, chunks='auto', threshold=None, block_size_limit=None)¶ Convert blocks in dask array x for new chunks.
- Parameters
- x: dask array
Array to be rechunked.
- chunks: int, tuple, dict or str, optional
The new block dimensions to create. -1 indicates the full size of the corresponding dimension. Default is “auto” which automatically determines chunk sizes.
- threshold: int, optional
The graph growth factor under which we don’t bother introducing an intermediate step.
- block_size_limit: int, optional
The maximum block size (in bytes) we want to produce Defaults to the configuration value
array.chunk-size
Examples
>>> import dask.array as da >>> x = da.ones((1000, 1000), chunks=(100, 100))
Specify uniform chunk sizes with a tuple
>>> y = x.rechunk((1000, 10))
Or chunk only specific dimensions with a dictionary
>>> y = x.rechunk({0: 1000})
Use the value
-1
to specify that you want a single chunk along a dimension or the value"auto"
to specify that dask can freely rechunk a dimension to attain blocks of a uniform block size>>> y = x.rechunk({0: -1, 1: 'auto'}, block_size_limit=1e8)
-
dask.array.
reduction
(x, chunk, aggregate, axis=None, keepdims=False, dtype=None, split_every=None, combine=None, name=None, out=None, concatenate=True, output_size=1, meta=None)¶ General version of reductions
- Parameters
- x: Array
Data being reduced along one or more axes
- chunk: callable(x_chunk, axis, keepdims)
First function to be executed when resolving the dask graph. This function is applied in parallel to all original chunks of x. See below for function parameters.
- combine: callable(x_chunk, axis, keepdims), optional
Function used for intermediate recursive aggregation (see split_every below). If omitted, it defaults to aggregate. If the reduction can be performed in less than 3 steps, it will not be invoked at all.
- aggregate: callable(x_chunk, axis, keepdims)
Last function to be executed when resolving the dask graph, producing the final output. It is always invoked, even when the reduced Array counts a single chunk along the reduced axes.
- axis: int or sequence of ints, optional
Axis or axes to aggregate upon. If omitted, aggregate along all axes.
- keepdims: boolean, optional
Whether the reduction function should preserve the reduced axes, leaving them at size
output_size
, or remove them.- dtype: np.dtype, optional
Force output dtype. Defaults to x.dtype if omitted.
- split_every: int >= 2 or dict(axis: int), optional
Determines the depth of the recursive aggregation. If set to or more than the number of input chunks, the aggregation will be performed in two steps, one
chunk
function per input chunk and a singleaggregate
function at the end. If set to less than that, an intermediatecombine
function will be used, so that any onecombine
oraggregate
function has no more thansplit_every
inputs. The depth of the aggregation graph will be logsplitevery(inputchunksalongreducedaxes). Setting to a low value can reduce cache size and network transfers, at the cost of more CPU and a larger dask graph.Omit to let dask heuristically decide a good default. A default can also be set globally with the
split_every
key indask.config
.- name: str, optional
Prefix of the keys of the intermediate and output nodes. If omitted it defaults to the function names.
- out: Array, optional
Another dask array whose contents will be replaced. Omit to create a new one. Note that, unlike in numpy, this setting gives no performance benefits whatsoever, but can still be useful if one needs to preserve the references to a previously existing Array.
- concatenate: bool, optional
If True (the default), the outputs of the
chunk
/combine
functions are concatenated into a single np.array before being passed to thecombine
/aggregate
functions. If False, the input ofcombine
andaggregate
will be either a list of the raw outputs of the previous step or a single output, and the function will have to concatenate it itself. It can be useful to set this to False if the chunk and/or combine steps do not produce np.arrays.- output_size: int >= 1, optional
Size of the output of the
aggregate
function along the reduced axes. Ignored if keepdims is False.
- Returns
- dask array
- Function Parameters
- x_chunk: numpy.ndarray
Individual input chunk. For
chunk
functions, it is one of the original chunks of x. Forcombine
andaggregate
functions, it’s the concatenation of the outputs produced by the previouschunk
orcombine
functions. If concatenate=False, it’s a list of the raw outputs from the previous functions.- axis: tuple
Normalized list of axes to reduce upon, e.g.
(0, )
Scalar, negative, and None axes have been normalized away. Note that some numpy reduction functions cannot reduce along multiple axes at once and strictly require an int in input. Such functions have to be wrapped to cope.- keepdims: bool
Whether the reduction function should preserve the reduced axes or remove them.
-
dask.array.
repeat
(a, repeats, axis=None)¶ Repeat elements of an array.
This docstring was copied from numpy.repeat.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Input array.
- repeatsint or array of ints
The number of repetitions for each element. repeats is broadcasted to fit the shape of the given axis.
- axisint, optional
The axis along which to repeat values. By default, use the flattened input array, and return a flat output array.
- Returns
- repeated_arrayndarray
Output array which has the same shape as a, except along the given axis.
See also
tile
Tile an array.
Examples
>>> np.repeat(3, 4) array([3, 3, 3, 3]) >>> x = np.array([[1,2],[3,4]]) >>> np.repeat(x, 2) array([1, 1, 2, 2, 3, 3, 4, 4]) >>> np.repeat(x, 3, axis=1) array([[1, 1, 1, 2, 2, 2], [3, 3, 3, 4, 4, 4]]) >>> np.repeat(x, [1, 2], axis=0) array([[1, 2], [3, 4], [3, 4]])
-
dask.array.
reshape
(x, shape)¶ Reshape array to new shape
This is a parallelized version of the
np.reshape
function with the following limitations:It assumes that the array is stored in row-major order
It only allows for reshapings that collapse or merge dimensions like
(1, 2, 3, 4) -> (1, 6, 4)
or(64,) -> (4, 4, 4)
When communication is necessary this algorithm depends on the logic within rechunk. It endeavors to keep chunk sizes roughly the same when possible.
See also
-
dask.array.
result_type
(*arrays_and_dtypes)¶ This docstring was copied from numpy.result_type.
Some inconsistencies with the Dask version may exist.
Returns the type that results from applying the NumPy type promotion rules to the arguments.
Type promotion in NumPy works similarly to the rules in languages like C++, with some slight differences. When both scalars and arrays are used, the array’s type takes precedence and the actual value of the scalar is taken into account.
For example, calculating 3*a, where a is an array of 32-bit floats, intuitively should result in a 32-bit float output. If the 3 is a 32-bit integer, the NumPy rules indicate it can’t convert losslessly into a 32-bit float, so a 64-bit float should be the result type. By examining the value of the constant, ‘3’, we see that it fits in an 8-bit integer, which can be cast losslessly into the 32-bit float.
- Parameters
- arrays_and_dtypeslist of arrays and dtypes
The operands of some operation whose result type is needed.
- Returns
- outdtype
The result type.
See also
dtype
,promote_types
,min_scalar_type
,can_cast
Notes
New in version 1.6.0.
The specific algorithm used is as follows.
Categories are determined by first checking which of boolean, integer (int/uint), or floating point (float/complex) the maximum kind of all the arrays and the scalars are.
If there are only scalars or the maximum category of the scalars is higher than the maximum category of the arrays, the data types are combined with
promote_types()
to produce the return value.Otherwise, min_scalar_type is called on each array, and the resulting data types are all combined with
promote_types()
to produce the return value.The set of int values is not a subset of the uint values for types with the same number of bits, something not reflected in
min_scalar_type()
, but handled as a special case in result_type.Examples
>>> np.result_type(3, np.arange(7, dtype='i1')) dtype('int8')
>>> np.result_type('i4', 'c8') dtype('complex128')
>>> np.result_type(3.0, -2) dtype('float64')
-
dask.array.
rint
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.rint.
Some inconsistencies with the Dask version may exist.
Round elements of the array to the nearest integer.
- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Output array is same shape and type as x. This is a scalar if x is a scalar.
Examples
>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]) >>> np.rint(a) array([-2., -2., -0., 0., 2., 2., 2.])
-
dask.array.
roll
(array, shift, axis=None)¶ Roll array elements along a given axis.
This docstring was copied from numpy.roll.
Some inconsistencies with the Dask version may exist.
Elements that roll beyond the last position are re-introduced at the first.
- Parameters
- aarray_like (Not supported in Dask)
Input array.
- shiftint or tuple of ints
The number of places by which elements are shifted. If a tuple, then axis must be a tuple of the same size, and each of the given axes is shifted by the corresponding number. If an int while axis is a tuple of ints, then the same value is used for all given axes.
- axisint or tuple of ints, optional
Axis or axes along which elements are shifted. By default, the array is flattened before shifting, after which the original shape is restored.
- Returns
- resndarray
Output array, with the same shape as a.
See also
rollaxis
Roll the specified axis backwards, until it lies in a given position.
Notes
New in version 1.12.0.
Supports rolling over multiple dimensions simultaneously.
Examples
>>> x = np.arange(10) >>> np.roll(x, 2) array([8, 9, 0, 1, 2, 3, 4, 5, 6, 7]) >>> np.roll(x, -2) array([2, 3, 4, 5, 6, 7, 8, 9, 0, 1])
>>> x2 = np.reshape(x, (2,5)) >>> x2 array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) >>> np.roll(x2, 1) array([[9, 0, 1, 2, 3], [4, 5, 6, 7, 8]]) >>> np.roll(x2, -1) array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 0]]) >>> np.roll(x2, 1, axis=0) array([[5, 6, 7, 8, 9], [0, 1, 2, 3, 4]]) >>> np.roll(x2, -1, axis=0) array([[5, 6, 7, 8, 9], [0, 1, 2, 3, 4]]) >>> np.roll(x2, 1, axis=1) array([[4, 0, 1, 2, 3], [9, 5, 6, 7, 8]]) >>> np.roll(x2, -1, axis=1) array([[1, 2, 3, 4, 0], [6, 7, 8, 9, 5]])
-
dask.array.
rollaxis
(a, axis, start=0)¶
-
dask.array.
round
(a, decimals=0)¶ Round an array to the given number of decimals.
This docstring was copied from numpy.round.
Some inconsistencies with the Dask version may exist.
See also
around
equivalent function; see for details.
-
dask.array.
sign
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.sign.
Some inconsistencies with the Dask version may exist.
Returns an element-wise indication of the sign of a number.
The sign function returns
-1 if x < 0, 0 if x==0, 1 if x > 0
. nan is returned for nan inputs.For complex inputs, the sign function returns
sign(x.real) + 0j if x.real != 0 else sign(x.imag) + 0j
.complex(nan, 0) is returned for complex nan inputs.
- Parameters
- xarray_like
Input values.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The sign of x. This is a scalar if x is a scalar.
Notes
There is more than one definition of sign in common use for complex numbers. The definition used here is equivalent to x/√x∗x which is different from a common alternative, x/|x|.
Examples
>>> np.sign([-5., 4.5]) array([-1., 1.]) >>> np.sign(0) 0 >>> np.sign(5-2j) (1+0j)
-
dask.array.
signbit
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.signbit.
Some inconsistencies with the Dask version may exist.
Returns element-wise True where signbit is set (less than zero).
- Parameters
- xarray_like
The input value(s).
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- resultndarray of bool
Output array, or reference to out if that was supplied. This is a scalar if x is a scalar.
Examples
>>> np.signbit(-1.2) True >>> np.signbit(np.array([1, -2.3, 2.1])) array([False, True, False])
-
dask.array.
sin
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.sin.
Some inconsistencies with the Dask version may exist.
Trigonometric sine, element-wise.
- Parameters
- xarray_like
Angle, in radians (2π rad equals 360 degrees).
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yarray_like
The sine of each element of x. This is a scalar if x is a scalar.
Notes
The sine is one of the fundamental functions of trigonometry (the mathematical study of triangles). Consider a circle of radius 1 centered on the origin. A ray comes in from the +x axis, makes an angle at the origin (measured counter-clockwise from that axis), and departs from the origin. The y coordinate of the outgoing ray’s intersection with the unit circle is the sine of that angle. It ranges from -1 for x=3π/2 to +1 for π/2. The function has zeroes where the angle is a multiple of π. Sines of angles between π and 2π are negative. The numerous properties of the sine and related functions are included in any standard trigonometry text.
Examples
Print sine of one angle:
>>> np.sin(np.pi/2.) 1.0
Print sines of an array of angles given in degrees:
>>> np.sin(np.array((0., 30., 45., 60., 90.)) * np.pi / 180. ) array([ 0. , 0.5 , 0.70710678, 0.8660254 , 1. ])
Plot the sine function:
>>> import matplotlib.pylab as plt >>> x = np.linspace(-np.pi, np.pi, 201) >>> plt.plot(x, np.sin(x)) >>> plt.xlabel('Angle [rad]') >>> plt.ylabel('sin(x)') >>> plt.axis('tight') >>> plt.show()
-
dask.array.
sinh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.sinh.
Some inconsistencies with the Dask version may exist.
Hyperbolic sine, element-wise.
Equivalent to
1/2 * (np.exp(x) - np.exp(-x))
or-1j * np.sin(1j*x)
.- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The corresponding hyperbolic sine values. This is a scalar if x is a scalar.
Notes
If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)
References
M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972, pg. 83.
Examples
>>> np.sinh(0) 0.0 >>> np.sinh(np.pi*1j/2) 1j >>> np.sinh(np.pi*1j) # (exact value is 0) 1.2246063538223773e-016j >>> # Discrepancy due to vagaries of floating point arithmetic.
>>> # Example of providing the optional output parameter >>> out1 = np.array([0], dtype='d') >>> out2 = np.sinh([0.1], out1) >>> out2 is out1 True
>>> # Example of ValueError due to provision of shape mis-matched `out` >>> np.sinh(np.zeros((3,3)),np.zeros((2,2))) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
-
dask.array.
sqrt
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.sqrt.
Some inconsistencies with the Dask version may exist.
Return the non-negative square-root of an array, element-wise.
- Parameters
- xarray_like
The values whose square-roots are required.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
An array of the same shape as x, containing the positive square-root of each element in x. If any element in x is complex, a complex array is returned (and the square-roots of negative reals are calculated). If all of the elements in x are real, so is y, with negative elements returning
nan
. If out was provided, y is a reference to it. This is a scalar if x is a scalar.
See also
lib.scimath.sqrt
A version which returns complex numbers when given negative reals.
Notes
sqrt has–consistent with common convention–as its branch cut the real “interval” [-inf, 0), and is continuous from above on it. A branch cut is a curve in the complex plane across which a given complex function fails to be continuous.
Examples
>>> np.sqrt([1,4,9]) array([ 1., 2., 3.])
>>> np.sqrt([4, -1, -3+4J]) array([ 2.+0.j, 0.+1.j, 1.+2.j])
>>> np.sqrt([4, -1, np.inf]) array([ 2., nan, inf])
-
dask.array.
square
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.square.
Some inconsistencies with the Dask version may exist.
Return the element-wise square of the input.
- Parameters
- xarray_like
Input data.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- outndarray or scalar
Element-wise x*x, of the same shape and dtype as x. This is a scalar if x is a scalar.
See also
Examples
>>> np.square([-1j, 1]) array([-1.-0.j, 1.+0.j])
-
dask.array.
squeeze
(a, axis=None)¶ Remove single-dimensional entries from the shape of an array.
This docstring was copied from numpy.squeeze.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Input data.
- axisNone or int or tuple of ints, optional
New in version 1.7.0.
Selects a subset of the single-dimensional entries in the shape. If an axis is selected with shape entry greater than one, an error is raised.
- Returns
- squeezedndarray
The input array, but with all or a subset of the dimensions of length 1 removed. This is always a itself or a view into a.
- Raises
- ValueError
If axis is not None, and an axis being squeezed is not of length 1
See also
expand_dims
The inverse operation, adding singleton dimensions
reshape
Insert, remove, and combine dimensions, and resize existing ones
Examples
>>> x = np.array([[[0], [1], [2]]]) >>> x.shape (1, 3, 1) >>> np.squeeze(x).shape (3,) >>> np.squeeze(x, axis=0).shape (3, 1) >>> np.squeeze(x, axis=1).shape Traceback (most recent call last): ... ValueError: cannot select an axis to squeeze out which has size not equal to one >>> np.squeeze(x, axis=2).shape (1, 3)
-
dask.array.
stack
(seq, axis=0, allow_unknown_chunksizes=False) Stack arrays along a new axis
Given a sequence of dask arrays, form a new dask array by stacking them along a new dimension (axis=0 by default)
- Parameters
- seq: list of dask.arrays
- axis: int
Dimension along which to align all of the arrays
- allow_unknown_chunksizes: bool
Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.
See also
Examples
Create slices
>>> import dask.array as da >>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2)) ... for i in range(3)]
>>> x = da.stack(data, axis=0) >>> x.shape (3, 4, 4)
>>> da.stack(data, axis=1).shape (4, 3, 4)
>>> da.stack(data, axis=-1).shape (4, 4, 3)
Result is a new dask Array
-
dask.array.
std
(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Compute the standard deviation along the specified axis.
This docstring was copied from numpy.std.
Some inconsistencies with the Dask version may exist.
Returns the standard deviation, a measure of the spread of a distribution, of the array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.
- Parameters
- aarray_like
Calculate the standard deviation of these values.
- axisNone or int or tuple of ints, optional
Axis or axes along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array.
New in version 1.7.0.
If this is a tuple of ints, a standard deviation is performed over multiple axes, instead of a single axis or all the axes as before.
- dtypedtype, optional
Type to use in computing the standard deviation. For arrays of integer type the default is float64, for arrays of float types it is the same as the array type.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape as the expected output but the type (of the calculated values) will be cast if necessary.
- ddofint, optional
Means Delta Degrees of Freedom. The divisor used in calculations is
N - ddof
, whereN
represents the number of elements. By default ddof is zero.- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the std method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- Returns
- standard_deviationndarray, see dtype parameter above.
If out is None, return a new array containing the standard deviation, otherwise return a reference to the output array.
Notes
The standard deviation is the square root of the average of the squared deviations from the mean, i.e.,
std = sqrt(mean(abs(x - x.mean())**2))
.The average squared deviation is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of the infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even withddof=1
, it will not be an unbiased estimate of the standard deviation per se.Note that, for complex numbers, std takes the absolute value before squaring, so that the result is always real and nonnegative.
For floating-point input, the std is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.
Examples
>>> a = np.array([[1, 2], [3, 4]]) >>> np.std(a) 1.1180339887498949 # may vary >>> np.std(a, axis=0) array([1., 1.]) >>> np.std(a, axis=1) array([0.5, 0.5])
In single precision, std() can be inaccurate:
>>> a = np.zeros((2, 512*512), dtype=np.float32) >>> a[0, :] = 1.0 >>> a[1, :] = 0.1 >>> np.std(a) 0.45000005
Computing the standard deviation in float64 is more accurate:
>>> np.std(a, dtype=np.float64) 0.44999999925494177 # may vary
-
dask.array.
sum
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Sum of array elements over a given axis.
This docstring was copied from numpy.sum.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Elements to sum.
- axisNone or int or tuple of ints, optional
Axis or axes along which a sum is performed. The default, axis=None, will sum all of the elements of the input array. If axis is negative it counts from the last to the first axis.
New in version 1.7.0.
If axis is a tuple of ints, a sum is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.
- dtypedtype, optional
The type of the returned array and of the accumulator in which the elements are summed. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.
- outndarray, optional
Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.
- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the sum method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- initialscalar, optional (Not supported in Dask)
Starting value for the sum. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
- wherearray_like of bool, optional (Not supported in Dask)
Elements to include in the sum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
- Returns
- sum_along_axisndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, a scalar is returned. If an output array is specified, a reference to out is returned.
See also
Notes
Arithmetic is modular when using integer types, and no error is raised on overflow.
The sum of an empty array is the neutral element 0:
>>> np.sum([]) 0.0
For floating point numbers the numerical precision of sum (and
np.add.reduce
) is in general limited by directly adding each number individually to the result causing rounding errors in every step. However, often numpy will use a numerically better approach (partial pairwise summation) leading to improved precision in many use-cases. This improved precision is always provided when noaxis
is given. Whenaxis
is given, it will depend on which axis is summed. Technically, to provide the best speed possible, the improved precision is only used when the summation is along the fast axis in memory. Note that the exact precision may vary depending on other parameters. In contrast to NumPy, Python’smath.fsum
function uses a slower but more precise approach to summation. Especially when summing a large number of lower precision floating point numbers, such asfloat32
, numerical errors can become significant. In such cases it can be advisable to use dtype=”float64” to use a higher precision for the output.Examples
>>> np.sum([0.5, 1.5]) 2.0 >>> np.sum([0.5, 0.7, 0.2, 1.5], dtype=np.int32) 1 >>> np.sum([[0, 1], [0, 5]]) 6 >>> np.sum([[0, 1], [0, 5]], axis=0) array([0, 6]) >>> np.sum([[0, 1], [0, 5]], axis=1) array([1, 5]) >>> np.sum([[0, 1], [np.nan, 5]], where=[False, True], axis=1) array([1., 5.])
If the accumulator is too small, overflow occurs:
>>> np.ones(128, dtype=np.int8).sum(dtype=np.int8) -128
You can also start the sum with a value other than zero:
>>> np.sum([10], initial=5) 15
-
dask.array.
take
(a, indices, axis=0)¶ Take elements from an array along an axis.
This docstring was copied from numpy.take.
Some inconsistencies with the Dask version may exist.
When axis is not None, this function does the same thing as “fancy” indexing (indexing arrays using arrays); however, it can be easier to use if you need elements along a given axis. A call such as
np.take(arr, indices, axis=3)
is equivalent toarr[:,:,:,indices,...]
.Explained without fancy indexing, this is equivalent to the following use of ndindex, which sets each of
ii
,jj
, andkk
to a tuple of indices:Ni, Nk = a.shape[:axis], a.shape[axis+1:] Nj = indices.shape for ii in ndindex(Ni): for jj in ndindex(Nj): for kk in ndindex(Nk): out[ii + jj + kk] = a[ii + (indices[jj],) + kk]
- Parameters
- aarray_like (Ni…, M, Nk…)
The source array.
- indicesarray_like (Nj…)
The indices of the values to extract.
New in version 1.8.0.
Also allow scalars for indices.
- axisint, optional
The axis over which to select values. By default, the flattened input array is used.
- outndarray, optional (Ni…, Nj…, Nk…)
If provided, the result will be placed in this array. It should be of the appropriate shape and dtype. Note that out is always buffered if mode=’raise’; use other modes for better performance.
- mode{‘raise’, ‘wrap’, ‘clip’}, optional (Not supported in Dask)
Specifies how out-of-bounds indices will behave.
‘raise’ – raise an error (default)
‘wrap’ – wrap around
‘clip’ – clip to the range
‘clip’ mode means that all indices that are too large are replaced by the index that addresses the last element along that axis. Note that this disables indexing with negative numbers.
- Returns
- outndarray (Ni…, Nj…, Nk…)
The returned array has the same type as a.
See also
compress
Take elements using a boolean mask
ndarray.take
equivalent method
take_along_axis
Take elements by matching the array and the index arrays
Notes
By eliminating the inner loop in the description above, and using s_ to build simple slice objects, take can be expressed in terms of applying fancy indexing to each 1-d slice:
Ni, Nk = a.shape[:axis], a.shape[axis+1:] for ii in ndindex(Ni): for kk in ndindex(Nj): out[ii + s_[...,] + kk] = a[ii + s_[:,] + kk][indices]
For this reason, it is equivalent to (but faster than) the following use of apply_along_axis:
out = np.apply_along_axis(lambda a_1d: a_1d[indices], axis, a)
Examples
>>> a = [4, 3, 5, 7, 6, 8] >>> indices = [0, 1, 4] >>> np.take(a, indices) array([4, 3, 6])
In this example if a is an ndarray, “fancy” indexing can be used.
>>> a = np.array(a) >>> a[indices] array([4, 3, 6])
If indices is not one dimensional, the output also has these dimensions.
>>> np.take(a, [[0, 1], [2, 3]]) array([[4, 3], [5, 7]])
-
dask.array.
tan
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.tan.
Some inconsistencies with the Dask version may exist.
Compute tangent element-wise.
Equivalent to
np.sin(x)/np.cos(x)
element-wise.- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The corresponding tangent values. This is a scalar if x is a scalar.
Notes
If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)
References
M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972.
Examples
>>> from math import pi >>> np.tan(np.array([-pi,pi/2,pi])) array([ 1.22460635e-16, 1.63317787e+16, -1.22460635e-16]) >>> >>> # Example of providing the optional output parameter illustrating >>> # that what is returned is a reference to said parameter >>> out1 = np.array([0], dtype='d') >>> out2 = np.cos([0.1], out1) >>> out2 is out1 True >>> >>> # Example of ValueError due to provision of shape mis-matched `out` >>> np.cos(np.zeros((3,3)),np.zeros((2,2))) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
-
dask.array.
tanh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.tanh.
Some inconsistencies with the Dask version may exist.
Compute hyperbolic tangent element-wise.
Equivalent to
np.sinh(x)/np.cosh(x)
or-1j * np.tan(1j*x)
.- Parameters
- xarray_like
Input array.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray
The corresponding hyperbolic tangent values. This is a scalar if x is a scalar.
Notes
If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)
References
- 1
M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972, pg. 83. http://www.math.sfu.ca/~cbm/aands/
- 2
Wikipedia, “Hyperbolic function”, https://en.wikipedia.org/wiki/Hyperbolic_function
Examples
>>> np.tanh((0, np.pi*1j, np.pi*1j/2)) array([ 0. +0.00000000e+00j, 0. -1.22460635e-16j, 0. +1.63317787e+16j])
>>> # Example of providing the optional output parameter illustrating >>> # that what is returned is a reference to said parameter >>> out1 = np.array([0], dtype='d') >>> out2 = np.tanh([0.1], out1) >>> out2 is out1 True
>>> # Example of ValueError due to provision of shape mis-matched `out` >>> np.tanh(np.zeros((3,3)),np.zeros((2,2))) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
-
dask.array.
tensordot
(lhs, rhs, axes=2)¶ Compute tensor dot product along specified axes.
This docstring was copied from numpy.tensordot.
Some inconsistencies with the Dask version may exist.
Given two tensors, a and b, and an array_like object containing two array_like objects,
(a_axes, b_axes)
, sum the products of a’s and b’s elements (components) over the axes specified bya_axes
andb_axes
. The third argument can be a single non-negative integer_like scalar,N
; if it is such, then the lastN
dimensions of a and the firstN
dimensions of b are summed over.- Parameters
- a, barray_like
Tensors to “dot”.
- axesint or (2,) array_like
integer_like If an int N, sum over the last N axes of a and the first N axes of b in order. The sizes of the corresponding axes must match.
(2,) array_like Or, a list of axes to be summed over, first sequence applying to a, second to b. Both elements array_like must be of the same length.
- Returns
- outputndarray
The tensor dot product of the input.
Notes
- Three common use cases are:
axes = 0
: tensor product a⊗baxes = 1
: tensor dot product a⋅baxes = 2
: (default) tensor double contraction a:b
When axes is integer_like, the sequence for evaluation will be: first the -Nth axis in a and 0th axis in b, and the -1th axis in a and Nth axis in b last.
When there is more than one axis to sum over - and they are not the last (first) axes of a (b) - the argument axes should consist of two sequences of the same length, with the first axis to sum over given first in both sequences, the second axis second, and so forth.
The shape of the result consists of the non-contracted axes of the first tensor, followed by the non-contracted axes of the second.
Examples
A “traditional” example:
>>> a = np.arange(60.).reshape(3,4,5) >>> b = np.arange(24.).reshape(4,3,2) >>> c = np.tensordot(a,b, axes=([1,0],[0,1])) >>> c.shape (5, 2) >>> c array([[4400., 4730.], [4532., 4874.], [4664., 5018.], [4796., 5162.], [4928., 5306.]]) >>> # A slower but equivalent way of computing the same... >>> d = np.zeros((5,2)) >>> for i in range(5): ... for j in range(2): ... for k in range(3): ... for n in range(4): ... d[i,j] += a[k,n,i] * b[n,k,j] >>> c == d array([[ True, True], [ True, True], [ True, True], [ True, True], [ True, True]])
An extended example taking advantage of the overloading of + and *:
>>> a = np.array(range(1, 9)) >>> a.shape = (2, 2, 2) >>> A = np.array(('a', 'b', 'c', 'd'), dtype=object) >>> A.shape = (2, 2) >>> a; A array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) array([['a', 'b'], ['c', 'd']], dtype=object)
>>> np.tensordot(a, A) # third argument default is 2 for double-contraction array(['abbcccdddd', 'aaaaabbbbbbcccccccdddddddd'], dtype=object)
>>> np.tensordot(a, A, 1) array([[['acc', 'bdd'], ['aaacccc', 'bbbdddd']], [['aaaaacccccc', 'bbbbbdddddd'], ['aaaaaaacccccccc', 'bbbbbbbdddddddd']]], dtype=object)
>>> np.tensordot(a, A, 0) # tensor product (result too long to incl.) array([[[[['a', 'b'], ['c', 'd']], ...
>>> np.tensordot(a, A, (0, 1)) array([[['abbbbb', 'cddddd'], ['aabbbbbb', 'ccdddddd']], [['aaabbbbbbb', 'cccddddddd'], ['aaaabbbbbbbb', 'ccccdddddddd']]], dtype=object)
>>> np.tensordot(a, A, (2, 1)) array([[['abb', 'cdd'], ['aaabbbb', 'cccdddd']], [['aaaaabbbbbb', 'cccccdddddd'], ['aaaaaaabbbbbbbb', 'cccccccdddddddd']]], dtype=object)
>>> np.tensordot(a, A, ((0, 1), (0, 1))) array(['abbbcccccddddddd', 'aabbbbccccccdddddddd'], dtype=object)
>>> np.tensordot(a, A, ((2, 1), (1, 0))) array(['acccbbdddd', 'aaaaacccccccbbbbbbdddddddd'], dtype=object)
-
dask.array.
tile
(A, reps)¶ Construct an array by repeating A the number of times given by reps.
This docstring was copied from numpy.tile.
Some inconsistencies with the Dask version may exist.
If reps has length
d
, the result will have dimension ofmax(d, A.ndim)
.If
A.ndim < d
, A is promoted to be d-dimensional by prepending new axes. So a shape (3,) array is promoted to (1, 3) for 2-D replication, or shape (1, 1, 3) for 3-D replication. If this is not the desired behavior, promote A to d-dimensions manually before calling this function.If
A.ndim > d
, reps is promoted to A.ndim by pre-pending 1’s to it. Thus for an A of shape (2, 3, 4, 5), a reps of (2, 2) is treated as (1, 1, 2, 2).Note : Although tile may be used for broadcasting, it is strongly recommended to use numpy’s broadcasting operations and functions.
- Parameters
- Aarray_like
The input array.
- repsarray_like
The number of repetitions of A along each axis.
- Returns
- cndarray
The tiled output array.
See also
repeat
Repeat elements of an array.
broadcast_to
Broadcast an array to a new shape
Examples
>>> a = np.array([0, 1, 2]) >>> np.tile(a, 2) array([0, 1, 2, 0, 1, 2]) >>> np.tile(a, (2, 2)) array([[0, 1, 2, 0, 1, 2], [0, 1, 2, 0, 1, 2]]) >>> np.tile(a, (2, 1, 2)) array([[[0, 1, 2, 0, 1, 2]], [[0, 1, 2, 0, 1, 2]]])
>>> b = np.array([[1, 2], [3, 4]]) >>> np.tile(b, 2) array([[1, 2, 1, 2], [3, 4, 3, 4]]) >>> np.tile(b, (2, 1)) array([[1, 2], [3, 4], [1, 2], [3, 4]])
>>> c = np.array([1,2,3,4]) >>> np.tile(c,(4,1)) array([[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]])
-
dask.array.
topk
(a, k, axis=-1, split_every=None)¶ Extract the k largest elements from a on the given axis, and return them sorted from largest to smallest. If k is negative, extract the -k smallest elements instead, and return them sorted from smallest to largest.
This performs best when
k
is much smaller than the chunk size. All results will be returned in a single chunk along the given axis.- Parameters
- x: Array
Data being sorted
- k: int
- axis: int, optional
- split_every: int >=2, optional
See
reduce()
. This parameter becomes very important when k is on the same order of magnitude of the chunk size or more, as it prevents getting the whole or a significant portion of the input array in memory all at once, with a negative impact on network transfer too when running on distributed.
- Returns
- Selection of x with size abs(k) along the given axis.
Examples
>>> import dask.array as da >>> x = np.array([5, 1, 3, 6]) >>> d = da.from_array(x, chunks=2) >>> d.topk(2).compute() array([6, 5]) >>> d.topk(-2).compute() array([1, 3])
-
dask.array.
transpose
(a, axes=None)¶ Permute the dimensions of an array.
This docstring was copied from numpy.transpose.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Input array.
- axeslist of ints, optional
By default, reverse the dimensions, otherwise permute the axes according to the values given.
- Returns
- pndarray
a with its axes permuted. A view is returned whenever possible.
See also
moveaxis
argsort
Notes
Use transpose(a, argsort(axes)) to invert the transposition of tensors when using the axes keyword argument.
Transposing a 1-D array returns an unchanged view of the original array.
Examples
>>> x = np.arange(4).reshape((2,2)) >>> x array([[0, 1], [2, 3]])
>>> np.transpose(x) array([[0, 2], [1, 3]])
>>> x = np.ones((1, 2, 3)) >>> np.transpose(x, (1, 0, 2)).shape (2, 1, 3)
-
dask.array.
tril
(m, k=0)¶ Lower triangle of an array with elements above the k-th diagonal zeroed.
- Parameters
- marray_like, shape (M, M)
Input array.
- kint, optional
Diagonal above which to zero elements. k = 0 (the default) is the main diagonal, k < 0 is below it and k > 0 is above.
- Returns
- trilndarray, shape (M, M)
Lower triangle of m, of same shape and data-type as m.
See also
triu
upper triangle of an array
-
dask.array.
triu
(m, k=0)¶ Upper triangle of an array with elements below the k-th diagonal zeroed.
- Parameters
- marray_like, shape (M, N)
Input array.
- kint, optional
Diagonal below which to zero elements. k = 0 (the default) is the main diagonal, k < 0 is below it and k > 0 is above.
- Returns
- triundarray, shape (M, N)
Upper triangle of m, of same shape and data-type as m.
See also
tril
lower triangle of an array
-
dask.array.
trunc
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.trunc.
Some inconsistencies with the Dask version may exist.
Return the truncated value of the input, element-wise.
The truncated value of the scalar x is the nearest integer i which is closer to zero than x is. In short, the fractional part of the signed number x is discarded.
- Parameters
- xarray_like
Input data.
- outndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
- wherearray_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.- **kwargs
For other keyword-only arguments, see the ufunc docs.
- Returns
- yndarray or scalar
The truncated value of each element in x. This is a scalar if x is a scalar.
Notes
New in version 1.3.0.
Examples
>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]) >>> np.trunc(a) array([-1., -1., -0., 0., 1., 1., 2.])
-
dask.array.
unique
(ar, return_index=False, return_inverse=False, return_counts=False)¶ Find the unique elements of an array.
This docstring was copied from numpy.unique.
Some inconsistencies with the Dask version may exist.
Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements:
the indices of the input array that give the unique values
the indices of the unique array that reconstruct the input array
the number of times each unique value comes up in the input array
- Parameters
- ararray_like
Input array. Unless axis is specified, this will be flattened if it is not already 1-D.
- return_indexbool, optional
If True, also return the indices of ar (along the specified axis, if provided, or in the flattened array) that result in the unique array.
- return_inversebool, optional
If True, also return the indices of the unique array (for the specified axis, if provided) that can be used to reconstruct ar.
- return_countsbool, optional
If True, also return the number of times each unique item appears in ar.
New in version 1.9.0.
- axisint or None, optional (Not supported in Dask)
The axis to operate on. If None, ar will be flattened. If an integer, the subarrays indexed by the given axis will be flattened and treated as the elements of a 1-D array with the dimension of the given axis, see the notes for more details. Object arrays or structured arrays that contain objects are not supported if the axis kwarg is used. The default is None.
New in version 1.13.0.
- Returns
- uniquendarray
The sorted unique values.
- unique_indicesndarray, optional
The indices of the first occurrences of the unique values in the original array. Only provided if return_index is True.
- unique_inversendarray, optional
The indices to reconstruct the original array from the unique array. Only provided if return_inverse is True.
- unique_countsndarray, optional
The number of times each of the unique values comes up in the original array. Only provided if return_counts is True.
New in version 1.9.0.
See also
numpy.lib.arraysetops
Module with a number of other functions for performing set operations on arrays.
Notes
When an axis is specified the subarrays indexed by the axis are sorted. This is done by making the specified axis the first dimension of the array (move the axis to the first dimension to keep the order of the other axes) and then flattening the subarrays in C order. The flattened subarrays are then viewed as a structured type with each element given a label, with the effect that we end up with a 1-D array of structured types that can be treated in the same way as any other 1-D array. The result is that the flattened subarrays are sorted in lexicographic order starting with the first element.
Examples
>>> np.unique([1, 1, 2, 2, 3, 3]) array([1, 2, 3]) >>> a = np.array([[1, 1], [2, 3]]) >>> np.unique(a) array([1, 2, 3])
Return the unique rows of a 2D array
>>> a = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]]) >>> np.unique(a, axis=0) array([[1, 0, 0], [2, 3, 4]])
Return the indices of the original array that give the unique values:
>>> a = np.array(['a', 'b', 'b', 'c', 'a']) >>> u, indices = np.unique(a, return_index=True) >>> u array(['a', 'b', 'c'], dtype='<U1') >>> indices array([0, 1, 3]) >>> a[indices] array(['a', 'b', 'c'], dtype='<U1')
Reconstruct the input array from the unique values:
>>> a = np.array([1, 2, 6, 4, 2, 3, 2]) >>> u, indices = np.unique(a, return_inverse=True) >>> u array([1, 2, 3, 4, 6]) >>> indices array([0, 1, 4, ..., 1, 2, 1]) >>> u[indices] array([1, 2, 6, ..., 2, 3, 2])
-
dask.array.
unravel_index
(indices, shape, order='C')¶ This docstring was copied from numpy.unravel_index.
Some inconsistencies with the Dask version may exist.
Converts a flat index or array of flat indices into a tuple of coordinate arrays.
- Parameters
- indicesarray_like
An integer array whose elements are indices into the flattened version of an array of dimensions
shape
. Before version 1.6.0, this function accepted just one index value.- shapetuple of ints
The shape of the array to use for unraveling
indices
.Changed in version 1.16.0: Renamed from
dims
toshape
.- order{‘C’, ‘F’}, optional
Determines whether the indices should be viewed as indexing in row-major (C-style) or column-major (Fortran-style) order.
New in version 1.6.0.
- Returns
- unraveled_coordstuple of ndarray
Each array in the tuple has the same shape as the
indices
array.
See also
ravel_multi_index
Examples
>>> np.unravel_index([22, 41, 37], (7,6)) (array([3, 6, 6]), array([4, 5, 1])) >>> np.unravel_index([31, 41, 13], (7,6), order='F') (array([3, 6, 6]), array([4, 5, 1]))
>>> np.unravel_index(1621, (6,7,8,9)) (3, 1, 4, 1)
-
dask.array.
var
(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Compute the variance along the specified axis.
This docstring was copied from numpy.var.
Some inconsistencies with the Dask version may exist.
Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.
- Parameters
- aarray_like
Array containing numbers whose variance is desired. If a is not an array, a conversion is attempted.
- axisNone or int or tuple of ints, optional
Axis or axes along which the variance is computed. The default is to compute the variance of the flattened array.
New in version 1.7.0.
If this is a tuple of ints, a variance is performed over multiple axes, instead of a single axis or all the axes as before.
- dtypedata-type, optional
Type to use in computing the variance. For arrays of integer type the default is float64; for arrays of float types it is the same as the array type.
- outndarray, optional
Alternate output array in which to place the result. It must have the same shape as the expected output, but the type is cast if necessary.
- ddofint, optional
“Delta Degrees of Freedom”: the divisor used in the calculation is
N - ddof
, whereN
represents the number of elements. By default ddof is zero.- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the var method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- Returns
- variancendarray, see dtype parameter above
If
out=None
, returns a new array containing the variance; otherwise, a reference to the output array is returned.
Notes
The variance is the average of the squared deviations from the mean, i.e.,
var = mean(abs(x - x.mean())**2)
.The mean is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of a hypothetical infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables.Note that for complex numbers, the absolute value is taken before squaring, so that the result is always real and nonnegative.
For floating-point input, the variance is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the
dtype
keyword can alleviate this issue.Examples
>>> a = np.array([[1, 2], [3, 4]]) >>> np.var(a) 1.25 >>> np.var(a, axis=0) array([1., 1.]) >>> np.var(a, axis=1) array([0.25, 0.25])
In single precision, var() can be inaccurate:
>>> a = np.zeros((2, 512*512), dtype=np.float32) >>> a[0, :] = 1.0 >>> a[1, :] = 0.1 >>> np.var(a) 0.20250003
Computing the variance in float64 is more accurate:
>>> np.var(a, dtype=np.float64) 0.20249999932944759 # may vary >>> ((1-0.55)**2 + (0.1-0.55)**2)/2 0.2025
-
dask.array.
vdot
(a, b)¶ This docstring was copied from numpy.vdot.
Some inconsistencies with the Dask version may exist.
Return the dot product of two vectors.
The vdot(a, b) function handles complex numbers differently than dot(a, b). If the first argument is complex the complex conjugate of the first argument is used for the calculation of the dot product.
Note that vdot handles multidimensional arrays differently than dot: it does not perform a matrix product, but flattens input arguments to 1-D vectors first. Consequently, it should only be used for vectors.
- Parameters
- aarray_like
If a is complex the complex conjugate is taken before calculation of the dot product.
- barray_like
Second argument to the dot product.
- Returns
- outputndarray
Dot product of a and b. Can be an int, float, or complex depending on the types of a and b.
See also
dot
Return the dot product without using the complex conjugate of the first argument.
Examples
>>> a = np.array([1+2j,3+4j]) >>> b = np.array([5+6j,7+8j]) >>> np.vdot(a, b) (70-8j) >>> np.vdot(b, a) (70+8j)
Note that higher-dimensional arrays are flattened!
>>> a = np.array([[1, 4], [5, 6]]) >>> b = np.array([[4, 1], [2, 2]]) >>> np.vdot(a, b) 30 >>> np.vdot(b, a) 30 >>> 1*4 + 4*1 + 5*2 + 6*2 30
-
dask.array.
vstack
(tup, allow_unknown_chunksizes=False)¶ Stack arrays in sequence vertically (row wise).
This docstring was copied from numpy.vstack.
Some inconsistencies with the Dask version may exist.
This is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N). Rebuilds arrays divided by vsplit.
This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations.
- Parameters
- tupsequence of ndarrays
The arrays must have the same shape along all but the first axis. 1-D arrays must have the same length.
- Returns
- stackedndarray
The array formed by stacking the given arrays, will be at least 2-D.
See also
stack
Join a sequence of arrays along a new axis.
hstack
Stack arrays in sequence horizontally (column wise).
dstack
Stack arrays in sequence depth wise (along third dimension).
concatenate
Join a sequence of arrays along an existing axis.
vsplit
Split array into a list of multiple sub-arrays vertically.
block
Assemble arrays from blocks.
Examples
>>> a = np.array([1, 2, 3]) >>> b = np.array([2, 3, 4]) >>> np.vstack((a,b)) array([[1, 2, 3], [2, 3, 4]])
>>> a = np.array([[1], [2], [3]]) >>> b = np.array([[2], [3], [4]]) >>> np.vstack((a,b)) array([[1], [2], [3], [2], [3], [4]])
-
dask.array.
where
(condition[, x, y])¶ This docstring was copied from numpy.where.
Some inconsistencies with the Dask version may exist.
Return elements chosen from x or y depending on condition.
Note
When only condition is provided, this function is a shorthand for
np.asarray(condition).nonzero()
. Using nonzero directly should be preferred, as it behaves correctly for subclasses. The rest of this documentation covers only the case where all three arguments are provided.- Parameters
- conditionarray_like, bool
Where True, yield x, otherwise yield y.
- x, yarray_like
Values from which to choose. x, y and condition need to be broadcastable to some shape.
- Returns
- outndarray
An array with elements from x where condition is True, and elements from y elsewhere.
Notes
If all the arrays are 1-D, where is equivalent to:
[xv if c else yv for c, xv, yv in zip(condition, x, y)]
Examples
>>> a = np.arange(10) >>> a array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> np.where(a < 5, a, 10*a) array([ 0, 1, 2, 3, 4, 50, 60, 70, 80, 90])
This can be used on multidimensional arrays too:
>>> np.where([[True, False], [True, True]], ... [[1, 2], [3, 4]], ... [[9, 8], [7, 6]]) array([[1, 8], [3, 4]])
The shapes of x, y, and the condition are broadcast together:
>>> x, y = np.ogrid[:3, :4] >>> np.where(x < y, x, 10 + y) # both x and 10+y are broadcast array([[10, 0, 0, 0], [10, 11, 1, 1], [10, 11, 12, 2]])
>>> a = np.array([[0, 1, 2], ... [0, 2, 4], ... [0, 3, 6]]) >>> np.where(a < 4, a, -1) # -1 is broadcast array([[ 0, 1, 2], [ 0, 2, -1], [ 0, 3, -1]])
-
dask.array.
zeros
(*args, **kwargs)¶ Blocked variant of zeros
Follows the signature of zeros exactly except that it also features optional keyword arguments
chunks: int, tuple, or dict
andname: str
.Original signature follows below. zeros(shape, dtype=float, order=’C’)
Return a new array of given shape and type, filled with zeros.
- Parameters
- shapeint or tuple of ints
Shape of the new array, e.g.,
(2, 3)
or2
.- dtypedata-type, optional
The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64.
- order{‘C’, ‘F’}, optional, default: ‘C’
Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
- Returns
- outndarray
Array of zeros with the given shape, dtype, and order.
See also
zeros_like
Return an array of zeros with shape and type of input.
empty
Return a new uninitialized array.
ones
Return a new array setting values to one.
full
Return a new array of given shape filled with value.
Examples
>>> np.zeros(5) array([ 0., 0., 0., 0., 0.])
>>> np.zeros((5,), dtype=int) array([0, 0, 0, 0, 0])
>>> np.zeros((2, 1)) array([[ 0.], [ 0.]])
>>> s = (2,2) >>> np.zeros(s) array([[ 0., 0.], [ 0., 0.]])
>>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype array([(0, 0), (0, 0)], dtype=[('x', '<i4'), ('y', '<i4')])
-
dask.array.
zeros_like
(a, dtype=None, order='C', chunks=None, name=None, shape=None)¶ Return an array of zeros with the same shape and type as a given array.
- Parameters
- aarray_like
The shape and data-type of a define these same attributes of the returned array.
- dtypedata-type, optional
Overrides the data type of the result.
- order{‘C’, ‘F’}, optional
Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.
- chunkssequence of ints
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.- namestr, optional
An optional keyname for the array. Defaults to hashing the input keyword arguments.
- shapeint or sequence of ints, optional.
Overrides the shape of the result.
- Returns
- outndarray
Array of zeros with the same shape and type as a.
See also
ones_like
Return an array of ones with shape and type of input.
empty_like
Return an empty array with shape and type of input.
zeros
Return a new array setting values to zero.
ones
Return a new array setting values to one.
empty
Return a new uninitialized array.
-
dask.array.linalg.
cholesky
(a, lower=False)¶ Returns the Cholesky decomposition, A=LL∗ or A=U∗U of a Hermitian positive-definite matrix A.
- Parameters
- a(M, M) array_like
Matrix to be decomposed
- lowerbool, optional
Whether to compute the upper or lower triangular Cholesky factorization. Default is upper-triangular.
- Returns
- c(M, M) Array
Upper- or lower-triangular Cholesky factor of a.
-
dask.array.linalg.
inv
(a)¶ Compute the inverse of a matrix with LU decomposition and forward / backward substitutions.
- Parameters
- aarray_like
Square matrix to be inverted.
- Returns
- ainvArray
Inverse of the matrix a.
-
dask.array.linalg.
lstsq
(a, b)¶ Return the least-squares solution to a linear matrix equation using QR decomposition.
Solves the equation a x = b by computing a vector x that minimizes the Euclidean 2-norm || b - a x ||^2. The equation may be under-, well-, or over- determined (i.e., the number of linearly independent rows of a can be less than, equal to, or greater than its number of linearly independent columns). If a is square and of full rank, then x (but for round-off error) is the “exact” solution of the equation.
- Parameters
- a(M, N) array_like
“Coefficient” matrix.
- b(M,) array_like
Ordinate or “dependent variable” values.
- Returns
- x(N,) Array
Least-squares solution. If b is two-dimensional, the solutions are in the K columns of x.
- residuals(1,) Array
Sums of residuals; squared Euclidean 2-norm for each column in
b - a*x
.- rankArray
Rank of matrix a.
- s(min(M, N),) Array
Singular values of a.
-
dask.array.linalg.
lu
(a)¶ Compute the lu decomposition of a matrix.
- Returns
- p: Array, permutation matrix
- l: Array, lower triangular matrix with unit diagonal.
- u: Array, upper triangular matrix
Examples
>>> p, l, u = da.linalg.lu(x)
-
dask.array.linalg.
norm
(x, ord=None, axis=None, keepdims=False)¶ Matrix or vector norm.
This docstring was copied from numpy.linalg.norm.
Some inconsistencies with the Dask version may exist.
This function is able to return one of eight different matrix norms, or one of an infinite number of vector norms (described below), depending on the value of the
ord
parameter.- Parameters
- xarray_like
Input array. If axis is None, x must be 1-D or 2-D, unless ord is None. If both axis and ord are None, the 2-norm of
x.ravel
will be returned.- ord{non-zero int, inf, -inf, ‘fro’, ‘nuc’}, optional
Order of the norm (see table under
Notes
). inf means numpy’s inf object. The default is None.- axis{None, int, 2-tuple of ints}, optional.
If axis is an integer, it specifies the axis of x along which to compute the vector norms. If axis is a 2-tuple, it specifies the axes that hold 2-D matrices, and the matrix norms of these matrices are computed. If axis is None then either a vector norm (when x is 1-D) or a matrix norm (when x is 2-D) is returned. The default is None.
New in version 1.8.0.
- keepdimsbool, optional
If this is set to True, the axes which are normed over are left in the result as dimensions with size one. With this option the result will broadcast correctly against the original x.
New in version 1.10.0.
- Returns
- nfloat or ndarray
Norm of the matrix or vector(s).
Notes
For values of
ord <= 0
, the result is, strictly speaking, not a mathematical ‘norm’, but it may still be useful for various numerical purposes.The following norms can be calculated:
ord
norm for matrices
norm for vectors
None
Frobenius norm
2-norm
‘fro’
Frobenius norm
–
‘nuc’
nuclear norm
–
inf
max(sum(abs(x), axis=1))
max(abs(x))
-inf
min(sum(abs(x), axis=1))
min(abs(x))
0
–
sum(x != 0)
1
max(sum(abs(x), axis=0))
as below
-1
min(sum(abs(x), axis=0))
as below
2
2-norm (largest sing. value)
as below
-2
smallest singular value
as below
other
–
sum(abs(x)**ord)**(1./ord)
The Frobenius norm is given by [1]:
||A||F=[∑i,jabs(ai,j)2]1/2
The nuclear norm is the sum of the singular values.
References
- 1
G. H. Golub and C. F. Van Loan, Matrix Computations, Baltimore, MD, Johns Hopkins University Press, 1985, pg. 15
Examples
>>> from numpy import linalg as LA >>> a = np.arange(9) - 4 >>> a array([-4, -3, -2, ..., 2, 3, 4]) >>> b = a.reshape((3, 3)) >>> b array([[-4, -3, -2], [-1, 0, 1], [ 2, 3, 4]])
>>> LA.norm(a) 7.745966692414834 >>> LA.norm(b) 7.745966692414834 >>> LA.norm(b, 'fro') 7.745966692414834 >>> LA.norm(a, np.inf) 4.0 >>> LA.norm(b, np.inf) 9.0 >>> LA.norm(a, -np.inf) 0.0 >>> LA.norm(b, -np.inf) 2.0
>>> LA.norm(a, 1) 20.0 >>> LA.norm(b, 1) 7.0 >>> LA.norm(a, -1) -4.6566128774142013e-010 >>> LA.norm(b, -1) 6.0 >>> LA.norm(a, 2) 7.745966692414834 >>> LA.norm(b, 2) 7.3484692283495345
>>> LA.norm(a, -2) 0.0 >>> LA.norm(b, -2) 1.8570331885190563e-016 # may vary >>> LA.norm(a, 3) 5.8480354764257312 # may vary >>> LA.norm(a, -3) 0.0
Using the axis argument to compute vector norms:
>>> c = np.array([[ 1, 2, 3], ... [-1, 1, 4]]) >>> LA.norm(c, axis=0) array([ 1.41421356, 2.23606798, 5. ]) >>> LA.norm(c, axis=1) array([ 3.74165739, 4.24264069]) >>> LA.norm(c, ord=1, axis=1) array([ 6., 6.])
Using the axis argument to compute matrix norms:
>>> m = np.arange(8).reshape(2,2,2) >>> LA.norm(m, axis=(1,2)) array([ 3.74165739, 11.22497216]) >>> LA.norm(m[0, :, :]), LA.norm(m[1, :, :]) (3.7416573867739413, 11.224972160321824)
-
dask.array.linalg.
qr
(a)¶ Compute the qr factorization of a matrix.
- Parameters
- aArray
- Returns
- q: Array, orthonormal
- r: Array, upper-triangular
See also
numpy.linalg.qr
Equivalent NumPy Operation
dask.array.linalg.tsqr
Implementation for tall-and-skinny arrays
dask.array.linalg.sfqr
Implementation for short-and-fat arrays
Examples
>>> q, r = da.linalg.qr(x)
-
dask.array.linalg.
solve
(a, b, sym_pos=False)¶ Solve the equation
a x = b
forx
. By default, use LU decomposition and forward / backward substitutions. Whensym_pos
isTrue
, use Cholesky decomposition.- Parameters
- a(M, M) array_like
A square matrix.
- b(M,) or (M, N) array_like
Right-hand side matrix in
a x = b
.- sym_posbool
Assume a is symmetric and positive definite. If
True
, use Cholesky decomposition.
- Returns
- x(M,) or (M, N) Array
Solution to the system
a x = b
. Shape of the return matches the shape of b.
-
dask.array.linalg.
solve_triangular
(a, b, lower=False)¶ Solve the equation a x = b for x, assuming a is a triangular matrix.
- Parameters
- a(M, M) array_like
A triangular matrix
- b(M,) or (M, N) array_like
Right-hand side matrix in a x = b
- lowerbool, optional
Use only data contained in the lower triangle of a. Default is to use upper triangle.
- Returns
- x(M,) or (M, N) array
Solution to the system a x = b. Shape of return matches b.
-
dask.array.linalg.
svd
(a)¶ Compute the singular value decomposition of a matrix.
- Returns
- u: Array, unitary / orthogonal
- s: Array, singular values in decreasing order (largest first)
- v: Array, unitary / orthogonal
See also
np.linalg.svd
Equivalent NumPy Operation
dask.array.linalg.tsqr
Implementation for tall-and-skinny arrays
Examples
>>> u, s, v = da.linalg.svd(x)
-
dask.array.linalg.
svd_compressed
(a, k, n_power_iter=0, seed=None, compute=False)¶ Randomly compressed rank-k thin Singular Value Decomposition.
This computes the approximate singular value decomposition of a large array. This algorithm is generally faster than the normal algorithm but does not provide exact results. One can balance between performance and accuracy with input parameters (see below).
- Parameters
- a: Array
Input array
- k: int
Rank of the desired thin SVD decomposition.
- n_power_iter: int
Number of power iterations, useful when the singular values decay slowly. Error decreases exponentially as n_power_iter increases. In practice, set n_power_iter <= 4.
- computebool
Whether or not to compute data at each use. Recomputing the input while performing several passes reduces memory pressure, but means that we have to compute the input multiple times. This is a good choice if the data is larger than memory and cheap to recreate.
- Returns
- u: Array, unitary / orthogonal
- s: Array, singular values in decreasing order (largest first)
- v: Array, unitary / orthogonal
References
N. Halko, P. G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev., Survey and Review section, Vol. 53, num. 2, pp. 217-288, June 2011 https://arxiv.org/abs/0909.4061
Examples
>>> u, s, vt = svd_compressed(x, 20)
-
dask.array.linalg.
sfqr
(data, name=None)¶ Direct Short-and-Fat QR
Currently, this is a quick hack for non-tall-and-skinny matrices which are one chunk tall and (unless they are one chunk wide) have chunks that are wider than they are tall
Q [R_1 R_2 …] = [A_1 A_2 …]
it computes the factorization Q R_1 = A_1, then computes the other R_k’s in parallel.
- Parameters
- data: Array
See also
dask.array.linalg.qr
Main user API that uses this function
dask.array.linalg.tsqr
Variant for tall-and-skinny case
-
dask.array.linalg.
tsqr
(data, compute_svd=False, _max_vchunk_size=None)¶ Direct Tall-and-Skinny QR algorithm
As presented in:
A. Benson, D. Gleich, and J. Demmel. Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. IEEE International Conference on Big Data, 2013. https://arxiv.org/abs/1301.1071
This algorithm is used to compute both the QR decomposition and the Singular Value Decomposition. It requires that the input array have a single column of blocks, each of which fit in memory.
- Parameters
- data: Array
- compute_svd: bool
Whether to compute the SVD rather than the QR decomposition
- _max_vchunk_size: Integer
Used internally in recursion to set the maximum row dimension of chunks in subsequent recursive calls.
See also
dask.array.linalg.qr
Powered by this algorithm
dask.array.linalg.svd
Powered by this algorithm
dask.array.linalg.sfqr
Variant for short-and-fat arrays
Notes
With
k
blocks of size(m, n)
, this algorithm has memory use that scales ask * n * n
.The implementation here is the recursive variant due to the ultimate need for one “single core” QR decomposition. In the non-recursive version of the algorithm, given
k
blocks, afterk
m * n
QR decompositions, there will be a “single core” QR decomposition that will have to work with a(k * n, n)
matrix.Here, recursion is applied as necessary to ensure that
k * n
is not larger thanm
(ifm / n >= 2
). In particular, this is done to ensure that single core computations do not have to work on blocks larger than(m, n)
.Where blocks are irregular, the above logic is applied with the “height” of the “tallest” block used in place of
m
.Consider use of the
rechunk
method to control this behavior. Taller blocks will reduce overall memory use (assuming that many of them still fit in memory at once).
-
dask.array.ma.
average
(a, axis=None, weights=None, returned=False)¶ Return the weighted average of array over the given axis.
This docstring was copied from numpy.ma.average.
Some inconsistencies with the Dask version may exist.
- Parameters
- aarray_like
Data to be averaged. Masked entries are not taken into account in the computation.
- axisint, optional
Axis along which to average a. If None, averaging is done over the flattened array.
- weightsarray_like, optional
The importance that each element has in the computation of the average. The weights array can either be 1-D (in which case its length must be the size of a along the given axis) or of the same shape as a. If
weights=None
, then all data in a are assumed to have a weight equal to one. The 1-D calculation is:avg = sum(a * weights) / sum(weights)
The only constraint on weights is that sum(weights) must not be 0.
- returnedbool, optional
Flag indicating whether a tuple
(result, sum of weights)
should be returned as output (True), or just the result (False). Default is False.
- Returns
- average, [sum_of_weights](tuple of) scalar or MaskedArray
The average along the specified axis. When returned is True, return a tuple with the average as the first element and the sum of the weights as the second element. The return type is np.float64 if a is of integer type and floats smaller than float64, or the input data-type, otherwise. If returned, sum_of_weights is always float64.
Examples
>>> a = np.ma.array([1., 2., 3., 4.], mask=[False, False, True, True]) >>> np.ma.average(a, weights=[3, 1, 0, 0]) 1.25
>>> x = np.ma.arange(6.).reshape(3, 2) >>> x masked_array( data=[[0., 1.], [2., 3.], [4., 5.]], mask=False, fill_value=1e+20) >>> avg, sumweights = np.ma.average(x, axis=0, weights=[1, 2, 3], ... returned=True) >>> avg masked_array(data=[2.6666666666666665, 3.6666666666666665], mask=[False, False], fill_value=1e+20)
-
dask.array.ma.
filled
(a, fill_value=None)¶ Return input as an array with masked data replaced by a fill value.
This docstring was copied from numpy.ma.filled.
Some inconsistencies with the Dask version may exist.
If a is not a MaskedArray, a itself is returned. If a is a MaskedArray and fill_value is None, fill_value is set to
a.fill_value
.- Parameters
- aMaskedArray or array_like
An input object.
- fill_valuearray_like, optional.
Can be scalar or non-scalar. If non-scalar, the resulting filled array should be broadcastable over input array. Default is None.
- Returns
- andarray
The filled array.
See also
compressed
Examples
>>> x = np.ma.array(np.arange(9).reshape(3, 3), mask=[[1, 0, 0], ... [1, 0, 0], ... [0, 0, 0]]) >>> x.filled() array([[999999, 1, 2], [999999, 4, 5], [ 6, 7, 8]]) >>> x.filled(fill_value=333) array([[333, 1, 2], [333, 4, 5], [ 6, 7, 8]]) >>> x.filled(fill_value=np.arange(3)) array([[0, 1, 2], [0, 4, 5], [6, 7, 8]])
-
dask.array.ma.
fix_invalid
(a, fill_value=None)¶ Return input with invalid data masked and replaced by a fill value.
This docstring was copied from numpy.ma.fix_invalid.
Some inconsistencies with the Dask version may exist.
Invalid data means values of nan, inf, etc.
- Parameters
- aarray_like
Input array, a (subclass of) ndarray.
- masksequence, optional (Not supported in Dask)
Mask. Must be convertible to an array of booleans with the same shape as data. True indicates a masked (i.e. invalid) data.
- copybool, optional (Not supported in Dask)
Whether to use a copy of a (True) or to fix a in place (False). Default is True.
- fill_valuescalar, optional
Value used for fixing invalid data. Default is None, in which case the
a.fill_value
is used.
- Returns
- bMaskedArray
The input array with invalid entries fixed.
Notes
A copy is performed by default.
Examples
>>> x = np.ma.array([1., -1, np.nan, np.inf], mask=[1] + [0]*3) >>> x masked_array(data=[--, -1.0, nan, inf], mask=[ True, False, False, False], fill_value=1e+20) >>> np.ma.fix_invalid(x) masked_array(data=[--, -1.0, --, --], mask=[ True, False, True, True], fill_value=1e+20)
>>> fixed = np.ma.fix_invalid(x) >>> fixed.data array([ 1.e+00, -1.e+00, 1.e+20, 1.e+20]) >>> x.data array([ 1., -1., nan, inf])
-
dask.array.ma.
getdata
(a)¶ Return the data of a masked array as an ndarray.
This docstring was copied from numpy.ma.getdata.
Some inconsistencies with the Dask version may exist.
Return the data of a (if any) as an ndarray if a is a
MaskedArray
, else return a as a ndarray or subclass (depending on subok) if not.- Parameters
- aarray_like
Input
MaskedArray
, alternatively a ndarray or a subclass thereof.- subokbool (Not supported in Dask)
Whether to force the output to be a pure ndarray (False) or to return a subclass of ndarray if appropriate (True, default).
See also
getmask
Return the mask of a masked array, or nomask.
getmaskarray
Return the mask of a masked array, or full array of False.
Examples
>>> import numpy.ma as ma >>> a = ma.masked_equal([[1,2],[3,4]], 2) >>> a masked_array( data=[[1, --], [3, 4]], mask=[[False, True], [False, False]], fill_value=2) >>> ma.getdata(a) array([[1, 2], [3, 4]])
Equivalently use the
MaskedArray
data attribute.>>> a.data array([[1, 2], [3, 4]])
-
dask.array.ma.
getmaskarray
(a)¶ Return the mask of a masked array, or full boolean array of False.
This docstring was copied from numpy.ma.getmaskarray.
Some inconsistencies with the Dask version may exist.
Return the mask of arr as an ndarray if arr is a MaskedArray and the mask is not nomask, else return a full boolean array of False of the same shape as arr.
- Parameters
- arrarray_like (Not supported in Dask)
Input MaskedArray for which the mask is required.
See also
getmask
Return the mask of a masked array, or nomask.
getdata
Return the data of a masked array as an ndarray.
Examples
>>> import numpy.ma as ma >>> a = ma.masked_equal([[1,2],[3,4]], 2) >>> a masked_array( data=[[1, --], [3, 4]], mask=[[False, True], [False, False]], fill_value=2) >>> ma.getmaskarray(a) array([[False, True], [False, False]])
Result when mask ==
nomask
>>> b = ma.masked_array([[1,2],[3,4]]) >>> b masked_array( data=[[1, 2], [3, 4]], mask=False, fill_value=999999) >>> ma.getmaskarray(b) array([[False, False], [False, False]])
-
dask.array.ma.
masked_array
(data, mask=False, fill_value=None, **kwargs)¶ An array class with possibly masked values.
This docstring was copied from numpy.ma.masked_array.
Some inconsistencies with the Dask version may exist.
Masked values of True exclude the corresponding element from any computation.
Construction:
x = MaskedArray(data, mask=nomask, dtype=None, copy=False, subok=True, ndmin=0, fill_value=None, keep_mask=True, hard_mask=None, shrink=True, order=None)
- Parameters
- dataarray_like
Input data.
- masksequence, optional
Mask. Must be convertible to an array of booleans with the same shape as data. True indicates a masked (i.e. invalid) data.
- dtypedtype, optional (Not supported in Dask)
Data type of the output. If dtype is None, the type of the data argument (
data.dtype
) is used. If dtype is not None and different fromdata.dtype
, a copy is performed.- copybool, optional (Not supported in Dask)
Whether to copy the input data (True), or to use a reference instead. Default is False.
- subokbool, optional (Not supported in Dask)
Whether to return a subclass of MaskedArray if possible (True) or a plain MaskedArray. Default is True.
- ndminint, optional (Not supported in Dask)
Minimum number of dimensions. Default is 0.
- fill_valuescalar, optional
Value used to fill in the masked values when necessary. If None, a default based on the data-type is used.
- keep_maskbool, optional (Not supported in Dask)
Whether to combine mask with the mask of the input data, if any (True), or to use only mask for the output (False). Default is True.
- hard_maskbool, optional (Not supported in Dask)
Whether to use a hard mask or not. With a hard mask, masked values cannot be unmasked. Default is False.
- shrinkbool, optional (Not supported in Dask)
Whether to force compression of an empty mask. Default is True.
- order{‘C’, ‘F’, ‘A’}, optional (Not supported in Dask)
Specify the order of the array. If order is ‘C’, then the array will be in C-contiguous order (last-index varies the fastest). If order is ‘F’, then the returned array will be in Fortran-contiguous order (first-index varies the fastest). If order is ‘A’ (default), then the returned array may be in any order (either C-, Fortran-contiguous, or even discontiguous), unless a copy is required, in which case it will be C-contiguous.
-
dask.array.ma.
masked_equal
(a, value)¶ Mask an array where equal to a given value.
This docstring was copied from numpy.ma.masked_equal.
Some inconsistencies with the Dask version may exist.
This function is a shortcut to
masked_where
, with condition = (x == value). For floating point arrays, consider usingmasked_values(x, value)
.See also
masked_where
Mask where a condition is met.
masked_values
Mask using floating point equality.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_equal(a, 2) masked_array(data=[0, 1, --, 3], mask=[False, False, True, False], fill_value=2)
-
dask.array.ma.
masked_greater
(x, value, copy=True)¶ Mask an array where greater than a given value.
This function is a shortcut to
masked_where
, with condition = (x > value).See also
masked_where
Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_greater(a, 2) masked_array(data=[0, 1, 2, --], mask=[False, False, False, True], fill_value=999999)
-
dask.array.ma.
masked_greater_equal
(x, value, copy=True)¶ Mask an array where greater than or equal to a given value.
This function is a shortcut to
masked_where
, with condition = (x >= value).See also
masked_where
Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_greater_equal(a, 2) masked_array(data=[0, 1, --, --], mask=[False, False, True, True], fill_value=999999)
-
dask.array.ma.
masked_inside
(x, v1, v2)¶ Mask an array inside a given interval.
This docstring was copied from numpy.ma.masked_inside.
Some inconsistencies with the Dask version may exist.
Shortcut to
masked_where
, where condition is True for x inside the interval [v1,v2] (v1 <= x <= v2). The boundaries v1 and v2 can be given in either order.See also
masked_where
Mask where a condition is met.
Notes
The array x is prefilled with its filling value.
Examples
>>> import numpy.ma as ma >>> x = [0.31, 1.2, 0.01, 0.2, -0.4, -1.1] >>> ma.masked_inside(x, -0.3, 0.3) masked_array(data=[0.31, 1.2, --, --, -0.4, -1.1], mask=[False, False, True, True, False, False], fill_value=1e+20)
The order of v1 and v2 doesn’t matter.
>>> ma.masked_inside(x, 0.3, -0.3) masked_array(data=[0.31, 1.2, --, --, -0.4, -1.1], mask=[False, False, True, True, False, False], fill_value=1e+20)
-
dask.array.ma.
masked_invalid
(a)¶ Mask an array where invalid values occur (NaNs or infs).
This docstring was copied from numpy.ma.masked_invalid.
Some inconsistencies with the Dask version may exist.
This function is a shortcut to
masked_where
, with condition = ~(np.isfinite(a)). Any pre-existing mask is conserved. Only applies to arrays with a dtype where NaNs or infs make sense (i.e. floating point types), but accepts any array_like object.See also
masked_where
Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(5, dtype=float) >>> a[2] = np.NaN >>> a[3] = np.PINF >>> a array([ 0., 1., nan, inf, 4.]) >>> ma.masked_invalid(a) masked_array(data=[0.0, 1.0, --, --, 4.0], mask=[False, False, True, True, False], fill_value=1e+20)
-
dask.array.ma.
masked_less
(x, value, copy=True)¶ Mask an array where less than a given value.
This function is a shortcut to
masked_where
, with condition = (x < value).See also
masked_where
Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_less(a, 2) masked_array(data=[--, --, 2, 3], mask=[ True, True, False, False], fill_value=999999)
-
dask.array.ma.
masked_less_equal
(x, value, copy=True)¶ Mask an array where less than or equal to a given value.
This function is a shortcut to
masked_where
, with condition = (x <= value).See also
masked_where
Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_less_equal(a, 2) masked_array(data=[--, --, --, 3], mask=[ True, True, True, False], fill_value=999999)
-
dask.array.ma.
masked_not_equal
(x, value, copy=True)¶ Mask an array where not equal to a given value.
This function is a shortcut to
masked_where
, with condition = (x != value).See also
masked_where
Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_not_equal(a, 2) masked_array(data=[--, --, 2, --], mask=[ True, True, False, True], fill_value=999999)
-
dask.array.ma.
masked_outside
(x, v1, v2)¶ Mask an array outside a given interval.
This docstring was copied from numpy.ma.masked_outside.
Some inconsistencies with the Dask version may exist.
Shortcut to
masked_where
, where condition is True for x outside the interval [v1,v2] (x < v1)|(x > v2). The boundaries v1 and v2 can be given in either order.See also
masked_where
Mask where a condition is met.
Notes
The array x is prefilled with its filling value.
Examples
>>> import numpy.ma as ma >>> x = [0.31, 1.2, 0.01, 0.2, -0.4, -1.1] >>> ma.masked_outside(x, -0.3, 0.3) masked_array(data=[--, --, 0.01, 0.2, --, --], mask=[ True, True, False, False, True, True], fill_value=1e+20)
The order of v1 and v2 doesn’t matter.
>>> ma.masked_outside(x, 0.3, -0.3) masked_array(data=[--, --, 0.01, 0.2, --, --], mask=[ True, True, False, False, True, True], fill_value=1e+20)
-
dask.array.ma.
masked_values
(x, value, rtol=1e-05, atol=1e-08, shrink=True)¶ Mask using floating point equality.
This docstring was copied from numpy.ma.masked_values.
Some inconsistencies with the Dask version may exist.
Return a MaskedArray, masked where the data in array x are approximately equal to value, determined using isclose. The default tolerances for masked_values are the same as those for isclose.
For integer types, exact equality is used, in the same way as masked_equal.
The fill_value is set to value and the mask is set to
nomask
if possible.- Parameters
- xarray_like
Array to mask.
- valuefloat
Masking value.
- rtol, atolfloat, optional
Tolerance parameters passed on to isclose
- copybool, optional (Not supported in Dask)
Whether to return a copy of x.
- shrinkbool, optional
Whether to collapse a mask full of False to
nomask
.
- Returns
- resultMaskedArray
The result of masking x where approximately equal to value.
See also
masked_where
Mask where a condition is met.
masked_equal
Mask where equal to a given value (integers).
Examples
>>> import numpy.ma as ma >>> x = np.array([1, 1.1, 2, 1.1, 3]) >>> ma.masked_values(x, 1.1) masked_array(data=[1.0, --, 2.0, --, 3.0], mask=[False, True, False, True, False], fill_value=1.1)
Note that mask is set to
nomask
if possible.>>> ma.masked_values(x, 1.5) masked_array(data=[1. , 1.1, 2. , 1.1, 3. ], mask=False, fill_value=1.5)
For integers, the fill value will be different in general to the result of
masked_equal
.>>> x = np.arange(5) >>> x array([0, 1, 2, 3, 4]) >>> ma.masked_values(x, 2) masked_array(data=[0, 1, --, 3, 4], mask=[False, False, True, False, False], fill_value=2) >>> ma.masked_equal(x, 2) masked_array(data=[0, 1, --, 3, 4], mask=[False, False, True, False, False], fill_value=2)
-
dask.array.ma.
masked_where
(condition, a)¶ Mask an array where a condition is met.
This docstring was copied from numpy.ma.masked_where.
Some inconsistencies with the Dask version may exist.
Return a as an array masked where condition is True. Any masked values of a or condition are also masked in the output.
- Parameters
- conditionarray_like
Masking condition. When condition tests floating point values for equality, consider using
masked_values
instead.- aarray_like
Array to mask.
- copybool (Not supported in Dask)
If True (default) make a copy of a in the result. If False modify a in place and return a view.
- Returns
- resultMaskedArray
The result of masking a where condition is True.
See also
masked_values
Mask using floating point equality.
masked_equal
Mask where equal to a given value.
masked_not_equal
Mask where not equal to a given value.
masked_less_equal
Mask where less than or equal to a given value.
masked_greater_equal
Mask where greater than or equal to a given value.
masked_less
Mask where less than a given value.
masked_greater
Mask where greater than a given value.
masked_inside
Mask inside a given interval.
masked_outside
Mask outside a given interval.
masked_invalid
Mask invalid values (NaNs or infs).
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_where(a <= 2, a) masked_array(data=[--, --, --, 3], mask=[ True, True, True, False], fill_value=999999)
Mask array b conditional on a.
>>> b = ['a', 'b', 'c', 'd'] >>> ma.masked_where(a == 2, b) masked_array(data=['a', 'b', --, 'd'], mask=[False, False, True, False], fill_value='N/A', dtype='<U1')
Effect of the copy argument.
>>> c = ma.masked_where(a <= 2, a) >>> c masked_array(data=[--, --, --, 3], mask=[ True, True, True, False], fill_value=999999) >>> c[0] = 99 >>> c masked_array(data=[99, --, --, 3], mask=[False, True, True, False], fill_value=999999) >>> a array([0, 1, 2, 3]) >>> c = ma.masked_where(a <= 2, a, copy=False) >>> c[0] = 99 >>> c masked_array(data=[99, --, --, 3], mask=[False, True, True, False], fill_value=999999) >>> a array([99, 1, 2, 3])
When condition or a contain masked values.
>>> a = np.arange(4) >>> a = ma.masked_where(a == 2, a) >>> a masked_array(data=[0, 1, --, 3], mask=[False, False, True, False], fill_value=999999) >>> b = np.arange(4) >>> b = ma.masked_where(b == 0, b) >>> b masked_array(data=[--, 1, 2, 3], mask=[ True, False, False, False], fill_value=999999) >>> ma.masked_where(a == 3, b) masked_array(data=[--, 1, --, --], mask=[ True, False, True, True], fill_value=999999)
-
dask.array.ma.
set_fill_value
(a, fill_value)¶ Set the filling value of a, if a is a masked array.
This docstring was copied from numpy.ma.set_fill_value.
Some inconsistencies with the Dask version may exist.
This function changes the fill value of the masked array a in place. If a is not a masked array, the function returns silently, without doing anything.
- Parameters
- aarray_like
Input array.
- fill_valuedtype
Filling value. A consistency test is performed to make sure the value is compatible with the dtype of a.
- Returns
- None
Nothing returned by this function.
See also
maximum_fill_value
Return the default fill value for a dtype.
MaskedArray.fill_value
Return current fill value.
MaskedArray.set_fill_value
Equivalent method.
Examples
>>> import numpy.ma as ma >>> a = np.arange(5) >>> a array([0, 1, 2, 3, 4]) >>> a = ma.masked_where(a < 3, a) >>> a masked_array(data=[--, --, --, 3, 4], mask=[ True, True, True, False, False], fill_value=999999) >>> ma.set_fill_value(a, -999) >>> a masked_array(data=[--, --, --, 3, 4], mask=[ True, True, True, False, False], fill_value=-999)
Nothing happens if a is not a masked array.
>>> a = list(range(5)) >>> a [0, 1, 2, 3, 4] >>> ma.set_fill_value(a, 100) >>> a [0, 1, 2, 3, 4] >>> a = np.arange(5) >>> a array([0, 1, 2, 3, 4]) >>> ma.set_fill_value(a, 100) >>> a array([0, 1, 2, 3, 4])
-
dask.array.overlap.
overlap
(x, depth, boundary)¶ Share boundaries between neighboring blocks
- Parameters
- x: da.Array
A dask array
- depth: dict
The size of the shared boundary per axis
- boundary: dict
The boundary condition on each axis. Options are ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or an array value. Such a value will fill the boundary with that value.
- The depth input informs how many cells to overlap between neighboring
- blocks ``{0: 2, 2: 5}`` means share two cells in 0 axis, 5 cells in 2 axis.
- Axes missing from this input will not be overlapped.
Examples
>>> import numpy as np >>> import dask.array as da
>>> x = np.arange(64).reshape((8, 8)) >>> d = da.from_array(x, chunks=(4, 4)) >>> d.chunks ((4, 4), (4, 4))
>>> g = da.overlap.overlap(d, depth={0: 2, 1: 1}, ... boundary={0: 100, 1: 'reflect'}) >>> g.chunks ((8, 8), (6, 6))
>>> np.array(g) array([[100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100], [100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100], [ 0, 0, 1, 2, 3, 4, 3, 4, 5, 6, 7, 7], [ 8, 8, 9, 10, 11, 12, 11, 12, 13, 14, 15, 15], [ 16, 16, 17, 18, 19, 20, 19, 20, 21, 22, 23, 23], [ 24, 24, 25, 26, 27, 28, 27, 28, 29, 30, 31, 31], [ 32, 32, 33, 34, 35, 36, 35, 36, 37, 38, 39, 39], [ 40, 40, 41, 42, 43, 44, 43, 44, 45, 46, 47, 47], [ 16, 16, 17, 18, 19, 20, 19, 20, 21, 22, 23, 23], [ 24, 24, 25, 26, 27, 28, 27, 28, 29, 30, 31, 31], [ 32, 32, 33, 34, 35, 36, 35, 36, 37, 38, 39, 39], [ 40, 40, 41, 42, 43, 44, 43, 44, 45, 46, 47, 47], [ 48, 48, 49, 50, 51, 52, 51, 52, 53, 54, 55, 55], [ 56, 56, 57, 58, 59, 60, 59, 60, 61, 62, 63, 63], [100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100], [100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100]])
-
dask.array.overlap.
map_overlap
(func, *args, depth=None, boundary=None, trim=True, align_arrays=True, **kwargs)¶ Map a function over blocks of arrays with some overlap
We share neighboring zones between blocks of the array, map a function, and then trim away the neighboring strips.
- Parameters
- func: function
The function to apply to each extended block. If multiple arrays are provided, then the function should expect to receive chunks of each array in the same order.
- argsdask arrays
- depth: int, tuple, dict or list
The number of elements that each block should share with its neighbors If a tuple or dict then this can be different per axis. If a list then each element of that list must be an int, tuple or dict defining depth for the corresponding array in args. Asymmetric depths may be specified using a dict value of (-/+) tuples. Note that asymmetric depths are currently only supported when
boundary
is ‘none’. The default value is 0.- boundary: str, tuple, dict or list
How to handle the boundaries. Values include ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or any constant value like 0 or np.nan. If a list then each element must be a str, tuple or dict defining the boundary for the corresponding array in args. The default value is ‘reflect’.
- trim: bool
Whether or not to trim
depth
elements from each block after calling the map function. Set this to False if your mapping function already does this for you- align_arrays: bool
Whether or not to align chunks along equally sized dimensions when multiple arrays are provided. This allows for larger chunks in some arrays to be broken into smaller ones that match chunk sizes in other arrays such that they are compatible for block function mapping. If this is false, then an error will be thrown if arrays do not already have the same number of blocks in each dimension.
- **kwargs:
Other keyword arguments valid in
map_blocks
Examples
>>> import numpy as np >>> import dask.array as da
>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1]) >>> x = da.from_array(x, chunks=5) >>> def derivative(x): ... return x - np.roll(x, 1)
>>> y = x.map_overlap(derivative, depth=1, boundary=0) >>> y.compute() array([ 1, 0, 1, 1, 0, 0, -1, -1, 0])
>>> x = np.arange(16).reshape((4, 4)) >>> d = da.from_array(x, chunks=(2, 2)) >>> d.map_overlap(lambda x: x + x.size, depth=1).compute() array([[16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]])
>>> func = lambda x: x + x.size >>> depth = {0: 1, 1: 1} >>> boundary = {0: 'reflect', 1: 'none'} >>> d.map_overlap(func, depth, boundary).compute() array([[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27]])
The
da.map_overlap
function can also accept multiple arrays.>>> func = lambda x, y: x + y >>> x = da.arange(8).reshape(2, 4).rechunk((1, 2)) >>> y = da.arange(4).rechunk(2) >>> da.map_overlap(func, x, y, depth=1).compute() array([[ 0, 2, 4, 6], [ 4, 6, 8, 10]])
When multiple arrays are given, they do not need to have the same number of dimensions but they must broadcast together. Arrays are aligned block by block (just as in
da.map_blocks
) so the blocks must have a common chunk size. This common chunking is determined automatically as long asalign_arrays
is True.>>> x = da.arange(8, chunks=4) >>> y = da.arange(8, chunks=2) >>> r = da.map_overlap(func, x, y, depth=1, align_arrays=True) >>> len(r.to_delayed()) 4
>>> da.map_overlap(func, x, y, depth=1, align_arrays=False).compute() Traceback (most recent call last): ... ValueError: Shapes do not align {'.0': {2, 4}}
Note also that this function is equivalent to
map_blocks
by default. A non-zerodepth
must be defined for any overlap to appear in the arrays provided tofunc
.>>> func = lambda x: x.sum() >>> x = da.ones(10, dtype='int') >>> block_args = dict(chunks=(), drop_axis=0) >>> da.map_blocks(func, x, **block_args).compute() 10 >>> da.map_overlap(func, x, **block_args).compute() 10 >>> da.map_overlap(func, x, **block_args, depth=1).compute() 12
-
dask.array.overlap.
trim_internal
(x, axes, boundary=None)¶ Trim sides from each block
This couples well with the overlap operation, which may leave excess data on each block
See also
dask.array.chunk.trim
dask.array.map_blocks
-
dask.array.overlap.
trim_overlap
(x, depth, boundary=None)¶ Trim sides from each block.
This couples well with the
map_overlap
operation which may leave excess data on each block.See also
-
dask.array.
from_array
(x, chunks='auto', name=None, lock=False, asarray=None, fancy=True, getitem=None, meta=None) Create dask array from something that looks like an array
Input must have a
.shape
,.ndim
,.dtype
and support numpy-style slicing.- Parameters
- xarray_like
- chunksint, tuple
How to chunk the array. Must be one of the following forms:
A blocksize like 1000.
A blockshape like (1000, 1000).
Explicit sizes of all blocks along all dimensions like ((1000, 1000, 500), (400, 400)).
A size in bytes, like “100 MiB” which will choose a uniform block-like shape
The word “auto” which acts like the above, but uses a configuration value
array.chunk-size
for the chunk size
-1 or None as a blocksize indicate the size of the corresponding dimension.
- namestr, optional
The key name to use for the array. Defaults to a hash of
x
. By default, hash uses python’s standard sha1. This behaviour can be changed by installing cityhash, xxhash or murmurhash. If installed, a large-factor speedup can be obtained in the tokenisation step. Usename=False
to generate a random name instead of hashing (fast)Note
Because this
name
is used as the key in task graphs, you should ensure that it uniquely identifies the data contained within. If you’d like to provide a descriptive name that is still unique, combine the descriptive name withdask.base.tokenize()
of thearray_like
. See Task Graphs for more.- lockbool or Lock, optional
If
x
doesn’t support concurrent reads then provide a lock here, or pass in True to have dask.array create one for you.- asarraybool, optional
If True then call np.asarray on chunks to convert them to numpy arrays. If False then chunks are passed through unchanged. If None (default) then we use True if the
__array_function__
method is undefined.- fancybool, optional
If
x
doesn’t support fancy indexing (e.g. indexing with lists or arrays) then set to False. Default is True.- metaArray-like, optional
The metadata for the resulting dask array. This is the kind of array that will result from slicing the input array. Defaults to the input array.
Examples
>>> x = h5py.File('...')['/data/path'] >>> a = da.from_array(x, chunks=(1000, 1000))
If your underlying datastore does not support concurrent reads then include the
lock=True
keyword argument orlock=mylock
if you want multiple arrays to coordinate around the same lock.>>> a = da.from_array(x, chunks=(1000, 1000), lock=True)
If your underlying datastore has a
.chunks
attribute (as h5py and zarr datasets do) then a multiple of that chunk shape will be used if you do not provide a chunk shape.>>> a = da.from_array(x, chunks='auto') >>> a = da.from_array(x, chunks='100 MiB') >>> a = da.from_array(x)
If providing a name, ensure that it is unique
>>> import dask.base >>> token = dask.base.tokenize(x) >>> a = da.from_array('myarray-' + token)
-
dask.array.
from_delayed
(value, shape, dtype=None, meta=None, name=None) Create a dask array from a dask delayed value
This routine is useful for constructing dask arrays in an ad-hoc fashion using dask delayed, particularly when combined with stack and concatenate.
The dask array will consist of a single chunk.
Examples
>>> import dask >>> import dask.array as da >>> value = dask.delayed(np.ones)(5) >>> array = da.from_delayed(value, (5,), dtype=float) >>> array dask.array<from-value, shape=(5,), dtype=float64, chunksize=(5,), chunktype=numpy.ndarray> >>> array.compute() array([1., 1., 1., 1., 1.])
-
dask.array.
from_npy_stack
(dirname, mmap_mode='r')¶ Load dask array from stack of npy files
See
da.to_npy_stack
for docstring- Parameters
- dirname: string
Directory of .npy files
- mmap_mode: (None or ‘r’)
Read data in memory map mode
-
dask.array.
from_zarr
(url, component=None, storage_options=None, chunks=None, name=None, **kwargs)¶ Load array from the zarr storage format
See https://zarr.readthedocs.io for details about the format.
- Parameters
- url: Zarr Array or str or MutableMapping
Location of the data. A URL can include a protocol specifier like s3:// for remote data. Can also be any MutableMapping instance, which should be serializable if used in multiple processes.
- component: str or None
If the location is a zarr group rather than an array, this is the subcomponent that should be loaded, something like
'foo/bar'
.- storage_options: dict
Any additional parameters for the storage backend (ignored for local paths)
- chunks: tuple of ints or tuples of ints
Passed to
da.from_array
, allows setting the chunks on initialisation, if the chunking scheme in the on-disc dataset is not optimal for the calculations to follow.- namestr, optional
An optional keyname for the array. Defaults to hashing the input
- kwargs: passed to ``zarr.Array``.
-
dask.array.
from_tiledb
(uri, attribute=None, chunks=None, storage_options=None, **kwargs)¶ Load array from the TileDB storage format
See https://docs.tiledb.io for more information about TileDB.
- Parameters
- uri: TileDB array or str
Location to save the data
- attribute: str or None
Attribute selection (single-attribute view on multi-attribute array)
- Returns
- A Dask Array
Examples
>>> # create a tiledb array >>> import tiledb, numpy as np, tempfile >>> uri = tempfile.NamedTemporaryFile().name >>> tiledb.from_numpy(uri, np.arange(0,9).reshape(3,3)) <tiledb.libtiledb.DenseArray object at 0x...> >>> # read back the array >>> import dask.array as da >>> tdb_ar = da.from_tiledb(uri) >>> tdb_ar.shape (3, 3) >>> tdb_ar.mean().compute() 4.0
-
dask.array.
store
(sources, targets, lock=True, regions=None, compute=True, return_stored=False, **kwargs) Store dask arrays in array-like objects, overwrite data in target
This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.
If your data fits in memory then you may prefer calling
np.array(myarray)
instead.- Parameters
- sources: Array or iterable of Arrays
- targets: array-like or Delayed or iterable of array-likes and/or Delayeds
These should support setitem syntax
target[10:20] = ...
- lock: boolean or threading.Lock, optional
Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular
threading.Lock
object to be shared among all writes.- regions: tuple of slices or list of tuples of slices
Each
region
tuple inregions
should be such thattarget[region].shape = source.shape
for the corresponding source and target in sources and targets, respectively. If this is a tuple, the contents will be assumed to be slices, so do not provide a tuple of tuples.- compute: boolean, optional
If true compute immediately, return
dask.delayed.Delayed
otherwise- return_stored: boolean, optional
Optionally return the stored result (default False).
Examples
>>> x = ...
>>> import h5py >>> f = h5py.File('myfile.hdf5', mode='a') >>> dset = f.create_dataset('/data', shape=x.shape, ... chunks=x.chunks, ... dtype='f8')
>>> store(x, dset)
Alternatively store many arrays at the same time
>>> store([x, y, z], [dset1, dset2, dset3])
-
dask.array.
to_hdf5
(filename, *args, **kwargs)¶ Store arrays in HDF5 file
This saves several dask arrays into several datapaths in an HDF5 file. It creates the necessary datasets and handles clean file opening/closing.
>>> da.to_hdf5('myfile.hdf5', '/x', x)
or
>>> da.to_hdf5('myfile.hdf5', {'/x': x, '/y': y})
Optionally provide arguments as though to
h5py.File.create_dataset
>>> da.to_hdf5('myfile.hdf5', '/x', x, compression='lzf', shuffle=True)
This can also be used as a method on a single Array
>>> x.to_hdf5('myfile.hdf5', '/x')
See also
da.store
h5py.File.create_dataset
-
dask.array.
to_zarr
(arr, url, component=None, storage_options=None, overwrite=False, compute=True, return_stored=False, **kwargs)¶ Save array to the zarr storage format
See https://zarr.readthedocs.io for details about the format.
- Parameters
- arr: dask.array
Data to store
- url: Zarr Array or str or MutableMapping
Location of the data. A URL can include a protocol specifier like s3:// for remote data. Can also be any MutableMapping instance, which should be serializable if used in multiple processes.
- component: str or None
If the location is a zarr group rather than an array, this is the subcomponent that should be created/over-written.
- storage_options: dict
Any additional parameters for the storage backend (ignored for local paths)
- overwrite: bool
If given array already exists, overwrite=False will cause an error, where overwrite=True will replace the existing data. Note that this check is done at computation time, not during graph creation.
- compute, return_stored: see ``store()``
- kwargs: passed to the ``zarr.create()`` function, e.g., compression options
- Raises
- ValueError
If
arr
has unknown chunk sizes, which is not supported by Zarr.
See also
-
dask.array.
to_npy_stack
(dirname, x, axis=0)¶ Write dask array to a stack of .npy files
This partitions the dask.array along one axis and stores each block along that axis as a single .npy file in the specified directory
See also
Examples
>>> x = da.ones((5, 10, 10), chunks=(2, 4, 4)) >>> da.to_npy_stack('data/', x, axis=0)
The
.npy
files store numpy arrays forx[0:2], x[2:4], and x[4:5]
respectively, as is specified by the chunk size along the zeroth axis:$ tree data/ data/ |-- 0.npy |-- 1.npy |-- 2.npy |-- info
The
info
file stores the dtype, chunks, and axis information of the array. You can load these stacks with theda.from_npy_stack
function.>>> y = da.from_npy_stack('data/')
-
dask.array.
to_tiledb
(darray, uri, compute=True, return_stored=False, storage_options=None, **kwargs)¶ Save array to the TileDB storage format
Save ‘array’ using the TileDB storage manager, to any TileDB-supported URI, including local disk, S3, or HDFS.
See https://docs.tiledb.io for more information about TileDB.
- Parameters
- darray: dask.array
A dask array to write.
- uri:
Any supported TileDB storage location.
- storage_options: dict
Dict containing any configuration options for the TileDB backend. see https://docs.tiledb.io/en/stable/tutorials/config.html
- compute, return_stored: see ``store()``
- Returns
- None
Unless
return_stored
is set toTrue
(False
by default)
Notes
TileDB only supports regularly-chunked arrays. TileDB tile extents correspond to form 2 of the dask chunk specification, and the conversion is done automatically for supported arrays.
Examples
>>> import dask.array as da, tempfile >>> uri = tempfile.NamedTemporaryFile().name >>> data = da.random.random(5,5) >>> da.to_tiledb(data, uri) >>> import tiledb >>> tdb_ar = tiledb.open(uri) >>> all(tdb_ar == data) True
-
dask.array.fft.
fft_wrap
(fft_func, kind=None, dtype=None)¶ Wrap 1D, 2D, and ND real and complex FFT functions
Takes a function that behaves like
numpy.fft
functions and a specified kind to match it to that are named after the functions in thenumpy.fft
API.Supported kinds include:
fft
fft2
fftn
ifft
ifft2
ifftn
rfft
rfft2
rfftn
irfft
irfft2
irfftn
hfft
ihfft
Examples
>>> parallel_fft = fft_wrap(np.fft.fft) >>> parallel_ifft = fft_wrap(np.fft.ifft)
-
dask.array.fft.
fft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.fft
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.fft docstring follows below:
Compute the one-dimensional discrete Fourier Transform.
This function computes the one-dimensional n-point discrete Fourier Transform (DFT) with the efficient Fast Fourier Transform (FFT) algorithm [CT].
- Parameters
- aarray_like
Input array, can be complex.
- nint, optional
Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
- axisint, optional
Axis over which to compute the FFT. If not given, the last axis is used.
- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outcomplex ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified.
- Raises
- IndexError
if axes is larger than the last axis of a.
See also
Notes
FFT (Fast Fourier Transform) refers to a way the discrete Fourier Transform (DFT) can be calculated efficiently, by using symmetries in the calculated terms. The symmetry is highest when n is a power of 2, and the transform is therefore most efficient for these sizes.
The DFT is defined, with the conventions used in this implementation, in the documentation for the numpy.fft module.
References
- CT
Cooley, James W., and John W. Tukey, 1965, “An algorithm for the machine calculation of complex Fourier series,” Math. Comput. 19: 297-301.
Examples
>>> np.fft.fft(np.exp(2j * np.pi * np.arange(8) / 8)) array([-2.33486982e-16+1.14423775e-17j, 8.00000000e+00-1.25557246e-15j, 2.33486982e-16+2.33486982e-16j, 0.00000000e+00+1.22464680e-16j, -1.14423775e-17+2.33486982e-16j, 0.00000000e+00+5.20784380e-16j, 1.14423775e-17+1.14423775e-17j, 0.00000000e+00+1.22464680e-16j])
In this example, real input has an FFT which is Hermitian, i.e., symmetric in the real part and anti-symmetric in the imaginary part, as described in the numpy.fft documentation:
>>> import matplotlib.pyplot as plt >>> t = np.arange(256) >>> sp = np.fft.fft(np.sin(t)) >>> freq = np.fft.fftfreq(t.shape[-1]) >>> plt.plot(freq, sp.real, freq, sp.imag) [<matplotlib.lines.Line2D object at 0x...>, <matplotlib.lines.Line2D object at 0x...>] >>> plt.show()
-
dask.array.fft.
fft2
(a, s=None, axes=None)¶ Wrapping of numpy.fft.fft2
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.fft2 docstring follows below:
Compute the 2-dimensional discrete Fourier Transform
This function computes the n-dimensional discrete Fourier Transform over any axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). By default, the transform is computed over the last two axes of the input array, i.e., a 2-dimensional FFT.
- Parameters
- aarray_like
Input array, can be complex
- ssequence of ints, optional
Shape (length of each transformed axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). This corresponds ton
forfft(x, n)
. Along each axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used.- axessequence of ints, optional
Axes over which to compute the FFT. If not given, the last two axes are used. A repeated index in axes means the transform over that axis is performed multiple times. A one-element sequence means that a one-dimensional FFT is performed.
- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outcomplex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or the last two axes if axes is not given.
- Raises
- ValueError
If s and axes have different length, or axes not given and
len(s) != 2
.- IndexError
If an element of axes is larger than than the number of axes of a.
See also
numpy.fft
Overall view of discrete Fourier transforms, with definitions and conventions used.
ifft2
The inverse two-dimensional FFT.
fft
The one-dimensional FFT.
fftn
The n-dimensional FFT.
fftshift
Shifts zero-frequency terms to the center of the array. For two-dimensional input, swaps first and third quadrants, and second and fourth quadrants.
Notes
fft2 is just fftn with a different default for axes.
The output, analogously to fft, contains the term for zero frequency in the low-order corner of the transformed axes, the positive frequency terms in the first half of these axes, the term for the Nyquist frequency in the middle of the axes and the negative frequency terms in the second half of the axes, in order of decreasingly negative frequency.
See fftn for details and a plotting example, and numpy.fft for definitions and conventions used.
Examples
>>> a = np.mgrid[:5, :5][0] >>> np.fft.fft2(a) array([[ 50. +0.j , 0. +0.j , 0. +0.j , # may vary 0. +0.j , 0. +0.j ], [-12.5+17.20477401j, 0. +0.j , 0. +0.j , 0. +0.j , 0. +0.j ], [-12.5 +4.0614962j , 0. +0.j , 0. +0.j , 0. +0.j , 0. +0.j ], [-12.5 -4.0614962j , 0. +0.j , 0. +0.j , 0. +0.j , 0. +0.j ], [-12.5-17.20477401j, 0. +0.j , 0. +0.j , 0. +0.j , 0. +0.j ]])
-
dask.array.fft.
fftn
(a, s=None, axes=None)¶ Wrapping of numpy.fft.fftn
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.fftn docstring follows below:
Compute the N-dimensional discrete Fourier Transform.
This function computes the N-dimensional discrete Fourier Transform over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT).
- Parameters
- aarray_like
Input array, can be complex.
- ssequence of ints, optional
Shape (length of each transformed axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). This corresponds ton
forfft(x, n)
. Along any axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used.- axessequence of ints, optional
Axes over which to compute the FFT. If not given, the last
len(s)
axes are used, or all axes if s is also not specified. Repeated indices in axes means that the transform over that axis is performed multiple times.- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outcomplex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s and a, as explained in the parameters section above.
- Raises
- ValueError
If s and axes have different length.
- IndexError
If an element of axes is larger than than the number of axes of a.
See also
numpy.fft
Overall view of discrete Fourier transforms, with definitions and conventions used.
ifftn
The inverse of fftn, the inverse n-dimensional FFT.
fft
The one-dimensional FFT, with definitions and conventions used.
rfftn
The n-dimensional FFT of real input.
fft2
The two-dimensional FFT.
fftshift
Shifts zero-frequency terms to centre of array
Notes
The output, analogously to fft, contains the term for zero frequency in the low-order corner of all axes, the positive frequency terms in the first half of all axes, the term for the Nyquist frequency in the middle of all axes and the negative frequency terms in the second half of all axes, in order of decreasingly negative frequency.
See numpy.fft for details, definitions and conventions used.
Examples
>>> a = np.mgrid[:3, :3, :3][0] >>> np.fft.fftn(a, axes=(1, 2)) array([[[ 0.+0.j, 0.+0.j, 0.+0.j], # may vary [ 0.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j]], [[ 9.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j]], [[18.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j]]]) >>> np.fft.fftn(a, (2, 2), axes=(0, 1)) array([[[ 2.+0.j, 2.+0.j, 2.+0.j], # may vary [ 0.+0.j, 0.+0.j, 0.+0.j]], [[-2.+0.j, -2.+0.j, -2.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j]]])
>>> import matplotlib.pyplot as plt >>> [X, Y] = np.meshgrid(2 * np.pi * np.arange(200) / 12, ... 2 * np.pi * np.arange(200) / 34) >>> S = np.sin(X) + np.cos(Y) + np.random.uniform(0, 1, X.shape) >>> FS = np.fft.fftn(S) >>> plt.imshow(np.log(np.abs(np.fft.fftshift(FS))**2)) <matplotlib.image.AxesImage object at 0x...> >>> plt.show()
-
dask.array.fft.
ifft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.ifft
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.ifft docstring follows below:
Compute the one-dimensional inverse discrete Fourier Transform.
This function computes the inverse of the one-dimensional n-point discrete Fourier transform computed by fft. In other words,
ifft(fft(a)) == a
to within numerical accuracy. For a general description of the algorithm and definitions, see numpy.fft.The input should be ordered in the same way as is returned by fft, i.e.,
a[0]
should contain the zero frequency term,a[1:n//2]
should contain the positive-frequency terms,a[n//2 + 1:]
should contain the negative-frequency terms, in increasing order starting from the most negative frequency.
For an even number of input points,
A[n//2]
represents the sum of the values at the positive and negative Nyquist frequencies, as the two are aliased together. See numpy.fft for details.- Parameters
- aarray_like
Input array, can be complex.
- nint, optional
Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used. See notes about padding issues.
- axisint, optional
Axis over which to compute the inverse DFT. If not given, the last axis is used.
- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outcomplex ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified.
- Raises
- IndexError
If axes is larger than the last axis of a.
See also
Notes
If the input parameter n is larger than the size of the input, the input is padded by appending zeros at the end. Even though this is the common approach, it might lead to surprising results. If a different padding is desired, it must be performed before calling ifft.
Examples
>>> np.fft.ifft([0, 4, 0, 0]) array([ 1.+0.j, 0.+1.j, -1.+0.j, 0.-1.j]) # may vary
Create and plot a band-limited signal with random phases:
>>> import matplotlib.pyplot as plt >>> t = np.arange(400) >>> n = np.zeros((400,), dtype=complex) >>> n[40:60] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20,))) >>> s = np.fft.ifft(n) >>> plt.plot(t, s.real, 'b-', t, s.imag, 'r--') [<matplotlib.lines.Line2D object at ...>, <matplotlib.lines.Line2D object at ...>] >>> plt.legend(('real', 'imaginary')) <matplotlib.legend.Legend object at ...> >>> plt.show()
-
dask.array.fft.
ifft2
(a, s=None, axes=None)¶ Wrapping of numpy.fft.ifft2
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.ifft2 docstring follows below:
Compute the 2-dimensional inverse discrete Fourier Transform.
This function computes the inverse of the 2-dimensional discrete Fourier Transform over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). In other words,
ifft2(fft2(a)) == a
to within numerical accuracy. By default, the inverse transform is computed over the last two axes of the input array.The input, analogously to ifft, should be ordered in the same way as is returned by fft2, i.e. it should have the term for zero frequency in the low-order corner of the two axes, the positive frequency terms in the first half of these axes, the term for the Nyquist frequency in the middle of the axes and the negative frequency terms in the second half of both axes, in order of decreasingly negative frequency.
- Parameters
- aarray_like
Input array, can be complex.
- ssequence of ints, optional
Shape (length of each axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). This corresponds to n forifft(x, n)
. Along each axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used. See notes for issue on ifft zero padding.- axessequence of ints, optional
Axes over which to compute the FFT. If not given, the last two axes are used. A repeated index in axes means the transform over that axis is performed multiple times. A one-element sequence means that a one-dimensional FFT is performed.
- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outcomplex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or the last two axes if axes is not given.
- Raises
- ValueError
If s and axes have different length, or axes not given and
len(s) != 2
.- IndexError
If an element of axes is larger than than the number of axes of a.
See also
Notes
ifft2 is just ifftn with a different default for axes.
See ifftn for details and a plotting example, and numpy.fft for definition and conventions used.
Zero-padding, analogously with ifft, is performed by appending zeros to the input along the specified dimension. Although this is the common approach, it might lead to surprising results. If another form of zero padding is desired, it must be performed before ifft2 is called.
Examples
>>> a = 4 * np.eye(4) >>> np.fft.ifft2(a) array([[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], # may vary [0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j], [0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j], [0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j]])
-
dask.array.fft.
ifftn
(a, s=None, axes=None)¶ Wrapping of numpy.fft.ifftn
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.ifftn docstring follows below:
Compute the N-dimensional inverse discrete Fourier Transform.
This function computes the inverse of the N-dimensional discrete Fourier Transform over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). In other words,
ifftn(fftn(a)) == a
to within numerical accuracy. For a description of the definitions and conventions used, see numpy.fft.The input, analogously to ifft, should be ordered in the same way as is returned by fftn, i.e. it should have the term for zero frequency in all axes in the low-order corner, the positive frequency terms in the first half of all axes, the term for the Nyquist frequency in the middle of all axes and the negative frequency terms in the second half of all axes, in order of decreasingly negative frequency.
- Parameters
- aarray_like
Input array, can be complex.
- ssequence of ints, optional
Shape (length of each transformed axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). This corresponds ton
forifft(x, n)
. Along any axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used. See notes for issue on ifft zero padding.- axessequence of ints, optional
Axes over which to compute the IFFT. If not given, the last
len(s)
axes are used, or all axes if s is also not specified. Repeated indices in axes means that the inverse transform over that axis is performed multiple times.- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outcomplex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s or a, as explained in the parameters section above.
- Raises
- ValueError
If s and axes have different length.
- IndexError
If an element of axes is larger than than the number of axes of a.
See also
numpy.fft
Overall view of discrete Fourier transforms, with definitions and conventions used.
fftn
The forward n-dimensional FFT, of which ifftn is the inverse.
ifft
The one-dimensional inverse FFT.
ifft2
The two-dimensional inverse FFT.
ifftshift
Undoes fftshift, shifts zero-frequency terms to beginning of array.
Notes
See numpy.fft for definitions and conventions used.
Zero-padding, analogously with ifft, is performed by appending zeros to the input along the specified dimension. Although this is the common approach, it might lead to surprising results. If another form of zero padding is desired, it must be performed before ifftn is called.
Examples
>>> a = np.eye(4) >>> np.fft.ifftn(np.fft.fftn(a, axes=(0,)), axes=(1,)) array([[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], # may vary [0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j], [0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j], [0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j]])
Create and plot an image with band-limited frequency content:
>>> import matplotlib.pyplot as plt >>> n = np.zeros((200,200), dtype=complex) >>> n[60:80, 20:40] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20, 20))) >>> im = np.fft.ifftn(n).real >>> plt.imshow(im) <matplotlib.image.AxesImage object at 0x...> >>> plt.show()
-
dask.array.fft.
rfft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.rfft
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.rfft docstring follows below:
Compute the one-dimensional discrete Fourier Transform for real input.
This function computes the one-dimensional n-point discrete Fourier Transform (DFT) of a real-valued array by means of an efficient algorithm called the Fast Fourier Transform (FFT).
- Parameters
- aarray_like
Input array
- nint, optional
Number of points along transformation axis in the input to use. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
- axisint, optional
Axis over which to compute the FFT. If not given, the last axis is used.
- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outcomplex ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. If n is even, the length of the transformed axis is
(n/2)+1
. If n is odd, the length is(n+1)/2
.
- Raises
- IndexError
If axis is larger than the last axis of a.
See also
Notes
When the DFT is computed for purely real input, the output is Hermitian-symmetric, i.e. the negative frequency terms are just the complex conjugates of the corresponding positive-frequency terms, and the negative-frequency terms are therefore redundant. This function does not compute the negative frequency terms, and the length of the transformed axis of the output is therefore
n//2 + 1
.When
A = rfft(a)
and fs is the sampling frequency,A[0]
contains the zero-frequency term 0*fs, which is real due to Hermitian symmetry.If n is even,
A[-1]
contains the term representing both positive and negative Nyquist frequency (+fs/2 and -fs/2), and must also be purely real. If n is odd, there is no term at fs/2;A[-1]
contains the largest positive frequency (fs/2*(n-1)/n), and is complex in the general case.If the input a contains an imaginary part, it is silently discarded.
Examples
>>> np.fft.fft([0, 1, 0, 0]) array([ 1.+0.j, 0.-1.j, -1.+0.j, 0.+1.j]) # may vary >>> np.fft.rfft([0, 1, 0, 0]) array([ 1.+0.j, 0.-1.j, -1.+0.j]) # may vary
Notice how the final element of the fft output is the complex conjugate of the second element, for real input. For rfft, this symmetry is exploited to compute only the non-negative frequency terms.
-
dask.array.fft.
rfft2
(a, s=None, axes=None)¶ Wrapping of numpy.fft.rfft2
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.rfft2 docstring follows below:
Compute the 2-dimensional FFT of a real array.
- Parameters
- aarray
Input array, taken to be real.
- ssequence of ints, optional
Shape of the FFT.
- axessequence of ints, optional
Axes over which to compute the FFT.
- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outndarray
The result of the real 2-D FFT.
See also
rfftn
Compute the N-dimensional discrete Fourier Transform for real input.
Notes
This is really just rfftn with different default behavior. For more details see rfftn.
-
dask.array.fft.
rfftn
(a, s=None, axes=None)¶ Wrapping of numpy.fft.rfftn
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.rfftn docstring follows below:
Compute the N-dimensional discrete Fourier Transform for real input.
This function computes the N-dimensional discrete Fourier Transform over any number of axes in an M-dimensional real array by means of the Fast Fourier Transform (FFT). By default, all axes are transformed, with the real transform performed over the last axis, while the remaining transforms are complex.
- Parameters
- aarray_like
Input array, taken to be real.
- ssequence of ints, optional
Shape (length along each transformed axis) to use from the input. (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). The final element of s corresponds to n forrfft(x, n)
, while for the remaining axes, it corresponds to n forfft(x, n)
. Along any axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used.- axessequence of ints, optional
Axes over which to compute the FFT. If not given, the last
len(s)
axes are used, or all axes if s is also not specified.- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outcomplex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s and a, as explained in the parameters section above. The length of the last axis transformed will be
s[-1]//2+1
, while the remaining transformed axes will have lengths according to s, or unchanged from the input.
- Raises
- ValueError
If s and axes have different length.
- IndexError
If an element of axes is larger than than the number of axes of a.
See also
Notes
The transform for real input is performed over the last transformation axis, as by rfft, then the transform over the remaining axes is performed as by fftn. The order of the output is as for rfft for the final transformation axis, and as for fftn for the remaining transformation axes.
See fft for details, definitions and conventions used.
Examples
>>> a = np.ones((2, 2, 2)) >>> np.fft.rfftn(a) array([[[8.+0.j, 0.+0.j], # may vary [0.+0.j, 0.+0.j]], [[0.+0.j, 0.+0.j], [0.+0.j, 0.+0.j]]])
>>> np.fft.rfftn(a, axes=(2, 0)) array([[[4.+0.j, 0.+0.j], # may vary [4.+0.j, 0.+0.j]], [[0.+0.j, 0.+0.j], [0.+0.j, 0.+0.j]]])
-
dask.array.fft.
irfft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.irfft
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.irfft docstring follows below:
Compute the inverse of the n-point DFT for real input.
This function computes the inverse of the one-dimensional n-point discrete Fourier Transform of real input computed by rfft. In other words,
irfft(rfft(a), len(a)) == a
to within numerical accuracy. (See Notes below for whylen(a)
is necessary here.)The input is expected to be in the form returned by rfft, i.e. the real zero-frequency term followed by the complex positive frequency terms in order of increasing frequency. Since the discrete Fourier Transform of real input is Hermitian-symmetric, the negative frequency terms are taken to be the complex conjugates of the corresponding positive frequency terms.
- Parameters
- aarray_like
The input array.
- nint, optional
Length of the transformed axis of the output. For n output points,
n//2+1
input points are necessary. If the input is longer than this, it is cropped. If it is shorter than this, it is padded with zeros. If n is not given, it is taken to be2*(m-1)
wherem
is the length of the input along the axis specified by axis.- axisint, optional
Axis over which to compute the inverse FFT. If not given, the last axis is used.
- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is n, or, if n is not given,
2*(m-1)
wherem
is the length of the transformed axis of the input. To get an odd number of output points, n must be specified.
- Raises
- IndexError
If axis is larger than the last axis of a.
See also
Notes
Returns the real valued n-point inverse discrete Fourier transform of a, where a contains the non-negative frequency terms of a Hermitian-symmetric sequence. n is the length of the result, not the input.
If you specify an n such that a must be zero-padded or truncated, the extra/removed values will be added/removed at high frequencies. One can thus resample a series to m points via Fourier interpolation by:
a_resamp = irfft(rfft(a), m)
.The correct interpretation of the hermitian input depends on the length of the original data, as given by n. This is because each input shape could correspond to either an odd or even length signal. By default, irfft assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. By Hermitian symmetry, the value is thus treated as purely real. To avoid losing information, the correct length of the real input must be given.
Examples
>>> np.fft.ifft([1, -1j, -1, 1j]) array([0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j]) # may vary >>> np.fft.irfft([1, -1j, -1]) array([0., 1., 0., 0.])
Notice how the last term in the input to the ordinary ifft is the complex conjugate of the second term, and the output has zero imaginary part everywhere. When calling irfft, the negative frequencies are not specified, and the output array is purely real.
-
dask.array.fft.
irfft2
(a, s=None, axes=None)¶ Wrapping of numpy.fft.irfft2
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.irfft2 docstring follows below:
Compute the 2-dimensional inverse FFT of a real array.
- Parameters
- aarray_like
The input array
- ssequence of ints, optional
Shape of the real output to the inverse FFT.
- axessequence of ints, optional
The axes over which to compute the inverse fft. Default is the last two axes.
- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outndarray
The result of the inverse real 2-D FFT.
See also
irfftn
Compute the inverse of the N-dimensional FFT of real input.
Notes
This is really irfftn with different defaults. For more details see irfftn.
-
dask.array.fft.
irfftn
(a, s=None, axes=None)¶ Wrapping of numpy.fft.irfftn
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.irfftn docstring follows below:
Compute the inverse of the N-dimensional FFT of real input.
This function computes the inverse of the N-dimensional discrete Fourier Transform for real input over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). In other words,
irfftn(rfftn(a), a.shape) == a
to within numerical accuracy. (Thea.shape
is necessary likelen(a)
is for irfft, and for the same reason.)The input should be ordered in the same way as is returned by rfftn, i.e. as for irfft for the final transformation axis, and as for ifftn along all the other axes.
- Parameters
- aarray_like
Input array.
- ssequence of ints, optional
Shape (length of each transformed axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). s is also the number of input points used along this axis, except for the last axis, wheres[-1]//2+1
points of the input are used. Along any axis, if the shape indicated by s is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. If s is not given, the shape of the input along the axes specified by axes is used. Except for the last axis which is taken to be2*(m-1)
wherem
is the length of the input along that axis.- axessequence of ints, optional
Axes over which to compute the inverse FFT. If not given, the last len(s) axes are used, or all axes if s is also not specified. Repeated indices in axes means that the inverse transform over that axis is performed multiple times.
- norm{None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
- Returns
- outndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s or a, as explained in the parameters section above. The length of each transformed axis is as given by the corresponding element of s, or the length of the input in every axis except for the last one if s is not given. In the final transformed axis the length of the output when s is not given is
2*(m-1)
wherem
is the length of the final transformed axis of the input. To get an odd number of output points in the final axis, s must be specified.
- Raises
- ValueError
If s and axes have different length.
- IndexError
If an element of axes is larger than than the number of axes of a.
See also
Notes
See fft for definitions and conventions used.
See rfft for definitions and conventions used for real input.
The correct interpretation of the hermitian input depends on the shape of the original data, as given by s. This is because each input shape could correspond to either an odd or even length signal. By default, irfftn assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. When performing the final complex to real transform, the last value is thus treated as purely real. To avoid losing information, the correct shape of the real input must be given.
Examples
>>> a = np.zeros((3, 2, 2)) >>> a[0, 0, 0] = 3 * 2 * 2 >>> np.fft.irfftn(a) array([[[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]]])
-
dask.array.fft.
hfft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.hfft
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.hfft docstring follows below:
Compute the FFT of a signal that has Hermitian symmetry, i.e., a real spectrum.
- Parameters
- aarray_like
The input array.
- nint, optional
Length of the transformed axis of the output. For n output points,
n//2 + 1
input points are necessary. If the input is longer than this, it is cropped. If it is shorter than this, it is padded with zeros. If n is not given, it is taken to be2*(m-1)
wherem
is the length of the input along the axis specified by axis.- axisint, optional
Axis over which to compute the FFT. If not given, the last axis is used.
- norm{None, “ortho”}, optional
Normalization mode (see numpy.fft). Default is None.
New in version 1.10.0.
- Returns
- outndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is n, or, if n is not given,
2*m - 2
wherem
is the length of the transformed axis of the input. To get an odd number of output points, n must be specified, for instance as2*m - 1
in the typical case,
- Raises
- IndexError
If axis is larger than the last axis of a.
Notes
hfft/ihfft are a pair analogous to rfft/irfft, but for the opposite case: here the signal has Hermitian symmetry in the time domain and is real in the frequency domain. So here it’s hfft for which you must supply the length of the result if it is to be odd.
even:
ihfft(hfft(a, 2*len(a) - 2) == a
, within roundoff error,odd:
ihfft(hfft(a, 2*len(a) - 1) == a
, within roundoff error.
The correct interpretation of the hermitian input depends on the length of the original data, as given by n. This is because each input shape could correspond to either an odd or even length signal. By default, hfft assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. By Hermitian symmetry, the value is thus treated as purely real. To avoid losing information, the shape of the full signal must be given.
Examples
>>> signal = np.array([1, 2, 3, 4, 3, 2]) >>> np.fft.fft(signal) array([15.+0.j, -4.+0.j, 0.+0.j, -1.-0.j, 0.+0.j, -4.+0.j]) # may vary >>> np.fft.hfft(signal[:4]) # Input first half of signal array([15., -4., 0., -1., 0., -4.]) >>> np.fft.hfft(signal, 6) # Input entire signal and truncate array([15., -4., 0., -1., 0., -4.])
>>> signal = np.array([[1, 1.j], [-1.j, 2]]) >>> np.conj(signal.T) - signal # check Hermitian symmetry array([[ 0.-0.j, -0.+0.j], # may vary [ 0.+0.j, 0.-0.j]]) >>> freq_spectrum = np.fft.hfft(signal) >>> freq_spectrum array([[ 1., 1.], [ 2., -2.]])
-
dask.array.fft.
ihfft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.ihfft
The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.ihfft docstring follows below:
Compute the inverse FFT of a signal that has Hermitian symmetry.
- Parameters
- aarray_like
Input array.
- nint, optional
Length of the inverse FFT, the number of points along transformation axis in the input to use. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
- axisint, optional
Axis over which to compute the inverse FFT. If not given, the last axis is used.
- norm{None, “ortho”}, optional
Normalization mode (see numpy.fft). Default is None.
New in version 1.10.0.
- Returns
- outcomplex ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is
n//2 + 1
.
Notes
hfft/ihfft are a pair analogous to rfft/irfft, but for the opposite case: here the signal has Hermitian symmetry in the time domain and is real in the frequency domain. So here it’s hfft for which you must supply the length of the result if it is to be odd:
even:
ihfft(hfft(a, 2*len(a) - 2) == a
, within roundoff error,odd:
ihfft(hfft(a, 2*len(a) - 1) == a
, within roundoff error.
Examples
>>> spectrum = np.array([ 15, -4, 0, -1, 0, -4]) >>> np.fft.ifft(spectrum) array([1.+0.j, 2.+0.j, 3.+0.j, 4.+0.j, 3.+0.j, 2.+0.j]) # may vary >>> np.fft.ihfft(spectrum) array([ 1.-0.j, 2.-0.j, 3.-0.j, 4.-0.j]) # may vary
-
dask.array.fft.
fftfreq
(n, d=1.0, chunks='auto')¶ Return the Discrete Fourier Transform sample frequencies.
This docstring was copied from numpy.fft.fftfreq.
Some inconsistencies with the Dask version may exist.
The returned float array f contains the frequency bin centers in cycles per unit of the sample spacing (with zero at the start). For instance, if the sample spacing is in seconds, then the frequency unit is cycles/second.
Given a window length n and a sample spacing d:
f = [0, 1, ..., n/2-1, -n/2, ..., -1] / (d*n) if n is even f = [0, 1, ..., (n-1)/2, -(n-1)/2, ..., -1] / (d*n) if n is odd
- Parameters
- nint
Window length.
- dscalar, optional
Sample spacing (inverse of the sampling rate). Defaults to 1.
- Returns
- fndarray
Array of length n containing the sample frequencies.
Examples
>>> signal = np.array([-2, 8, 6, 4, 1, 0, 3, 5], dtype=float) >>> fourier = np.fft.fft(signal) >>> n = signal.size >>> timestep = 0.1 >>> freq = np.fft.fftfreq(n, d=timestep) >>> freq array([ 0. , 1.25, 2.5 , ..., -3.75, -2.5 , -1.25])
-
dask.array.fft.
rfftfreq
(n, d=1.0, chunks='auto')¶ Return the Discrete Fourier Transform sample frequencies (for usage with rfft, irfft).
This docstring was copied from numpy.fft.rfftfreq.
Some inconsistencies with the Dask version may exist.
The returned float array f contains the frequency bin centers in cycles per unit of the sample spacing (with zero at the start). For instance, if the sample spacing is in seconds, then the frequency unit is cycles/second.
Given a window length n and a sample spacing d:
f = [0, 1, ..., n/2-1, n/2] / (d*n) if n is even f = [0, 1, ..., (n-1)/2-1, (n-1)/2] / (d*n) if n is odd
Unlike fftfreq (but like scipy.fftpack.rfftfreq) the Nyquist frequency component is considered to be positive.
- Parameters
- nint
Window length.
- dscalar, optional
Sample spacing (inverse of the sampling rate). Defaults to 1.
- Returns
- fndarray
Array of length
n//2 + 1
containing the sample frequencies.
Examples
>>> signal = np.array([-2, 8, 6, 4, 1, 0, 3, 5, -3, 4], dtype=float) >>> fourier = np.fft.rfft(signal) >>> n = signal.size >>> sample_rate = 100 >>> freq = np.fft.fftfreq(n, d=1./sample_rate) >>> freq array([ 0., 10., 20., ..., -30., -20., -10.]) >>> freq = np.fft.rfftfreq(n, d=1./sample_rate) >>> freq array([ 0., 10., 20., 30., 40., 50.])
-
dask.array.fft.
fftshift
(x, axes=None)¶ Shift the zero-frequency component to the center of the spectrum.
This docstring was copied from numpy.fft.fftshift.
Some inconsistencies with the Dask version may exist.
This function swaps half-spaces for all axes listed (defaults to all). Note that
y[0]
is the Nyquist component only iflen(x)
is even.- Parameters
- xarray_like
Input array.
- axesint or shape tuple, optional
Axes over which to shift. Default is None, which shifts all axes.
- Returns
- yndarray
The shifted array.
See also
ifftshift
The inverse of fftshift.
Examples
>>> freqs = np.fft.fftfreq(10, 0.1) >>> freqs array([ 0., 1., 2., ..., -3., -2., -1.]) >>> np.fft.fftshift(freqs) array([-5., -4., -3., -2., -1., 0., 1., 2., 3., 4.])
Shift the zero-frequency component only along the second axis:
>>> freqs = np.fft.fftfreq(9, d=1./9).reshape(3, 3) >>> freqs array([[ 0., 1., 2.], [ 3., 4., -4.], [-3., -2., -1.]]) >>> np.fft.fftshift(freqs, axes=(1,)) array([[ 2., 0., 1.], [-4., 3., 4.], [-1., -3., -2.]])
-
dask.array.fft.
ifftshift
(x, axes=None)¶ The inverse of fftshift. Although identical for even-length x, the functions differ by one sample for odd-length x.
This docstring was copied from numpy.fft.ifftshift.
Some inconsistencies with the Dask version may exist.
- Parameters
- xarray_like
Input array.
- axesint or shape tuple, optional
Axes over which to calculate. Defaults to None, which shifts all axes.
- Returns
- yndarray
The shifted array.
See also
fftshift
Shift zero-frequency component to the center of the spectrum.
Examples
>>> freqs = np.fft.fftfreq(9, d=1./9).reshape(3, 3) >>> freqs array([[ 0., 1., 2.], [ 3., 4., -4.], [-3., -2., -1.]]) >>> np.fft.ifftshift(np.fft.fftshift(freqs)) array([[ 0., 1., 2.], [ 3., 4., -4.], [-3., -2., -1.]])
-
dask.array.random.
beta
(a, b, size=None, chunks='auto', **kwargs)¶ Draw samples from a Beta distribution.
This docstring was copied from numpy.random.mtrand.RandomState.beta.
Some inconsistencies with the Dask version may exist.
The Beta distribution is a special case of the Dirichlet distribution, and is related to the Gamma distribution. It has the probability distribution function
f(x;a,b)=1B(α,β)xα−1(1−x)β−1,where the normalization, B, is the beta function,
B(α,β)=∫10tα−1(1−t)β−1dt.It is often seen in Bayesian inference and order statistics.
Note
New code should use the
beta
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- afloat or array_like of floats
Alpha, positive (>0).
- bfloat or array_like of floats
Beta, positive (>0).
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifa
andb
are both scalars. Otherwise,np.broadcast(a, b).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized beta distribution.
See also
Generator.beta
which should be used for new code.
-
dask.array.random.
binomial
(n, p, size=None, chunks='auto', **kwargs)¶ Draw samples from a binomial distribution.
This docstring was copied from numpy.random.mtrand.RandomState.binomial.
Some inconsistencies with the Dask version may exist.
Samples are drawn from a binomial distribution with specified parameters, n trials and p probability of success where n an integer >= 0 and p is in the interval [0,1]. (n may be input as a float, but it is truncated to an integer in use)
Note
New code should use the
binomial
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- nint or array_like of ints
Parameter of the distribution, >= 0. Floats are also accepted, but they will be truncated to integers.
- pfloat or array_like of floats
Parameter of the distribution, >= 0 and <=1.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifn
andp
are both scalars. Otherwise,np.broadcast(n, p).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized binomial distribution, where each sample is equal to the number of successes over the n trials.
See also
scipy.stats.binom
probability density function, distribution or cumulative density function, etc.
Generator.binomial
which should be used for new code.
Notes
The probability density for the binomial distribution is
P(N)=(nN)pN(1−p)n−N,where n is the number of trials, p is the probability of success, and N is the number of successes.
When estimating the standard error of a proportion in a population by using a random sample, the normal distribution works well unless the product p*n <=5, where p = population proportion estimate, and n = number of samples, in which case the binomial distribution is used instead. For example, a sample of 15 people shows 4 who are left handed, and 11 who are right handed. Then p = 4/15 = 27%. 0.27*15 = 4, so the binomial distribution should be used in this case.
References
- 1
Dalgaard, Peter, “Introductory Statistics with R”, Springer-Verlag, 2002.
- 2
Glantz, Stanton A. “Primer of Biostatistics.”, McGraw-Hill, Fifth Edition, 2002.
- 3
Lentner, Marvin, “Elementary Applied Statistics”, Bogden and Quigley, 1972.
- 4
Weisstein, Eric W. “Binomial Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/BinomialDistribution.html
- 5
Wikipedia, “Binomial distribution”, https://en.wikipedia.org/wiki/Binomial_distribution
Examples
Draw samples from the distribution:
>>> n, p = 10, .5 # number of trials, probability of each trial >>> s = np.random.binomial(n, p, 1000) # result of flipping a coin 10 times, tested 1000 times.
A real world example. A company drills 9 wild-cat oil exploration wells, each with an estimated probability of success of 0.1. All nine wells fail. What is the probability of that happening?
Let’s do 20,000 trials of the model, and count the number that generate zero positive results.
>>> sum(np.random.binomial(9, 0.1, 20000) == 0)/20000. # answer = 0.38885, or 38%.
-
dask.array.random.
chisquare
(df, size=None, chunks='auto', **kwargs)¶ Draw samples from a chi-square distribution.
This docstring was copied from numpy.random.mtrand.RandomState.chisquare.
Some inconsistencies with the Dask version may exist.
When df independent random variables, each with standard normal distributions (mean 0, variance 1), are squared and summed, the resulting distribution is chi-square (see Notes). This distribution is often used in hypothesis testing.
Note
New code should use the
chisquare
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- dffloat or array_like of floats
Number of degrees of freedom, must be > 0.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdf
is a scalar. Otherwise,np.array(df).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized chi-square distribution.
- Raises
- ValueError
When df <= 0 or when an inappropriate size (e.g.
size=-1
) is given.
See also
Generator.chisquare
which should be used for new code.
Notes
The variable obtained by summing the squares of df independent, standard normally distributed random variables:
Q=df∑i=0X2iis chi-square distributed, denoted
Q∼χ2k.The probability density function of the chi-squared distribution is
p(x)=(1/2)k/2Γ(k/2)xk/2−1e−x/2,where Γ is the gamma function,
Γ(x)=∫−∞0tx−1e−tdt.References
- 1
NIST “Engineering Statistics Handbook” https://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm
Examples
>>> np.random.chisquare(2,4) array([ 1.89920014, 9.00867716, 3.13710533, 5.62318272]) # random
-
dask.array.random.
choice
(a, size=None, replace=True, p=None, chunks='auto')¶ Generates a random sample from a given 1-D array
This docstring was copied from numpy.random.mtrand.RandomState.choice.
Some inconsistencies with the Dask version may exist.
New in version 1.7.0.
Note
New code should use the
choice
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- a1-D array-like or int
If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.- replaceboolean, optional
Whether the sample is with or without replacement
- p1-D array-like, optional
The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.
- Returns
- samplessingle item or ndarray
The generated random samples
- Raises
- ValueError
If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size
See also
randint
,shuffle
,permutation
Generator.choice
which should be used in new code
Examples
Generate a uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3) array([0, 3, 4]) # random >>> #This is equivalent to np.random.randint(0,5,3)
Generate a non-uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0]) array([3, 3, 0]) # random
Generate a uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False) array([3,1,0]) # random >>> #This is equivalent to np.random.permutation(np.arange(5))[:3]
Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0]) array([2, 3, 0]) # random
Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:
>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher'] >>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3]) array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random dtype='<U11')
-
dask.array.random.
exponential
(scale=1.0, size=None, chunks='auto', **kwargs)¶ Draw samples from an exponential distribution.
This docstring was copied from numpy.random.mtrand.RandomState.exponential.
Some inconsistencies with the Dask version may exist.
Its probability density function is
f(x;1β)=1βexp(−xβ),for
x > 0
and 0 elsewhere. β is the scale parameter, which is the inverse of the rate parameter λ=1/β. The rate parameter is an alternative, widely used parameterization of the exponential distribution [3].The exponential distribution is a continuous analogue of the geometric distribution. It describes many common situations, such as the size of raindrops measured over many rainstorms [1], or the time between page requests to Wikipedia [2].
Note
New code should use the
exponential
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- scalefloat or array_like of floats
The scale parameter, β=1/λ. Must be non-negative.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifscale
is a scalar. Otherwise,np.array(scale).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized exponential distribution.
See also
Generator.exponential
which should be used for new code.
References
- 1
Peyton Z. Peebles Jr., “Probability, Random Variables and Random Signal Principles”, 4th ed, 2001, p. 57.
- 2
Wikipedia, “Poisson process”, https://en.wikipedia.org/wiki/Poisson_process
- 3
Wikipedia, “Exponential distribution”, https://en.wikipedia.org/wiki/Exponential_distribution
-
dask.array.random.
f
(dfnum, dfden, size=None, chunks='auto', **kwargs)¶ Draw samples from an F distribution.
This docstring was copied from numpy.random.mtrand.RandomState.f.
Some inconsistencies with the Dask version may exist.
Samples are drawn from an F distribution with specified parameters, dfnum (degrees of freedom in numerator) and dfden (degrees of freedom in denominator), where both parameters must be greater than zero.
The random variate of the F distribution (also known as the Fisher distribution) is a continuous probability distribution that arises in ANOVA tests, and is the ratio of two chi-square variates.
Note
New code should use the
f
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- dfnumfloat or array_like of floats
Degrees of freedom in numerator, must be > 0.
- dfdenfloat or array_like of float
Degrees of freedom in denominator, must be > 0.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdfnum
anddfden
are both scalars. Otherwise,np.broadcast(dfnum, dfden).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized Fisher distribution.
See also
scipy.stats.f
probability density function, distribution or cumulative density function, etc.
Generator.f
which should be used for new code.
Notes
The F statistic is used to compare in-group variances to between-group variances. Calculating the distribution depends on the sampling, and so it is a function of the respective degrees of freedom in the problem. The variable dfnum is the number of samples minus one, the between-groups degrees of freedom, while dfden is the within-groups degrees of freedom, the sum of the number of samples in each group minus the number of groups.
References
- 1
Glantz, Stanton A. “Primer of Biostatistics.”, McGraw-Hill, Fifth Edition, 2002.
- 2
Wikipedia, “F-distribution”, https://en.wikipedia.org/wiki/F-distribution
Examples
An example from Glantz[1], pp 47-40:
Two groups, children of diabetics (25 people) and children from people without diabetes (25 controls). Fasting blood glucose was measured, case group had a mean value of 86.1, controls had a mean value of 82.2. Standard deviations were 2.09 and 2.49 respectively. Are these data consistent with the null hypothesis that the parents diabetic status does not affect their children’s blood glucose levels? Calculating the F statistic from the data gives a value of 36.01.
Draw samples from the distribution:
>>> dfnum = 1. # between group degrees of freedom >>> dfden = 48. # within groups degrees of freedom >>> s = np.random.f(dfnum, dfden, 1000)
The lower bound for the top 1% of the samples is :
>>> np.sort(s)[-10] 7.61988120985 # random
So there is about a 1% chance that the F statistic will exceed 7.62, the measured value is 36, so the null hypothesis is rejected at the 1% level.
-
dask.array.random.
gamma
(shape, scale=1.0, size=None, chunks='auto', **kwargs)¶ Draw samples from a Gamma distribution.
This docstring was copied from numpy.random.mtrand.RandomState.gamma.
Some inconsistencies with the Dask version may exist.
Samples are drawn from a Gamma distribution with specified parameters, shape (sometimes designated “k”) and scale (sometimes designated “theta”), where both parameters are > 0.
Note
New code should use the
gamma
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- shapefloat or array_like of floats
The shape of the gamma distribution. Must be non-negative.
- scalefloat or array_like of floats, optional
The scale of the gamma distribution. Must be non-negative. Default is equal to 1.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifshape
andscale
are both scalars. Otherwise,np.broadcast(shape, scale).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized gamma distribution.
See also
scipy.stats.gamma
probability density function, distribution or cumulative density function, etc.
Generator.gamma
which should be used for new code.
Notes
The probability density for the Gamma distribution is
p(x)=xk−1e−x/θθkΓ(k),where k is the shape and θ the scale, and Γ is the Gamma function.
The Gamma distribution is often used to model the times to failure of electronic components, and arises naturally in processes for which the waiting times between Poisson distributed events are relevant.
References
- 1
Weisstein, Eric W. “Gamma Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/GammaDistribution.html
- 2
Wikipedia, “Gamma distribution”, https://en.wikipedia.org/wiki/Gamma_distribution
Examples
Draw samples from the distribution:
>>> shape, scale = 2., 2. # mean=4, std=2*sqrt(2) >>> s = np.random.gamma(shape, scale, 1000)
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> import scipy.special as sps >>> count, bins, ignored = plt.hist(s, 50, density=True) >>> y = bins**(shape-1)*(np.exp(-bins/scale) / ... (sps.gamma(shape)*scale**shape)) >>> plt.plot(bins, y, linewidth=2, color='r') >>> plt.show()
-
dask.array.random.
geometric
(p, size=None, chunks='auto', **kwargs)¶ Draw samples from the geometric distribution.
This docstring was copied from numpy.random.mtrand.RandomState.geometric.
Some inconsistencies with the Dask version may exist.
Bernoulli trials are experiments with one of two outcomes: success or failure (an example of such an experiment is flipping a coin). The geometric distribution models the number of trials that must be run in order to achieve success. It is therefore supported on the positive integers,
k = 1, 2, ...
.The probability mass function of the geometric distribution is
f(k)=(1−p)k−1pwhere p is the probability of success of an individual trial.
Note
New code should use the
geometric
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- pfloat or array_like of floats
The probability of success of an individual trial.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifp
is a scalar. Otherwise,np.array(p).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized geometric distribution.
See also
Generator.geometric
which should be used for new code.
Examples
Draw ten thousand values from the geometric distribution, with the probability of an individual success equal to 0.35:
>>> z = np.random.geometric(p=0.35, size=10000)
How many trials succeeded after a single run?
>>> (z == 1).sum() / 10000. 0.34889999999999999 #random
-
dask.array.random.
gumbel
(loc=0.0, scale=1.0, size=None, chunks='auto', **kwargs)¶ Draw samples from a Gumbel distribution.
This docstring was copied from numpy.random.mtrand.RandomState.gumbel.
Some inconsistencies with the Dask version may exist.
Draw samples from a Gumbel distribution with specified location and scale. For more information on the Gumbel distribution, see Notes and References below.
Note
New code should use the
gumbel
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- locfloat or array_like of floats, optional
The location of the mode of the distribution. Default is 0.
- scalefloat or array_like of floats, optional
The scale parameter of the distribution. Default is 1. Must be non- negative.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifloc
andscale
are both scalars. Otherwise,np.broadcast(loc, scale).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized Gumbel distribution.
See also
scipy.stats.gumbel_l
scipy.stats.gumbel_r
scipy.stats.genextreme
weibull
Generator.gumbel
which should be used for new code.
Notes
The Gumbel (or Smallest Extreme Value (SEV) or the Smallest Extreme Value Type I) distribution is one of a class of Generalized Extreme Value (GEV) distributions used in modeling extreme value problems. The Gumbel is a special case of the Extreme Value Type I distribution for maximums from distributions with “exponential-like” tails.
The probability density for the Gumbel distribution is
p(x)=e−(x−μ)/ββe−e−(x−μ)/β,where μ is the mode, a location parameter, and β is the scale parameter.
The Gumbel (named for German mathematician Emil Julius Gumbel) was used very early in the hydrology literature, for modeling the occurrence of flood events. It is also used for modeling maximum wind speed and rainfall rates. It is a “fat-tailed” distribution - the probability of an event in the tail of the distribution is larger than if one used a Gaussian, hence the surprisingly frequent occurrence of 100-year floods. Floods were initially modeled as a Gaussian process, which underestimated the frequency of extreme events.
It is one of a class of extreme value distributions, the Generalized Extreme Value (GEV) distributions, which also includes the Weibull and Frechet.
The function has a mean of μ+0.57721β and a variance of π26β2.
References
- 1
Gumbel, E. J., “Statistics of Extremes,” New York: Columbia University Press, 1958.
- 2
Reiss, R.-D. and Thomas, M., “Statistical Analysis of Extreme Values from Insurance, Finance, Hydrology and Other Fields,” Basel: Birkhauser Verlag, 2001.
Examples
Draw samples from the distribution:
>>> mu, beta = 0, 0.1 # location and scale >>> s = np.random.gumbel(mu, beta, 1000)
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s, 30, density=True) >>> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta) ... * np.exp( -np.exp( -(bins - mu) /beta) ), ... linewidth=2, color='r') >>> plt.show()
Show how an extreme value distribution can arise from a Gaussian process and compare to a Gaussian:
>>> means = [] >>> maxima = [] >>> for i in range(0,1000) : ... a = np.random.normal(mu, beta, 1000) ... means.append(a.mean()) ... maxima.append(a.max()) >>> count, bins, ignored = plt.hist(maxima, 30, density=True) >>> beta = np.std(maxima) * np.sqrt(6) / np.pi >>> mu = np.mean(maxima) - 0.57721*beta >>> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta) ... * np.exp(-np.exp(-(bins - mu)/beta)), ... linewidth=2, color='r') >>> plt.plot(bins, 1/(beta * np.sqrt(2 * np.pi)) ... * np.exp(-(bins - mu)**2 / (2 * beta**2)), ... linewidth=2, color='g') >>> plt.show()
-
dask.array.random.
hypergeometric
(ngood, nbad, nsample, size=None, chunks='auto', **kwargs)¶ Draw samples from a Hypergeometric distribution.
This docstring was copied from numpy.random.mtrand.RandomState.hypergeometric.
Some inconsistencies with the Dask version may exist.
Samples are drawn from a hypergeometric distribution with specified parameters, ngood (ways to make a good selection), nbad (ways to make a bad selection), and nsample (number of items sampled, which is less than or equal to the sum
ngood + nbad
).Note
New code should use the
hypergeometric
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- ngoodint or array_like of ints
Number of ways to make a good selection. Must be nonnegative.
- nbadint or array_like of ints
Number of ways to make a bad selection. Must be nonnegative.
- nsampleint or array_like of ints
Number of items sampled. Must be at least 1 and at most
ngood + nbad
.- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned if ngood, nbad, and nsample are all scalars. Otherwise,np.broadcast(ngood, nbad, nsample).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized hypergeometric distribution. Each sample is the number of good items within a randomly selected subset of size nsample taken from a set of ngood good items and nbad bad items.
See also
scipy.stats.hypergeom
probability density function, distribution or cumulative density function, etc.
Generator.hypergeometric
which should be used for new code.
Notes
The probability density for the Hypergeometric distribution is
P(x)=(gx)(bn−x)(g+bn),where 0≤x≤n and n−b≤x≤g
for P(x) the probability of
x
good results in the drawn sample, g = ngood, b = nbad, and n = nsample.Consider an urn with black and white marbles in it, ngood of them are black and nbad are white. If you draw nsample balls without replacement, then the hypergeometric distribution describes the distribution of black balls in the drawn sample.
Note that this distribution is very similar to the binomial distribution, except that in this case, samples are drawn without replacement, whereas in the Binomial case samples are drawn with replacement (or the sample space is infinite). As the sample space becomes large, this distribution approaches the binomial.
References
- 1
Lentner, Marvin, “Elementary Applied Statistics”, Bogden and Quigley, 1972.
- 2
Weisstein, Eric W. “Hypergeometric Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/HypergeometricDistribution.html
- 3
Wikipedia, “Hypergeometric distribution”, https://en.wikipedia.org/wiki/Hypergeometric_distribution
Examples
Draw samples from the distribution:
>>> ngood, nbad, nsamp = 100, 2, 10 # number of good, number of bad, and number of samples >>> s = np.random.hypergeometric(ngood, nbad, nsamp, 1000) >>> from matplotlib.pyplot import hist >>> hist(s) # note that it is very unlikely to grab both bad items
Suppose you have an urn with 15 white and 15 black marbles. If you pull 15 marbles at random, how likely is it that 12 or more of them are one color?
>>> s = np.random.hypergeometric(15, 15, 15, 100000) >>> sum(s>=12)/100000. + sum(s<=3)/100000. # answer = 0.003 ... pretty unlikely!
-
dask.array.random.
laplace
(loc=0.0, scale=1.0, size=None, chunks='auto', **kwargs)¶ Draw samples from the Laplace or double exponential distribution with specified location (or mean) and scale (decay).
This docstring was copied from numpy.random.mtrand.RandomState.laplace.
Some inconsistencies with the Dask version may exist.
The Laplace distribution is similar to the Gaussian/normal distribution, but is sharper at the peak and has fatter tails. It represents the difference between two independent, identically distributed exponential random variables.
Note
New code should use the
laplace
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- locfloat or array_like of floats, optional
The position, μ, of the distribution peak. Default is 0.
- scalefloat or array_like of floats, optional
λ, the exponential decay. Default is 1. Must be non- negative.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifloc
andscale
are both scalars. Otherwise,np.broadcast(loc, scale).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized Laplace distribution.
See also
Generator.laplace
which should be used for new code.
Notes
It has the probability density function
f(x;μ,λ)=12λexp(−|x−μ|λ).The first law of Laplace, from 1774, states that the frequency of an error can be expressed as an exponential function of the absolute magnitude of the error, which leads to the Laplace distribution. For many problems in economics and health sciences, this distribution seems to model the data better than the standard Gaussian distribution.
References
- 1
Abramowitz, M. and Stegun, I. A. (Eds.). “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing,” New York: Dover, 1972.
- 2
Kotz, Samuel, et. al. “The Laplace Distribution and Generalizations, ” Birkhauser, 2001.
- 3
Weisstein, Eric W. “Laplace Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/LaplaceDistribution.html
- 4
Wikipedia, “Laplace distribution”, https://en.wikipedia.org/wiki/Laplace_distribution
Examples
Draw samples from the distribution
>>> loc, scale = 0., 1. >>> s = np.random.laplace(loc, scale, 1000)
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s, 30, density=True) >>> x = np.arange(-8., 8., .01) >>> pdf = np.exp(-abs(x-loc)/scale)/(2.*scale) >>> plt.plot(x, pdf)
Plot Gaussian for comparison:
>>> g = (1/(scale * np.sqrt(2 * np.pi)) * ... np.exp(-(x - loc)**2 / (2 * scale**2))) >>> plt.plot(x,g)
-
dask.array.random.
logistic
(loc=0.0, scale=1.0, size=None, chunks='auto', **kwargs)¶ Draw samples from a logistic distribution.
This docstring was copied from numpy.random.mtrand.RandomState.logistic.
Some inconsistencies with the Dask version may exist.
Samples are drawn from a logistic distribution with specified parameters, loc (location or mean, also median), and scale (>0).
Note
New code should use the
logistic
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- locfloat or array_like of floats, optional
Parameter of the distribution. Default is 0.
- scalefloat or array_like of floats, optional
Parameter of the distribution. Must be non-negative. Default is 1.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifloc
andscale
are both scalars. Otherwise,np.broadcast(loc, scale).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized logistic distribution.
See also
scipy.stats.logistic
probability density function, distribution or cumulative density function, etc.
Generator.logistic
which should be used for new code.
Notes
The probability density for the Logistic distribution is
P(x)=P(x)=e−(x−μ)/ss(1+e−(x−μ)/s)2,where μ = location and s = scale.
The Logistic distribution is used in Extreme Value problems where it can act as a mixture of Gumbel distributions, in Epidemiology, and by the World Chess Federation (FIDE) where it is used in the Elo ranking system, assuming the performance of each player is a logistically distributed random variable.
References
- 1
Reiss, R.-D. and Thomas M. (2001), “Statistical Analysis of Extreme Values, from Insurance, Finance, Hydrology and Other Fields,” Birkhauser Verlag, Basel, pp 132-133.
- 2
Weisstein, Eric W. “Logistic Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/LogisticDistribution.html
- 3
Wikipedia, “Logistic-distribution”, https://en.wikipedia.org/wiki/Logistic_distribution
Examples
Draw samples from the distribution:
>>> loc, scale = 10, 1 >>> s = np.random.logistic(loc, scale, 10000) >>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s, bins=50)
# plot against distribution
>>> def logist(x, loc, scale): ... return np.exp((loc-x)/scale)/(scale*(1+np.exp((loc-x)/scale))**2) >>> lgst_val = logist(bins, loc, scale) >>> plt.plot(bins, lgst_val * count.max() / lgst_val.max()) >>> plt.show()
-
dask.array.random.
lognormal
(mean=0.0, sigma=1.0, size=None, chunks='auto', **kwargs)¶ Draw samples from a log-normal distribution.
This docstring was copied from numpy.random.mtrand.RandomState.lognormal.
Some inconsistencies with the Dask version may exist.
Draw samples from a log-normal distribution with specified mean, standard deviation, and array shape. Note that the mean and standard deviation are not the values for the distribution itself, but of the underlying normal distribution it is derived from.
Note
New code should use the
lognormal
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- meanfloat or array_like of floats, optional
Mean value of the underlying normal distribution. Default is 0.
- sigmafloat or array_like of floats, optional
Standard deviation of the underlying normal distribution. Must be non-negative. Default is 1.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifmean
andsigma
are both scalars. Otherwise,np.broadcast(mean, sigma).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized log-normal distribution.
See also
scipy.stats.lognorm
probability density function, distribution, cumulative density function, etc.
Generator.lognormal
which should be used for new code.
Notes
A variable x has a log-normal distribution if log(x) is normally distributed. The probability density function for the log-normal distribution is:
p(x)=1σx√2πe(−(ln(x)−μ)22σ2)where μ is the mean and σ is the standard deviation of the normally distributed logarithm of the variable. A log-normal distribution results if a random variable is the product of a large number of independent, identically-distributed variables in the same way that a normal distribution results if the variable is the sum of a large number of independent, identically-distributed variables.
References
- 1
Limpert, E., Stahel, W. A., and Abbt, M., “Log-normal Distributions across the Sciences: Keys and Clues,” BioScience, Vol. 51, No. 5, May, 2001. https://stat.ethz.ch/~stahel/lognormal/bioscience.pdf
- 2
Reiss, R.D. and Thomas, M., “Statistical Analysis of Extreme Values,” Basel: Birkhauser Verlag, 2001, pp. 31-32.
Examples
Draw samples from the distribution:
>>> mu, sigma = 3., 1. # mean and standard deviation >>> s = np.random.lognormal(mu, sigma, 1000)
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s, 100, density=True, align='mid')
>>> x = np.linspace(min(bins), max(bins), 10000) >>> pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) ... / (x * sigma * np.sqrt(2 * np.pi)))
>>> plt.plot(x, pdf, linewidth=2, color='r') >>> plt.axis('tight') >>> plt.show()
Demonstrate that taking the products of random samples from a uniform distribution can be fit well by a log-normal probability density function.
>>> # Generate a thousand samples: each is the product of 100 random >>> # values, drawn from a normal distribution. >>> b = [] >>> for i in range(1000): ... a = 10. + np.random.standard_normal(100) ... b.append(np.product(a))
>>> b = np.array(b) / np.min(b) # scale values to be positive >>> count, bins, ignored = plt.hist(b, 100, density=True, align='mid') >>> sigma = np.std(np.log(b)) >>> mu = np.mean(np.log(b))
>>> x = np.linspace(min(bins), max(bins), 10000) >>> pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) ... / (x * sigma * np.sqrt(2 * np.pi)))
>>> plt.plot(x, pdf, color='r', linewidth=2) >>> plt.show()
-
dask.array.random.
logseries
(p, size=None, chunks='auto', **kwargs)¶ Draw samples from a logarithmic series distribution.
This docstring was copied from numpy.random.mtrand.RandomState.logseries.
Some inconsistencies with the Dask version may exist.
Samples are drawn from a log series distribution with specified shape parameter, 0 <
p
< 1.Note
New code should use the
logseries
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- pfloat or array_like of floats
Shape parameter for the distribution. Must be in the range (0, 1).
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifp
is a scalar. Otherwise,np.array(p).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized logarithmic series distribution.
See also
scipy.stats.logser
probability density function, distribution or cumulative density function, etc.
Generator.logseries
which should be used for new code.
Notes
The probability density for the Log Series distribution is
P(k)=−pkkln(1−p),where p = probability.
The log series distribution is frequently used to represent species richness and occurrence, first proposed by Fisher, Corbet, and Williams in 1943 [2]. It may also be used to model the numbers of occupants seen in cars [3].
References
- 1
Buzas, Martin A.; Culver, Stephen J., Understanding regional species diversity through the log series distribution of occurrences: BIODIVERSITY RESEARCH Diversity & Distributions, Volume 5, Number 5, September 1999 , pp. 187-195(9).
- 2
Fisher, R.A,, A.S. Corbet, and C.B. Williams. 1943. The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology, 12:42-58.
- 3
D. J. Hand, F. Daly, D. Lunn, E. Ostrowski, A Handbook of Small Data Sets, CRC Press, 1994.
- 4
Wikipedia, “Logarithmic distribution”, https://en.wikipedia.org/wiki/Logarithmic_distribution
Examples
Draw samples from the distribution:
>>> a = .6 >>> s = np.random.logseries(a, 10000) >>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s)
# plot against distribution
>>> def logseries(k, p): ... return -p**k/(k*np.log(1-p)) >>> plt.plot(bins, logseries(bins, a)*count.max()/ ... logseries(bins, a).max(), 'r') >>> plt.show()
-
dask.array.random.
negative_binomial
(n, p, size=None, chunks='auto', **kwargs)¶ Draw samples from a negative binomial distribution.
This docstring was copied from numpy.random.mtrand.RandomState.negative_binomial.
Some inconsistencies with the Dask version may exist.
Samples are drawn from a negative binomial distribution with specified parameters, n successes and p probability of success where n is > 0 and p is in the interval [0, 1].
Note
New code should use the
negative_binomial
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- nfloat or array_like of floats
Parameter of the distribution, > 0.
- pfloat or array_like of floats
Parameter of the distribution, >= 0 and <=1.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifn
andp
are both scalars. Otherwise,np.broadcast(n, p).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized negative binomial distribution, where each sample is equal to N, the number of failures that occurred before a total of n successes was reached.
See also
Generator.negative_binomial
which should be used for new code.
Notes
The probability mass function of the negative binomial distribution is
P(N;n,p)=Γ(N+n)N!Γ(n)pn(1−p)N,where n is the number of successes, p is the probability of success, N+n is the number of trials, and Γ is the gamma function. When n is an integer, Γ(N+n)N!Γ(n)=(N+n−1N), which is the more common form of this term in the the pmf. The negative binomial distribution gives the probability of N failures given n successes, with a success on the last trial.
If one throws a die repeatedly until the third time a “1” appears, then the probability distribution of the number of non-“1”s that appear before the third “1” is a negative binomial distribution.
References
- 1
Weisstein, Eric W. “Negative Binomial Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/NegativeBinomialDistribution.html
- 2
Wikipedia, “Negative binomial distribution”, https://en.wikipedia.org/wiki/Negative_binomial_distribution
Examples
Draw samples from the distribution:
A real world example. A company drills wild-cat oil exploration wells, each with an estimated probability of success of 0.1. What is the probability of having one success for each successive well, that is what is the probability of a single success after drilling 5 wells, after 6 wells, etc.?
>>> s = np.random.negative_binomial(1, 0.1, 100000) >>> for i in range(1, 11): ... probability = sum(s<i) / 100000. ... print(i, "wells drilled, probability of one success =", probability)
-
dask.array.random.
noncentral_chisquare
(df, nonc, size=None, chunks='auto', **kwargs)¶ Draw samples from a noncentral chi-square distribution.
This docstring was copied from numpy.random.mtrand.RandomState.noncentral_chisquare.
Some inconsistencies with the Dask version may exist.
The noncentral χ2 distribution is a generalization of the χ2 distribution.
Note
New code should use the
noncentral_chisquare
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- dffloat or array_like of floats
Degrees of freedom, must be > 0.
Changed in version 1.10.0: Earlier NumPy versions required dfnum > 1.
- noncfloat or array_like of floats
Non-centrality, must be non-negative.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdf
andnonc
are both scalars. Otherwise,np.broadcast(df, nonc).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized noncentral chi-square distribution.
See also
Generator.noncentral_chisquare
which should be used for new code.
Notes
The probability density function for the noncentral Chi-square distribution is
P(x;df,nonc)=∞∑i=0e−nonc/2(nonc/2)ii!PYdf+2i(x),where Yq is the Chi-square with q degrees of freedom.
References
- 1
Wikipedia, “Noncentral chi-squared distribution” https://en.wikipedia.org/wiki/Noncentral_chi-squared_distribution
Examples
Draw values from the distribution and plot the histogram
>>> import matplotlib.pyplot as plt >>> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000), ... bins=200, density=True) >>> plt.show()
Draw values from a noncentral chisquare with very small noncentrality, and compare to a chisquare.
>>> plt.figure() >>> values = plt.hist(np.random.noncentral_chisquare(3, .0000001, 100000), ... bins=np.arange(0., 25, .1), density=True) >>> values2 = plt.hist(np.random.chisquare(3, 100000), ... bins=np.arange(0., 25, .1), density=True) >>> plt.plot(values[1][0:-1], values[0]-values2[0], 'ob') >>> plt.show()
Demonstrate how large values of non-centrality lead to a more symmetric distribution.
>>> plt.figure() >>> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000), ... bins=200, density=True) >>> plt.show()
-
dask.array.random.
noncentral_f
(dfnum, dfden, nonc, size=None, chunks='auto', **kwargs)¶ Draw samples from the noncentral F distribution.
This docstring was copied from numpy.random.mtrand.RandomState.noncentral_f.
Some inconsistencies with the Dask version may exist.
Samples are drawn from an F distribution with specified parameters, dfnum (degrees of freedom in numerator) and dfden (degrees of freedom in denominator), where both parameters > 1. nonc is the non-centrality parameter.
Note
New code should use the
noncentral_f
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- dfnumfloat or array_like of floats
Numerator degrees of freedom, must be > 0.
Changed in version 1.14.0: Earlier NumPy versions required dfnum > 1.
- dfdenfloat or array_like of floats
Denominator degrees of freedom, must be > 0.
- noncfloat or array_like of floats
Non-centrality parameter, the sum of the squares of the numerator means, must be >= 0.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdfnum
,dfden
, andnonc
are all scalars. Otherwise,np.broadcast(dfnum, dfden, nonc).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized noncentral Fisher distribution.
See also
Generator.noncentral_f
which should be used for new code.
Notes
When calculating the power of an experiment (power = probability of rejecting the null hypothesis when a specific alternative is true) the non-central F statistic becomes important. When the null hypothesis is true, the F statistic follows a central F distribution. When the null hypothesis is not true, then it follows a non-central F statistic.
References
- 1
Weisstein, Eric W. “Noncentral F-Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/NoncentralF-Distribution.html
- 2
Wikipedia, “Noncentral F-distribution”, https://en.wikipedia.org/wiki/Noncentral_F-distribution
Examples
In a study, testing for a specific alternative to the null hypothesis requires use of the Noncentral F distribution. We need to calculate the area in the tail of the distribution that exceeds the value of the F distribution for the null hypothesis. We’ll plot the two probability distributions for comparison.
>>> dfnum = 3 # between group deg of freedom >>> dfden = 20 # within groups degrees of freedom >>> nonc = 3.0 >>> nc_vals = np.random.noncentral_f(dfnum, dfden, nonc, 1000000) >>> NF = np.histogram(nc_vals, bins=50, density=True) >>> c_vals = np.random.f(dfnum, dfden, 1000000) >>> F = np.histogram(c_vals, bins=50, density=True) >>> import matplotlib.pyplot as plt >>> plt.plot(F[1][1:], F[0]) >>> plt.plot(NF[1][1:], NF[0]) >>> plt.show()
-
dask.array.random.
normal
(loc=0.0, scale=1.0, size=None, chunks='auto', **kwargs)¶ Draw random samples from a normal (Gaussian) distribution.
This docstring was copied from numpy.random.mtrand.RandomState.normal.
Some inconsistencies with the Dask version may exist.
The probability density function of the normal distribution, first derived by De Moivre and 200 years later by both Gauss and Laplace independently [2], is often called the bell curve because of its characteristic shape (see the example below).
The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random disturbances, each with its own unique distribution [2].
Note
New code should use the
normal
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- locfloat or array_like of floats
Mean (“centre”) of the distribution.
- scalefloat or array_like of floats
Standard deviation (spread or “width”) of the distribution. Must be non-negative.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifloc
andscale
are both scalars. Otherwise,np.broadcast(loc, scale).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized normal distribution.
See also
scipy.stats.norm
probability density function, distribution or cumulative density function, etc.
Generator.normal
which should be used for new code.
Notes
The probability density for the Gaussian distribution is
p(x)=1√2πσ2e−(x−μ)22σ2,where μ is the mean and σ the standard deviation. The square of the standard deviation, σ2, is called the variance.
The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at x+σ and x−σ [2]). This implies that normal is more likely to return samples lying close to the mean, rather than those far away.
References
- 1
Wikipedia, “Normal distribution”, https://en.wikipedia.org/wiki/Normal_distribution
- 2(1,2,3)
P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125.
Examples
Draw samples from the distribution:
>>> mu, sigma = 0, 0.1 # mean and standard deviation >>> s = np.random.normal(mu, sigma, 1000)
Verify the mean and the variance:
>>> abs(mu - np.mean(s)) 0.0 # may vary
>>> abs(sigma - np.std(s, ddof=1)) 0.1 # may vary
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s, 30, density=True) >>> plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * ... np.exp( - (bins - mu)**2 / (2 * sigma**2) ), ... linewidth=2, color='r') >>> plt.show()
Two-by-four array of samples from N(3, 6.25):
>>> np.random.normal(3, 2.5, size=(2, 4)) array([[-4.49401501, 4.00950034, -1.81814867, 7.29718677], # random [ 0.39924804, 4.68456316, 4.99394529, 4.84057254]]) # random
-
dask.array.random.
pareto
(a, size=None, chunks='auto', **kwargs)¶ Draw samples from a Pareto II or Lomax distribution with specified shape.
This docstring was copied from numpy.random.mtrand.RandomState.pareto.
Some inconsistencies with the Dask version may exist.
The Lomax or Pareto II distribution is a shifted Pareto distribution. The classical Pareto distribution can be obtained from the Lomax distribution by adding 1 and multiplying by the scale parameter
m
(see Notes). The smallest value of the Lomax distribution is zero while for the classical Pareto distribution it ismu
, where the standard Pareto distribution has locationmu = 1
. Lomax can also be considered as a simplified version of the Generalized Pareto distribution (available in SciPy), with the scale set to one and the location set to zero.The Pareto distribution must be greater than zero, and is unbounded above. It is also known as the “80-20 rule”. In this distribution, 80 percent of the weights are in the lowest 20 percent of the range, while the other 20 percent fill the remaining 80 percent of the range.
Note
New code should use the
pareto
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- afloat or array_like of floats
Shape of the distribution. Must be positive.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifa
is a scalar. Otherwise,np.array(a).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized Pareto distribution.
See also
scipy.stats.lomax
probability density function, distribution or cumulative density function, etc.
scipy.stats.genpareto
probability density function, distribution or cumulative density function, etc.
Generator.pareto
which should be used for new code.
Notes
The probability density for the Pareto distribution is
p(x)=amaxa+1where a is the shape and m the scale.
The Pareto distribution, named after the Italian economist Vilfredo Pareto, is a power law probability distribution useful in many real world problems. Outside the field of economics it is generally referred to as the Bradford distribution. Pareto developed the distribution to describe the distribution of wealth in an economy. It has also found use in insurance, web page access statistics, oil field sizes, and many other problems, including the download frequency for projects in Sourceforge [1]. It is one of the so-called “fat-tailed” distributions.
References
- 1
Francis Hunt and Paul Johnson, On the Pareto Distribution of Sourceforge projects.
- 2
Pareto, V. (1896). Course of Political Economy. Lausanne.
- 3
Reiss, R.D., Thomas, M.(2001), Statistical Analysis of Extreme Values, Birkhauser Verlag, Basel, pp 23-30.
- 4
Wikipedia, “Pareto distribution”, https://en.wikipedia.org/wiki/Pareto_distribution
Examples
Draw samples from the distribution:
>>> a, m = 3., 2. # shape and mode >>> s = (np.random.pareto(a, 1000) + 1) * m
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> count, bins, _ = plt.hist(s, 100, density=True) >>> fit = a*m**a / bins**(a+1) >>> plt.plot(bins, max(count)*fit/max(fit), linewidth=2, color='r') >>> plt.show()
-
dask.array.random.
poisson
(lam=1.0, size=None, chunks='auto', **kwargs)¶ Draw samples from a Poisson distribution.
This docstring was copied from numpy.random.mtrand.RandomState.poisson.
Some inconsistencies with the Dask version may exist.
The Poisson distribution is the limit of the binomial distribution for large N.
Note
New code should use the
poisson
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- lamfloat or array_like of floats
Expectation of interval, must be >= 0. A sequence of expectation intervals must be broadcastable over the requested size.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned iflam
is a scalar. Otherwise,np.array(lam).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized Poisson distribution.
See also
Generator.poisson
which should be used for new code.
Notes
The Poisson distribution
f(k;λ)=λke−λk!For events with an expected separation λ the Poisson distribution f(k;λ) describes the probability of k events occurring within the observed interval λ.
Because the output is limited to the range of the C int64 type, a ValueError is raised when lam is within 10 sigma of the maximum representable value.
References
- 1
Weisstein, Eric W. “Poisson Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/PoissonDistribution.html
- 2
Wikipedia, “Poisson distribution”, https://en.wikipedia.org/wiki/Poisson_distribution
Examples
Draw samples from the distribution:
>>> import numpy as np >>> s = np.random.poisson(5, 10000)
Display histogram of the sample:
>>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s, 14, density=True) >>> plt.show()
Draw each 100 values for lambda 100 and 500:
>>> s = np.random.poisson(lam=(100., 500.), size=(100, 2))
-
dask.array.random.
power
(a, size=None, chunks='auto', **kwargs)¶ Draws samples in [0, 1] from a power distribution with positive exponent a - 1.
This docstring was copied from numpy.random.mtrand.RandomState.power.
Some inconsistencies with the Dask version may exist.
Also known as the power function distribution.
Note
New code should use the
power
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- afloat or array_like of floats
Parameter of the distribution. Must be non-negative.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifa
is a scalar. Otherwise,np.array(a).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized power distribution.
- Raises
- ValueError
If a < 1.
See also
Generator.power
which should be used for new code.
Notes
The probability density function is
P(x;a)=axa−1,0≤x≤1,a>0.The power function distribution is just the inverse of the Pareto distribution. It may also be seen as a special case of the Beta distribution.
It is used, for example, in modeling the over-reporting of insurance claims.
References
- 1
Christian Kleiber, Samuel Kotz, “Statistical size distributions in economics and actuarial sciences”, Wiley, 2003.
- 2
Heckert, N. A. and Filliben, James J. “NIST Handbook 148: Dataplot Reference Manual, Volume 2: Let Subcommands and Library Functions”, National Institute of Standards and Technology Handbook Series, June 2003. https://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/powpdf.pdf
Examples
Draw samples from the distribution:
>>> a = 5. # shape >>> samples = 1000 >>> s = np.random.power(a, samples)
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s, bins=30) >>> x = np.linspace(0, 1, 100) >>> y = a*x**(a-1.) >>> normed_y = samples*np.diff(bins)[0]*y >>> plt.plot(x, normed_y) >>> plt.show()
Compare the power function distribution to the inverse of the Pareto.
>>> from scipy import stats >>> rvs = np.random.power(5, 1000000) >>> rvsp = np.random.pareto(5, 1000000) >>> xx = np.linspace(0,1,100) >>> powpdf = stats.powerlaw.pdf(xx,5)
>>> plt.figure() >>> plt.hist(rvs, bins=50, density=True) >>> plt.plot(xx,powpdf,'r-') >>> plt.title('np.random.power(5)')
>>> plt.figure() >>> plt.hist(1./(1.+rvsp), bins=50, density=True) >>> plt.plot(xx,powpdf,'r-') >>> plt.title('inverse of 1 + np.random.pareto(5)')
>>> plt.figure() >>> plt.hist(1./(1.+rvsp), bins=50, density=True) >>> plt.plot(xx,powpdf,'r-') >>> plt.title('inverse of stats.pareto(5)')
-
dask.array.random.
randint
(low, high=None, size=None, chunks='auto', dtype='l', **kwargs)¶ Return random integers from low (inclusive) to high (exclusive).
This docstring was copied from numpy.random.mtrand.RandomState.randint.
Some inconsistencies with the Dask version may exist.
Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval [low, high). If high is None (the default), then results are from [0, low).
Note
New code should use the
integers
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- lowint or array-like of ints
Lowest (signed) integers to be drawn from the distribution (unless
high=None
, in which case this parameter is one above the highest such integer).- highint or array-like of ints, optional
If provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if
high=None
). If array-like, must contain integer values- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.- dtypedtype, optional
Desired dtype of the result. All dtypes are determined by their name, i.e., ‘int64’, ‘int’, etc, so byteorder is not available and a specific precision may have different C types depending on the platform. The default value is np.int_.
New in version 1.11.0.
- Returns
- outint or ndarray of ints
size-shaped array of random integers from the appropriate distribution, or a single such random int if size not provided.
See also
random_integers
similar to randint, only for the closed interval [low, high], and 1 is the lowest value if high is omitted.
Generator.integers
which should be used for new code.
Examples
>>> np.random.randint(2, size=10) array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0]) # random >>> np.random.randint(1, size=10) array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Generate a 2 x 4 array of ints between 0 and 4, inclusive:
>>> np.random.randint(5, size=(2, 4)) array([[4, 0, 2, 1], # random [3, 2, 2, 0]])
Generate a 1 x 3 array with 3 different upper bounds
>>> np.random.randint(1, [3, 5, 10]) array([2, 2, 9]) # random
Generate a 1 by 3 array with 3 different lower bounds
>>> np.random.randint([1, 5, 7], 10) array([9, 8, 7]) # random
Generate a 2 by 4 array using broadcasting with dtype of uint8
>>> np.random.randint([1, 3, 5, 7], [[10], [20]], dtype=np.uint8) array([[ 8, 6, 9, 7], # random [ 1, 16, 9, 12]], dtype=uint8)
-
dask.array.random.
random
(size=None, chunks='auto', **kwargs)¶ Return random floats in the half-open interval [0.0, 1.0).
This docstring was copied from numpy.random.mtrand.RandomState.random_sample.
Some inconsistencies with the Dask version may exist.
Results are from the “continuous uniform” distribution over the stated interval. To sample Unif[a,b),b>a multiply the output of random_sample by (b-a) and add a:
(b - a) * random_sample() + a
Note
New code should use the
random
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.
- Returns
- outfloat or ndarray of floats
Array of random floats of shape size (unless
size=None
, in which case a single float is returned).
See also
Generator.random
which should be used for new code.
Examples
>>> np.random.random_sample() 0.47108547995356098 # random >>> type(np.random.random_sample()) <class 'float'> >>> np.random.random_sample((5,)) array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428]) # random
Three-by-two array of random numbers from [-5, 0):
>>> 5 * np.random.random_sample((3, 2)) - 5 array([[-3.99149989, -0.52338984], # random [-2.99091858, -0.79479508], [-1.23204345, -1.75224494]])
-
dask.array.random.
random_sample
(size=None, chunks='auto', **kwargs)¶ Return random floats in the half-open interval [0.0, 1.0).
This docstring was copied from numpy.random.mtrand.RandomState.random_sample.
Some inconsistencies with the Dask version may exist.
Results are from the “continuous uniform” distribution over the stated interval. To sample Unif[a,b),b>a multiply the output of random_sample by (b-a) and add a:
(b - a) * random_sample() + a
Note
New code should use the
random
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.
- Returns
- outfloat or ndarray of floats
Array of random floats of shape size (unless
size=None
, in which case a single float is returned).
See also
Generator.random
which should be used for new code.
Examples
>>> np.random.random_sample() 0.47108547995356098 # random >>> type(np.random.random_sample()) <class 'float'> >>> np.random.random_sample((5,)) array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428]) # random
Three-by-two array of random numbers from [-5, 0):
>>> 5 * np.random.random_sample((3, 2)) - 5 array([[-3.99149989, -0.52338984], # random [-2.99091858, -0.79479508], [-1.23204345, -1.75224494]])
-
dask.array.random.
rayleigh
(scale=1.0, size=None, chunks='auto', **kwargs)¶ Draw samples from a Rayleigh distribution.
This docstring was copied from numpy.random.mtrand.RandomState.rayleigh.
Some inconsistencies with the Dask version may exist.
The χ and Weibull distributions are generalizations of the Rayleigh.
Note
New code should use the
rayleigh
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- scalefloat or array_like of floats, optional
Scale, also equals the mode. Must be non-negative. Default is 1.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifscale
is a scalar. Otherwise,np.array(scale).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized Rayleigh distribution.
See also
Generator.rayleigh
which should be used for new code.
Notes
The probability density function for the Rayleigh distribution is
P(x;scale)=xscale2e−x22⋅scale2The Rayleigh distribution would arise, for example, if the East and North components of the wind velocity had identical zero-mean Gaussian distributions. Then the wind speed would have a Rayleigh distribution.
References
- 1
Brighton Webs Ltd., “Rayleigh Distribution,” https://web.archive.org/web/20090514091424/http://brighton-webs.co.uk:80/distributions/rayleigh.asp
- 2
Wikipedia, “Rayleigh distribution” https://en.wikipedia.org/wiki/Rayleigh_distribution
Examples
Draw values from the distribution and plot the histogram
>>> from matplotlib.pyplot import hist >>> values = hist(np.random.rayleigh(3, 100000), bins=200, density=True)
Wave heights tend to follow a Rayleigh distribution. If the mean wave height is 1 meter, what fraction of waves are likely to be larger than 3 meters?
>>> meanvalue = 1 >>> modevalue = np.sqrt(2 / np.pi) * meanvalue >>> s = np.random.rayleigh(modevalue, 1000000)
The percentage of waves larger than 3 meters is:
>>> 100.*sum(s>3)/1000000. 0.087300000000000003 # random
-
dask.array.random.
standard_cauchy
(size=None, chunks='auto', **kwargs)¶ Draw samples from a standard Cauchy distribution with mode = 0.
This docstring was copied from numpy.random.mtrand.RandomState.standard_cauchy.
Some inconsistencies with the Dask version may exist.
Also known as the Lorentz distribution.
Note
New code should use the
standard_cauchy
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.
- Returns
- samplesndarray or scalar
The drawn samples.
See also
Generator.standard_cauchy
which should be used for new code.
Notes
The probability density function for the full Cauchy distribution is
P(x;x0,γ)=1πγ[1+(x−x0γ)2]and the Standard Cauchy distribution just sets x0=0 and γ=1
The Cauchy distribution arises in the solution to the driven harmonic oscillator problem, and also describes spectral line broadening. It also describes the distribution of values at which a line tilted at a random angle will cut the x axis.
When studying hypothesis tests that assume normality, seeing how the tests perform on data from a Cauchy distribution is a good indicator of their sensitivity to a heavy-tailed distribution, since the Cauchy looks very much like a Gaussian distribution, but with heavier tails.
References
- 1
NIST/SEMATECH e-Handbook of Statistical Methods, “Cauchy Distribution”, https://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm
- 2
Weisstein, Eric W. “Cauchy Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/CauchyDistribution.html
- 3
Wikipedia, “Cauchy distribution” https://en.wikipedia.org/wiki/Cauchy_distribution
Examples
Draw samples and plot the distribution:
>>> import matplotlib.pyplot as plt >>> s = np.random.standard_cauchy(1000000) >>> s = s[(s>-25) & (s<25)] # truncate distribution so it plots well >>> plt.hist(s, bins=100) >>> plt.show()
-
dask.array.random.
standard_exponential
(size=None, chunks='auto', **kwargs)¶ Draw samples from the standard exponential distribution.
This docstring was copied from numpy.random.mtrand.RandomState.standard_exponential.
Some inconsistencies with the Dask version may exist.
standard_exponential is identical to the exponential distribution with a scale parameter of 1.
Note
New code should use the
standard_exponential
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.
- Returns
- outfloat or ndarray
Drawn samples.
See also
Generator.standard_exponential
which should be used for new code.
Examples
Output a 3x8000 array:
>>> n = np.random.standard_exponential((3, 8000))
-
dask.array.random.
standard_gamma
(shape, size=None, chunks='auto', **kwargs)¶ Draw samples from a standard Gamma distribution.
This docstring was copied from numpy.random.mtrand.RandomState.standard_gamma.
Some inconsistencies with the Dask version may exist.
Samples are drawn from a Gamma distribution with specified parameters, shape (sometimes designated “k”) and scale=1.
Note
New code should use the
standard_gamma
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- shapefloat or array_like of floats
Parameter, must be non-negative.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifshape
is a scalar. Otherwise,np.array(shape).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized standard gamma distribution.
See also
scipy.stats.gamma
probability density function, distribution or cumulative density function, etc.
Generator.standard_gamma
which should be used for new code.
Notes
The probability density for the Gamma distribution is
p(x)=xk−1e−x/θθkΓ(k),where k is the shape and θ the scale, and Γ is the Gamma function.
The Gamma distribution is often used to model the times to failure of electronic components, and arises naturally in processes for which the waiting times between Poisson distributed events are relevant.
References
- 1
Weisstein, Eric W. “Gamma Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/GammaDistribution.html
- 2
Wikipedia, “Gamma distribution”, https://en.wikipedia.org/wiki/Gamma_distribution
Examples
Draw samples from the distribution:
>>> shape, scale = 2., 1. # mean and width >>> s = np.random.standard_gamma(shape, 1000000)
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> import scipy.special as sps >>> count, bins, ignored = plt.hist(s, 50, density=True) >>> y = bins**(shape-1) * ((np.exp(-bins/scale))/ ... (sps.gamma(shape) * scale**shape)) >>> plt.plot(bins, y, linewidth=2, color='r') >>> plt.show()
-
dask.array.random.
standard_normal
(size=None, chunks='auto', **kwargs)¶ Draw samples from a standard Normal distribution (mean=0, stdev=1).
This docstring was copied from numpy.random.mtrand.RandomState.standard_normal.
Some inconsistencies with the Dask version may exist.
Note
New code should use the
standard_normal
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.
- Returns
- outfloat or ndarray
A floating-point array of shape
size
of drawn samples, or a single sample ifsize
was not specified.
See also
normal
Equivalent function with additional
loc
andscale
arguments for setting the mean and standard deviation.Generator.standard_normal
which should be used for new code.
Notes
For random samples from N(μ,σ2), use one of:
mu + sigma * np.random.standard_normal(size=...) np.random.normal(mu, sigma, size=...)
Examples
>>> np.random.standard_normal() 2.1923875335537315 #random
>>> s = np.random.standard_normal(8000) >>> s array([ 0.6888893 , 0.78096262, -0.89086505, ..., 0.49876311, # random -0.38672696, -0.4685006 ]) # random >>> s.shape (8000,) >>> s = np.random.standard_normal(size=(3, 4, 2)) >>> s.shape (3, 4, 2)
Two-by-four array of samples from N(3,6.25):
>>> 3 + 2.5 * np.random.standard_normal(size=(2, 4)) array([[-4.49401501, 4.00950034, -1.81814867, 7.29718677], # random [ 0.39924804, 4.68456316, 4.99394529, 4.84057254]]) # random
-
dask.array.random.
standard_t
(df, size=None, chunks='auto', **kwargs)¶ Draw samples from a standard Student’s t distribution with df degrees of freedom.
This docstring was copied from numpy.random.mtrand.RandomState.standard_t.
Some inconsistencies with the Dask version may exist.
A special case of the hyperbolic distribution. As df gets large, the result resembles that of the standard normal distribution (standard_normal).
Note
New code should use the
standard_t
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- dffloat or array_like of floats
Degrees of freedom, must be > 0.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdf
is a scalar. Otherwise,np.array(df).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized standard Student’s t distribution.
See also
Generator.standard_t
which should be used for new code.
Notes
The probability density function for the t distribution is
P(x,df)=Γ(df+12)√πdfΓ(df2)(1+x2df)−(df+1)/2The t test is based on an assumption that the data come from a Normal distribution. The t test provides a way to test whether the sample mean (that is the mean calculated from the data) is a good estimate of the true mean.
The derivation of the t-distribution was first published in 1908 by William Gosset while working for the Guinness Brewery in Dublin. Due to proprietary issues, he had to publish under a pseudonym, and so he used the name Student.
References
- 1
Dalgaard, Peter, “Introductory Statistics With R”, Springer, 2002.
- 2
Wikipedia, “Student’s t-distribution” https://en.wikipedia.org/wiki/Student’s_t-distribution
Examples
From Dalgaard page 83 [1], suppose the daily energy intake for 11 women in kilojoules (kJ) is:
>>> intake = np.array([5260., 5470, 5640, 6180, 6390, 6515, 6805, 7515, \ ... 7515, 8230, 8770])
Does their energy intake deviate systematically from the recommended value of 7725 kJ?
We have 10 degrees of freedom, so is the sample mean within 95% of the recommended value?
>>> s = np.random.standard_t(10, size=100000) >>> np.mean(intake) 6753.636363636364 >>> intake.std(ddof=1) 1142.1232221373727
Calculate the t statistic, setting the ddof parameter to the unbiased value so the divisor in the standard deviation will be degrees of freedom, N-1.
>>> t = (np.mean(intake)-7725)/(intake.std(ddof=1)/np.sqrt(len(intake))) >>> import matplotlib.pyplot as plt >>> h = plt.hist(s, bins=100, density=True)
For a one-sided t-test, how far out in the distribution does the t statistic appear?
>>> np.sum(s<t) / float(len(s)) 0.0090699999999999999 #random
So the p-value is about 0.009, which says the null hypothesis has a probability of about 99% of being true.
-
dask.array.random.
triangular
(left, mode, right, size=None, chunks='auto', **kwargs)¶ Draw samples from the triangular distribution over the interval
[left, right]
.This docstring was copied from numpy.random.mtrand.RandomState.triangular.
Some inconsistencies with the Dask version may exist.
The triangular distribution is a continuous probability distribution with lower limit left, peak at mode, and upper limit right. Unlike the other distributions, these parameters directly define the shape of the pdf.
Note
New code should use the
triangular
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- leftfloat or array_like of floats
Lower limit.
- modefloat or array_like of floats
The value where the peak of the distribution occurs. The value must fulfill the condition
left <= mode <= right
.- rightfloat or array_like of floats
Upper limit, must be larger than left.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifleft
,mode
, andright
are all scalars. Otherwise,np.broadcast(left, mode, right).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized triangular distribution.
See also
Generator.triangular
which should be used for new code.
Notes
The probability density function for the triangular distribution is
P(x;l,m,r)={2(x−l)(r−l)(m−l)for l≤x≤m,2(r−x)(r−l)(r−m)for m≤x≤r,0otherwise.The triangular distribution is often used in ill-defined problems where the underlying distribution is not known, but some knowledge of the limits and mode exists. Often it is used in simulations.
References
- 1
Wikipedia, “Triangular distribution” https://en.wikipedia.org/wiki/Triangular_distribution
Examples
Draw values from the distribution and plot the histogram:
>>> import matplotlib.pyplot as plt >>> h = plt.hist(np.random.triangular(-3, 0, 8, 100000), bins=200, ... density=True) >>> plt.show()
-
dask.array.random.
uniform
(low=0.0, high=1.0, size=None, chunks='auto', **kwargs)¶ Draw samples from a uniform distribution.
This docstring was copied from numpy.random.mtrand.RandomState.uniform.
Some inconsistencies with the Dask version may exist.
Samples are uniformly distributed over the half-open interval
[low, high)
(includes low, but excludes high). In other words, any value within the given interval is equally likely to be drawn by uniform.Note
New code should use the
uniform
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- lowfloat or array_like of floats, optional
Lower boundary of the output interval. All values generated will be greater than or equal to low. The default value is 0.
- highfloat or array_like of floats
Upper boundary of the output interval. All values generated will be less than high. The default value is 1.0.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned iflow
andhigh
are both scalars. Otherwise,np.broadcast(low, high).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized uniform distribution.
See also
randint
Discrete uniform distribution, yielding integers.
random_integers
Discrete uniform distribution over the closed interval
[low, high]
.random_sample
Floats uniformly distributed over
[0, 1)
.random
Alias for random_sample.
rand
Convenience function that accepts dimensions as input, e.g.,
rand(2,2)
would generate a 2-by-2 array of floats, uniformly distributed over[0, 1)
.Generator.uniform
which should be used for new code.
Notes
The probability density function of the uniform distribution is
p(x)=1b−aanywhere within the interval
[a, b)
, and zero elsewhere.When
high
==low
, values oflow
will be returned. Ifhigh
<low
, the results are officially undefined and may eventually raise an error, i.e. do not rely on this function to behave when passed arguments satisfying that inequality condition.Examples
Draw samples from the distribution:
>>> s = np.random.uniform(-1,0,1000)
All values are within the given interval:
>>> np.all(s >= -1) True >>> np.all(s < 0) True
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s, 15, density=True) >>> plt.plot(bins, np.ones_like(bins), linewidth=2, color='r') >>> plt.show()
-
dask.array.random.
vonmises
(mu, kappa, size=None, chunks='auto', **kwargs)¶ Draw samples from a von Mises distribution.
This docstring was copied from numpy.random.mtrand.RandomState.vonmises.
Some inconsistencies with the Dask version may exist.
Samples are drawn from a von Mises distribution with specified mode (mu) and dispersion (kappa), on the interval [-pi, pi].
The von Mises distribution (also known as the circular normal distribution) is a continuous probability distribution on the unit circle. It may be thought of as the circular analogue of the normal distribution.
Note
New code should use the
vonmises
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- mufloat or array_like of floats
Mode (“center”) of the distribution.
- kappafloat or array_like of floats
Dispersion of the distribution, has to be >=0.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifmu
andkappa
are both scalars. Otherwise,np.broadcast(mu, kappa).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized von Mises distribution.
See also
scipy.stats.vonmises
probability density function, distribution, or cumulative density function, etc.
Generator.vonmises
which should be used for new code.
Notes
The probability density for the von Mises distribution is
p(x)=eκcos(x−μ)2πI0(κ),where μ is the mode and κ the dispersion, and I0(κ) is the modified Bessel function of order 0.
The von Mises is named for Richard Edler von Mises, who was born in Austria-Hungary, in what is now the Ukraine. He fled to the United States in 1939 and became a professor at Harvard. He worked in probability theory, aerodynamics, fluid mechanics, and philosophy of science.
References
- 1
Abramowitz, M. and Stegun, I. A. (Eds.). “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing,” New York: Dover, 1972.
- 2
von Mises, R., “Mathematical Theory of Probability and Statistics”, New York: Academic Press, 1964.
Examples
Draw samples from the distribution:
>>> mu, kappa = 0.0, 4.0 # mean and dispersion >>> s = np.random.vonmises(mu, kappa, 1000)
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> from scipy.special import i0 >>> plt.hist(s, 50, density=True) >>> x = np.linspace(-np.pi, np.pi, num=51) >>> y = np.exp(kappa*np.cos(x-mu))/(2*np.pi*i0(kappa)) >>> plt.plot(x, y, linewidth=2, color='r') >>> plt.show()
-
dask.array.random.
wald
(mean, scale, size=None, chunks='auto', **kwargs)¶ Draw samples from a Wald, or inverse Gaussian, distribution.
This docstring was copied from numpy.random.mtrand.RandomState.wald.
Some inconsistencies with the Dask version may exist.
As the scale approaches infinity, the distribution becomes more like a Gaussian. Some references claim that the Wald is an inverse Gaussian with mean equal to 1, but this is by no means universal.
The inverse Gaussian distribution was first studied in relationship to Brownian motion. In 1956 M.C.K. Tweedie used the name inverse Gaussian because there is an inverse relationship between the time to cover a unit distance and distance covered in unit time.
Note
New code should use the
wald
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- meanfloat or array_like of floats
Distribution mean, must be > 0.
- scalefloat or array_like of floats
Scale parameter, must be > 0.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifmean
andscale
are both scalars. Otherwise,np.broadcast(mean, scale).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized Wald distribution.
See also
Generator.wald
which should be used for new code.
Notes
The probability density function for the Wald distribution is
P(x;mean,scale)=√scale2πx3e−scale(x−mean)22⋅mean2xAs noted above the inverse Gaussian distribution first arise from attempts to model Brownian motion. It is also a competitor to the Weibull for use in reliability modeling and modeling stock returns and interest rate processes.
References
- 1
Brighton Webs Ltd., Wald Distribution, https://web.archive.org/web/20090423014010/http://www.brighton-webs.co.uk:80/distributions/wald.asp
- 2
Chhikara, Raj S., and Folks, J. Leroy, “The Inverse Gaussian Distribution: Theory : Methodology, and Applications”, CRC Press, 1988.
- 3
Wikipedia, “Inverse Gaussian distribution” https://en.wikipedia.org/wiki/Inverse_Gaussian_distribution
Examples
Draw values from the distribution and plot the histogram:
>>> import matplotlib.pyplot as plt >>> h = plt.hist(np.random.wald(3, 2, 100000), bins=200, density=True) >>> plt.show()
-
dask.array.random.
weibull
(a, size=None, chunks='auto', **kwargs)¶ Draw samples from a Weibull distribution.
This docstring was copied from numpy.random.mtrand.RandomState.weibull.
Some inconsistencies with the Dask version may exist.
Draw samples from a 1-parameter Weibull distribution with the given shape parameter a.
X=(−ln(U))1/aHere, U is drawn from the uniform distribution over (0,1].
The more common 2-parameter Weibull, including a scale parameter λ is just X=λ(−ln(U))1/a.
Note
New code should use the
weibull
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
- afloat or array_like of floats
Shape parameter of the distribution. Must be nonnegative.
- sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifa
is a scalar. Otherwise,np.array(a).size
samples are drawn.
- Returns
- outndarray or scalar
Drawn samples from the parameterized Weibull distribution.
See also
scipy.stats.weibull_max
scipy.stats.weibull_min
scipy.stats.genextreme
gumbel
Generator.weibull
which should be used for new code.
Notes
The Weibull (or Type III asymptotic extreme value distribution for smallest values, SEV Type III, or Rosin-Rammler distribution) is one of a class of Generalized Extreme Value (GEV) distributions used in modeling extreme value problems. This class includes the Gumbel and Frechet distributions.
The probability density for the Weibull distribution is
p(x)=aλ(xλ)a−1e−(x/λ)a,where a is the shape and λ the scale.
The function has its peak (the mode) at λ(a−1a)1/a.
When
a = 1
, the Weibull distribution reduces to the exponential distribution.References
- 1
Waloddi Weibull, Royal Technical University, Stockholm, 1939 “A Statistical Theory Of The Strength Of Materials”, Ingeniorsvetenskapsakademiens Handlingar Nr 151, 1939, Generalstabens Litografiska Anstalts Forlag, Stockholm.
- 2
Waloddi Weibull, “A Statistical Distribution Function of Wide Applicability”, Journal Of Applied Mechanics ASME Paper 1951.
- 3
Wikipedia, “Weibull distribution”, https://en.wikipedia.org/wiki/Weibull_distribution
Examples
Draw samples from the distribution:
>>> a = 5. # shape >>> s = np.random.weibull(a, 1000)
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> x = np.arange(1,100.)/50. >>> def weib(x,n,a): ... return (a / n) * (x / n)**(a - 1) * np.exp(-(x / n)**a)
>>> count, bins, ignored = plt.hist(np.random.weibull(5.,1000)) >>> x = np.arange(1,100.)/50. >>> scale = count.max()/weib(x, 1., 5.).max() >>> plt.plot(x, weib(x, 1., 5.)*scale) >>> plt.show()
-
dask.array.random.
zipf
(a, size=None, chunks='auto', **kwargs)¶ Standard distributions
-
dask.array.stats.
ttest_ind
(a, b, axis=0, equal_var=True)¶ Calculate the T-test for the means of two independent samples of scores.
This docstring was copied from scipy.stats.ttest_ind.
Some inconsistencies with the Dask version may exist.
This is a two-sided test for the null hypothesis that 2 independent samples have identical average (expected) values. This test assumes that the populations have identical variances by default.
- Parameters
- a, barray_like
The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).
- axisint or None, optional
Axis along which to compute test. If None, compute over the whole arrays, a, and b.
- equal_varbool, optional
If True (default), perform a standard independent 2 sample test that assumes equal population variances [1]. If False, perform Welch’s t-test, which does not assume equal population variance [2].
New in version 0.11.0.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional (Not supported in Dask)
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
- Returns
- statisticfloat or array
The calculated t-statistic.
- pvaluefloat or array
The two-tailed p-value.
Notes
We can use this test, if we observe two independent samples from the same or different population, e.g. exam scores of boys and girls or of two ethnic groups. The test measures whether the average (expected) value differs significantly across samples. If we observe a large p-value, for example larger than 0.05 or 0.1, then we cannot reject the null hypothesis of identical average scores. If the p-value is smaller than the threshold, e.g. 1%, 5% or 10%, then we reject the null hypothesis of equal averages.
References
Examples
>>> from scipy import stats >>> np.random.seed(12345678)
Test with sample with identical means:
>>> rvs1 = stats.norm.rvs(loc=5,scale=10,size=500) >>> rvs2 = stats.norm.rvs(loc=5,scale=10,size=500) >>> stats.ttest_ind(rvs1,rvs2) (0.26833823296239279, 0.78849443369564776) >>> stats.ttest_ind(rvs1,rvs2, equal_var = False) (0.26833823296239279, 0.78849452749500748)
ttest_ind underestimates p for unequal variances:
>>> rvs3 = stats.norm.rvs(loc=5, scale=20, size=500) >>> stats.ttest_ind(rvs1, rvs3) (-0.46580283298287162, 0.64145827413436174) >>> stats.ttest_ind(rvs1, rvs3, equal_var = False) (-0.46580283298287162, 0.64149646246569292)
When n1 != n2, the equal variance t-statistic is no longer equal to the unequal variance t-statistic:
>>> rvs4 = stats.norm.rvs(loc=5, scale=20, size=100) >>> stats.ttest_ind(rvs1, rvs4) (-0.99882539442782481, 0.3182832709103896) >>> stats.ttest_ind(rvs1, rvs4, equal_var = False) (-0.69712570584654099, 0.48716927725402048)
T-test with different means, variance, and n:
>>> rvs5 = stats.norm.rvs(loc=8, scale=20, size=100) >>> stats.ttest_ind(rvs1, rvs5) (-1.4679669854490653, 0.14263895620529152) >>> stats.ttest_ind(rvs1, rvs5, equal_var = False) (-0.94365973617132992, 0.34744170334794122)
-
dask.array.stats.
ttest_1samp
(a, popmean, axis=0, nan_policy='propagate')¶ Calculate the T-test for the mean of ONE group of scores.
This docstring was copied from scipy.stats.ttest_1samp.
Some inconsistencies with the Dask version may exist.
This is a two-sided test for the null hypothesis that the expected value (mean) of a sample of independent observations a is equal to the given population mean, popmean.
- Parameters
- aarray_like
Sample observation.
- popmeanfloat or array_like
Expected value in null hypothesis. If array_like, then it must have the same shape as a excluding the axis dimension.
- axisint or None, optional
Axis along which to compute test. If None, compute over the whole array a.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
- Returns
- statisticfloat or array
t-statistic.
- pvaluefloat or array
Two-sided p-value.
Examples
>>> from scipy import stats
>>> np.random.seed(7654567) # fix seed to get the same result >>> rvs = stats.norm.rvs(loc=5, scale=10, size=(50,2))
Test if mean of random sample is equal to true mean, and different mean. We reject the null hypothesis in the second case and don’t reject it in the first case.
>>> stats.ttest_1samp(rvs,5.0) (array([-0.68014479, -0.04323899]), array([ 0.49961383, 0.96568674])) >>> stats.ttest_1samp(rvs,0.0) (array([ 2.77025808, 4.11038784]), array([ 0.00789095, 0.00014999]))
Examples using axis and non-scalar dimension for population mean.
>>> stats.ttest_1samp(rvs,[5.0,0.0]) (array([-0.68014479, 4.11038784]), array([ 4.99613833e-01, 1.49986458e-04])) >>> stats.ttest_1samp(rvs.T,[5.0,0.0],axis=1) (array([-0.68014479, 4.11038784]), array([ 4.99613833e-01, 1.49986458e-04])) >>> stats.ttest_1samp(rvs,[[5.0],[0.0]]) (array([[-0.68014479, -0.04323899], [ 2.77025808, 4.11038784]]), array([[ 4.99613833e-01, 9.65686743e-01], [ 7.89094663e-03, 1.49986458e-04]]))
-
dask.array.stats.
ttest_rel
(a, b, axis=0, nan_policy='propagate')¶ Calculate the t-test on TWO RELATED samples of scores, a and b.
This docstring was copied from scipy.stats.ttest_rel.
Some inconsistencies with the Dask version may exist.
This is a two-sided test for the null hypothesis that 2 related or repeated samples have identical average (expected) values.
- Parameters
- a, barray_like
The arrays must have the same shape.
- axisint or None, optional
Axis along which to compute test. If None, compute over the whole arrays, a, and b.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
- Returns
- statisticfloat or array
t-statistic.
- pvaluefloat or array
Two-sided p-value.
Notes
Examples for use are scores of the same set of student in different exams, or repeated sampling from the same units. The test measures whether the average score differs significantly across samples (e.g. exams). If we observe a large p-value, for example greater than 0.05 or 0.1 then we cannot reject the null hypothesis of identical average scores. If the p-value is smaller than the threshold, e.g. 1%, 5% or 10%, then we reject the null hypothesis of equal averages. Small p-values are associated with large t-statistics.
References
https://en.wikipedia.org/wiki/T-test#Dependent_t-test_for_paired_samples
Examples
>>> from scipy import stats >>> np.random.seed(12345678) # fix random seed to get same numbers
>>> rvs1 = stats.norm.rvs(loc=5,scale=10,size=500) >>> rvs2 = (stats.norm.rvs(loc=5,scale=10,size=500) + ... stats.norm.rvs(scale=0.2,size=500)) >>> stats.ttest_rel(rvs1,rvs2) (0.24101764965300962, 0.80964043445811562) >>> rvs3 = (stats.norm.rvs(loc=8,scale=10,size=500) + ... stats.norm.rvs(scale=0.2,size=500)) >>> stats.ttest_rel(rvs1,rvs3) (-3.9995108708727933, 7.3082402191726459e-005)
-
dask.array.stats.
chisquare
(f_obs, f_exp=None, ddof=0, axis=0)¶ Calculate a one-way chi-square test.
This docstring was copied from scipy.stats.chisquare.
Some inconsistencies with the Dask version may exist.
The chi-square test tests the null hypothesis that the categorical data has the given frequencies.
- Parameters
- f_obsarray_like
Observed frequencies in each category.
- f_exparray_like, optional
Expected frequencies in each category. By default the categories are assumed to be equally likely.
- ddofint, optional
“Delta degrees of freedom”: adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with
k - 1 - ddof
degrees of freedom, where k is the number of observed frequencies. The default value of ddof is 0.- axisint or None, optional
The axis of the broadcast result of f_obs and f_exp along which to apply the test. If axis is None, all values in f_obs are treated as a single data set. Default is 0.
- Returns
- chisqfloat or ndarray
The chi-squared test statistic. The value is a float if axis is None or f_obs and f_exp are 1-D.
- pfloat or ndarray
The p-value of the test. The value is a float if ddof and the return value chisq are scalars.
See also
scipy.stats.power_divergence
Notes
This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5.
The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not chi-square, in which case this test is not appropriate.
References
- 1
Lowry, Richard. “Concepts and Applications of Inferential Statistics”. Chapter 8. https://web.archive.org/web/20171022032306/http://vassarstats.net:80/textbook/ch8pt1.html
- 2
“Chi-squared test”, https://en.wikipedia.org/wiki/Chi-squared_test
Examples
When just f_obs is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies.
>>> from scipy.stats import chisquare >>> chisquare([16, 18, 16, 14, 12, 12]) (2.0, 0.84914503608460956)
With f_exp the expected frequencies can be given.
>>> chisquare([16, 18, 16, 14, 12, 12], f_exp=[16, 16, 16, 16, 16, 8]) (3.5, 0.62338762774958223)
When f_obs is 2-D, by default the test is applied to each column.
>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T >>> obs.shape (6, 2) >>> chisquare(obs) (array([ 2. , 6.66666667]), array([ 0.84914504, 0.24663415]))
By setting
axis=None
, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array.>>> chisquare(obs, axis=None) (23.31034482758621, 0.015975692534127565) >>> chisquare(obs.ravel()) (23.31034482758621, 0.015975692534127565)
ddof is the change to make to the default degrees of freedom.
>>> chisquare([16, 18, 16, 14, 12, 12], ddof=1) (2.0, 0.73575888234288467)
The calculation of the p-values is done by broadcasting the chi-squared statistic with ddof.
>>> chisquare([16, 18, 16, 14, 12, 12], ddof=[0,1,2]) (2.0, array([ 0.84914504, 0.73575888, 0.5724067 ]))
f_obs and f_exp are also broadcast. In the following, f_obs has shape (6,) and f_exp has shape (2, 6), so the result of broadcasting f_obs and f_exp has shape (2, 6). To compute the desired chi-squared statistics, we use
axis=1
:>>> chisquare([16, 18, 16, 14, 12, 12], ... f_exp=[[16, 16, 16, 16, 16, 8], [8, 20, 20, 16, 12, 12]], ... axis=1) (array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846]))
-
dask.array.stats.
power_divergence
(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None)¶ Cressie-Read power divergence statistic and goodness of fit test.
This docstring was copied from scipy.stats.power_divergence.
Some inconsistencies with the Dask version may exist.
This function tests the null hypothesis that the categorical data has the given frequencies, using the Cressie-Read power divergence statistic.
- Parameters
- f_obsarray_like
Observed frequencies in each category.
- f_exparray_like, optional
Expected frequencies in each category. By default the categories are assumed to be equally likely.
- ddofint, optional
“Delta degrees of freedom”: adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with
k - 1 - ddof
degrees of freedom, where k is the number of observed frequencies. The default value of ddof is 0.- axisint or None, optional
The axis of the broadcast result of f_obs and f_exp along which to apply the test. If axis is None, all values in f_obs are treated as a single data set. Default is 0.
- lambda_float or str, optional
The power in the Cressie-Read power divergence statistic. The default is 1. For convenience, lambda_ may be assigned one of the following strings, in which case the corresponding numerical value is used:
String Value Description "pearson" 1 Pearson's chi-squared statistic. In this case, the function is equivalent to `stats.chisquare`. "log-likelihood" 0 Log-likelihood ratio. Also known as the G-test [R5ed189a69e5c-3]_. "freeman-tukey" -1/2 Freeman-Tukey statistic. "mod-log-likelihood" -1 Modified log-likelihood ratio. "neyman" -2 Neyman's statistic. "cressie-read" 2/3 The power recommended in [R5ed189a69e5c-5]_.
- Returns
- statisticfloat or ndarray
The Cressie-Read power divergence test statistic. The value is a float if axis is None or if` f_obs and f_exp are 1-D.
- pvaluefloat or ndarray
The p-value of the test. The value is a float if ddof and the return value stat are scalars.
See also
Notes
This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5.
When lambda_ is less than zero, the formula for the statistic involves dividing by f_obs, so a warning or error may be generated if any value in f_obs is 0.
Similarly, a warning or error may be generated if any value in f_exp is zero when lambda_ >= 0.
The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate.
This function handles masked arrays. If an element of f_obs or f_exp is masked, then data at that position is ignored, and does not count towards the size of the data set.
New in version 0.13.0.
References
- 1
Lowry, Richard. “Concepts and Applications of Inferential Statistics”. Chapter 8. https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
- 2
“Chi-squared test”, https://en.wikipedia.org/wiki/Chi-squared_test
- 3
“G-test”, https://en.wikipedia.org/wiki/G-test
- 4
Sokal, R. R. and Rohlf, F. J. “Biometry: the principles and practice of statistics in biological research”, New York: Freeman (1981)
- 5
Cressie, N. and Read, T. R. C., “Multinomial Goodness-of-Fit Tests”, J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984), pp. 440-464.
Examples
(See chisquare for more examples.)
When just f_obs is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies. Here we perform a G-test (i.e. use the log-likelihood ratio statistic):
>>> from scipy.stats import power_divergence >>> power_divergence([16, 18, 16, 14, 12, 12], lambda_='log-likelihood') (2.006573162632538, 0.84823476779463769)
The expected frequencies can be given with the f_exp argument:
>>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[16, 16, 16, 16, 16, 8], ... lambda_='log-likelihood') (3.3281031458963746, 0.6495419288047497)
When f_obs is 2-D, by default the test is applied to each column.
>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T >>> obs.shape (6, 2) >>> power_divergence(obs, lambda_="log-likelihood") (array([ 2.00657316, 6.77634498]), array([ 0.84823477, 0.23781225]))
By setting
axis=None
, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array.>>> power_divergence(obs, axis=None) (23.31034482758621, 0.015975692534127565) >>> power_divergence(obs.ravel()) (23.31034482758621, 0.015975692534127565)
ddof is the change to make to the default degrees of freedom.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1) (2.0, 0.73575888234288467)
The calculation of the p-values is done by broadcasting the test statistic with ddof.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2]) (2.0, array([ 0.84914504, 0.73575888, 0.5724067 ]))
f_obs and f_exp are also broadcast. In the following, f_obs has shape (6,) and f_exp has shape (2, 6), so the result of broadcasting f_obs and f_exp has shape (2, 6). To compute the desired chi-squared statistics, we must use
axis=1
:>>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[[16, 16, 16, 16, 16, 8], ... [8, 20, 20, 16, 12, 12]], ... axis=1) (array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846]))
-
dask.array.stats.
skew
(a, axis=0, bias=True, nan_policy='propagate')¶ Compute the sample skewness of a data set.
This docstring was copied from scipy.stats.skew.
Some inconsistencies with the Dask version may exist.
For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function skewtest can be used to determine if the skewness value is close enough to zero, statistically speaking.
- Parameters
- andarray
Input array.
- axisint or None, optional
Axis along which skewness is calculated. Default is 0. If None, compute over the whole array a.
- biasbool, optional
If False, then the calculations are corrected for statistical bias.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
- Returns
- skewnessndarray
The skewness of values along an axis, returning 0 where all values are equal.
Notes
The sample skewness is computed as the Fisher-Pearson coefficient of skewness, i.e.
g1=m3m3/22where
mi=1NN∑n=1(x[n]−ˉx)iis the biased sample ith central moment, and ˉx is the sample mean. If
bias
is False, the calculations are corrected for bias and the value computed is the adjusted Fisher-Pearson standardized moment coefficient, i.e.G1=k3k3/22=√N(N−1)N−2m3m3/22.References
- 1
Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000. Section 2.2.24.1
Examples
>>> from scipy.stats import skew >>> skew([1, 2, 3, 4, 5]) 0.0 >>> skew([2, 8, 0, 4, 1, 9, 9, 0]) 0.2650554122698573
-
dask.array.stats.
skewtest
(a, axis=0, nan_policy='propagate')¶ Test whether the skew is different from the normal distribution.
This docstring was copied from scipy.stats.skewtest.
Some inconsistencies with the Dask version may exist.
This function tests the null hypothesis that the skewness of the population that the sample was drawn from is the same as that of a corresponding normal distribution.
- Parameters
- aarray
The data to be tested.
- axisint or None, optional
Axis along which statistics are calculated. Default is 0. If None, compute over the whole array a.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
- Returns
- statisticfloat
The computed z-score for this test.
- pvaluefloat
Two-sided p-value for the hypothesis test.
Notes
The sample size must be at least 8.
References
- 1
R. B. D’Agostino, A. J. Belanger and R. B. D’Agostino Jr., “A suggestion for using powerful and informative tests of normality”, American Statistician 44, pp. 316-321, 1990.
Examples
>>> from scipy.stats import skewtest >>> skewtest([1, 2, 3, 4, 5, 6, 7, 8]) SkewtestResult(statistic=1.0108048609177787, pvalue=0.3121098361421897) >>> skewtest([2, 8, 0, 4, 1, 9, 9, 0]) SkewtestResult(statistic=0.44626385374196975, pvalue=0.6554066631275459) >>> skewtest([1, 2, 3, 4, 5, 6, 7, 8000]) SkewtestResult(statistic=3.571773510360407, pvalue=0.0003545719905823133) >>> skewtest([100, 100, 100, 100, 100, 100, 100, 101]) SkewtestResult(statistic=3.5717766638478072, pvalue=0.000354567720281634)
-
dask.array.stats.
kurtosis
(a, axis=0, fisher=True, bias=True, nan_policy='propagate')¶ Compute the kurtosis (Fisher or Pearson) of a dataset.
This docstring was copied from scipy.stats.kurtosis.
Some inconsistencies with the Dask version may exist.
Kurtosis is the fourth central moment divided by the square of the variance. If Fisher’s definition is used, then 3.0 is subtracted from the result to give 0.0 for a normal distribution.
If bias is False then the kurtosis is calculated using k statistics to eliminate bias coming from biased moment estimators
Use kurtosistest to see if result is close enough to normal.
- Parameters
- aarray
Data for which the kurtosis is calculated.
- axisint or None, optional
Axis along which the kurtosis is calculated. Default is 0. If None, compute over the whole array a.
- fisherbool, optional
If True, Fisher’s definition is used (normal ==> 0.0). If False, Pearson’s definition is used (normal ==> 3.0).
- biasbool, optional
If False, then the calculations are corrected for statistical bias.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
- Returns
- kurtosisarray
The kurtosis of values along an axis. If all values are equal, return -3 for Fisher’s definition and 0 for Pearson’s definition.
References
- 1
Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000.
Examples
In Fisher’s definiton, the kurtosis of the normal distribution is zero. In the following example, the kurtosis is close to zero, because it was calculated from the dataset, not from the continuous distribution.
>>> from scipy.stats import norm, kurtosis >>> data = norm.rvs(size=1000, random_state=3) >>> kurtosis(data) -0.06928694200380558
The distribution with a higher kurtosis has a heavier tail. The zero valued kurtosis of the normal distribution in Fisher’s definition can serve as a reference point.
>>> import matplotlib.pyplot as plt >>> import scipy.stats as stats >>> from scipy.stats import kurtosis
>>> x = np.linspace(-5, 5, 100) >>> ax = plt.subplot() >>> distnames = ['laplace', 'norm', 'uniform']
>>> for distname in distnames: ... if distname == 'uniform': ... dist = getattr(stats, distname)(loc=-2, scale=4) ... else: ... dist = getattr(stats, distname) ... data = dist.rvs(size=1000) ... kur = kurtosis(data, fisher=True) ... y = dist.pdf(x) ... ax.plot(x, y, label="{}, {}".format(distname, round(kur, 3))) ... ax.legend()
The Laplace distribution has a heavier tail than the normal distribution. The uniform distribution (which has negative kurtosis) has the thinnest tail.
-
dask.array.stats.
kurtosistest
(a, axis=0, nan_policy='propagate')¶ Test whether a dataset has normal kurtosis.
This docstring was copied from scipy.stats.kurtosistest.
Some inconsistencies with the Dask version may exist.
This function tests the null hypothesis that the kurtosis of the population from which the sample was drawn is that of the normal distribution:
kurtosis = 3(n-1)/(n+1)
.- Parameters
- aarray
Array of the sample data.
- axisint or None, optional
Axis along which to compute test. Default is 0. If None, compute over the whole array a.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
- Returns
- statisticfloat
The computed z-score for this test.
- pvaluefloat
The two-sided p-value for the hypothesis test.
Notes
Valid only for n>20. This function uses the method described in [1].
References
- 1
see e.g. F. J. Anscombe, W. J. Glynn, “Distribution of the kurtosis statistic b2 for normal samples”, Biometrika, vol. 70, pp. 227-234, 1983.
Examples
>>> from scipy.stats import kurtosistest >>> kurtosistest(list(range(20))) KurtosistestResult(statistic=-1.7058104152122062, pvalue=0.08804338332528348)
>>> np.random.seed(28041990) >>> s = np.random.normal(0, 1, 1000) >>> kurtosistest(s) KurtosistestResult(statistic=1.2317590987707365, pvalue=0.21803908613450895)
-
dask.array.stats.
normaltest
(a, axis=0, nan_policy='propagate')¶ Test whether a sample differs from a normal distribution.
This docstring was copied from scipy.stats.normaltest.
Some inconsistencies with the Dask version may exist.
This function tests the null hypothesis that a sample comes from a normal distribution. It is based on D’Agostino and Pearson’s [1], [2] test that combines skew and kurtosis to produce an omnibus test of normality.
- Parameters
- aarray_like
The array containing the sample to be tested.
- axisint or None, optional
Axis along which to compute test. Default is 0. If None, compute over the whole array a.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
- Returns
- statisticfloat or array
s^2 + k^2
, wheres
is the z-score returned by skewtest andk
is the z-score returned by kurtosistest.- pvaluefloat or array
A 2-sided chi squared probability for the hypothesis test.
References
- 1
D’Agostino, R. B. (1971), “An omnibus test of normality for moderate and large sample size”, Biometrika, 58, 341-348
- 2
D’Agostino, R. and Pearson, E. S. (1973), “Tests for departure from normality”, Biometrika, 60, 613-622
Examples
>>> from scipy import stats >>> pts = 1000 >>> np.random.seed(28041990) >>> a = np.random.normal(0, 1, size=pts) >>> b = np.random.normal(2, 1, size=pts) >>> x = np.concatenate((a, b)) >>> k2, p = stats.normaltest(x) >>> alpha = 1e-3 >>> print("p = {:g}".format(p)) p = 3.27207e-11 >>> if p < alpha: # null hypothesis: x comes from a normal distribution ... print("The null hypothesis can be rejected") ... else: ... print("The null hypothesis cannot be rejected") The null hypothesis can be rejected
-
dask.array.stats.
f_oneway
(*args)¶ Perform one-way ANOVA.
This docstring was copied from scipy.stats.f_oneway.
Some inconsistencies with the Dask version may exist.
The one-way ANOVA tests the null hypothesis that two or more groups have the same population mean. The test is applied to samples from two or more groups, possibly with differing sizes.
- Parameters
- sample1, sample2, …array_like
The sample measurements for each group.
- Returns
- statisticfloat
The computed F-value of the test.
- pvaluefloat
The associated p-value from the F-distribution.
Notes
The ANOVA test has important assumptions that must be satisfied in order for the associated p-value to be valid.
The samples are independent.
Each sample is from a normally distributed population.
The population standard deviations of the groups are all equal. This property is known as homoscedasticity.
If these assumptions are not true for a given set of data, it may still be possible to use the Kruskal-Wallis H-test (scipy.stats.kruskal) although with some loss of power.
The algorithm is from Heiman[2], pp.394-7.
References
- 1
R. Lowry, “Concepts and Applications of Inferential Statistics”, Chapter 14, 2014, http://vassarstats.net/textbook/
- 2
G.W. Heiman, “Understanding research methods and statistics: An integrated introduction for psychology”, Houghton, Mifflin and Company, 2001.
- 3
G.H. McDonald, “Handbook of Biological Statistics”, One-way ANOVA. http://www.biostathandbook.com/onewayanova.html
Examples
>>> import scipy.stats as stats
[3] Here are some data on a shell measurement (the length of the anterior adductor muscle scar, standardized by dividing by length) in the mussel Mytilus trossulus from five locations: Tillamook, Oregon; Newport, Oregon; Petersburg, Alaska; Magadan, Russia; and Tvarminne, Finland, taken from a much larger data set used in McDonald et al. (1991).
>>> tillamook = [0.0571, 0.0813, 0.0831, 0.0976, 0.0817, 0.0859, 0.0735, ... 0.0659, 0.0923, 0.0836] >>> newport = [0.0873, 0.0662, 0.0672, 0.0819, 0.0749, 0.0649, 0.0835, ... 0.0725] >>> petersburg = [0.0974, 0.1352, 0.0817, 0.1016, 0.0968, 0.1064, 0.105] >>> magadan = [0.1033, 0.0915, 0.0781, 0.0685, 0.0677, 0.0697, 0.0764, ... 0.0689] >>> tvarminne = [0.0703, 0.1026, 0.0956, 0.0973, 0.1039, 0.1045] >>> stats.f_oneway(tillamook, newport, petersburg, magadan, tvarminne) (7.1210194716424473, 0.00028122423145345439)
-
dask.array.stats.
moment
(a, moment=1, axis=0, nan_policy='propagate')¶ Calculate the nth moment about the mean for a sample.
This docstring was copied from scipy.stats.moment.
Some inconsistencies with the Dask version may exist.
A moment is a specific quantitative measure of the shape of a set of points. It is often used to calculate coefficients of skewness and kurtosis due to its close relationship with them.
- Parameters
- aarray_like
Input array.
- momentint or array_like of ints, optional
Order of central moment that is returned. Default is 1.
- axisint or None, optional
Axis along which the central moment is computed. Default is 0. If None, compute over the whole array a.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
‘propagate’: returns nan
‘raise’: throws an error
‘omit’: performs the calculations ignoring nan values
- Returns
- n-th central momentndarray or float
The appropriate moment along the given axis or over all values if axis is None. The denominator for the moment calculation is the number of observations, no degrees of freedom correction is done.
Notes
The k-th central moment of a data sample is:
mk=1nn∑i=1(xi−ˉx)kWhere n is the number of samples and x-bar is the mean. This function uses exponentiation by squares [1] for efficiency.
References
Examples
>>> from scipy.stats import moment >>> moment([1, 2, 3, 4, 5], moment=1) 0.0 >>> moment([1, 2, 3, 4, 5], moment=2) 2.0
-
dask.array.image.
imread
(filename, imread=None, preprocess=None)¶ Read a stack of images into a dask array
- Parameters
- filename: string
A globstring like ‘myfile.*.png’
- imread: function (optional)
Optionally provide custom imread function. Function should expect a filename and produce a numpy array. Defaults to
skimage.io.imread
.- preprocess: function (optional)
Optionally provide custom function to preprocess the image. Function should expect a numpy array for a single image.
- Returns
- Dask array of all images stacked along the first dimension. All images
- will be treated as individual chunks
Examples
>>> from dask.array.image import imread >>> im = imread('2015-*-*.png') >>> im.shape (365, 1000, 1000, 3)
-
dask.array.gufunc.
apply_gufunc
(func, signature, *args, **kwargs)¶ Apply a generalized ufunc or similar python function to arrays.
signature
determines if the function consumes or produces core dimensions. The remaining dimensions in given input arrays (*args
) are considered loop dimensions and are required to broadcast naturally against each other.In other terms, this function is like
np.vectorize
, but for the blocks of dask arrays. If the function itself shall also be vectorized usevectorize=True
for convenience.- Parameters
- funccallable
Function to call like
func(*args, **kwargs)
on input arrays (*args
) that returns an array or tuple of arrays. If multiple arguments with non-matching dimensions are supplied, this function is expected to vectorize (broadcast) over axes of positional arguments in the style of NumPy universal functions [1] (if this is not the case, setvectorize=True
). If this function returns multiple outputs,output_core_dims
has to be set as well.- signature: string
Specifies what core dimensions are consumed and produced by
func
. According to the specification of numpy.gufunc signature [2]- *argsnumeric
Input arrays or scalars to the callable function.
- axes: List of tuples, optional, keyword only
A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of
"(i,j),(j,k)->(i,k)"
appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be[(-2, -1), (-2, -1), (-2, -1)]
. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.- axis: int, optional, keyword only
A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and
()
for all others. For instance, for a signature"(i),(i)->()"
, it is equivalent to passing inaxes=[(axis,), (axis,), ()]
.- keepdims: bool, optional, keyword only
If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like
"(i),(i)->()"
or"(m,m)->()"
. If used, the location of the dimensions in the output can be controlled with axes and axis.- output_dtypesOptional, dtype or list of dtypes, keyword only
Valid numpy dtype specification or list thereof. If not given, a call of
func
with a small set of data is performed in order to try to automatically determine the output dtypes.- output_sizesdict, optional, keyword only
Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.
- vectorize: bool, keyword only
If set to
True
,np.vectorize
is applied tofunc
for convenience. Defaults toFalse
.- allow_rechunk: Optional, bool, keyword only
Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to
False
.- **kwargsdict
Extra keyword arguments to pass to func
- Returns
- Single dask.array.Array or tuple of dask.array.Array
References
Examples
>>> import dask.array as da >>> import numpy as np >>> def stats(x): ... return np.mean(x, axis=-1), np.std(x, axis=-1) >>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30)) >>> mean, std = da.apply_gufunc(stats, "(i)->(),()", a) >>> mean.compute().shape (10, 20)
>>> def outer_product(x, y): ... return np.einsum("i,j->ij", x, y) >>> a = da.random.normal(size=( 20,30), chunks=(10, 30)) >>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40)) >>> c = da.apply_gufunc(outer_product, "(i),(j)->(i,j)", a, b, vectorize=True) >>> c.compute().shape (10, 20, 30, 40)
-
dask.array.gufunc.
as_gufunc
(signature=None, **kwargs)¶ Decorator for
dask.array.gufunc
.- Parameters
- signatureString
Specifies what core dimensions are consumed and produced by
func
. According to the specification of numpy.gufunc signature [2]- axes: List of tuples, optional, keyword only
A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of
"(i,j),(j,k)->(i,k)"
appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be[(-2, -1), (-2, -1), (-2, -1)]
. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.- axis: int, optional, keyword only
A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and
()
for all others. For instance, for a signature"(i),(i)->()"
, it is equivalent to passing inaxes=[(axis,), (axis,), ()]
.- keepdims: bool, optional, keyword only
If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like
"(i),(i)->()"
or"(m,m)->()"
. If used, the location of the dimensions in the output can be controlled with axes and axis.- output_dtypesOptional, dtype or list of dtypes, keyword only
Valid numpy dtype specification or list thereof. If not given, a call of
func
with a small set of data is performed in order to try to automatically determine the output dtypes.- output_sizesdict, optional, keyword only
Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.
- vectorize: bool, keyword only
If set to
True
,np.vectorize
is applied tofunc
for convenience. Defaults toFalse
.- allow_rechunk: Optional, bool, keyword only
Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to
False
.
- Returns
- Decorator for pyfunc that itself returns a gufunc.
References
Examples
>>> import dask.array as da >>> import numpy as np >>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30)) >>> @da.as_gufunc("(i)->(),()", output_dtypes=(float, float)) ... def stats(x): ... return np.mean(x, axis=-1), np.std(x, axis=-1) >>> mean, std = stats(a) >>> mean.compute().shape (10, 20)
>>> a = da.random.normal(size=( 20,30), chunks=(10, 30)) >>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40)) >>> @da.as_gufunc("(i),(j)->(i,j)", output_dtypes=float, vectorize=True) ... def outer_product(x, y): ... return np.einsum("i,j->ij", x, y) >>> c = outer_product(a, b) >>> c.compute().shape (10, 20, 30, 40)
-
dask.array.gufunc.
gufunc
(pyfunc, **kwargs)¶ Binds pyfunc into
dask.array.apply_gufunc
when called.- Parameters
- pyfunccallable
Function to call like
func(*args, **kwargs)
on input arrays (*args
) that returns an array or tuple of arrays. If multiple arguments with non-matching dimensions are supplied, this function is expected to vectorize (broadcast) over axes of positional arguments in the style of NumPy universal functions [1] (if this is not the case, setvectorize=True
). If this function returns multiple outputs,output_core_dims
has to be set as well.- signatureString, keyword only
Specifies what core dimensions are consumed and produced by
func
. According to the specification of numpy.gufunc signature [2]- axes: List of tuples, optional, keyword only
A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of
"(i,j),(j,k)->(i,k)"
appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be[(-2, -1), (-2, -1), (-2, -1)]
. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.- axis: int, optional, keyword only
A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and
()
for all others. For instance, for a signature"(i),(i)->()"
, it is equivalent to passing inaxes=[(axis,), (axis,), ()]
.- keepdims: bool, optional, keyword only
If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like
"(i),(i)->()"
or"(m,m)->()"
. If used, the location of the dimensions in the output can be controlled with axes and axis.- output_dtypesOptional, dtype or list of dtypes, keyword only
Valid numpy dtype specification or list thereof. If not given, a call of
func
with a small set of data is performed in order to try to automatically determine the output dtypes.- output_sizesdict, optional, keyword only
Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.
- vectorize: bool, keyword only
If set to
True
,np.vectorize
is applied tofunc
for convenience. Defaults toFalse
.- allow_rechunk: Optional, bool, keyword only
Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to
False
.
- Returns
- Wrapped function
References
Examples
>>> import dask.array as da >>> import numpy as np >>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30)) >>> def stats(x): ... return np.mean(x, axis=-1), np.std(x, axis=-1) >>> gustats = da.gufunc(stats, signature="(i)->(),()", output_dtypes=(float, float)) >>> mean, std = gustats(a) >>> mean.compute().shape (10, 20)
>>> a = da.random.normal(size=( 20,30), chunks=(10, 30)) >>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40)) >>> def outer_product(x, y): ... return np.einsum("i,j->ij", x, y) >>> guouter_product = da.gufunc(outer_product, signature="(i),(j)->(i,j)", output_dtypes=float, vectorize=True) >>> c = guouter_product(a, b) >>> c.compute().shape (10, 20, 30, 40)
-
dask.array.core.
map_blocks
(func, *args, name=None, token=None, dtype=None, chunks=None, drop_axis=[], new_axis=None, meta=None, **kwargs)¶ Map a function across all blocks of a dask array.
- Parameters
- funccallable
Function to apply to every block in the array.
- argsdask arrays or other objects
- dtypenp.dtype, optional
The
dtype
of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data.- chunkstuple, optional
Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array.
- drop_axisnumber or iterable, optional
Dimensions lost by the function.
- new_axisnumber or iterable, optional
New dimensions created by the function. Note that these are applied after
drop_axis
(if present).- tokenstring, optional
The key prefix to use for the output array. If not provided, will be determined from the function name.
- namestring, optional
The key name to use for the output array. Note that this fully specifies the output key name, and must be unique. If not provided, will be determined by a hash of the arguments.
- **kwargs :
Other keyword arguments to pass to function. Values must be constants (not dask.arrays)
See also
dask.array.blockwise
Generalized operation with control over block alignment.
Examples
>>> import dask.array as da >>> x = da.arange(6, chunks=3)
>>> x.map_blocks(lambda x: x * 2).compute() array([ 0, 2, 4, 6, 8, 10])
The
da.map_blocks
function can also accept multiple arrays.>>> d = da.arange(5, chunks=2) >>> e = da.arange(5, chunks=2)
>>> f = map_blocks(lambda a, b: a + b**2, d, e) >>> f.compute() array([ 0, 2, 6, 12, 20])
If the function changes shape of the blocks then you must provide chunks explicitly.
>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))
You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.
>>> a = da.arange(18, chunks=(6,)) >>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))
If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.
>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1), ... new_axis=[0, 2])
If
chunks
is specified butnew_axis
is not, then it is inferred to add the necessary number of axes on the left.Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.
>>> x = da.arange(1000, chunks=(100,)) >>> y = da.arange(100, chunks=(10,))
The relevant attribute to match is numblocks.
>>> x.numblocks (10,) >>> y.numblocks (10,)
If these match (up to broadcasting rules) then we can map arbitrary functions across blocks
>>> def func(a, b): ... return np.array([a.max(), b.max()])
>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8') dask.array<func, shape=(20,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> _.compute() array([ 99, 9, 199, 19, 299, 29, 399, 39, 499, 49, 599, 59, 699, 69, 799, 79, 899, 89, 999, 99])
Your block function get information about where it is in the array by accepting a special
block_info
keyword argument.>>> def func(block, block_info=None): ... pass
This will receive the following information:
>>> block_info {0: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)]}, None: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)], 'chunk-shape': (100,), 'dtype': dtype('float64')}}
For each argument and keyword arguments that are dask arrays (the positions of which are the first index), you will receive the shape of the full array, the number of chunks of the full array in each dimension, the chunk location (for example the fourth chunk over in the first dimension), and the array location (for example the slice corresponding to
40:50
). The same information is provided for the output, with the keyNone
, plus the shape and dtype that should be returned.These features can be combined to synthesize an array from scratch, for example:
>>> def func(block_info=None): ... loc = block_info[None]['array-location'][0] ... return np.arange(loc[0], loc[1])
>>> da.map_blocks(func, chunks=((4, 4),), dtype=np.float_) dask.array<func, shape=(8,), dtype=float64, chunksize=(4,), chunktype=numpy.ndarray>
>>> _.compute() array([0, 1, 2, 3, 4, 5, 6, 7])
You may specify the key name prefix of the resulting task in the graph with the optional
token
keyword argument.>>> x.map_blocks(lambda x: x + 1, name='increment') dask.array<increment, shape=(100,), dtype=int64, chunksize=(10,), chunktype=numpy.ndarray>
-
dask.array.core.
blockwise
(func, out_ind, *args, name=None, token=None, dtype=None, adjust_chunks=None, new_axes=None, align_arrays=True, concatenate=None, meta=None, **kwargs)¶ Tensor operation: Generalized inner and outer products
A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The
blockwise
function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of blockwise including elementwise, broadcasting, reductions, tensordot, and transpose.- Parameters
- funccallable
Function to apply to individual tuples of blocks
- out_inditerable
Block pattern of the output, something like ‘ijk’ or (1, 2, 3)
- *argssequence of Array, index pairs
Sequence like (x, ‘ij’, y, ‘jk’, z, ‘i’)
- **kwargsdict
Extra keyword arguments to pass to function
- dtypenp.dtype
Datatype of resulting array.
- concatenatebool, keyword only
If true concatenate arrays along dummy indices, else provide lists
- adjust_chunksdict
Dictionary mapping index to function to be applied to chunk sizes
- new_axesdict, keyword only
New indexes and their dimension lengths
Examples
2D embarrassingly parallel operation from two arrays, x, and y.
>>> z = blockwise(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8') # z = x + y
Outer product multiplying x by y, two 1-d vectors
>>> z = blockwise(operator.mul, 'ij', x, 'i', y, 'j', dtype='f8')
z = x.T
>>> z = blockwise(np.transpose, 'ji', x, 'ij', dtype=x.dtype)
The transpose case above is illustrative because it does same transposition both on each in-memory block by calling
np.transpose
and on the order of the blocks themselves, by switching the order of the indexij -> ji
.We can compose these same patterns with more variables and more complex in-memory functions
z = X + Y.T
>>> z = blockwise(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8')
Any index, like
i
missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead passconcatenate=True
.Inner product multiplying x by y, two 1-d vectors
>>> def sequence_dot(x_blocks, y_blocks): ... result = 0 ... for x, y in zip(x_blocks, y_blocks): ... result += x.dot(y) ... return result
>>> z = blockwise(sequence_dot, '', x, 'i', y, 'i', dtype='f8')
Add new single-chunk dimensions with the
new_axes=
keyword, including the length of the new dimension. New dimensions will always be in a single chunk.>>> def f(x): ... return x[:, None] * np.ones((1, 5))
>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': 5}, dtype=x.dtype)
New dimensions can also be multi-chunk by specifying a tuple of chunk sizes. This has limited utility as is (because the chunks are all the same), but the resulting graph can be modified to achieve more useful results (see
da.map_blocks
).>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': (5, 5)}, dtype=x.dtype)
If the applied function changes the size of each chunk you can specify this with a
adjust_chunks={...}
dictionary holding a function for each index that modifies the dimension size in that index.>>> def double(x): ... return np.concatenate([x, x])
>>> y = blockwise(double, 'ij', x, 'ij', ... adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype)
Include literals by indexing with None
>>> y = blockwise(add, 'ij', x, 'ij', 1234, None, dtype=x.dtype)
-
dask.array.core.
normalize_chunks
(chunks, shape=None, limit=None, dtype=None, previous_chunks=None)¶ Normalize chunks to tuple of tuples
This takes in a variety of input types and information and produces a full tuple-of-tuples result for chunks, suitable to be passed to Array or rechunk or any other operation that creates a Dask array.
- Parameters
- chunks: tuple, int, dict, or string
The chunks to be normalized. See examples below for more details
- shape: Tuple[int]
The shape of the array
- limit: int (optional)
The maximum block size to target in bytes, if freedom is given to choose
- dtype: np.dtype
- previous_chunks: Tuple[Tuple[int]] optional
Chunks from a previous array that we should use for inspiration when rechunking auto dimensions. If not provided but auto-chunking exists then auto-dimensions will prefer square-like chunk shapes.
Examples
Specify uniform chunk sizes
>>> normalize_chunks((2, 2), shape=(5, 6)) ((2, 2, 1), (2, 2, 2))
Also passes through fully explicit tuple-of-tuples
>>> normalize_chunks(((2, 2, 1), (2, 2, 2)), shape=(5, 6)) ((2, 2, 1), (2, 2, 2))
Cleans up lists to tuples
>>> normalize_chunks([[2, 2], [3, 3]]) ((2, 2), (3, 3))
Expands integer inputs 10 -> (10, 10)
>>> normalize_chunks(10, shape=(30, 5)) ((10, 10, 10), (5,))
Expands dict inputs
>>> normalize_chunks({0: 2, 1: 3}, shape=(6, 6)) ((2, 2, 2), (3, 3))
The values -1 and None get mapped to full size
>>> normalize_chunks((5, -1), shape=(10, 10)) ((5, 5), (10,))
Use the value “auto” to automatically determine chunk sizes along certain dimensions. This uses the
limit=
anddtype=
keywords to determine how large to make the chunks. The term “auto” can be used anywhere an integer can be used. See array chunking documentation for more information.>>> normalize_chunks(("auto",), shape=(20,), limit=5, dtype='uint8') ((5, 5, 5, 5),)
You can also use byte sizes (see
dask.utils.parse_bytes
) in place of “auto” to ask for a particular size>>> normalize_chunks("1kiB", shape=(2000,), dtype='float32') ((250, 250, 250, 250, 250, 250, 250, 250),)
Respects null dimensions
>>> normalize_chunks((), shape=(0, 0)) ((0,), (0,))
Array Methods¶
-
class
dask.array.
Array
¶ Parallel Dask Array
A parallel nd-array comprised of many numpy arrays arranged in a grid.
This constructor is for advanced uses only. For normal use see the
da.from_array
function.- Parameters
- daskdict
Task dependency graph
- namestring
Name of array in dask
- shapetuple of ints
Shape of the entire array
- chunks: iterable of tuples
block sizes along each dimension
- dtypestr or dtype
Typecode or data-type for the new Dask Array
- metaempty ndarray
empty ndarray created with same NumPy backend, ndim and dtype as the Dask Array being created (overrides dtype)
See also
-
all
(axis=None, out=None, keepdims=False)¶ This docstring was copied from numpy.ndarray.all.
Some inconsistencies with the Dask version may exist.
Returns True if all elements evaluate to True.
Refer to numpy.all for full documentation.
See also
numpy.all
equivalent function
-
any
(axis=None, out=None, keepdims=False)¶ This docstring was copied from numpy.ndarray.any.
Some inconsistencies with the Dask version may exist.
Returns True if any of the elements of a evaluate to True.
Refer to numpy.any for full documentation.
See also
numpy.any
equivalent function
-
argmax
(axis=None, out=None)¶ This docstring was copied from numpy.ndarray.argmax.
Some inconsistencies with the Dask version may exist.
Return indices of the maximum values along the given axis.
Refer to numpy.argmax for full documentation.
See also
numpy.argmax
equivalent function
-
argmin
(axis=None, out=None)¶ This docstring was copied from numpy.ndarray.argmin.
Some inconsistencies with the Dask version may exist.
Return indices of the minimum values along the given axis of a.
Refer to numpy.argmin for detailed documentation.
See also
numpy.argmin
equivalent function
-
argtopk
(self, k, axis=-1, split_every=None)¶ The indices of the top k elements of an array.
See
da.argtopk
for docstring
-
astype
(self, dtype, **kwargs)¶ Copy of the array, cast to a specified type.
- Parameters
- dtypestr or dtype
Typecode or data-type to which the array is cast.
- casting{‘no’, ‘equiv’, ‘safe’, ‘same_kind’, ‘unsafe’}, optional
Controls what kind of data casting may occur. Defaults to ‘unsafe’ for backwards compatibility.
‘no’ means the data types should not be cast at all.
‘equiv’ means only byte-order changes are allowed.
‘safe’ means only casts which can preserve values are allowed.
- ‘same_kind’ means only safe casts or casts within a kind,
like float64 to float32, are allowed.
‘unsafe’ means any data conversions may be done.
- copybool, optional
By default, astype always returns a newly allocated array. If this is set to False and the dtype requirement is satisfied, the input array is returned instead of a copy.
-
property
blocks
¶ Slice an array by blocks
This allows blockwise slicing of a Dask array. You can perform normal Numpy-style slicing but now rather than slice elements of the array you slice along blocks so, for example,
x.blocks[0, ::2]
produces a new dask array with every other block in the first row of blocks.You can index blocks in any way that could index a numpy array of shape equal to the number of blocks in each dimension, (available as array.numblocks). The dimension of the output array will be the same as the dimension of this array, even if integer indices are passed. This does not support slicing with
np.newaxis
or multiple lists.- Returns
- A Dask array
Examples
>>> import dask.array as da >>> x = da.arange(10, chunks=2) >>> x.blocks[0].compute() array([0, 1]) >>> x.blocks[:3].compute() array([0, 1, 2, 3, 4, 5]) >>> x.blocks[::2].compute() array([0, 1, 4, 5, 8, 9]) >>> x.blocks[[-1, 0]].compute() array([8, 9, 0, 1])
-
choose
(choices, out=None, mode='raise')¶ This docstring was copied from numpy.ndarray.choose.
Some inconsistencies with the Dask version may exist.
Use an index array to construct a new array from a set of choices.
Refer to numpy.choose for full documentation.
See also
numpy.choose
equivalent function
-
clip
(min=None, max=None, out=None, **kwargs)¶ This docstring was copied from numpy.ndarray.clip.
Some inconsistencies with the Dask version may exist.
Return an array whose values are limited to
[min, max]
. One of max or min must be given.Refer to numpy.clip for full documentation.
See also
numpy.clip
equivalent function
-
compute_chunk_sizes
(self)¶ Compute the chunk sizes for a Dask array. This is especially useful when the chunk sizes are unknown (e.g., when indexing one Dask array with another).
Notes
This function modifies the Dask array in-place.
Examples
>>> import dask.array as da >>> import numpy as np >>> x = da.from_array([-2, -1, 0, 1, 2], chunks=2) >>> x.chunks ((2, 2, 1),) >>> y = x[x <= 0] >>> y.chunks ((nan, nan, nan),) >>> y.compute_chunk_sizes() # in-place computation dask.array<getitem, shape=(3,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray> >>> y.chunks ((2, 1, 0),)
-
copy
(self)¶ Copy array. This is a no-op for dask.arrays, which are immutable
-
cumprod
(axis=None, dtype=None, out=None)¶ This docstring was copied from numpy.ndarray.cumprod.
Some inconsistencies with the Dask version may exist.
Return the cumulative product of the elements along the given axis.
Refer to numpy.cumprod for full documentation.
See also
numpy.cumprod
equivalent function
-
cumsum
(axis=None, dtype=None, out=None)¶ This docstring was copied from numpy.ndarray.cumsum.
Some inconsistencies with the Dask version may exist.
Return the cumulative sum of the elements along the given axis.
Refer to numpy.cumsum for full documentation.
See also
numpy.cumsum
equivalent function
-
dot
(b, out=None)¶ This docstring was copied from numpy.ndarray.dot.
Some inconsistencies with the Dask version may exist.
Dot product of two arrays.
Refer to numpy.dot for full documentation.
See also
numpy.dot
equivalent function
Examples
>>> a = np.eye(2) >>> b = np.ones((2, 2)) * 2 >>> a.dot(b) array([[2., 2.], [2., 2.]])
This array method can be conveniently chained:
>>> a.dot(b).dot(b) array([[8., 8.], [8., 8.]])
-
flatten
([order])¶ This docstring was copied from numpy.ndarray.ravel.
Some inconsistencies with the Dask version may exist.
Return a flattened array.
Refer to numpy.ravel for full documentation.
See also
numpy.ravel
equivalent function
ndarray.flat
a flat iterator on the array.
-
property
itemsize
¶ Length of one array element in bytes
-
map_blocks
(func, *args, name=None, token=None, dtype=None, chunks=None, drop_axis=[], new_axis=None, meta=None, **kwargs)¶ Map a function across all blocks of a dask array.
- Parameters
- funccallable
Function to apply to every block in the array.
- argsdask arrays or other objects
- dtypenp.dtype, optional
The
dtype
of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data.- chunkstuple, optional
Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array.
- drop_axisnumber or iterable, optional
Dimensions lost by the function.
- new_axisnumber or iterable, optional
New dimensions created by the function. Note that these are applied after
drop_axis
(if present).- tokenstring, optional
The key prefix to use for the output array. If not provided, will be determined from the function name.
- namestring, optional
The key name to use for the output array. Note that this fully specifies the output key name, and must be unique. If not provided, will be determined by a hash of the arguments.
- **kwargs :
Other keyword arguments to pass to function. Values must be constants (not dask.arrays)
See also
dask.array.blockwise
Generalized operation with control over block alignment.
Examples
>>> import dask.array as da >>> x = da.arange(6, chunks=3)
>>> x.map_blocks(lambda x: x * 2).compute() array([ 0, 2, 4, 6, 8, 10])
The
da.map_blocks
function can also accept multiple arrays.>>> d = da.arange(5, chunks=2) >>> e = da.arange(5, chunks=2)
>>> f = map_blocks(lambda a, b: a + b**2, d, e) >>> f.compute() array([ 0, 2, 6, 12, 20])
If the function changes shape of the blocks then you must provide chunks explicitly.
>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))
You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.
>>> a = da.arange(18, chunks=(6,)) >>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))
If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.
>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1), ... new_axis=[0, 2])
If
chunks
is specified butnew_axis
is not, then it is inferred to add the necessary number of axes on the left.Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.
>>> x = da.arange(1000, chunks=(100,)) >>> y = da.arange(100, chunks=(10,))
The relevant attribute to match is numblocks.
>>> x.numblocks (10,) >>> y.numblocks (10,)
If these match (up to broadcasting rules) then we can map arbitrary functions across blocks
>>> def func(a, b): ... return np.array([a.max(), b.max()])
>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8') dask.array<func, shape=(20,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> _.compute() array([ 99, 9, 199, 19, 299, 29, 399, 39, 499, 49, 599, 59, 699, 69, 799, 79, 899, 89, 999, 99])
Your block function get information about where it is in the array by accepting a special
block_info
keyword argument.>>> def func(block, block_info=None): ... pass
This will receive the following information:
>>> block_info {0: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)]}, None: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)], 'chunk-shape': (100,), 'dtype': dtype('float64')}}
For each argument and keyword arguments that are dask arrays (the positions of which are the first index), you will receive the shape of the full array, the number of chunks of the full array in each dimension, the chunk location (for example the fourth chunk over in the first dimension), and the array location (for example the slice corresponding to
40:50
). The same information is provided for the output, with the keyNone
, plus the shape and dtype that should be returned.These features can be combined to synthesize an array from scratch, for example:
>>> def func(block_info=None): ... loc = block_info[None]['array-location'][0] ... return np.arange(loc[0], loc[1])
>>> da.map_blocks(func, chunks=((4, 4),), dtype=np.float_) dask.array<func, shape=(8,), dtype=float64, chunksize=(4,), chunktype=numpy.ndarray>
>>> _.compute() array([0, 1, 2, 3, 4, 5, 6, 7])
You may specify the key name prefix of the resulting task in the graph with the optional
token
keyword argument.>>> x.map_blocks(lambda x: x + 1, name='increment') dask.array<increment, shape=(100,), dtype=int64, chunksize=(10,), chunktype=numpy.ndarray>
-
map_overlap
(self, func, depth, boundary=None, trim=True, **kwargs)¶ Map a function over blocks of the array with some overlap
We share neighboring zones between blocks of the array, then map a function, then trim away the neighboring strips.
- Parameters
- func: function
The function to apply to each extended block
- depth: int, tuple, or dict
The number of elements that each block should share with its neighbors If a tuple or dict then this can be different per axis
- boundary: str, tuple, dict
How to handle the boundaries. Values include ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or any constant value like 0 or np.nan
- trim: bool
Whether or not to trim
depth
elements from each block after calling the map function. Set this to False if your mapping function already does this for you- **kwargs:
Other keyword arguments valid in
map_blocks
Examples
>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1]) >>> x = from_array(x, chunks=5) >>> def derivative(x): ... return x - np.roll(x, 1)
>>> y = x.map_overlap(derivative, depth=1, boundary=0) >>> y.compute() array([ 1, 0, 1, 1, 0, 0, -1, -1, 0])
>>> import dask.array as da >>> x = np.arange(16).reshape((4, 4)) >>> d = da.from_array(x, chunks=(2, 2)) >>> d.map_overlap(lambda x: x + x.size, depth=1).compute() array([[16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]])
>>> func = lambda x: x + x.size >>> depth = {0: 1, 1: 1} >>> boundary = {0: 'reflect', 1: 'none'} >>> d.map_overlap(func, depth, boundary).compute() array([[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27]])
-
max
(axis=None, out=None, keepdims=False, initial=<no value>, where=True)¶ This docstring was copied from numpy.ndarray.max.
Some inconsistencies with the Dask version may exist.
Return the maximum along a given axis.
Refer to numpy.amax for full documentation.
See also
numpy.amax
equivalent function
-
mean
(axis=None, dtype=None, out=None, keepdims=False)¶ This docstring was copied from numpy.ndarray.mean.
Some inconsistencies with the Dask version may exist.
Returns the average of the array elements along given axis.
Refer to numpy.mean for full documentation.
See also
numpy.mean
equivalent function
-
min
(axis=None, out=None, keepdims=False, initial=<no value>, where=True)¶ This docstring was copied from numpy.ndarray.min.
Some inconsistencies with the Dask version may exist.
Return the minimum along a given axis.
Refer to numpy.amin for full documentation.
See also
numpy.amin
equivalent function
-
moment
(self, order, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Calculate the nth centralized moment.
- Parameters
- orderint
Order of the moment that is returned, must be >= 2.
- axisint, optional
Axis along which the central moment is computed. The default is to compute the moment of the flattened array.
- dtypedata-type, optional
Type to use in computing the moment. For arrays of integer type the default is float64; for arrays of float types it is the same as the array type.
- keepdimsbool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original array.
- ddofint, optional
“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is zero.
- Returns
- momentndarray
References
- 1
Pebay, Philippe (2008), “Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments”, Technical Report SAND2008-6212, Sandia National Laboratories.
-
property
nbytes
¶ Number of bytes in array
-
nonzero
()¶ This docstring was copied from numpy.ndarray.nonzero.
Some inconsistencies with the Dask version may exist.
Return the indices of the elements that are non-zero.
Refer to numpy.nonzero for full documentation.
See also
numpy.nonzero
equivalent function
-
property
partitions
¶ Slice an array by partitions. Alias of dask array .blocks attribute.
This alias allows you to write agnostic code that works with both dask arrays and dask dataframes.
This allows blockwise slicing of a Dask array. You can perform normal Numpy-style slicing but now rather than slice elements of the array you slice along blocks so, for example,
x.blocks[0, ::2]
produces a new dask array with every other block in the first row of blocks.You can index blocks in any way that could index a numpy array of shape equal to the number of blocks in each dimension, (available as array.numblocks). The dimension of the output array will be the same as the dimension of this array, even if integer indices are passed. This does not support slicing with
np.newaxis
or multiple lists.- Returns
- A Dask array
Examples
>>> import dask.array as da >>> x = da.arange(10, chunks=2) >>> x.partitions[0].compute() array([0, 1]) >>> x.partitions[:3].compute() array([0, 1, 2, 3, 4, 5]) >>> x.partitions[::2].compute() array([0, 1, 4, 5, 8, 9]) >>> x.partitions[[-1, 0]].compute() array([8, 9, 0, 1]) >>> all(x.partitions[:].compute() == x.blocks[:].compute()) True
-
prod
(axis=None, dtype=None, out=None, keepdims=False, initial=1, where=True)¶ This docstring was copied from numpy.ndarray.prod.
Some inconsistencies with the Dask version may exist.
Return the product of the array elements over the given axis
Refer to numpy.prod for full documentation.
See also
numpy.prod
equivalent function
-
ravel
([order])¶ This docstring was copied from numpy.ndarray.ravel.
Some inconsistencies with the Dask version may exist.
Return a flattened array.
Refer to numpy.ravel for full documentation.
See also
numpy.ravel
equivalent function
ndarray.flat
a flat iterator on the array.
-
rechunk
(self, chunks='auto', threshold=None, block_size_limit=None)¶ See da.rechunk for docstring
-
repeat
(repeats, axis=None)¶ This docstring was copied from numpy.ndarray.repeat.
Some inconsistencies with the Dask version may exist.
Repeat elements of an array.
Refer to numpy.repeat for full documentation.
See also
numpy.repeat
equivalent function
-
reshape
(shape, order='C')¶ This docstring was copied from numpy.ndarray.reshape.
Some inconsistencies with the Dask version may exist.
Returns an array containing the same data with a new shape.
Refer to numpy.reshape for full documentation.
See also
numpy.reshape
equivalent function
Notes
Unlike the free function numpy.reshape, this method on ndarray allows the elements of the shape parameter to be passed in as separate arguments. For example,
a.reshape(10, 11)
is equivalent toa.reshape((10, 11))
.
-
round
(decimals=0, out=None)¶ This docstring was copied from numpy.ndarray.round.
Some inconsistencies with the Dask version may exist.
Return a with each element rounded to the given number of decimals.
Refer to numpy.around for full documentation.
See also
numpy.around
equivalent function
-
property
size
¶ Number of elements in array
-
squeeze
(axis=None)¶ This docstring was copied from numpy.ndarray.squeeze.
Some inconsistencies with the Dask version may exist.
Remove single-dimensional entries from the shape of a.
Refer to numpy.squeeze for full documentation.
See also
numpy.squeeze
equivalent function
-
std
(axis=None, dtype=None, out=None, ddof=0, keepdims=False)¶ This docstring was copied from numpy.ndarray.std.
Some inconsistencies with the Dask version may exist.
Returns the standard deviation of the array elements along given axis.
Refer to numpy.std for full documentation.
See also
numpy.std
equivalent function
-
store
(sources, targets, lock=True, regions=None, compute=True, return_stored=False, **kwargs)¶ Store dask arrays in array-like objects, overwrite data in target
This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.
If your data fits in memory then you may prefer calling
np.array(myarray)
instead.- Parameters
- sources: Array or iterable of Arrays
- targets: array-like or Delayed or iterable of array-likes and/or Delayeds
These should support setitem syntax
target[10:20] = ...
- lock: boolean or threading.Lock, optional
Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular
threading.Lock
object to be shared among all writes.- regions: tuple of slices or list of tuples of slices
Each
region
tuple inregions
should be such thattarget[region].shape = source.shape
for the corresponding source and target in sources and targets, respectively. If this is a tuple, the contents will be assumed to be slices, so do not provide a tuple of tuples.- compute: boolean, optional
If true compute immediately, return
dask.delayed.Delayed
otherwise- return_stored: boolean, optional
Optionally return the stored result (default False).
Examples
>>> x = ...
>>> import h5py >>> f = h5py.File('myfile.hdf5', mode='a') >>> dset = f.create_dataset('/data', shape=x.shape, ... chunks=x.chunks, ... dtype='f8')
>>> store(x, dset)
Alternatively store many arrays at the same time
>>> store([x, y, z], [dset1, dset2, dset3])
-
sum
(axis=None, dtype=None, out=None, keepdims=False, initial=0, where=True)¶ This docstring was copied from numpy.ndarray.sum.
Some inconsistencies with the Dask version may exist.
Return the sum of the array elements over the given axis.
Refer to numpy.sum for full documentation.
See also
numpy.sum
equivalent function
-
swapaxes
(axis1, axis2)¶ This docstring was copied from numpy.ndarray.swapaxes.
Some inconsistencies with the Dask version may exist.
Return a view of the array with axis1 and axis2 interchanged.
Refer to numpy.swapaxes for full documentation.
See also
numpy.swapaxes
equivalent function
-
to_dask_dataframe
(self, columns=None, index=None, meta=None)¶ Convert dask Array to dask Dataframe
- Parameters
- columns: list or string
list of column names if DataFrame, single string if Series
- indexdask.dataframe.Index, optional
An optional dask Index to use for the output Series or DataFrame.
The default output index depends on whether the array has any unknown chunks. If there are any unknown chunks, the output has
None
for all the divisions (one per chunk). If all the chunks are known, a default index with known divsions is created.Specifying
index
can be useful if you’re conforming a Dask Array to an existing dask Series or DataFrame, and you would like the indices to match.- metaobject, optional
An optional meta parameter can be passed for dask to specify the concrete dataframe type to use for partitions of the Dask dataframe. By default, pandas DataFrame is used.
See also
-
to_delayed
(self, optimize_graph=True)¶ Convert into an array of
dask.delayed
objects, one per chunk.- Parameters
- optimize_graphbool, optional
If True [default], the graph is optimized before converting into
dask.delayed
objects.
See also
-
to_hdf5
(self, filename, datapath, **kwargs)¶ Store array in HDF5 file
>>> x.to_hdf5('myfile.hdf5', '/x')
Optionally provide arguments as though to
h5py.File.create_dataset
>>> x.to_hdf5('myfile.hdf5', '/x', compression='lzf', shuffle=True)
See also
da.store
h5py.File.create_dataset
-
to_svg
(self, size=500)¶ Convert chunks from Dask Array into an SVG Image
- Parameters
- chunks: tuple
- size: int
Rough size of the image
- Returns
- text: An svg string depicting the array as a grid of chunks
Examples
>>> x.to_svg(size=500)
-
to_tiledb
(self, uri, *args, **kwargs)¶ Save array to the TileDB storage manager
See function
to_tiledb()
for argument documentation.See https://docs.tiledb.io for details about the format and engine.
-
to_zarr
(self, *args, **kwargs)¶ Save array to the zarr storage format
See https://zarr.readthedocs.io for details about the format.
See function
to_zarr()
for parameters.
-
topk
(self, k, axis=-1, split_every=None)¶ The top k elements of an array.
See
da.topk
for docstring
-
trace
(offset=0, axis1=0, axis2=1, dtype=None, out=None)¶ This docstring was copied from numpy.ndarray.trace.
Some inconsistencies with the Dask version may exist.
Return the sum along diagonals of the array.
Refer to numpy.trace for full documentation.
See also
numpy.trace
equivalent function
-
transpose
(*axes)¶ This docstring was copied from numpy.ndarray.transpose.
Some inconsistencies with the Dask version may exist.
Returns a view of the array with axes transposed.
For a 1-D array this has no effect, as a transposed vector is simply the same vector. To convert a 1-D array into a 2D column vector, an additional dimension must be added. np.atleast2d(a).T achieves this, as does a[:, np.newaxis]. For a 2-D array, this is a standard matrix transpose. For an n-D array, if axes are given, their order indicates how the axes are permuted (see Examples). If axes are not provided and
a.shape = (i[0], i[1], ... i[n-2], i[n-1])
, thena.transpose().shape = (i[n-1], i[n-2], ... i[1], i[0])
.- Parameters
- axesNone, tuple of ints, or n ints
None or no argument: reverses the order of the axes.
tuple of ints: i in the j-th place in the tuple means a’s i-th axis becomes a.transpose()’s j-th axis.
n ints: same as an n-tuple of the same ints (this form is intended simply as a “convenience” alternative to the tuple form)
- Returns
- outndarray
View of a, with axes suitably permuted.
See also
ndarray.T
Array property returning the array transposed.
ndarray.reshape
Give a new shape to an array without changing its data.
Examples
>>> a = np.array([[1, 2], [3, 4]]) >>> a array([[1, 2], [3, 4]]) >>> a.transpose() array([[1, 3], [2, 4]]) >>> a.transpose((1, 0)) array([[1, 3], [2, 4]]) >>> a.transpose(1, 0) array([[1, 3], [2, 4]])
-
var
(axis=None, dtype=None, out=None, ddof=0, keepdims=False)¶ This docstring was copied from numpy.ndarray.var.
Some inconsistencies with the Dask version may exist.
Returns the variance of the array elements, along given axis.
Refer to numpy.var for full documentation.
See also
numpy.var
equivalent function
-
view
(self, dtype=None, order='C')¶ Get a view of the array as a new data type
- Parameters
- dtype:
The dtype by which to view the array. The default, None, results in the view having the same data-type as the original array.
- order: string
‘C’ or ‘F’ (Fortran) ordering
- This reinterprets the bytes of the array under a new dtype. If that
- dtype does not have the same size as the original array then the shape
- will change.
- Beware that both numpy and dask.array can behave oddly when taking
- shape-changing views of arrays under Fortran ordering. Under some
- versions of NumPy this function will fail when taking shape-changing
- views of Fortran ordered arrays if the first dimension has chunks of
- size one.
-
property
vindex
¶ Vectorized indexing with broadcasting.
This is equivalent to numpy’s advanced indexing, using arrays that are broadcast against each other. This allows for pointwise indexing:
>>> x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> x = from_array(x, chunks=2) >>> x.vindex[[0, 1, 2], [0, 1, 2]].compute() array([1, 5, 9])
Mixed basic/advanced indexing with slices/arrays is also supported. The order of dimensions in the result follows those proposed for ndarray.vindex: the subspace spanned by arrays is followed by all slices.
Note:
vindex
provides more general functionality than standard indexing, but it also has fewer optimizations and can be significantly slower.