Up to date
This page is up to date for NumDot stable.
If you still find outdated information, please open an issue.
Changelog
Here you will find the release notes for each version of the library. Each section includes information about changes, improvements, and bug fixes.
Version 0.12 - 2026-06-17
Many bugs in this release were found by running NumDot against the Python array-api-tests conformance suite from the Consortium for Python Data API Standards.
Added
nd.where(condition, x, y)selects fromxwhereconditionis true and fromyotherwise, with broadcasting across all three operands.New elementwise math functions:
nd.log2,nd.log10,nd.log1p,nd.expm1,nd.logaddexp,nd.hypot,nd.copysign,nd.signbit, andnd.floor_divide(Python-style floor toward-infinity, including for integer inputs).nd.cumsumandnd.cumprodcompute cumulative sums and products along an axis (or over the flattened input whenaxisis null).nd.diff(a, n, axis)computes the n-th discrete difference along the given axis. Output shrinks bynalong that axis (empty ifnexceeds the axis length).nd.meshgrid(arrays, indexing)builds coordinate grids from a list of 1-D arrays.indexingis aStringNameaccepting&"xy"(default, swaps the first two output axes) or&"ij".nd.expand_dims(v, axis)inserts a length-1 dimension at the given axis (counts from the end if negative). Returns a view, no copy.nd.broadcast_to(v, shape)returns a view ofvstretched toshape. Front-padded axes and any input axes of length 1 are broadcast (zero-stride); the rest must match the target dimension or the call errors.nd.squeeze(v, axes)accepts an optionalaxesargument (int or list) selecting which length-1 axes to drop. The requested axes must all be size 1 or the call errors. Withoutaxesthe previous behavior is unchanged: drop every length-1 axis.nd.moveaxisaccepts lists forsrcanddst(in addition to single ints), moving multiple axes in one call.nd.moveaxis(arr, [0, 1], [-1, -2])swaps the first two axes to the end.nd.roll(v, shift, axis)cyclically shifts elements;axismay be null (flatten), an int, or a list paired withshift. Negative and over-sized shifts are normalized.nd.repeat(v, repeats, axis)repeats each element alongaxis.repeatsis an int (every element) or an array (one count per element along the axis); a length-1 array broadcasts as a scalar.axis = nullflattens first.nd.argmax(a, axis)andnd.argmin(a, axis)return the int64 index of the max / min element alongaxis(flattened whenaxisis null).nd.nonzero(a)returns one int64 1-D array per dimension with the indices of non-zero elements.
Changed
nd.split(and thend.hsplit/nd.vsplitshorthands) now treats an integerindices_or_sectionsas the number of equal sub-arrays to produce, matching numpy. Previously the integer was interpreted as the size of each chunk, sond.split(arr, 3, 0)on a length-3 axis returned one full-sized array instead of three length-1 arrays. The parameter has been renamed fromindices_or_section_sizetoindices_or_sectionsto reflect this. Callers passing the chunk size will silently get different results — passaxis_length / old_valueto recover the previous behavior.Binary ops (
nd.add,nd.subtract, comparisons, bitwise, ...) no longer widen the result dtype when one operand is a GDScriptint/float/boolscalar.nd.add(uint8_arr, 5)now staysuint8instead of returningint64, matching the Array API spec and numpy 2.x's NEP-50 promotion. To keep the old widening behavior, wrap the scalar as a typed array (e.g.nd.array(5, nd.Int64)).
Fixed
nd.roundandnd.rintnow pass integer/bool arrays through unchanged (used to returnnull).nd.roundalso now works on complex arrays.nd.clipacceptsnullforminand/ormaxto leave that side unbounded (used to error).nd.clipfollows the same scalar-promotion rule as binary ops, sond.clip(uint8_arr, 0, 255)staysuint8.Reductions (
nd.sum,nd.mean,nd.min,nd.max,nd.std,nd.var, ...) accept axis lists in any order and with negative indices.nd.mean(arr, [1, 0])used to returnnull.nd.prodandnd.sumon narrow integer dtypes (int8..``int32``,uint8..``uint32``) accumulate at the wider output dtype (int64/uint64) instead of overflowing.nd.prod([1291, 1291, 1291])now returns2151685171instead of-2143282125.nd.array(x, dtype)supports casting from complex sources:complex128 → complex64,complex → real(drops the imaginary part with a warning), andcomplex → bool. These previously errored.Boolean conversion of complex arrays was inverted:
ndb.all([1+0j])came backfalse.nd.array(complex_arr, nd.Bool),ndb.allandndb.anynow returntruefor nonzero values.nd.concatenateacceptsnullforaxis, flattening inputs before concatenating.nd.dumpbproduces correct bytes for non-contiguous arrays (e.g. results ofnd.flip/nd.transpose/ strided slices). Saving and reloading these used to silently corrupt the data.nd.transposeaccepts negative axes in the permutation;nd.transpose(arr, [-1])no longer returnsnull.nd.flipaccepts negative axes;nd.flip(arr, -1)used to crash.Result dtype for
nd.concatenate,nd.linspace,nd.arange,nd.matmul/nd.dot, and array-from-nested-array conversion follows numpy'sresult_typerules:uint8 + uint16 → uint16(wasint32),int32 + uint32 → int64,int64 + uint64 → float64.nd.linspace(a, b, num)landsout[-1]exactly onbwhenendpointis true (used to drift a few ULPs, e.g.29.000000000000004instead of29.0).nd.linspace(a, b, 1)returns[a]instead of[nan].nd.bitwise_right_shifton negative signed integers now sign-extends when the shift count meets or exceeds the dtype's bit width (e.g.int32(-1) >> 32now returns-1, was0), matching the array-api spec.nd.logno longer returnsnanon complex inputs whose magnitude is near the dtype's maximum (e.g.complex64with|z| ≈ 1.8e19).nd.signon complex inputs now returnsz / |z|(e.g.nd.sign(0.5+1j)is0.4472+0.8944j, not1+0j), matching the array-api spec.nd.abson multi-element complex arrays no longer returnsnanfor elements whose magnitude is near the dtype's maximum (e.g.complex64with|z| ≈ 1.8e19); 0-D scalars were already correct.nd.eyewith a very largekreturns all zeros instead of placing a diagonal of ones (e.g.nd.eye(1, k=2**32)now returns[[0.0]], used to return[[1.0]]).nd.divideon complex inputs whose magnitudes are near the dtype's maximum (e.g.complex64with|denom| ≈ 1.8e19) used to return0ornanfor the result; it now returns the correct quotient.nd.arangewith very large integerstart/stopand a floatstepreturned the wrong number of elements (e.g.nd.arange(-2305843009213692800, -2305843009213694530, -91.0)returned 17 instead of 20), and could return a non-empty array where it should have been empty.nd.atanandnd.atanhon complex inputs at extreme magnitudes (e.g.atan(1+2.77e7j)) used to return±π/2with the wrong sign,inf, ornan; they now return the correct value.nd.reshapefrom a 1-D array to a multi-dimensional shape used to silently produce column-major output (e.g.nd.reshape(nd.array([1, 2, 3, 4, 5, 6]), [2, 3])returned[[1, 3, 5], [2, 4, 6]]); it now returns row-major[[1, 2, 3], [4, 5, 6]]to match numpy / NumDot's general convention.nd.arangereturns an empty array whenstephas the wrong sign forstop - start(used to return garbage data).nd.arangewithstep = 0is rejected with a clean error.nd.arangewith integer arguments above2**53could return the wrong number of elements; integer dtypes now use exact integer arithmetic.Reductions over a non-last axis of a multi-dimensional array (e.g.
nd.mean(matrix, 0),nd.sum,nd.std,nd.var,nd.max) returned an empty or wrong-shaped result on Linux and Windows builds; macOS was unaffected.NDArray.to_godot_arrayreturned wrong data (and could crash Godot) on arrays with more than two elements.nd.concatenate/nd.hstack/nd.vstackno longer fail with a shape-mismatch error on a singleNDArray.Slice errors in array conversions (
to_godot_array,to_packed_*,to_vector*,copy,as_type, iteration) raise a Godot error instead of crashing.
Version 0.11 - 2025-09-14
Fixed
nd.arangeno longer produces garbage data.nd.reshapedid not provide a view (instead gave a copy) when given 0-D and 1-D arrays.nd.reshapedid not fill in the result array when given single-element 1-D arrays.Multi-indexing used to crash when using more than 4 indices.
Version 0.10 - 2025-07-08
Careful, we have some breaking changes in this update! Make sure you read the section below closely before updating. I apologize for the inconvenience caused to running projects!
Added
nd.outerandnd.innerfunctions for dedicated vector multiplication.nd.squeezefunction.Mathematical constants (
pi,e,euler_gamma,inf,nan). These are currently added as functions, due to limitations of Godot's APIs.
Changed
array.get(null)will no longer select all elements. Instead, it will add an new dimension, analogous tonp_array[None]. Usearray.get(&":")to select all elements instead.nd.reshapeno longer re-interprets the previous shape (if layout is not row-major). Instead, it iterates the previous array in the correct order, filling elements one by one.nd.flattenno longer makes a copy if it doesn't need to.nd.reduce_dotis now callednd.sum_product.Properties are now accessed without parentheses, e.g.
array.shapeinstead ofarray.shape(). This holds fordtype,shape,size,buffer_dtype,buffer_size,buffer_size_in_bytes,ndim,strides,strides_layout, andstrides_offset.
Fixed
array.get(0)andarray.get(&"newaxis")no longer fails or crashes the program.Restored compatibility with older Linux OS by downgrading to GLIBC 2.35.
Version 0.9 - 2025-04-29
Changed
Update
xtensorto0.26.0,xsimdto13.2.0,xtlto0.8.0. This may come with both improvements and regressions.
Fixed
nd.anyandnd.allalways incorrectly evaluated totrueandfalserespectively.Fix
nd.linspacefor non-floating types.
Version 0.8 - 2025-03-05
The backend of NumDot has been completely rewritten! The changes shrink the binary size in half, and make some functions a lot faster. Some bugs may have been silently fixed or introduced in the process. Let us know if you run into any trouble! In addition, there's unit tests now that check validity of functions against NumPy implementations as the ground truth. This should help make NumDot functions more reliable.
Added
Added
nd.loadandnd.savebfunctions, to read and write files from.npy.Added
array_equivfunction.Added conversion functions for
Plane,Quaternion,ProjectionandBasis.Added
NDArrayfunctions for getting slices as variant types (if shape is compatible).Added
complex64andcomplex128array creation shorthand functions.NumDot now appears in the
pluginssection of the editor preferences.NumDot now builds for Android (
x86_32,x86_64,arm32).
Changed
Optimizations in the build scheme and code architecture reduced the binary size by 50%.
array_equalnow also checks for shape equivalence, and doesn't fail if the shapes are not broadcastable.1-D array assignment is now about 3.5 times faster.
Single-slice indexing now has a lower latency.
Contiguous array conversions (from
NDArrayto godot arrays) has been optimized, and can be several times faster.nd.addandnd.abs,nd.remainder,nd.powandnd.remainderno longer promote values to higher bit counts.complexnumbers can now be booleanized in some situations.
Fixed
Various functions resulting in bools were broken. This is now fixed.
NDArrayfound in godot arrays will now properly type hint the resulting array, avoiding accidental promotion.arangewas producing garbage data.bitwise_left_shiftandbitwise_right_shiftwere incorrectly promoting, and producing undefined behavior when the shift was larger than the bit count (now it defaults to 0).
Version 0.7 - 2024-11-12
Added
Added complex numbers data types (
complex64andcomplex128).Added
real,imag,conjugateandanglefunctions for complex numbers.Added
complex_as_vectorandvector_as_complexfunctions for convenient complex number creation and manipulation, similar torealandimag.Added
anylayout type, which may bring tiny speed improvements.Added
fftandfft_freqfunctions.Added
padfunction.Added
crossfunction.Added
ndarray.buffer_sizeandndarray.buffer_dtypefunctions for investigation of underlying buffer types.Added bitwise functions (
bitwise_and,bitwise_or,bitwise_xor,bitwise_not,bitwise_left_shift,bitwise_right_shift).Added matrix
diagonal,diagandtracefunctions.Added
transposeandflattentoNDArraymethods.Added
is_close,array_equalandall_closefunctions.Added
is_nan,is_infandis_finitefunctions.
Changed
In-place adaptations of native godot types speed up conversions (to and from NumDot). In particular, in-place adaptations of packed arrays do not need to copy data on read, and will produce instantaneous copy-on-write copies on
to_packed_xxx_arraycalls for the same type.ndarray.array_size_in_bytesis now calledndarray.buffer_size_in_bytes.Custom builds can now disable each function / feature individually. This allows for very fine control of what to include in a custom build, which can reduce NumDot builds down to almost 0mb.
Removed
NUMDOT_DISABLE_GODOT_CONVERSION_FUNCTIONSto improve usability. A similar option may be re-added to de-optimize conversions to save space.Functions no longer declare unnecessary default values.
transpose()can now be called without passing a parameter, which reverses the axes.
Fixed
arangeproduced 0-size arrays when at least two arguments were passed.
Version 0.6 - 2024-10-28
Added
randnfunction (random sampling from a normal distribution)For custom builds, OpenMP support (through the new
openmp_thresholdcompile option, disabled by default). This requires your compiler to support OpenMP.Add support for web exports (wasm32).
Changed
Contiguous scalar assignment (e.g.
array.set(0)) is now about 20x as fast as before.Mask assignments with scalars are now a bit faster.
Fixed
Assignment from boolean to boolean arrays didn't work properly.
Setting with a mask of incompatible shape to the array didn't properly fail.
Version 0.5 - 2024-10-07
Added
Bounds checks are now enabled everywhere.
Negative indices are now supported everywhere.
Added boolean mask indexing, e.g.
a.set(5, nd.greater(a, 5)).Added index list indexing, e.g.
a.set(5, [[0, 1], [4, 2]]).Added a basic
convolvefunction.Added the
sliding_window_viewfunction.Added
array.copy()andnd.copy(array)functions.Added
positiveandnegativefunctions.Added
count_nonzerofunctions.Added
concatenate,hstackandvstackfunctions.Added
split,hsplitandvsplitfunctions.Added the
tilefunction.Added scalar optimizations for binary functions. This will greatly accelerate calls like
nd.add(array, 5), at the cost of some binary size. This behavior can be disabled with the build flagNUMDOT_DISABLE_SCALAR_OPTIMIZATION.nd.matmulcan now handle matrix-vector multiplication.
Changed
Added
array.set(x)should now be slightly faster when only a single element is updated.Accelerated
sum_product. This also affectsmatmul,dot, andconvolveoperations.
Fixed
nd.rangenow behaves properly when called asnd.range(x, null)(i.e. range from x to end).NDArrayinterpretation inside of Arrays would result ininhomogenous shapeerrors.Fixed
NDArray.to_godot_array()producing garbage data and shapes.Fixed
NDArray.to_packed_xxxproducing arrays that were too large.Fixed
zeros_likeand similar producing garbage arrays when the dtype is not given.
Version 0.4 - 2024-10-03
Added
Added NDRandomGenerator, created by
nd.default_rng. It offers.random()for floats,.integersfor ints and.spawn()for child generators.Added new namespaces ndb, ndf and ndi, for full tensor reductions to
bool,floatandint, respectively.Added
nd.median.NDArrayis now iterable over the outermost dimension.NDArrayconversion functions to and fromColor,Vector2,Vector3,Vector4,Vector2i,Vector3i,Vector4i,PackedVector2Array,PackedVector3Array,PackedVector4ArrayandPackedColorArray.Added
nd.as_arrayshorthands for every data type, e.g.nd.float32.(Now really) added the
logical_xorfunction.Added
nd.eye.Added
nd.empty_like,nd.full_like,nd.ones_likeandnd.zeros_like.Added
NDArray.strides(),NDArray.strides_layout(), andNDArray.strides_offset(), through which you can inspect the strides properties of anNDArray/NDArrayview.
Changed
nd.arrayandnd.as_array,NDArray.get_float,NDArray.get_int,NDArray.get_boolare now up to 2x faster.NDArray.to_godot_arraynow slices into the outermost dimension instead of flattening the array. To get floats and ints directly, use.to_packedxxx.NDArray.to_packed_xxxnow require 0D or 1D arrays to work. If the array is 2D, the conversion is not trivial, and a reshape should be used first.NumDot now uses
Vector4ias a surrogate for range objects. They are represented as (bitmask, start, stop, step). This optimizes range creation, interpretation and memory use.
Version 0.3 - 2024-09-25
Added
Added the
dotandsum_productfunctions.Added the
matmulfunction.nd.array([...])can now handle more complex array inputs, e.g. an array ofVector2i.Added the
stackandunstackfunctions.Added NDArray
to_boolandget_boolfunctions.nd.fullnow supports bools and arrays for the fill value.Axes, shape and permutation parameters now have support for more different argument types (including NDArrays).
Added
NUMDOT_COPY_FOR_ALL_INPLACE_OPERATIONSflag. This flag allows custom builds to de-optimize in-place operations even for optimal types. This reduces the binary size.Added
NUMDOT_OPTIMIZE_ALL_INPLACE_OPERATIONSflag. This flag allows custom builds to optimize all in-place operations, even for non-optimal target types. This increases the binary size a lot and is not recommended.
Changed
In-place operations with optimal destination types are now optimized by default.
Removed
NUMDOT_ASSIGN_INPLACE_DIRECTLY_INSTEAD_OF_COPYING_FIRSTcompile flag.
Fixed
NDArray
setdidn't honor the index parameters, and didn't broadcast.
Version 0.2 - 2024-09-20
Added
Added an in-place API to NDArray objects, mirroring the nd API. In-place functions can substantially improve performance for small arrays, because creation of intermediate types is avoided.
Added the
NUMDOT_ASSIGN_INPLACE_DIRECTLY_INSTEAD_OF_COPYING_FIRSTcompiler flag, which improves performance of same-type assignment while increasing the binary size.Added the
normfunction (l0, l1, l2 and linf supported).Added the
logical_xorfunction.Added the
anyandallfunctions.Added the
squarefunction.Added the
clipfunction.nd.arraycan now interpret multi-dimensional boolean arrays.Documentation is now available in the editor.
Changed
Reduced the binary size by half. In exchange, decrease performance of operations that need a cast before running by ~25%. The C define
NUMDOT_CAST_INSTEAD_OF_COPY_FOR_ARGUMENTSlets you revert to the old behavior.Optimized the compiler arguments for the release binary. On web, it optimizes for size (~30% decrease). For downloadable binaries, it optimizes for performance (2% to 30% increase). You can use custom builds to change the default behavior.
Fixed
Reduction functions now behave properly when casting (they used to crash or produce meaningless results).
Array creation could often lead to the wrong dtype.
nd.proderroneously evaluated asnd.sum.
Version 0.1 - 2024-09-17
Initial release.