NDArray | A Multidimensional Array library for Swift | Machine Learning library
kandi X-RAY | NDArray Summary
kandi X-RAY | NDArray Summary
NDArray is a multidimensional array library written in Swift that aims to become the equivalent of numpy in Swift's emerging data science ecosystem. This project is in a very early stage and has a long but exciting road ahead!.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of NDArray
NDArray Key Features
NDArray Examples and Code Snippets
import NDArray
let a = NDArray([
[1, 2, 3],
[4, 5, 6],
])
let b = NDArray([
[7, 8, 9],
[10, 11, 12],
])
print((a + b) * a)
/*
NDArray[2, 3]([
[8, 20, 36],
[56, 80, 108],
])
*/
import NDArray
struct Point: AdditiveArithmeti
Community Discussions
Trending Discussions on NDArray
QUESTION
From numpy
documentation:
numpy.ndarray.nbytes¶ attribute
ndarray.nbytes Total bytes consumed by the elements of the array.
Notes
Does not include memory consumed by non-element attributes of the array object.
The following code:
...ANSWER
Answered 2022-Apr-14 at 07:29In python (an in general), one int64 consumes 8 bytes of memory.
Slicing x[0]
you get 10 elements:
QUESTION
I am fairly new to Tensorflow and I am having trouble with Dataset. I work on Windows 10, and the Tensorflow version is 2.6.0 used with CUDA. I have 2 numpy arrays that are X_train and X_test (already split). The train is 5Gb and the test is 1.5Gb. The shapes are:
X_train: (259018, 30, 30, 3),
Y_train: (259018, 1),
I create Datasets using the following code:
...ANSWER
Answered 2021-Sep-03 at 09:23That's working as designed. from_tensor_slices is really only useful for small amounts of data. Dataset is designed for large datasets that need to be streamed from disk.
The hard way but ideal way to do this would be to write your numpy array data to TFRecords then read them in as a dataset via TFRecordDataset. Here's the guide.
https://www.tensorflow.org/tutorials/load_data/tfrecord
The easier way but less performant way to do this would be Dataset.from_generator. Here is a minimal example:
QUESTION
When passing a numpy.ndarray
of uint8
to numpy.logical_and
, it runs significantly faster if I apply numpy.view(bool)
to its inputs.
ANSWER
Answered 2022-Feb-22 at 20:23This is a performance issue of the current Numpy implementation. I can also reproduce this problem on Windows (using an Intel Skylake Xeon processor with Numpy 1.20.3). np.logical_and(a, b)
executes a very-inefficient scalar assembly code based on slow conditional jumps while np.logical_and(a.view(bool), b.view(bool))
executes relatively-fast SIMD instructions.
Currently, Numpy uses a specific implementation for bool
-types. Regarding the compiler used, the general-purpose implementation can be significantly slower if the compiler used to build Numpy failed to automatically vectorize the code which is apparently the case on Windows (and explain why this is not the case on other platforms since the compiler is likely not exactly the same). The Numpy code can be improved for non-bool
types. Note that the vectorization of Numpy is an ongoing work and we plan optimize this soon.
Here is the assembly code executed by np.logical_and(a, b)
:
QUESTION
I have a set of images represented by a 3d ndarray. Abstractly what I want to do is, delete an entire image if any of its pixel values is a nan. Imagine we got the following ndarray:
...ANSWER
Answered 2022-Feb-10 at 22:46As already answered by Michael Szczesny, more Pythonic way would be:
QUESTION
Currently I have the following cython function, modifies entries of a numpy array filled with zeros to sum non-zero values. Before I return the array, I would like to trim it and remove all the non-zero entries. At the moment, I use the numpy function myarray = myarray[~np.all(myarray == 0, axis=1)]
to do so. I was wondering if there is (in general) a faster way to do this using a Cython/C function instead of relying on python/numpy. This is one of the last bits of pythonic interactions in my script (checked by using to %%cython -a
). But I don't really know how to proceed with this problem. In general, i don't know a priori the number of nonzero elements in the final array.
ANSWER
Answered 2022-Jan-29 at 11:54If the highest dimension contains always a small number of element like 6, then your code is not the best one.
First of all, myarray == 0
, np.all
and ~
creates temporary arrays that introduces some additional overhead as they needs to be written and read back. The overhead is dependent of the this of the temporary array and the biggest one is myarray == 0
.
Moreover, Numpy calls perform some unwanted checks that Cython is not able to remove. These checks introduce a constant time overhead. Thus, is can be quite big for small input arrays but not big input arrays.
Additionally, the code of np.all
can be faster if it would know the exact size of the last dimension which is not the case here. Indeed, the loop of np.all
could theoretically be unrolled since the last dimension is small. Unfortunately, Cython does not optimize Numpy calls and Numpy is compiled for a variable input size, so not known at compile-time.
Finally, the computation can be parallelized if lenpropen
is huge (otherwise this will not be faster and could actually be slower). However, note that a parallel implementation requires the computation to be done in two steps: np.all(myarray == 0, axis=1)
needs to be computed in parallel and then you can create the resulting array and write it by computing myarray[~result]
in parallel. In sequential, you can directly overwrite myarray
by filtering lines in-place and then produce a view of the filtered lines. This pattern is known as the erase-remove idiom. Note that this assume the array is contiguous.
To conclude, a faster implementation consists writing 2 nested loops iterating on myarray
with a constant number of iterations for the innermost one. Regarding the size of lenpropen
, you can either use a sequential in-place implementation base on the erase-remove idiom, or a parallel out-of-place implementation with two steps (and a temporary array).
QUESTION
I'm writing C++ ndarray class. I need both dynamic-sized and compile-time-size-known arrays (free store allocated and stack allocated, respectively).
I want to support initializing from nested std::initializer_list
.
Dynamic-sized one is OK: This works perfectly.
...ANSWER
Answered 2022-Jan-29 at 16:25Yes, there's no reason why constexpr std::initializer_list would be unusable in compile-time initialization.
From your code snippet, it's unclear whether you've used an in-class initialization for StaticArray members, so one of the issues you could've run into is that a constexpr constructor can't use a trivial constructor for members which would initialize them to unspecified run-time values.
So the fix for your example is to default-initialize StaticArray members and specify constexpr for the constructor, checkDims, addList and data. To initialize a runtime StaticArray with constexpr std::initializer_list validated at compile-time, you can make the initializer expression manifestly constant-evaluated using an immediate function.
As you probably realize, it is impossible to initialize a run-time variable at compile-time so that's the best one can do.
If what you wanted is to validate at compile-time the dimensions of an std::initializer_list that depends on runtime variables, it can't be done -- the std::initializer_list is not constexpr, so its size isn't either. Instead, you can define a wrapper type around Scalar, mark its default constructor as deleted, and accept an aggregate type of these wrappers in the StaticArray constructor, for example a nested std::array of the desired dimensions, or, to avoid double braces, a C-style multidimensional array. Then, if the dimensions don't match, compilation will fail for either of the two reasons: too many initializers or the use of the deleted default constructor.
The code below compiles on godbolt with every GCC, Clang, MSVC version that supports C++20.
QUESTION
import pandas as pd
df = pd.DataFrame({
"col1" : ["a", "b", "c"],
"col2" : [[1,2,3], [4,5,6,7], [8,9,10,11,12]]
})
df.to_parquet("./df_as_pq.parquet")
df = pd.read_parquet("./df_as_pq.parquet")
[type(val) for val in df["col2"].tolist()]
...ANSWER
Answered 2021-Dec-15 at 09:24You can't change this behavior in the API, either when loading the parquet file into an arrow table or converting the arrow table to pandas.
But you can write your own function that would look at the schema of the arrow table and convert every list
field to a python list
QUESTION
I'm trying to understand the performance differences I am seeing by using various numba
implementations of an algorithm. In particular, I would expect func1d
from below to be the fastest implementation since it it the only algorithm that is not copying data, however from my timings func1b
appears to be fastest.
ANSWER
Answered 2021-Dec-21 at 04:01Here, copying of data doesn't play a big role: the bottle neck is fast how the tanh
-function is evaluated. There are many algorithms: some of them are faster some of them are slower, some are more precise some less.
Different numpy-distributions use different implementations of tanh
-function, e.g. it could be one from mkl/vml or the one from the gnu-math-library.
Depending on numba version, also either the mkl/svml impelementation is used or gnu-math-library.
The easiest way to look inside is to use a profiler, for example perf
.
For the numpy-version on my machine I get:
QUESTION
I have a class template for an N-dimensional array:
...ANSWER
Answered 2021-Nov-09 at 07:09constinit
has exactly and only one semantic: the expression used to initialize the variable must be a constant expression.
That's it. That's all it does: it causes a compile error if the initializing expression isn't a valid constant expression. In every other way, such a variable is identical to not having constinit
there. Indeed, the early versions of the proposal had constinit as an attribute rather than a keyword, since it didn't really do anything. It only gave a compile error for invalid initialization of the variable.
For this case, there is no reason not to make the variable constexpr
. You clearly don't intend to change the variable, so there's no point in making the variable modifiable.
QUESTION
I am trying to segment lung CT images using Kmeans by using code below:
...ANSWER
Answered 2021-Sep-20 at 00:21For this problem, I don't recommend using Kmeans color quantization since this technique is usually reserved for a situation where there are various colors and you want to segment them into dominant color blocks. Take a look at this previous answer for a typical use case. Since your CT scan images are grayscale, Kmeans would not perform very well. Here's a potential solution using simple image processing with OpenCV:
Obtain binary image. Load input image, convert to grayscale, Otsu's threshold, and find contours.
Create a blank mask to extract desired objects. We can use
np.zeros()
to create a empty mask with the same size as the input image.Filter contours using contour area and aspect ratio. We search for the lung objects by ensuring that contours are within a specified area threshold as well as aspect ratio. We use
cv2.contourArea()
,cv2.arcLength()
, andcv2.approxPolyDP()
for contour perimeter and contour shape approximation. If we have have found our lung object, we utilizecv2.drawContours()
to fill in our mask with white to represent the objects that we want to extract.Bitwise-and mask with original image. Finally we convert the mask to grayscale and bitwise-and with
cv2.bitwise_and()
to obtain our result.
Here is our image processing pipeline visualized step-by-step:
Grayscale ->
Otsu's threshold
Detected objects to extract highlighted in green ->
Filled mask
Bitwise-and to get our result ->
Optional result with white background instead
Code
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install NDArray
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page