KDTree | Simple C++ KD-Tree implementation | Dataset library
kandi X-RAY | KDTree Summary
kandi X-RAY | KDTree Summary
Simple C++ static KD-Tree implementation with minimal functionality.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of KDTree
KDTree Key Features
KDTree Examples and Code Snippets
Community Discussions
Trending Discussions on KDTree
QUESTION
I have been looking at producing a multiplication function to be used in a method called Conflation. The method can be found in the following article (An Optimal Method for Consolidating Data from Different Experiments). The Conflation equation can be found below:
I know that 2 lists can be multiplied together using the following codes and functions:
...ANSWER
Answered 2021-Apr-02 at 17:12In the second prod_pdf
you are using computed PDFs while in the first you were using defined distributions. So, in the second prod_pdf
you already have the PDF. Thus, in the for loop you simply need to do p_pdf = p_pdf * pdf
From the paper you linked, we know that "For discrete input distributions, the analogous definition of conflation is the normalized product of the probability mass functions". So you need not only to take the product of PDFs but also to normalize it. Thus, rewriting the equation for a discrete distribution, we get
where F
is the number of distributions we need to conflate and N
is the length of the discrete variable x
.
QUESTION
For a dataframe containing coordinate columns (e.g. 'x', 'y') I would like to check if the associated value 'val' deviates from the mean of 'val' in the local (distance to coordinates < radius) neighbourhood. I found following approach which is often used (e.g. here or here), building a KDTree and querying for each row the local mean. However I'm wondering if there is a better solution which prevents the dataframe iteration leading to a faster execution?
...ANSWER
Answered 2021-Feb-27 at 17:35There might be away to avoid looping all together that I haven't figured out yet, but an easy solution you can apply is to place your values needed into arrays, and then perform vectorized operations on those arrays. I did some tests and this averaged around 40% decrease in execution time.
QUESTION
I am trying to create a kd-tree through scipy's KD_tree class built by objects rather than pure coordinates. The objects has a (x,y) tuple, and the tree is based upon this, but i would like to include the object itself as the node/in the node.
Is there some "easy" approach to this? Had a look on scipy kdtree with meta data, which says to use a third dimension as a object pointer(?). Wouldn't the tree then apply this value to the comparison of neighbors? I am also in the same boat as this gentleman, where creating my own kd-tree would be nice to skip for now.
PS. This is my first post, so be gentle with me ;)
...ANSWER
Answered 2021-Feb-09 at 12:53The API of scipy's KdTree wants a 2D array of coordinates as input and not any sort of object array. In this array the rows are the points and the cols the coordinates of those points.
In the question you link to, he doesn't mean that there is a third dimension but a third index. Suppose you are looking for a single nearest neighbor and you query using some point, the function will return a distance and an index. The index is a reference into the array with which you built the tree. The distance is the difference in distance between your query point and the tree point.
So to use this tree you could keep two arrays. One with the object coordinates and a second one with the objects. They should be in the same order, so that when a query returns an index, they mean the same thing in both arrays.
PS. This is my first answer, so also be gentle :D
QUESTION
So I'm trying to find the k nearest neighbors in a pyvista numpy array from an example mesh. With the neighbors received, I want to implement some region growing in my 3d model.
But unfortunaley I receive some weird output, which you can see in the following picture.
It seems like I'm missing something on the KDTree
implementation. I was following the answer on a similar question: https://stackoverflow.com/a/2486341/9812286
ANSWER
Answered 2021-Jan-25 at 21:18You're almost there :) The problem is that you are using the points in the mesh to build the tree, but then extracting cells. Of course these are unrelated in the sense that indices for points will give you nonsense when applied as indices of cells.
Either you have to extract_points
:
QUESTION
I've got a package which I've previously successfully built on ReadTheDocs, but this is no longer the case. My imports are as follows:
...ANSWER
Answered 2021-Jan-25 at 13:26Based on feedback from a helpful user, I eventually arrived at a less hack'y solution. Unfortunately, the discussion went AWOL because of an unhelpful user, who responded with pip install -U numpy
, waited for me to figure it out, edited their answer and requested I accept it. Upon being denied, the answer and comment thread vanished. I don't even remember your name, helpful user, so I can't credit you for the tip.
Apparently ReadTheDocs uses an old pip, and requiring pip>=19.0
makes scikit-learn not install from source. As such, I added that line to docs/requirements.txt
, which I had previously set up to be a ReadTheDocs requirement file. This resulted in some progress - now rather than scikit-learn complaining about numpy, it was numba. Still, some synapses connected, and I just handled any dependency problems that arose via docs/requirements.txt
, the final contents of which are:
QUESTION
I need a memory & time efficient method to compute distances between about 50000 points in 1- to 10-dimensions, in Python. The methods I tried so far were not very good; so far, I tried:
scipy.spatial.distance.pdist
computes the full distance matrixscipy.spatial.KDTree.sparse_distance_matrix
computes the sparse distance matrix up to a threshold
To my surprise, the sparse_distance_matrix
was badly underperforming. The example I used was 5000 points chosen uniformly from the unit 5-dimensional ball, where pdist
returned me the result in 0.113 seconds and the sparse_distance_matrix
returned me the result in 44.966 seconds, when I made it use the threshold 0.1 for the maximum distance cutoff.
At this point, I would just stick with pdist
, but with 50000 points, it will be using a numpy array of 2.5 x 10^9 entries, and I'm concerned if it will overload the runtime (?) memory.
Does anyone know a better method, or sees a glaring mistake in my implementations? Thanks in advance!
Here's what's needed to reproduce the output in Python3:
...ANSWER
Answered 2021-Jan-18 at 19:40import numpy as np
from sklearn.neighbors import BallTree
tic = time.monotonic()
tree = BallTree(sample, leaf_size=10)
d,i = tree.query(sample, k=1)
toc = time.monotonic()
print(f"Time taken from Sklearn BallTree = {toc-tic}")
QUESTION
After many tries and a lot research, I still can't link OpenGL. The software I'm trying to compile was made on Ubuntu 18.04 and compiled fine while I'm now on Ubuntu 20.04.
This is the CMakeLists used on Ubuntu 18.04:
...ANSWER
Answered 2021-Jan-10 at 22:26It's hard to diagnose your problem without the minimal reproducible example. But what I see is that your system has both libGL
and libOpenGL
. This may mean that libGL
is just a proxy for libglvnd
and doesn't contain any of the GL API functions.
But you shouldn't rush to link directly to libOpenGL
. Different systems may be configured differently. Instead, the correct way of locating OpenGL objects with CMake is to use find_package(OpenGL)
and then include OpenGL::GL
in your target_link_libraries
.
Example dummy project:
CMakeLists.txt
QUESTION
Using the nanoflann-library for k-nearest-neighbor searches based on KDTrees I encountered a very strange behavior. My Code is a simple set of queries:
...ANSWER
Answered 2020-Dec-02 at 05:28The result set appears to be stateful - it's always showing you the nearest overall neighbor of all the points. For instance, if you loop from 5 to 10 you get 5 50
for each iteration
Reinitialize the result set each iteration and you'll get your desired behavior:
QUESTION
Problem statement:
I have 150k points in a 3D space with their coordinates stored in a matrix with dimension [150k, 3] in mm.
I want to find all the neighbors of a given point p
that are within a radius r
. And I want to do that in the most accurate way.
How should I choose my leafsize
parameter ?
ANSWER
Answered 2020-Nov-25 at 19:30The function query_ball_point
will return the correct set of points for any version of the search tree. The leafsize
parameter does not impact the results of the query, only the performance of the results.
Imagine two trees shown below for the same data (but different leafsize parameters) and a query searching for all points inside the red circle.
In both cases, the code will only return the two points that lie inside the red circle. This is done by checking all points in all boxes of the tree that intersect the circle. This leads to a different amount of work (i.e., different performance) in each case. For the left tree (corresponding to a larger leafsize), the algorithm has to check if 13 points are inside the circle (6 in the upper intersecting box and 7 in the lower intersecting box). In the right tree (which has a smaller leaf size), only three points get processed (one in the upper intersecting box and two in the lower intersecting box).
Following this logic, you may think it just makes sense to always use a small leaf size: this will minimize the number of actual comparisons at the end of the algorithm (do decide if the points actually lie in the query region). But it isn't that simple: the smaller leaf size will generate a deeper tree adding cost to the construction time and to the tree traversal time. Getting the right balance of tree-traversal performance with the leaf-level comparisons really depends on the type of data going into the tree and the specific leaf-level comparisons you are doing. Which is why scipy provides the leafsize parameter as an argument so you can tune things to perform best on a particular algorithm.
QUESTION
I'm on macOS Mojave 10.14.6 and I'm trying to compile some required extensions modules in c and c++ from this repository with:
python setup.py build_ext --inplace
which gives me the following error:
...ANSWER
Answered 2020-Oct-15 at 15:58Here are a few hints:
Use gcc instead of llvm or clang for painless openmp-support on macOS. Note that apple's default gcc is just an alias for Apple clang as you'll see with
gcc --version
. You can install the real gcc with homebrew:brew install gcc
.Then use
export CC='gcc-10'
(the newest version should be gcc 10.x) inside the same terminal window to use homebrew's gcc temporarily as your C compiler.There's no need to set
CXXFLAGS
orCFLAGS
. The required flags are set by distutils/setuptools inside the setup.py.You won't be able to compile
dmc_cuda_module
on macOS 10.14.6. The latest macOS version nvidia offers cuda drivers for is 10.13.6. So you might uncomment this part of the setup.py and hope for the best you don't need this module...Some of the Extensions inside the
setup.py
aren't including the numpy headers while using the numpy C-API. On macOS it's necessary to include the numpy headers for each Extension, see this comment. So you have to addinclude_dirs=[numpy_include_dir]
to those Extensions.Edit: As discussed in the chat: The error was due to the conda env ignoring the CC variable. After installing python+pip via homebrew and the required python packages via pip, this answer's steps worked for the OP.
All in all, here's a setup.py that worked for me (macOS 10.5.7, gcc-10):
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install KDTree
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page