kandi background
Explore Kits

numpy | fundamental package for scientific computing | Data Manipulation library

 by   numpy Python Version: v1.22.3 License: BSD-3-Clause

 by   numpy Python Version: v1.22.3 License: BSD-3-Clause

Download this library from

kandi X-RAY | numpy Summary

numpy is a Python library typically used in Utilities, Data Manipulation, Numpy applications. numpy has build file available, it has a Permissive License and it has medium support. However numpy has 152 bugs and it has 5 vulnerabilities. You can download it from GitHub.
NumPy is the fundamental package for scientific computing with Python.
Support
Support
Quality
Quality
Security
Security
License
License
Reuse
Reuse

kandi-support Support

  • numpy has a medium active ecosystem.
  • It has 20101 star(s) with 6774 fork(s). There are 560 watchers for this library.
  • There were 10 major release(s) in the last 12 months.
  • There are 2040 open issues and 8413 have been closed. On average issues are closed in 285 days. There are 230 open pull requests and 0 closed requests.
  • It has a neutral sentiment in the developer community.
  • The latest version of numpy is v1.22.3
numpy Support
Best in #Data Manipulation
Average in #Data Manipulation
numpy Support
Best in #Data Manipulation
Average in #Data Manipulation

quality kandi Quality

  • numpy has 152 bugs (5 blocker, 0 critical, 140 major, 7 minor) and 1954 code smells.
numpy Quality
Best in #Data Manipulation
Average in #Data Manipulation
numpy Quality
Best in #Data Manipulation
Average in #Data Manipulation

securitySecurity

  • numpy has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
  • numpy code analysis shows 5 unresolved vulnerabilities (5 blocker, 0 critical, 0 major, 0 minor).
  • There are 26 security hotspots that need review.
numpy Security
Best in #Data Manipulation
Average in #Data Manipulation
numpy Security
Best in #Data Manipulation
Average in #Data Manipulation

license License

  • numpy is licensed under the BSD-3-Clause License. This license is Permissive.
  • Permissive licenses have the least restrictions, and you can use them in most projects.
numpy License
Best in #Data Manipulation
Average in #Data Manipulation
numpy License
Best in #Data Manipulation
Average in #Data Manipulation

buildReuse

  • numpy releases are available to install and integrate.
  • Build file is available. You can build the component from source.
  • Installation instructions are not available. Examples and code snippets are available.
  • numpy saves you 126151 person hours of effort in developing the same functionality from scratch.
  • It has 132876 lines of code, 10561 functions and 454 files.
  • It has medium code complexity. Code complexity directly impacts maintainability of the code.
numpy Reuse
Best in #Data Manipulation
Average in #Data Manipulation
numpy Reuse
Best in #Data Manipulation
Average in #Data Manipulation
Top functions reviewed by kandi - BETA

kandi has reviewed numpy and discovered the below as its top functions. This is intended to give you an instant insight into numpy implemented functionality, and help decide if they suit your requirements.

  • Create a configuration object .
  • Create a row from a text file .
  • Analyze the group .
  • Einsum operator .
  • Analyze block .
  • Pad an array with a given padding .
  • Compute the gradient of a function .
  • Calculate the percentile of an array .
  • Computes the einsum path .
  • Read data from a file .

numpy Key Features

Website: https://www.numpy.org

Documentation: https://numpy.org/doc

Mailing list: https://mail.python.org/mailman/listinfo/numpy-discussion

Source code: https://github.com/numpy/numpy

Contributing: https://www.numpy.org/devdocs/dev/index.html

Bug reports: https://github.com/numpy/numpy/issues

Report a security vulnerability: https://tidelift.com/docs/security

a powerful N-dimensional array object

sophisticated (broadcasting) functions

tools for integrating C/C++ and Fortran code

useful linear algebra, Fourier transform, and random number capabilities

default

copy iconCopydownload iconDownload
python -c 'import numpy; numpy.test()'

Why is `np.sum(range(N))` very slow?

copy iconCopydownload iconDownload
/* Fast addition by keeping temporary sums in C instead of new Python objects.
   Assumes all inputs are the same type.  If the assumption fails, default
   to the more general routine.
*/
from functools import singledispatch

def sum_range(range_, /, start=0):
    """Overloaded `sum` for range, compute arithmetic sum"""
    n = len(range_)
    if not n:
        return start
    return int(start + (n * (range_[0] + range_[-1]) / 2))

sum = singledispatch(sum)
sum.register(range, sum_range)

def test():
    """
    >>> sum(range(0, 100))
    4950
    >>> sum(range(0, 10, 2))
    20
    >>> sum(range(0, 9, 2))
    20
    >>> sum(range(0, -10, -1))
    -45
    >>> sum(range(-10, 10))
    -10
    >>> sum(range(-1, -100, -2))
    -2500
    >>> sum(range(0, 10, 100))
    0
    >>> sum(range(0, 0))
    0
    >>> sum(range(0, 100), 50)
    5000
    >>> sum(range(0, 0), 10)
    10
    """

if __name__ == "__main__":
    import doctest
    doctest.testmod()
-----------------------
/* Fast addition by keeping temporary sums in C instead of new Python objects.
   Assumes all inputs are the same type.  If the assumption fails, default
   to the more general routine.
*/
from functools import singledispatch

def sum_range(range_, /, start=0):
    """Overloaded `sum` for range, compute arithmetic sum"""
    n = len(range_)
    if not n:
        return start
    return int(start + (n * (range_[0] + range_[-1]) / 2))

sum = singledispatch(sum)
sum.register(range, sum_range)

def test():
    """
    >>> sum(range(0, 100))
    4950
    >>> sum(range(0, 10, 2))
    20
    >>> sum(range(0, 9, 2))
    20
    >>> sum(range(0, -10, -1))
    -45
    >>> sum(range(-10, 10))
    -10
    >>> sum(range(-1, -100, -2))
    -2500
    >>> sum(range(0, 10, 100))
    0
    >>> sum(range(0, 0))
    0
    >>> sum(range(0, 100), 50)
    5000
    >>> sum(range(0, 0), 10)
    10
    """

if __name__ == "__main__":
    import doctest
    doctest.testmod()
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
# sum_range:          1.4830789409988938
# sum_rangelist:      3.6745876889999636
%%timeit x = list(range(N))
    ...: sum(x)
# npsum_range:        16.216972655000063
# npsum_fromiterrange:3.47655400199983
# sum_arange:         16.656015603000924
# sum_list_arange:    19.500842117000502
# sum_arange_tolist:  4.004777374000696
# npsum_arange:       0.2332638230000157
# array_basic:        16.1631146109994
# array_dtype:        16.550737804000164
# array_iter:         3.9803170430004684
-----------------------
tmp = list(range(10_000_000))

# Numpy implicitly convert the list in a Numpy array but 
# still automatically detect the input type to use
np.sum(tmp)
tmp = list(range(10_000_000))

# The array is explicitly converted using a well-defined type and 
# thus there is no need to perform an automatic detection 
# (note that the result is still wrong since it does not fit in a np.int32)
tmp2 = np.array(tmp, dtype=np.int32)
result = np.sum(tmp2)
-----------------------
tmp = list(range(10_000_000))

# Numpy implicitly convert the list in a Numpy array but 
# still automatically detect the input type to use
np.sum(tmp)
tmp = list(range(10_000_000))

# The array is explicitly converted using a well-defined type and 
# thus there is no need to perform an automatic detection 
# (note that the result is still wrong since it does not fit in a np.int32)
tmp2 = np.array(tmp, dtype=np.int32)
result = np.sum(tmp2)

Installing scipy and scikit-learn on apple m1

copy iconCopydownload iconDownload
# SciPy:
python -m pip install --no-cache --no-use-pep517 pythran cython pybind11 gast"==0.4.0"
pyenv rehash
python -m pip install --no-cache --no-binary :all: --no-use-pep517 scipy"==1.7.1"

# Scikit-Learn
python -m pip install --no-use-pep517 scikit-learn"==0.24.2"
brew install openblas openssl@1.1 pkg-config pyenv pyenv-virtualenv
python -m pip install numpy==1.19.5
-----------------------
# SciPy:
python -m pip install --no-cache --no-use-pep517 pythran cython pybind11 gast"==0.4.0"
pyenv rehash
python -m pip install --no-cache --no-binary :all: --no-use-pep517 scipy"==1.7.1"

# Scikit-Learn
python -m pip install --no-use-pep517 scikit-learn"==0.24.2"
brew install openblas openssl@1.1 pkg-config pyenv pyenv-virtualenv
python -m pip install numpy==1.19.5
-----------------------


    >> /opt/homebrew/bin/brew install openblas
    >> export OPENBLAS=$(/opt/homebrew/bin/brew --prefix openblas)
    >> export CFLAGS="-falign-functions=8 ${CFLAGS}"
    >> git clone https://github.com/scipy/scipy.git
    >> cd scipy
    >> git submodule update --init
    >> /opt/homebrew/bin/pip3 install .
    >> /opt/homebrew/bin/pip3 install scikit-learn

-----------------------
RUN OPENBLAS="/opt/homebrew/opt/openblas" CFLAGS="-falign-functions=8 ${CFLAGS}" pip3 install scipy
RUN OPENBLAS="/opt/homebrew/opt/openblas" CFLAGS="-falign-functions=8 ${CFLAGS}" pip3 install scikit-learn
-----------------------
brew install openblas
export OPENBLAS=$(/opt/homebrew/bin/brew --prefix openblas)
export CFLAGS="-falign-functions=8 ${CFLAGS}"
# ^ no need to add to .zshrc, just doing this once.
pip install scikit-learn # ==0.24.1 if you want
Building wheels for collected packages: scikit-learn
  Building wheel for scikit-learn (pyproject.toml) ... done
  Created wheel for scikit-learn: filename=scikit_learn-1.0.1-cp38-cp38-macosx_12_0_arm64.whl size=6364030 sha256=0b0cc9a21af775e0c8077ee71698ff62da05ab62efc914c5c15cd4bf97867b31
Successfully built scikit-learn
Installing collected packages: scipy, scikit-learn
Successfully installed scikit-learn-1.0.1 scipy-1.7.3
Collecting click>=7.0
  Downloading click-8.0.3-py3-none-any.whl
Collecting grpcio>=1.28.1
  Downloading grpcio-1.42.0.tar.gz (21.3 MB)
     |████████████████████████████████| 21.3 MB 12.7 MB/s
  Preparing metadata (setup.py) ... done

## later in the process it installs using setuptools 
Running setup.py install for grpcio ... done
-----------------------
brew install openblas
export OPENBLAS=$(/opt/homebrew/bin/brew --prefix openblas)
export CFLAGS="-falign-functions=8 ${CFLAGS}"
# ^ no need to add to .zshrc, just doing this once.
pip install scikit-learn # ==0.24.1 if you want
Building wheels for collected packages: scikit-learn
  Building wheel for scikit-learn (pyproject.toml) ... done
  Created wheel for scikit-learn: filename=scikit_learn-1.0.1-cp38-cp38-macosx_12_0_arm64.whl size=6364030 sha256=0b0cc9a21af775e0c8077ee71698ff62da05ab62efc914c5c15cd4bf97867b31
Successfully built scikit-learn
Installing collected packages: scipy, scikit-learn
Successfully installed scikit-learn-1.0.1 scipy-1.7.3
Collecting click>=7.0
  Downloading click-8.0.3-py3-none-any.whl
Collecting grpcio>=1.28.1
  Downloading grpcio-1.42.0.tar.gz (21.3 MB)
     |████████████████████████████████| 21.3 MB 12.7 MB/s
  Preparing metadata (setup.py) ... done

## later in the process it installs using setuptools 
Running setup.py install for grpcio ... done
-----------------------
brew install openblas
export OPENBLAS=$(/opt/homebrew/bin/brew --prefix openblas)
export CFLAGS="-falign-functions=8 ${CFLAGS}"
# ^ no need to add to .zshrc, just doing this once.
pip install scikit-learn # ==0.24.1 if you want
Building wheels for collected packages: scikit-learn
  Building wheel for scikit-learn (pyproject.toml) ... done
  Created wheel for scikit-learn: filename=scikit_learn-1.0.1-cp38-cp38-macosx_12_0_arm64.whl size=6364030 sha256=0b0cc9a21af775e0c8077ee71698ff62da05ab62efc914c5c15cd4bf97867b31
Successfully built scikit-learn
Installing collected packages: scipy, scikit-learn
Successfully installed scikit-learn-1.0.1 scipy-1.7.3
Collecting click>=7.0
  Downloading click-8.0.3-py3-none-any.whl
Collecting grpcio>=1.28.1
  Downloading grpcio-1.42.0.tar.gz (21.3 MB)
     |████████████████████████████████| 21.3 MB 12.7 MB/s
  Preparing metadata (setup.py) ... done

## later in the process it installs using setuptools 
Running setup.py install for grpcio ... done
-----------------------
brew install openblas
export OPENBLAS=$(/opt/homebrew/bin/brew --prefix openblas)
export CFLAGS="-falign-functions=8 ${CFLAGS}"
# ^ no need to add to .zshrc, just doing this once.
pip install scikit-learn # ==0.24.1 if you want
Building wheels for collected packages: scikit-learn
  Building wheel for scikit-learn (pyproject.toml) ... done
  Created wheel for scikit-learn: filename=scikit_learn-1.0.1-cp38-cp38-macosx_12_0_arm64.whl size=6364030 sha256=0b0cc9a21af775e0c8077ee71698ff62da05ab62efc914c5c15cd4bf97867b31
Successfully built scikit-learn
Installing collected packages: scipy, scikit-learn
Successfully installed scikit-learn-1.0.1 scipy-1.7.3
Collecting click>=7.0
  Downloading click-8.0.3-py3-none-any.whl
Collecting grpcio>=1.28.1
  Downloading grpcio-1.42.0.tar.gz (21.3 MB)
     |████████████████████████████████| 21.3 MB 12.7 MB/s
  Preparing metadata (setup.py) ... done

## later in the process it installs using setuptools 
Running setup.py install for grpcio ... done
-----------------------
Python 3.9.10 (main, Jan 15 2022, 11:40:53) 
[Clang 13.0.0 (clang-1300.0.29.3)] on darwin
pip 22.0.3
brew install openblas gfortran
OPENBLAS="$(brew --prefix openblas)" pip install numpy==1.19.3
OPENBLAS="$(brew --prefix openblas)" pip install scipy==1.7.2
pip install cython
brew install libomp
export CC=/usr/bin/clang
export CXX=/usr/bin/clang++
export CPPFLAGS="$CPPFLAGS -Xpreprocessor -fopenmp"
export CFLAGS="$CFLAGS -I/opt/homebrew/Cellar/libomp/13.0.1/include"
export CXXFLAGS="$CXXFLAGS -I/opt/homebrew/Cellar/libomp/13.0.1/include"
export LDFLAGS="$LDFLAGS -L/opt/homebrew/Cellar/libomp/13.0.1/lib -lomp"
export DYLD_LIBRARY_PATH=/opt/homebrew/Cellar/libomp/13.0.1/lib
pip install scikit-learn==0.21.3
-----------------------
Python 3.9.10 (main, Jan 15 2022, 11:40:53) 
[Clang 13.0.0 (clang-1300.0.29.3)] on darwin
pip 22.0.3
brew install openblas gfortran
OPENBLAS="$(brew --prefix openblas)" pip install numpy==1.19.3
OPENBLAS="$(brew --prefix openblas)" pip install scipy==1.7.2
pip install cython
brew install libomp
export CC=/usr/bin/clang
export CXX=/usr/bin/clang++
export CPPFLAGS="$CPPFLAGS -Xpreprocessor -fopenmp"
export CFLAGS="$CFLAGS -I/opt/homebrew/Cellar/libomp/13.0.1/include"
export CXXFLAGS="$CXXFLAGS -I/opt/homebrew/Cellar/libomp/13.0.1/include"
export LDFLAGS="$LDFLAGS -L/opt/homebrew/Cellar/libomp/13.0.1/lib -lomp"
export DYLD_LIBRARY_PATH=/opt/homebrew/Cellar/libomp/13.0.1/lib
pip install scikit-learn==0.21.3
-----------------------
Python 3.9.10 (main, Jan 15 2022, 11:40:53) 
[Clang 13.0.0 (clang-1300.0.29.3)] on darwin
pip 22.0.3
brew install openblas gfortran
OPENBLAS="$(brew --prefix openblas)" pip install numpy==1.19.3
OPENBLAS="$(brew --prefix openblas)" pip install scipy==1.7.2
pip install cython
brew install libomp
export CC=/usr/bin/clang
export CXX=/usr/bin/clang++
export CPPFLAGS="$CPPFLAGS -Xpreprocessor -fopenmp"
export CFLAGS="$CFLAGS -I/opt/homebrew/Cellar/libomp/13.0.1/include"
export CXXFLAGS="$CXXFLAGS -I/opt/homebrew/Cellar/libomp/13.0.1/include"
export LDFLAGS="$LDFLAGS -L/opt/homebrew/Cellar/libomp/13.0.1/lib -lomp"
export DYLD_LIBRARY_PATH=/opt/homebrew/Cellar/libomp/13.0.1/lib
pip install scikit-learn==0.21.3
-----------------------
conda install --channel=conda-forge scikit-learn

How could I speed up my written python code: spheres contact detection (collision) using spatial searching

copy iconCopydownload iconDownload
from pyflann import FLANN

p = np.loadtxt("pos_large.csv", delimiter=",")
flann = FLANN()
flann.build_index(pts=p)
idx, dist = flann.nn_index(qpts=p, num_neighbors=50)
-----------------------
def ends_gap(poss, dia_max):
    particle_corsp_overlaps = []
    ends_ind = [np.empty([1, 2], dtype=np.int64)]

    kdtree = cKDTree(poss)

    for particle_idx in range(len(poss)):
        # Find the nearest point including the current one and
        # then remove the current point from the output.
        # The distances can be computed directly without a new query.
        cur_point = poss[particle_idx]
        nears_i_ind = np.array(kdtree.query_ball_point(cur_point, r=dia_max), dtype=np.int64)
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = distance.cdist(poss[nears_i_ind], cur_point[None, :]).squeeze()

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]

        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where([contact_check <= 0])[1]
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.array([np.repeat(particle_idx, len(sphere_olps_ind)), sphere_olps_ind], dtype=np.int64).T
        if particle_idx > 0:
            ends_ind.append(ends_ind_mod_temp)
        else:
            ends_ind[0][:] = ends_ind_mod_temp[0, 0], ends_ind_mod_temp[0, 1]

    ends_ind_org = np.concatenate(ends_ind)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)                                # <--- relatively high time consumer
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
@nb.jit('(float64[:, ::1], int64[::1], int64[::1], float64)')
def compute(poss, all_neighbours, all_neighbours_sizes, dia_max):
    particle_corsp_overlaps = []
    ends_ind_lst = [np.empty((1, 2), dtype=np.int64)]
    an_offset = 0

    for particle_idx in range(len(poss)):
        cur_point = poss[particle_idx]
        cur_len = all_neighbours_sizes[particle_idx]
        nears_i_ind = all_neighbours[an_offset:an_offset+cur_len]
        an_offset += cur_len
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = np.empty(len(nears_i_ind), dtype=np.float64)

        # Compute the distances
        x1, y1, z1 = poss[particle_idx]
        for i in range(len(nears_i_ind)):
            x2, y2, z2 = poss[nears_i_ind[i]]
            dist_i[i] = np.sqrt((x2-x1)**2 + (y2-y1)**2 + (z2-z1)**2)

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]

        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where(contact_check <= 0)
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.empty((len(sphere_olps_ind), 2), dtype=np.int64)
        for i in range(len(sphere_olps_ind)):
            ends_ind_mod_temp[i, 0] = particle_idx
            ends_ind_mod_temp[i, 1] = sphere_olps_ind[i]

        if particle_idx > 0:
            ends_ind_lst.append(ends_ind_mod_temp)
        else:
            tmp = ends_ind_lst[0]
            tmp[:] = ends_ind_mod_temp[0, :]

    return particle_corsp_overlaps, ends_ind_lst

def ends_gap(poss, dia_max):
    kdtree = cKDTree(poss)
    tmp = kdtree.query_ball_point(poss, r=dia_max, workers=-1)
    all_neighbours = np.concatenate(tmp, dtype=np.int64)
    all_neighbours_sizes = np.array([len(e) for e in tmp], dtype=np.int64)
    particle_corsp_overlaps, ends_ind_lst = compute(poss, all_neighbours, all_neighbours_sizes, dia_max)
    ends_ind_org = np.concatenate(ends_ind_lst)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org

ends_gap(poss, dia_max)
Initial code with Scipy:             259 s
Initial default code with Numpy:     112 s
Optimized algorithm:                   1.37 s
Final optimized code:                  0.22 s
-----------------------
def ends_gap(poss, dia_max):
    particle_corsp_overlaps = []
    ends_ind = [np.empty([1, 2], dtype=np.int64)]

    kdtree = cKDTree(poss)

    for particle_idx in range(len(poss)):
        # Find the nearest point including the current one and
        # then remove the current point from the output.
        # The distances can be computed directly without a new query.
        cur_point = poss[particle_idx]
        nears_i_ind = np.array(kdtree.query_ball_point(cur_point, r=dia_max), dtype=np.int64)
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = distance.cdist(poss[nears_i_ind], cur_point[None, :]).squeeze()

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]

        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where([contact_check <= 0])[1]
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.array([np.repeat(particle_idx, len(sphere_olps_ind)), sphere_olps_ind], dtype=np.int64).T
        if particle_idx > 0:
            ends_ind.append(ends_ind_mod_temp)
        else:
            ends_ind[0][:] = ends_ind_mod_temp[0, 0], ends_ind_mod_temp[0, 1]

    ends_ind_org = np.concatenate(ends_ind)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)                                # <--- relatively high time consumer
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
@nb.jit('(float64[:, ::1], int64[::1], int64[::1], float64)')
def compute(poss, all_neighbours, all_neighbours_sizes, dia_max):
    particle_corsp_overlaps = []
    ends_ind_lst = [np.empty((1, 2), dtype=np.int64)]
    an_offset = 0

    for particle_idx in range(len(poss)):
        cur_point = poss[particle_idx]
        cur_len = all_neighbours_sizes[particle_idx]
        nears_i_ind = all_neighbours[an_offset:an_offset+cur_len]
        an_offset += cur_len
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = np.empty(len(nears_i_ind), dtype=np.float64)

        # Compute the distances
        x1, y1, z1 = poss[particle_idx]
        for i in range(len(nears_i_ind)):
            x2, y2, z2 = poss[nears_i_ind[i]]
            dist_i[i] = np.sqrt((x2-x1)**2 + (y2-y1)**2 + (z2-z1)**2)

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]

        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where(contact_check <= 0)
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.empty((len(sphere_olps_ind), 2), dtype=np.int64)
        for i in range(len(sphere_olps_ind)):
            ends_ind_mod_temp[i, 0] = particle_idx
            ends_ind_mod_temp[i, 1] = sphere_olps_ind[i]

        if particle_idx > 0:
            ends_ind_lst.append(ends_ind_mod_temp)
        else:
            tmp = ends_ind_lst[0]
            tmp[:] = ends_ind_mod_temp[0, :]

    return particle_corsp_overlaps, ends_ind_lst

def ends_gap(poss, dia_max):
    kdtree = cKDTree(poss)
    tmp = kdtree.query_ball_point(poss, r=dia_max, workers=-1)
    all_neighbours = np.concatenate(tmp, dtype=np.int64)
    all_neighbours_sizes = np.array([len(e) for e in tmp], dtype=np.int64)
    particle_corsp_overlaps, ends_ind_lst = compute(poss, all_neighbours, all_neighbours_sizes, dia_max)
    ends_ind_org = np.concatenate(ends_ind_lst)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org

ends_gap(poss, dia_max)
Initial code with Scipy:             259 s
Initial default code with Numpy:     112 s
Optimized algorithm:                   1.37 s
Final optimized code:                  0.22 s
-----------------------
def ends_gap(poss, dia_max):
    particle_corsp_overlaps = []
    ends_ind = [np.empty([1, 2], dtype=np.int64)]

    kdtree = cKDTree(poss)

    for particle_idx in range(len(poss)):
        # Find the nearest point including the current one and
        # then remove the current point from the output.
        # The distances can be computed directly without a new query.
        cur_point = poss[particle_idx]
        nears_i_ind = np.array(kdtree.query_ball_point(cur_point, r=dia_max), dtype=np.int64)
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = distance.cdist(poss[nears_i_ind], cur_point[None, :]).squeeze()

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]

        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where([contact_check <= 0])[1]
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.array([np.repeat(particle_idx, len(sphere_olps_ind)), sphere_olps_ind], dtype=np.int64).T
        if particle_idx > 0:
            ends_ind.append(ends_ind_mod_temp)
        else:
            ends_ind[0][:] = ends_ind_mod_temp[0, 0], ends_ind_mod_temp[0, 1]

    ends_ind_org = np.concatenate(ends_ind)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)                                # <--- relatively high time consumer
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
@nb.jit('(float64[:, ::1], int64[::1], int64[::1], float64)')
def compute(poss, all_neighbours, all_neighbours_sizes, dia_max):
    particle_corsp_overlaps = []
    ends_ind_lst = [np.empty((1, 2), dtype=np.int64)]
    an_offset = 0

    for particle_idx in range(len(poss)):
        cur_point = poss[particle_idx]
        cur_len = all_neighbours_sizes[particle_idx]
        nears_i_ind = all_neighbours[an_offset:an_offset+cur_len]
        an_offset += cur_len
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = np.empty(len(nears_i_ind), dtype=np.float64)

        # Compute the distances
        x1, y1, z1 = poss[particle_idx]
        for i in range(len(nears_i_ind)):
            x2, y2, z2 = poss[nears_i_ind[i]]
            dist_i[i] = np.sqrt((x2-x1)**2 + (y2-y1)**2 + (z2-z1)**2)

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]

        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where(contact_check <= 0)
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.empty((len(sphere_olps_ind), 2), dtype=np.int64)
        for i in range(len(sphere_olps_ind)):
            ends_ind_mod_temp[i, 0] = particle_idx
            ends_ind_mod_temp[i, 1] = sphere_olps_ind[i]

        if particle_idx > 0:
            ends_ind_lst.append(ends_ind_mod_temp)
        else:
            tmp = ends_ind_lst[0]
            tmp[:] = ends_ind_mod_temp[0, :]

    return particle_corsp_overlaps, ends_ind_lst

def ends_gap(poss, dia_max):
    kdtree = cKDTree(poss)
    tmp = kdtree.query_ball_point(poss, r=dia_max, workers=-1)
    all_neighbours = np.concatenate(tmp, dtype=np.int64)
    all_neighbours_sizes = np.array([len(e) for e in tmp], dtype=np.int64)
    particle_corsp_overlaps, ends_ind_lst = compute(poss, all_neighbours, all_neighbours_sizes, dia_max)
    ends_ind_org = np.concatenate(ends_ind_lst)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org

ends_gap(poss, dia_max)
Initial code with Scipy:             259 s
Initial default code with Numpy:     112 s
Optimized algorithm:                   1.37 s
Final optimized code:                  0.22 s
-----------------------
def ends_gap(poss, dia_max):
    balltree = BallTree(poss, metric='euclidean')

    # tmp = balltree.query_radius(poss, r=dia_max)
    # all_neighbours = np.concatenate(tmp, dtype=np.int64)
    all_neighbours = balltree.query_radius(poss, r=dia_max)
    all_neighbours_sizes = np.array([len(e) for e in all_neighbours], dtype=np.int64)
    all_neighbours = np.concatenate(all_neighbours, dtype=np.int64)

    particle_corsp_overlaps, ends_ind_lst = compute(poss, all_neighbours, all_neighbours_sizes)
    ends_ind_org = np.concatenate(ends_ind_lst)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
def ends_gap(poss, dia_max):
    particle_corsp_overlaps = []
    ends_ind = []                                                                       # <------- this line is modified

    kdtree = cKDTree(poss)

    for particle_idx in range(len(poss)):
        cur_point = poss[particle_idx]
        nears_i_ind = np.array(kdtree.query_ball_point(cur_point, r=dia_max, return_sorted=True), dtype=np.int64)       # <------- this line is modified
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = distance.cdist(poss[nears_i_ind], cur_point[None, :]).squeeze()

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]

        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where([contact_check <= 0])[1]
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.array([np.repeat(particle_idx, len(sphere_olps_ind)), sphere_olps_ind], dtype=np.int64).T
        ends_ind.append(ends_ind_mod_temp)                                              # <------- this line is modified

    ends_ind_org = np.concatenate(ends_ind)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
@nb.jit('(float64[:, ::1], int64[::1], int64[::1])')
def compute(poss, all_neighbours, all_neighbours_sizes):
    particle_corsp_overlaps = []
    ends_ind_lst = []                                                                   # <------- this line is modified
    an_offset = 0

    for particle_idx in range(len(poss)):
        cur_len = all_neighbours_sizes[particle_idx]
        nears_i_ind = np.sort(all_neighbours[an_offset:an_offset+cur_len])              # <------- this line is modified
        an_offset += cur_len
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = np.empty(len(nears_i_ind), dtype=np.float64)

        x1, y1, z1 = poss[particle_idx]
        for i in range(len(nears_i_ind)):
            x2, y2, z2 = poss[nears_i_ind[i]]
            dist_i[i] = np.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2 + (z2 - z1) ** 2)

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]
        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where(contact_check <= 0)
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.empty((len(sphere_olps_ind), 2), dtype=np.int64)
        for i in range(len(sphere_olps_ind)):
            ends_ind_mod_temp[i, 0] = particle_idx
            ends_ind_mod_temp[i, 1] = sphere_olps_ind[i]
        ends_ind_lst.append(ends_ind_mod_temp)                                          # <------- this line is modified

    return particle_corsp_overlaps, ends_ind_lst


def ends_gap(poss, dia_max):
    balltree = BallTree(poss, metric='euclidean')                                       # <------- new code
    all_neighbours = balltree.query_radius(poss, r=dia_max)                             # <------- new code and modified
    all_neighbours_sizes = np.array([len(e) for e in all_neighbours], dtype=np.int64)   # <------- this line is modified
    all_neighbours = np.concatenate(all_neighbours, dtype=np.int64)                     # <------- this line is modified
    particle_corsp_overlaps, ends_ind_lst = compute(poss, all_neighbours, all_neighbours_sizes)
    ends_ind_org = np.concatenate(ends_ind_lst)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
-----------------------
def ends_gap(poss, dia_max):
    balltree = BallTree(poss, metric='euclidean')

    # tmp = balltree.query_radius(poss, r=dia_max)
    # all_neighbours = np.concatenate(tmp, dtype=np.int64)
    all_neighbours = balltree.query_radius(poss, r=dia_max)
    all_neighbours_sizes = np.array([len(e) for e in all_neighbours], dtype=np.int64)
    all_neighbours = np.concatenate(all_neighbours, dtype=np.int64)

    particle_corsp_overlaps, ends_ind_lst = compute(poss, all_neighbours, all_neighbours_sizes)
    ends_ind_org = np.concatenate(ends_ind_lst)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
def ends_gap(poss, dia_max):
    particle_corsp_overlaps = []
    ends_ind = []                                                                       # <------- this line is modified

    kdtree = cKDTree(poss)

    for particle_idx in range(len(poss)):
        cur_point = poss[particle_idx]
        nears_i_ind = np.array(kdtree.query_ball_point(cur_point, r=dia_max, return_sorted=True), dtype=np.int64)       # <------- this line is modified
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = distance.cdist(poss[nears_i_ind], cur_point[None, :]).squeeze()

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]

        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where([contact_check <= 0])[1]
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.array([np.repeat(particle_idx, len(sphere_olps_ind)), sphere_olps_ind], dtype=np.int64).T
        ends_ind.append(ends_ind_mod_temp)                                              # <------- this line is modified

    ends_ind_org = np.concatenate(ends_ind)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
@nb.jit('(float64[:, ::1], int64[::1], int64[::1])')
def compute(poss, all_neighbours, all_neighbours_sizes):
    particle_corsp_overlaps = []
    ends_ind_lst = []                                                                   # <------- this line is modified
    an_offset = 0

    for particle_idx in range(len(poss)):
        cur_len = all_neighbours_sizes[particle_idx]
        nears_i_ind = np.sort(all_neighbours[an_offset:an_offset+cur_len])              # <------- this line is modified
        an_offset += cur_len
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = np.empty(len(nears_i_ind), dtype=np.float64)

        x1, y1, z1 = poss[particle_idx]
        for i in range(len(nears_i_ind)):
            x2, y2, z2 = poss[nears_i_ind[i]]
            dist_i[i] = np.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2 + (z2 - z1) ** 2)

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]
        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where(contact_check <= 0)
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.empty((len(sphere_olps_ind), 2), dtype=np.int64)
        for i in range(len(sphere_olps_ind)):
            ends_ind_mod_temp[i, 0] = particle_idx
            ends_ind_mod_temp[i, 1] = sphere_olps_ind[i]
        ends_ind_lst.append(ends_ind_mod_temp)                                          # <------- this line is modified

    return particle_corsp_overlaps, ends_ind_lst


def ends_gap(poss, dia_max):
    balltree = BallTree(poss, metric='euclidean')                                       # <------- new code
    all_neighbours = balltree.query_radius(poss, r=dia_max)                             # <------- new code and modified
    all_neighbours_sizes = np.array([len(e) for e in all_neighbours], dtype=np.int64)   # <------- this line is modified
    all_neighbours = np.concatenate(all_neighbours, dtype=np.int64)                     # <------- this line is modified
    particle_corsp_overlaps, ends_ind_lst = compute(poss, all_neighbours, all_neighbours_sizes)
    ends_ind_org = np.concatenate(ends_ind_lst)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
-----------------------
def ends_gap(poss, dia_max):
    balltree = BallTree(poss, metric='euclidean')

    # tmp = balltree.query_radius(poss, r=dia_max)
    # all_neighbours = np.concatenate(tmp, dtype=np.int64)
    all_neighbours = balltree.query_radius(poss, r=dia_max)
    all_neighbours_sizes = np.array([len(e) for e in all_neighbours], dtype=np.int64)
    all_neighbours = np.concatenate(all_neighbours, dtype=np.int64)

    particle_corsp_overlaps, ends_ind_lst = compute(poss, all_neighbours, all_neighbours_sizes)
    ends_ind_org = np.concatenate(ends_ind_lst)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
def ends_gap(poss, dia_max):
    particle_corsp_overlaps = []
    ends_ind = []                                                                       # <------- this line is modified

    kdtree = cKDTree(poss)

    for particle_idx in range(len(poss)):
        cur_point = poss[particle_idx]
        nears_i_ind = np.array(kdtree.query_ball_point(cur_point, r=dia_max, return_sorted=True), dtype=np.int64)       # <------- this line is modified
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = distance.cdist(poss[nears_i_ind], cur_point[None, :]).squeeze()

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]

        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where([contact_check <= 0])[1]
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.array([np.repeat(particle_idx, len(sphere_olps_ind)), sphere_olps_ind], dtype=np.int64).T
        ends_ind.append(ends_ind_mod_temp)                                              # <------- this line is modified

    ends_ind_org = np.concatenate(ends_ind)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
@nb.jit('(float64[:, ::1], int64[::1], int64[::1])')
def compute(poss, all_neighbours, all_neighbours_sizes):
    particle_corsp_overlaps = []
    ends_ind_lst = []                                                                   # <------- this line is modified
    an_offset = 0

    for particle_idx in range(len(poss)):
        cur_len = all_neighbours_sizes[particle_idx]
        nears_i_ind = np.sort(all_neighbours[an_offset:an_offset+cur_len])              # <------- this line is modified
        an_offset += cur_len
        assert len(nears_i_ind) > 0

        if len(nears_i_ind) <= 1:
            continue

        nears_i_ind = nears_i_ind[nears_i_ind != particle_idx]
        dist_i = np.empty(len(nears_i_ind), dtype=np.float64)

        x1, y1, z1 = poss[particle_idx]
        for i in range(len(nears_i_ind)):
            x2, y2, z2 = poss[nears_i_ind[i]]
            dist_i[i] = np.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2 + (z2 - z1) ** 2)

        contact_check = dist_i - (radii[nears_i_ind] + radii[particle_idx])
        connected = contact_check[contact_check <= 0]
        particle_corsp_overlaps.append(connected)

        contacts_ind = np.where(contact_check <= 0)
        contacts_sec_ind = nears_i_ind[contacts_ind]
        sphere_olps_ind = np.sort(contacts_sec_ind)

        ends_ind_mod_temp = np.empty((len(sphere_olps_ind), 2), dtype=np.int64)
        for i in range(len(sphere_olps_ind)):
            ends_ind_mod_temp[i, 0] = particle_idx
            ends_ind_mod_temp[i, 1] = sphere_olps_ind[i]
        ends_ind_lst.append(ends_ind_mod_temp)                                          # <------- this line is modified

    return particle_corsp_overlaps, ends_ind_lst


def ends_gap(poss, dia_max):
    balltree = BallTree(poss, metric='euclidean')                                       # <------- new code
    all_neighbours = balltree.query_radius(poss, r=dia_max)                             # <------- new code and modified
    all_neighbours_sizes = np.array([len(e) for e in all_neighbours], dtype=np.int64)   # <------- this line is modified
    all_neighbours = np.concatenate(all_neighbours, dtype=np.int64)                     # <------- this line is modified
    particle_corsp_overlaps, ends_ind_lst = compute(poss, all_neighbours, all_neighbours_sizes)
    ends_ind_org = np.concatenate(ends_ind_lst)
    ends_ind, ends_ind_idx = np.unique(np.sort(ends_ind_org), axis=0, return_index=True)
    gap = np.concatenate(particle_corsp_overlaps)[ends_ind_idx]
    return gap, ends_ind, ends_ind_idx, ends_ind_org
-----------------------
import numpy as np
from scipy import spatial
from timeit import default_timer


def load_data():
    centers = np.loadtxt("pos_large.csv", delimiter=",")
    radii = np.loadtxt("radii_large.csv")
    assert radii.shape + (3,) == centers.shape
    return centers, radii


def count_contacts(centers, radii):
    scaled_centers = centers / np.sqrt(centers.shape[1])
    max_radius = radii.max()
    tree = spatial.cKDTree(np.c_[scaled_centers, max_radius - radii])
    count = 0
    for i, x in enumerate(np.c_[scaled_centers, radii - max_radius]):
        for j in tree.query_ball_point(x, r=2 * max_radius, p=1):
            d = centers[i] - centers[j]
            r = radii[i] + radii[j]
            if i < j and np.inner(d, d) <= r * r:
                count += 1
    return count


def main():
    centers, radii = load_data()
    start = default_timer()
    print(count_contacts(centers, radii))
    end = default_timer()
    print(end - start)


if __name__ == "__main__":
    main()
-----------------------
import numpy as np
import numba as nb
from sklearn.neighbors import BallTree
from joblib import Parallel, delayed

def flatten_neighbours(arr):
    sizes = np.fromiter(map(len, arr), count=len(arr), dtype=np.int64)
    values = np.concatenate(arr, dtype=np.int64)
    return sizes, values

@delayed
def find_neighbours(searched_pts, ref_pts, max_dist):
    balltree = BallTree(ref_pts, leaf_size=16, metric='euclidean')
    res = balltree.query_radius(searched_pts, r=max_dist)
    return flatten_neighbours(res)

def vstack_neighbours(top_infos, bottom_infos):
    top_sizes, top_values = top_infos
    bottom_sizes, bottom_values = bottom_infos
    return np.concatenate([top_sizes, bottom_sizes]), np.concatenate([top_values, bottom_values])

@nb.njit('(Tuple([int64[::1],int64[::1]]), Tuple([int64[::1],int64[::1]]), int64)')
def hstack_neighbours(left_infos, right_infos, offset):
    left_sizes, left_values = left_infos
    right_sizes, right_values = right_infos
    n = left_sizes.size
    out_sizes = np.empty(n, dtype=np.int64)
    out_values = np.empty(left_values.size + right_values.size, dtype=np.int64)
    left_cur, right_cur, out_cur = 0, 0, 0
    right_values += offset
    for i in range(n):
        left, right = left_sizes[i], right_sizes[i]
        full = left + right
        out_values[out_cur:out_cur+left] = left_values[left_cur:left_cur+left]
        out_values[out_cur+left:out_cur+full] = right_values[right_cur:right_cur+right]
        out_sizes[i] = full
        left_cur += left
        right_cur += right
        out_cur += full
    return out_sizes, out_values

@nb.njit('(int64[::1], int64[::1], int64[::1], int64[::1])')
def reorder_neighbours(in_sizes, in_values, index, reverse_index):
    n = reverse_index.size
    out_sizes = np.empty_like(in_sizes)
    out_values = np.empty_like(in_values)
    in_offsets = np.empty_like(in_sizes)
    s, cur = 0, 0

    for i in range(n):
        in_offsets[i] = s
        s += in_sizes[i]

    for i in range(n):
        in_ind = reverse_index[i]
        size = in_sizes[in_ind]
        in_offset = in_offsets[in_ind]
        out_sizes[i] = size
        for j in range(size):
            out_values[cur+j] = index[in_values[in_offset+j]]
        cur += size

    return out_sizes, out_values

@nb.njit
def small_inplace_sort(arr):
    if len(arr) < 80:
        # Basic insertion sort
        i = 1
        while i < len(arr):
            x = arr[i]
            j = i - 1
            while j >= 0 and arr[j] > x:
                arr[j+1] = arr[j]
                j = j - 1
            arr[j+1] = x
            i += 1
    else:
        arr.sort()

@nb.jit('(float64[:, ::1], float64[::1], int64[::1], int64[::1])')
def compute(poss, radii, neighbours_sizes, neighbours_values):
    n, m = neighbours_sizes.size, np.max(neighbours_sizes)

    # Big buffers allocated with the maximum size.
    # Thank to virtual memory, it does not take more memory can actually needed.
    particle_corsp_overlaps = np.empty(neighbours_values.size, dtype=np.float64)
    ends_ind_org = np.empty((neighbours_values.size, 2), dtype=np.float64)

    in_offset = 0
    out_offset = 0

    buff1 = np.empty(m, dtype=np.int64)
    buff2 = np.empty(m, dtype=np.float64)
    buff3 = np.empty(m, dtype=np.float64)

    for particle_idx in range(n):
        size = neighbours_sizes[particle_idx]
        cur = 0

        for i in range(size):
            value = neighbours_values[in_offset+i]
            if value != particle_idx:
                buff1[cur] = value
                cur += 1

        nears_i_ind = buff1[0:cur]
        small_inplace_sort(nears_i_ind)  # Note: bottleneck of this function
        in_offset += size

        if len(nears_i_ind) == 0:
            continue

        x1, y1, z1 = poss[particle_idx]
        cur = 0

        for i in range(len(nears_i_ind)):
            index = nears_i_ind[i]
            x2, y2, z2 = poss[index]
            dist = np.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2 + (z2 - z1) ** 2)
            contact_check = dist - (radii[index] + radii[particle_idx])
            if contact_check <= 0.0:
                buff2[cur] = contact_check
                buff3[cur] = index
                cur += 1

        particle_corsp_overlaps[out_offset:out_offset+cur] = buff2[0:cur]

        contacts_sec_ind = buff3[0:cur]
        small_inplace_sort(contacts_sec_ind)
        sphere_olps_ind = contacts_sec_ind

        for i in range(cur):
            ends_ind_org[out_offset+i, 0] = particle_idx
            ends_ind_org[out_offset+i, 1] = sphere_olps_ind[i]

        out_offset += cur

    # Truncate the views to their real size
    particle_corsp_overlaps = particle_corsp_overlaps[:out_offset]
    ends_ind_org = ends_ind_org[:out_offset]

    assert len(ends_ind_org) % 2 == 0
    size = len(ends_ind_org)//2
    ends_ind = np.empty((size,2), dtype=np.int64)
    ends_ind_idx = np.empty(size, dtype=np.int64)
    gap = np.empty(size, dtype=np.float64)
    cur = 0

    # Find efficiently duplicates (replace np.unique+np.sort)
    for i in range(len(ends_ind_org)):
        left, right = ends_ind_org[i]
        if left < right:
            ends_ind[cur, 0] = left
            ends_ind[cur, 1] = right
            ends_ind_idx[cur] = i
            gap[cur] = particle_corsp_overlaps[i]
            cur += 1

    return gap, ends_ind, ends_ind_idx, ends_ind_org

def ends_gap(poss, radii):
    assert poss.size >= 1

    # Sort the balls
    index = np.argsort(radii)
    reverse_index = np.empty(index.size, np.int64)
    reverse_index[index] = np.arange(index.size, dtype=np.int64)
    sorted_poss = poss[index]
    sorted_radii = radii[index]

    # Split them in two groups: the small and the big ones
    split_ind = len(radii) * 3 // 4
    small_poss, big_poss = np.split(sorted_poss, [split_ind])
    small_radii, big_radii = np.split(sorted_radii, [split_ind])
    max_small_radii = sorted_radii[max(split_ind, 0)]
    max_big_radii = sorted_radii[-1]

    # Find the neighbours in parallel
    result = Parallel(n_jobs=4, backend='threading')([
        find_neighbours(small_poss, small_poss, small_radii+max_small_radii),
        find_neighbours(small_poss, big_poss,   small_radii+max_big_radii  ),
        find_neighbours(big_poss,   small_poss, big_radii+max_small_radii  ),
        find_neighbours(big_poss,   big_poss,   big_radii+max_big_radii    )
    ])
    small_small_neighbours = result[0]
    small_big_neighbours = result[1]
    big_small_neighbours = result[2]
    big_big_neighbours = result[3]

    # Merge the (segmented) arrays in a big one
    neighbours_sizes, neighbours_values = vstack_neighbours(
        hstack_neighbours(small_small_neighbours, small_big_neighbours, split_ind),
        hstack_neighbours(big_small_neighbours, big_big_neighbours, split_ind)
    )

    # Reverse the indices.
    # Note that the results in `neighbours_values` associated to 
    # `neighbours_sizes[i]` are subsets of `query_radius([poss[i]], r=dia_max)`
    # on a `BallTree(poss)`.
    res = reorder_neighbours(neighbours_sizes, neighbours_values, index, reverse_index)
    neighbours_sizes, neighbours_values = res

    # Finally compute the neighbours with a method similar to the 
    # previous one, but using a much faster optimized code.
    return compute(poss, radii, neighbours_sizes, neighbours_values)

result = ends_gap(poss, radii)
Small dataset:
 - Reference optimized Numba code:    256 ms
 - This highly-optimized Numba code:   82 ms

Big dataset:
 - Reference optimized Numba code:    42.7 s  (take about 7~8 GiB of RAM)
 - This highly-optimized Numba code:   4.2 s  (take about  1  GiB of RAM)
-----------------------
import numpy as np
import numba as nb
from sklearn.neighbors import BallTree
from joblib import Parallel, delayed

def flatten_neighbours(arr):
    sizes = np.fromiter(map(len, arr), count=len(arr), dtype=np.int64)
    values = np.concatenate(arr, dtype=np.int64)
    return sizes, values

@delayed
def find_neighbours(searched_pts, ref_pts, max_dist):
    balltree = BallTree(ref_pts, leaf_size=16, metric='euclidean')
    res = balltree.query_radius(searched_pts, r=max_dist)
    return flatten_neighbours(res)

def vstack_neighbours(top_infos, bottom_infos):
    top_sizes, top_values = top_infos
    bottom_sizes, bottom_values = bottom_infos
    return np.concatenate([top_sizes, bottom_sizes]), np.concatenate([top_values, bottom_values])

@nb.njit('(Tuple([int64[::1],int64[::1]]), Tuple([int64[::1],int64[::1]]), int64)')
def hstack_neighbours(left_infos, right_infos, offset):
    left_sizes, left_values = left_infos
    right_sizes, right_values = right_infos
    n = left_sizes.size
    out_sizes = np.empty(n, dtype=np.int64)
    out_values = np.empty(left_values.size + right_values.size, dtype=np.int64)
    left_cur, right_cur, out_cur = 0, 0, 0
    right_values += offset
    for i in range(n):
        left, right = left_sizes[i], right_sizes[i]
        full = left + right
        out_values[out_cur:out_cur+left] = left_values[left_cur:left_cur+left]
        out_values[out_cur+left:out_cur+full] = right_values[right_cur:right_cur+right]
        out_sizes[i] = full
        left_cur += left
        right_cur += right
        out_cur += full
    return out_sizes, out_values

@nb.njit('(int64[::1], int64[::1], int64[::1], int64[::1])')
def reorder_neighbours(in_sizes, in_values, index, reverse_index):
    n = reverse_index.size
    out_sizes = np.empty_like(in_sizes)
    out_values = np.empty_like(in_values)
    in_offsets = np.empty_like(in_sizes)
    s, cur = 0, 0

    for i in range(n):
        in_offsets[i] = s
        s += in_sizes[i]

    for i in range(n):
        in_ind = reverse_index[i]
        size = in_sizes[in_ind]
        in_offset = in_offsets[in_ind]
        out_sizes[i] = size
        for j in range(size):
            out_values[cur+j] = index[in_values[in_offset+j]]
        cur += size

    return out_sizes, out_values

@nb.njit
def small_inplace_sort(arr):
    if len(arr) < 80:
        # Basic insertion sort
        i = 1
        while i < len(arr):
            x = arr[i]
            j = i - 1
            while j >= 0 and arr[j] > x:
                arr[j+1] = arr[j]
                j = j - 1
            arr[j+1] = x
            i += 1
    else:
        arr.sort()

@nb.jit('(float64[:, ::1], float64[::1], int64[::1], int64[::1])')
def compute(poss, radii, neighbours_sizes, neighbours_values):
    n, m = neighbours_sizes.size, np.max(neighbours_sizes)

    # Big buffers allocated with the maximum size.
    # Thank to virtual memory, it does not take more memory can actually needed.
    particle_corsp_overlaps = np.empty(neighbours_values.size, dtype=np.float64)
    ends_ind_org = np.empty((neighbours_values.size, 2), dtype=np.float64)

    in_offset = 0
    out_offset = 0

    buff1 = np.empty(m, dtype=np.int64)
    buff2 = np.empty(m, dtype=np.float64)
    buff3 = np.empty(m, dtype=np.float64)

    for particle_idx in range(n):
        size = neighbours_sizes[particle_idx]
        cur = 0

        for i in range(size):
            value = neighbours_values[in_offset+i]
            if value != particle_idx:
                buff1[cur] = value
                cur += 1

        nears_i_ind = buff1[0:cur]
        small_inplace_sort(nears_i_ind)  # Note: bottleneck of this function
        in_offset += size

        if len(nears_i_ind) == 0:
            continue

        x1, y1, z1 = poss[particle_idx]
        cur = 0

        for i in range(len(nears_i_ind)):
            index = nears_i_ind[i]
            x2, y2, z2 = poss[index]
            dist = np.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2 + (z2 - z1) ** 2)
            contact_check = dist - (radii[index] + radii[particle_idx])
            if contact_check <= 0.0:
                buff2[cur] = contact_check
                buff3[cur] = index
                cur += 1

        particle_corsp_overlaps[out_offset:out_offset+cur] = buff2[0:cur]

        contacts_sec_ind = buff3[0:cur]
        small_inplace_sort(contacts_sec_ind)
        sphere_olps_ind = contacts_sec_ind

        for i in range(cur):
            ends_ind_org[out_offset+i, 0] = particle_idx
            ends_ind_org[out_offset+i, 1] = sphere_olps_ind[i]

        out_offset += cur

    # Truncate the views to their real size
    particle_corsp_overlaps = particle_corsp_overlaps[:out_offset]
    ends_ind_org = ends_ind_org[:out_offset]

    assert len(ends_ind_org) % 2 == 0
    size = len(ends_ind_org)//2
    ends_ind = np.empty((size,2), dtype=np.int64)
    ends_ind_idx = np.empty(size, dtype=np.int64)
    gap = np.empty(size, dtype=np.float64)
    cur = 0

    # Find efficiently duplicates (replace np.unique+np.sort)
    for i in range(len(ends_ind_org)):
        left, right = ends_ind_org[i]
        if left < right:
            ends_ind[cur, 0] = left
            ends_ind[cur, 1] = right
            ends_ind_idx[cur] = i
            gap[cur] = particle_corsp_overlaps[i]
            cur += 1

    return gap, ends_ind, ends_ind_idx, ends_ind_org

def ends_gap(poss, radii):
    assert poss.size >= 1

    # Sort the balls
    index = np.argsort(radii)
    reverse_index = np.empty(index.size, np.int64)
    reverse_index[index] = np.arange(index.size, dtype=np.int64)
    sorted_poss = poss[index]
    sorted_radii = radii[index]

    # Split them in two groups: the small and the big ones
    split_ind = len(radii) * 3 // 4
    small_poss, big_poss = np.split(sorted_poss, [split_ind])
    small_radii, big_radii = np.split(sorted_radii, [split_ind])
    max_small_radii = sorted_radii[max(split_ind, 0)]
    max_big_radii = sorted_radii[-1]

    # Find the neighbours in parallel
    result = Parallel(n_jobs=4, backend='threading')([
        find_neighbours(small_poss, small_poss, small_radii+max_small_radii),
        find_neighbours(small_poss, big_poss,   small_radii+max_big_radii  ),
        find_neighbours(big_poss,   small_poss, big_radii+max_small_radii  ),
        find_neighbours(big_poss,   big_poss,   big_radii+max_big_radii    )
    ])
    small_small_neighbours = result[0]
    small_big_neighbours = result[1]
    big_small_neighbours = result[2]
    big_big_neighbours = result[3]

    # Merge the (segmented) arrays in a big one
    neighbours_sizes, neighbours_values = vstack_neighbours(
        hstack_neighbours(small_small_neighbours, small_big_neighbours, split_ind),
        hstack_neighbours(big_small_neighbours, big_big_neighbours, split_ind)
    )

    # Reverse the indices.
    # Note that the results in `neighbours_values` associated to 
    # `neighbours_sizes[i]` are subsets of `query_radius([poss[i]], r=dia_max)`
    # on a `BallTree(poss)`.
    res = reorder_neighbours(neighbours_sizes, neighbours_values, index, reverse_index)
    neighbours_sizes, neighbours_values = res

    # Finally compute the neighbours with a method similar to the 
    # previous one, but using a much faster optimized code.
    return compute(poss, radii, neighbours_sizes, neighbours_values)

result = ends_gap(poss, radii)
Small dataset:
 - Reference optimized Numba code:    256 ms
 - This highly-optimized Numba code:   82 ms

Big dataset:
 - Reference optimized Numba code:    42.7 s  (take about 7~8 GiB of RAM)
 - This highly-optimized Numba code:   4.2 s  (take about  1  GiB of RAM)

TypeError: load() missing 1 required positional argument: 'Loader' in Google Colab

copy iconCopydownload iconDownload
!pip install pyyaml==5.4.1

How do I calculate square root in Python?

copy iconCopydownload iconDownload
>>> import math
>>> math.sqrt(9)
3.0
>>> 9 ** (1/2)
3.0
>>> 9 ** .5  # Same thing
3.0
>>> 2 ** .5
1.4142135623730951
>>> 8 ** (1/3)
2.0
>>> 125 ** (1/3)
4.999999999999999
>>> (-25) ** .5  # Should be 5j
(3.061616997868383e-16+5j)
>>> 8j ** .5  # Should be 2+2j
(2.0000000000000004+2j)
>>> import cmath
>>> cmath.sqrt(-25)
5j
>>> cmath.sqrt(8j)
(2+2j)
>>> n = 10**30
>>> square = n**2
>>> x = square**.5
>>> x == n
False
>>> x - n  # how far off are they?
0.0
>>> int(x) - n  # how far off is the float from the int?
19884624838656
>>> decimal.Decimal('9') ** .5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ** or pow(): 'decimal.Decimal' and 'float'
>>> decimal.Decimal('9') ** decimal.Decimal('.5')
Decimal('3.000000000000000000000000000')
-----------------------
>>> import math
>>> math.sqrt(9)
3.0
>>> 9 ** (1/2)
3.0
>>> 9 ** .5  # Same thing
3.0
>>> 2 ** .5
1.4142135623730951
>>> 8 ** (1/3)
2.0
>>> 125 ** (1/3)
4.999999999999999
>>> (-25) ** .5  # Should be 5j
(3.061616997868383e-16+5j)
>>> 8j ** .5  # Should be 2+2j
(2.0000000000000004+2j)
>>> import cmath
>>> cmath.sqrt(-25)
5j
>>> cmath.sqrt(8j)
(2+2j)
>>> n = 10**30
>>> square = n**2
>>> x = square**.5
>>> x == n
False
>>> x - n  # how far off are they?
0.0
>>> int(x) - n  # how far off is the float from the int?
19884624838656
>>> decimal.Decimal('9') ** .5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ** or pow(): 'decimal.Decimal' and 'float'
>>> decimal.Decimal('9') ** decimal.Decimal('.5')
Decimal('3.000000000000000000000000000')
-----------------------
>>> import math
>>> math.sqrt(9)
3.0
>>> 9 ** (1/2)
3.0
>>> 9 ** .5  # Same thing
3.0
>>> 2 ** .5
1.4142135623730951
>>> 8 ** (1/3)
2.0
>>> 125 ** (1/3)
4.999999999999999
>>> (-25) ** .5  # Should be 5j
(3.061616997868383e-16+5j)
>>> 8j ** .5  # Should be 2+2j
(2.0000000000000004+2j)
>>> import cmath
>>> cmath.sqrt(-25)
5j
>>> cmath.sqrt(8j)
(2+2j)
>>> n = 10**30
>>> square = n**2
>>> x = square**.5
>>> x == n
False
>>> x - n  # how far off are they?
0.0
>>> int(x) - n  # how far off is the float from the int?
19884624838656
>>> decimal.Decimal('9') ** .5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ** or pow(): 'decimal.Decimal' and 'float'
>>> decimal.Decimal('9') ** decimal.Decimal('.5')
Decimal('3.000000000000000000000000000')
-----------------------
>>> import math
>>> math.sqrt(9)
3.0
>>> 9 ** (1/2)
3.0
>>> 9 ** .5  # Same thing
3.0
>>> 2 ** .5
1.4142135623730951
>>> 8 ** (1/3)
2.0
>>> 125 ** (1/3)
4.999999999999999
>>> (-25) ** .5  # Should be 5j
(3.061616997868383e-16+5j)
>>> 8j ** .5  # Should be 2+2j
(2.0000000000000004+2j)
>>> import cmath
>>> cmath.sqrt(-25)
5j
>>> cmath.sqrt(8j)
(2+2j)
>>> n = 10**30
>>> square = n**2
>>> x = square**.5
>>> x == n
False
>>> x - n  # how far off are they?
0.0
>>> int(x) - n  # how far off is the float from the int?
19884624838656
>>> decimal.Decimal('9') ** .5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ** or pow(): 'decimal.Decimal' and 'float'
>>> decimal.Decimal('9') ** decimal.Decimal('.5')
Decimal('3.000000000000000000000000000')
-----------------------
>>> import math
>>> math.sqrt(9)
3.0
>>> 9 ** (1/2)
3.0
>>> 9 ** .5  # Same thing
3.0
>>> 2 ** .5
1.4142135623730951
>>> 8 ** (1/3)
2.0
>>> 125 ** (1/3)
4.999999999999999
>>> (-25) ** .5  # Should be 5j
(3.061616997868383e-16+5j)
>>> 8j ** .5  # Should be 2+2j
(2.0000000000000004+2j)
>>> import cmath
>>> cmath.sqrt(-25)
5j
>>> cmath.sqrt(8j)
(2+2j)
>>> n = 10**30
>>> square = n**2
>>> x = square**.5
>>> x == n
False
>>> x - n  # how far off are they?
0.0
>>> int(x) - n  # how far off is the float from the int?
19884624838656
>>> decimal.Decimal('9') ** .5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ** or pow(): 'decimal.Decimal' and 'float'
>>> decimal.Decimal('9') ** decimal.Decimal('.5')
Decimal('3.000000000000000000000000000')
-----------------------
>>> import math
>>> math.sqrt(9)
3.0
>>> 9 ** (1/2)
3.0
>>> 9 ** .5  # Same thing
3.0
>>> 2 ** .5
1.4142135623730951
>>> 8 ** (1/3)
2.0
>>> 125 ** (1/3)
4.999999999999999
>>> (-25) ** .5  # Should be 5j
(3.061616997868383e-16+5j)
>>> 8j ** .5  # Should be 2+2j
(2.0000000000000004+2j)
>>> import cmath
>>> cmath.sqrt(-25)
5j
>>> cmath.sqrt(8j)
(2+2j)
>>> n = 10**30
>>> square = n**2
>>> x = square**.5
>>> x == n
False
>>> x - n  # how far off are they?
0.0
>>> int(x) - n  # how far off is the float from the int?
19884624838656
>>> decimal.Decimal('9') ** .5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ** or pow(): 'decimal.Decimal' and 'float'
>>> decimal.Decimal('9') ** decimal.Decimal('.5')
Decimal('3.000000000000000000000000000')
-----------------------
>>> import math
>>> math.sqrt(9)
3.0
>>> 9 ** (1/2)
3.0
>>> 9 ** .5  # Same thing
3.0
>>> 2 ** .5
1.4142135623730951
>>> 8 ** (1/3)
2.0
>>> 125 ** (1/3)
4.999999999999999
>>> (-25) ** .5  # Should be 5j
(3.061616997868383e-16+5j)
>>> 8j ** .5  # Should be 2+2j
(2.0000000000000004+2j)
>>> import cmath
>>> cmath.sqrt(-25)
5j
>>> cmath.sqrt(8j)
(2+2j)
>>> n = 10**30
>>> square = n**2
>>> x = square**.5
>>> x == n
False
>>> x - n  # how far off are they?
0.0
>>> int(x) - n  # how far off is the float from the int?
19884624838656
>>> decimal.Decimal('9') ** .5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ** or pow(): 'decimal.Decimal' and 'float'
>>> decimal.Decimal('9') ** decimal.Decimal('.5')
Decimal('3.000000000000000000000000000')
-----------------------
>>> import numpy as np
>>> np.sqrt(25)
5.0
>>> np.sqrt([2, 3, 4])
array([1.41421356, 1.73205081, 2.        ])
>>> a = np.array([4, -1, np.inf])
>>> np.sqrt(a)
<stdin>:1: RuntimeWarning: invalid value encountered in sqrt
array([ 2., nan, inf])
>>> np.lib.scimath.sqrt(a)
array([ 2.+0.j,  0.+1.j, inf+0.j])
>>> a = a.astype(complex)
>>> np.sqrt(a)
array([ 2.+0.j,  0.+1.j, inf+0.j])
-----------------------
>>> import numpy as np
>>> np.sqrt(25)
5.0
>>> np.sqrt([2, 3, 4])
array([1.41421356, 1.73205081, 2.        ])
>>> a = np.array([4, -1, np.inf])
>>> np.sqrt(a)
<stdin>:1: RuntimeWarning: invalid value encountered in sqrt
array([ 2., nan, inf])
>>> np.lib.scimath.sqrt(a)
array([ 2.+0.j,  0.+1.j, inf+0.j])
>>> a = a.astype(complex)
>>> np.sqrt(a)
array([ 2.+0.j,  0.+1.j, inf+0.j])
-----------------------
>>> import numpy as np
>>> np.sqrt(25)
5.0
>>> np.sqrt([2, 3, 4])
array([1.41421356, 1.73205081, 2.        ])
>>> a = np.array([4, -1, np.inf])
>>> np.sqrt(a)
<stdin>:1: RuntimeWarning: invalid value encountered in sqrt
array([ 2., nan, inf])
>>> np.lib.scimath.sqrt(a)
array([ 2.+0.j,  0.+1.j, inf+0.j])
>>> a = a.astype(complex)
>>> np.sqrt(a)
array([ 2.+0.j,  0.+1.j, inf+0.j])
-----------------------
import sympy
sympy.sqrt(2)
# => sqrt(2)
sympy.sqrt(8) / sympy.sqrt(27)
# => 2*sqrt(6)/9
s = sympy.sqrt(2)
s**2
# => 2
type(s**2)
#=> <class 'sympy.core.numbers.Integer'>
(2**0.5)**2
# => 2.0000000000000004

from decimal import Decimal
(Decimal('2')**Decimal('0.5'))**Decimal('2')
# => Decimal('1.999999999999999999999999999')
from sympy import Symbol, integrate, pi, sqrt, exp, oo
x = Symbol('x')
integrate(exp(-x**2), (x, -oo, oo))
# => sqrt(pi)
integrate(exp(-x**2), (x, -oo, oo)) == sqrt(pi)
# => True
sympy.N(sympy.sqrt(2), 1_000_000)
# => 1.4142135623730950488016...........2044193016904841204
-----------------------
import sympy
sympy.sqrt(2)
# => sqrt(2)
sympy.sqrt(8) / sympy.sqrt(27)
# => 2*sqrt(6)/9
s = sympy.sqrt(2)
s**2
# => 2
type(s**2)
#=> <class 'sympy.core.numbers.Integer'>
(2**0.5)**2
# => 2.0000000000000004

from decimal import Decimal
(Decimal('2')**Decimal('0.5'))**Decimal('2')
# => Decimal('1.999999999999999999999999999')
from sympy import Symbol, integrate, pi, sqrt, exp, oo
x = Symbol('x')
integrate(exp(-x**2), (x, -oo, oo))
# => sqrt(pi)
integrate(exp(-x**2), (x, -oo, oo)) == sqrt(pi)
# => True
sympy.N(sympy.sqrt(2), 1_000_000)
# => 1.4142135623730950488016...........2044193016904841204
-----------------------
import sympy
sympy.sqrt(2)
# => sqrt(2)
sympy.sqrt(8) / sympy.sqrt(27)
# => 2*sqrt(6)/9
s = sympy.sqrt(2)
s**2
# => 2
type(s**2)
#=> <class 'sympy.core.numbers.Integer'>
(2**0.5)**2
# => 2.0000000000000004

from decimal import Decimal
(Decimal('2')**Decimal('0.5'))**Decimal('2')
# => Decimal('1.999999999999999999999999999')
from sympy import Symbol, integrate, pi, sqrt, exp, oo
x = Symbol('x')
integrate(exp(-x**2), (x, -oo, oo))
# => sqrt(pi)
integrate(exp(-x**2), (x, -oo, oo)) == sqrt(pi)
# => True
sympy.N(sympy.sqrt(2), 1_000_000)
# => 1.4142135623730950488016...........2044193016904841204
-----------------------
import sympy
sympy.sqrt(2)
# => sqrt(2)
sympy.sqrt(8) / sympy.sqrt(27)
# => 2*sqrt(6)/9
s = sympy.sqrt(2)
s**2
# => 2
type(s**2)
#=> <class 'sympy.core.numbers.Integer'>
(2**0.5)**2
# => 2.0000000000000004

from decimal import Decimal
(Decimal('2')**Decimal('0.5'))**Decimal('2')
# => Decimal('1.999999999999999999999999999')
from sympy import Symbol, integrate, pi, sqrt, exp, oo
x = Symbol('x')
integrate(exp(-x**2), (x, -oo, oo))
# => sqrt(pi)
integrate(exp(-x**2), (x, -oo, oo)) == sqrt(pi)
# => True
sympy.N(sympy.sqrt(2), 1_000_000)
# => 1.4142135623730950488016...........2044193016904841204
-----------------------
import sympy
sympy.sqrt(2)
# => sqrt(2)
sympy.sqrt(8) / sympy.sqrt(27)
# => 2*sqrt(6)/9
s = sympy.sqrt(2)
s**2
# => 2
type(s**2)
#=> <class 'sympy.core.numbers.Integer'>
(2**0.5)**2
# => 2.0000000000000004

from decimal import Decimal
(Decimal('2')**Decimal('0.5'))**Decimal('2')
# => Decimal('1.999999999999999999999999999')
from sympy import Symbol, integrate, pi, sqrt, exp, oo
x = Symbol('x')
integrate(exp(-x**2), (x, -oo, oo))
# => sqrt(pi)
integrate(exp(-x**2), (x, -oo, oo)) == sqrt(pi)
# => True
sympy.N(sympy.sqrt(2), 1_000_000)
# => 1.4142135623730950488016...........2044193016904841204
-----------------------
import sympy
sympy.sqrt(2)
# => sqrt(2)
sympy.sqrt(8) / sympy.sqrt(27)
# => 2*sqrt(6)/9
s = sympy.sqrt(2)
s**2
# => 2
type(s**2)
#=> <class 'sympy.core.numbers.Integer'>
(2**0.5)**2
# => 2.0000000000000004

from decimal import Decimal
(Decimal('2')**Decimal('0.5'))**Decimal('2')
# => Decimal('1.999999999999999999999999999')
from sympy import Symbol, integrate, pi, sqrt, exp, oo
x = Symbol('x')
integrate(exp(-x**2), (x, -oo, oo))
# => sqrt(pi)
integrate(exp(-x**2), (x, -oo, oo)) == sqrt(pi)
# => True
sympy.N(sympy.sqrt(2), 1_000_000)
# => 1.4142135623730950488016...........2044193016904841204
-----------------------
def int_squareroot(d: int) -> tuple[int, bool]:
    """Try calculating integer squareroot and return if it's exact"""
    left, right = 1, (d+1)//2
    while left<right-1:
        x = (left+right)//2
        if x**2 > d:
            left, right = left, x
        else:
            left, right = x, right
    return left, left**2==d
-----------------------
from fractions import Fraction

def sqrt(x, n):
    x = x if isinstance(x, Fraction) else Fraction(x)
    upper = x + 1
    for i in range(0, n):
        upper = (upper + x/upper) / 2
    lower = x / upper
    if lower > upper:
        raise ValueError("Sanity check failed")
    return (lower, upper)
-----------------------
new_estimate = (estimate + num/estimate) / 2
def newtons_method(num, estimate):
    # Computing a new_estimate
    new_estimate = (estimate + num/estimate) / 2
    print(new_estimate)
    # Base Case: Comparing our estimate with built-in functions value
    if new_estimate == math.sqrt(num):
        return True
    else:
        return newtons_method(num, new_estimate)
newtons_method(30,5)
5.5
5.477272727272727
5.4772255752546215
5.477225575051661
-----------------------
new_estimate = (estimate + num/estimate) / 2
def newtons_method(num, estimate):
    # Computing a new_estimate
    new_estimate = (estimate + num/estimate) / 2
    print(new_estimate)
    # Base Case: Comparing our estimate with built-in functions value
    if new_estimate == math.sqrt(num):
        return True
    else:
        return newtons_method(num, new_estimate)
newtons_method(30,5)
5.5
5.477272727272727
5.4772255752546215
5.477225575051661
-----------------------
new_estimate = (estimate + num/estimate) / 2
def newtons_method(num, estimate):
    # Computing a new_estimate
    new_estimate = (estimate + num/estimate) / 2
    print(new_estimate)
    # Base Case: Comparing our estimate with built-in functions value
    if new_estimate == math.sqrt(num):
        return True
    else:
        return newtons_method(num, new_estimate)
newtons_method(30,5)
5.5
5.477272727272727
5.4772255752546215
5.477225575051661
-----------------------
new_estimate = (estimate + num/estimate) / 2
def newtons_method(num, estimate):
    # Computing a new_estimate
    new_estimate = (estimate + num/estimate) / 2
    print(new_estimate)
    # Base Case: Comparing our estimate with built-in functions value
    if new_estimate == math.sqrt(num):
        return True
    else:
        return newtons_method(num, new_estimate)
newtons_method(30,5)
5.5
5.477272727272727
5.4772255752546215
5.477225575051661
-----------------------
from math import isqrt

def str_sqrt(num, digits):
    """ Arbitrary precision square root

        num arg must be a string
        Return a string with `digits` after
        the decimal point

        Written by PM 2Ring 2022.01.26
    """

    int_part , _, frac_part = num.partition('.')
    num = int_part + frac_part

    # Determine the required precision
    width = 2 * digits - len(frac_part)

    # Truncate or pad with zeroes
    num = num[:width] if width < 0 else num + '0' * width
    s = str(isqrt(int(num)))

    if digits:
        # Pad, if necessary
        s = '0' * (1 + digits - len(s)) + s
        s = f"{s[:-digits]}.{s[-digits:]}"
    return s
print(str_sqrt("2.0", 30))
1.414213562373095048801688724209
-----------------------
from math import isqrt

def str_sqrt(num, digits):
    """ Arbitrary precision square root

        num arg must be a string
        Return a string with `digits` after
        the decimal point

        Written by PM 2Ring 2022.01.26
    """

    int_part , _, frac_part = num.partition('.')
    num = int_part + frac_part

    # Determine the required precision
    width = 2 * digits - len(frac_part)

    # Truncate or pad with zeroes
    num = num[:width] if width < 0 else num + '0' * width
    s = str(isqrt(int(num)))

    if digits:
        # Pad, if necessary
        s = '0' * (1 + digits - len(s)) + s
        s = f"{s[:-digits]}.{s[-digits:]}"
    return s
print(str_sqrt("2.0", 30))
1.414213562373095048801688724209
-----------------------
from math import isqrt

def str_sqrt(num, digits):
    """ Arbitrary precision square root

        num arg must be a string
        Return a string with `digits` after
        the decimal point

        Written by PM 2Ring 2022.01.26
    """

    int_part , _, frac_part = num.partition('.')
    num = int_part + frac_part

    # Determine the required precision
    width = 2 * digits - len(frac_part)

    # Truncate or pad with zeroes
    num = num[:width] if width < 0 else num + '0' * width
    s = str(isqrt(int(num)))

    if digits:
        # Pad, if necessary
        s = '0' * (1 + digits - len(s)) + s
        s = f"{s[:-digits]}.{s[-digits:]}"
    return s
print(str_sqrt("2.0", 30))
1.414213562373095048801688724209

Using a pip requirements file in a conda yml file throws AttributeError: 'FileNotFoundError'

copy iconCopydownload iconDownload
name: test
dependencies:
  - python>=3
  - pip
  - pip:
    - -r requirements.txt
name: test
dependencies:
  - python>=3
  - pip
  - pip:
    - -r file:/full/path/to/requirements.txt
    # - -r file:///full/path/to/requirements.txt # alternate syntax
-----------------------
name: test
dependencies:
  - python>=3
  - pip
  - pip:
    - -r requirements.txt
name: test
dependencies:
  - python>=3
  - pip
  - pip:
    - -r file:/full/path/to/requirements.txt
    # - -r file:///full/path/to/requirements.txt # alternate syntax

Problem with memory allocation in Julia code

copy iconCopydownload iconDownload
function problem2(c)
    N = zeros(Int, c+2)
    notseen = falses(c+1)

    for lN in 1:c+1
        notseen .= true
        @inbounds for i in 1:lN-1
            b = N[i] ⊻ N[lN-i]
            b <= c && (notseen[b+1] = false)
        end
        idx = findfirst(notseen)
        isnothing(idx) || (N[lN+1] = idx-1)
    end
    return count(==(0), N)
end
julia> problem(10000), problem2(10000)
(1475, 1475)
julia> using BenchmarkTools

julia> @btime problem(10000)
  4.938 s (163884 allocations: 3.25 GiB)
1475

julia> @btime problem2(10000)
  76.275 ms (4 allocations: 79.59 KiB)
1475
julia> function problem3(c)
           N = zeros(Int, c+2)
           notseen = Vector{Bool}(undef, c+1)

           for lN in 1:c+1
               notseen .= true
               @inbounds for i in 1:lN-1
                   b = N[i] ⊻ N[lN-i]
                   notseen[b+1] = false
               end
               idx = findfirst(notseen)
               isnothing(idx) || (N[lN+1] = idx-1)
           end
           return count(==(0), N)
       end
problem3 (generic function with 1 method)

julia> @btime problem3(10000)
  20.714 ms (3 allocations: 88.17 KiB)
1475
-----------------------
function problem2(c)
    N = zeros(Int, c+2)
    notseen = falses(c+1)

    for lN in 1:c+1
        notseen .= true
        @inbounds for i in 1:lN-1
            b = N[i] ⊻ N[lN-i]
            b <= c && (notseen[b+1] = false)
        end
        idx = findfirst(notseen)
        isnothing(idx) || (N[lN+1] = idx-1)
    end
    return count(==(0), N)
end
julia> problem(10000), problem2(10000)
(1475, 1475)
julia> using BenchmarkTools

julia> @btime problem(10000)
  4.938 s (163884 allocations: 3.25 GiB)
1475

julia> @btime problem2(10000)
  76.275 ms (4 allocations: 79.59 KiB)
1475
julia> function problem3(c)
           N = zeros(Int, c+2)
           notseen = Vector{Bool}(undef, c+1)

           for lN in 1:c+1
               notseen .= true
               @inbounds for i in 1:lN-1
                   b = N[i] ⊻ N[lN-i]
                   notseen[b+1] = false
               end
               idx = findfirst(notseen)
               isnothing(idx) || (N[lN+1] = idx-1)
           end
           return count(==(0), N)
       end
problem3 (generic function with 1 method)

julia> @btime problem3(10000)
  20.714 ms (3 allocations: 88.17 KiB)
1475
-----------------------
function problem2(c)
    N = zeros(Int, c+2)
    notseen = falses(c+1)

    for lN in 1:c+1
        notseen .= true
        @inbounds for i in 1:lN-1
            b = N[i] ⊻ N[lN-i]
            b <= c && (notseen[b+1] = false)
        end
        idx = findfirst(notseen)
        isnothing(idx) || (N[lN+1] = idx-1)
    end
    return count(==(0), N)
end
julia> problem(10000), problem2(10000)
(1475, 1475)
julia> using BenchmarkTools

julia> @btime problem(10000)
  4.938 s (163884 allocations: 3.25 GiB)
1475

julia> @btime problem2(10000)
  76.275 ms (4 allocations: 79.59 KiB)
1475
julia> function problem3(c)
           N = zeros(Int, c+2)
           notseen = Vector{Bool}(undef, c+1)

           for lN in 1:c+1
               notseen .= true
               @inbounds for i in 1:lN-1
                   b = N[i] ⊻ N[lN-i]
                   notseen[b+1] = false
               end
               idx = findfirst(notseen)
               isnothing(idx) || (N[lN+1] = idx-1)
           end
           return count(==(0), N)
       end
problem3 (generic function with 1 method)

julia> @btime problem3(10000)
  20.714 ms (3 allocations: 88.17 KiB)
1475
-----------------------
function problem2(c)
    N = zeros(Int, c+2)
    notseen = falses(c+1)

    for lN in 1:c+1
        notseen .= true
        @inbounds for i in 1:lN-1
            b = N[i] ⊻ N[lN-i]
            b <= c && (notseen[b+1] = false)
        end
        idx = findfirst(notseen)
        isnothing(idx) || (N[lN+1] = idx-1)
    end
    return count(==(0), N)
end
julia> problem(10000), problem2(10000)
(1475, 1475)
julia> using BenchmarkTools

julia> @btime problem(10000)
  4.938 s (163884 allocations: 3.25 GiB)
1475

julia> @btime problem2(10000)
  76.275 ms (4 allocations: 79.59 KiB)
1475
julia> function problem3(c)
           N = zeros(Int, c+2)
           notseen = Vector{Bool}(undef, c+1)

           for lN in 1:c+1
               notseen .= true
               @inbounds for i in 1:lN-1
                   b = N[i] ⊻ N[lN-i]
                   notseen[b+1] = false
               end
               idx = findfirst(notseen)
               isnothing(idx) || (N[lN+1] = idx-1)
           end
           return count(==(0), N)
       end
problem3 (generic function with 1 method)

julia> @btime problem3(10000)
  20.714 ms (3 allocations: 88.17 KiB)
1475

Efficient summation in Python

copy iconCopydownload iconDownload
# method #2
start=time.time()
w=np.arange(0, n+1, dtype=np.object)
result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
start=time.time()
for i in range(100):
    result1 = summation(0, n, mysum)
print('Slow method:', time.time()-start)

# method #2
start=time.time()
for i in range(100):
    w=np.arange(0, n+1, dtype=np.object)
    result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
Slow method: 0.06906533241271973
Fast method: 0.008007287979125977
# method #3
import itertools
start=time.time()
for i in range(100):
    result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
print('Faster, pure python:', (time.time()-start))
Faster, pure python: 0.0009944438934326172
# method #4
start=time.time()
for i in range(100):
    w = np.arange(0, n+1, dtype=np.object)
    result2 = (w*w*np.cumsum(w)).sum()
print('Fast method x*x:', time.time()-start)
-----------------------
# method #2
start=time.time()
w=np.arange(0, n+1, dtype=np.object)
result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
start=time.time()
for i in range(100):
    result1 = summation(0, n, mysum)
print('Slow method:', time.time()-start)

# method #2
start=time.time()
for i in range(100):
    w=np.arange(0, n+1, dtype=np.object)
    result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
Slow method: 0.06906533241271973
Fast method: 0.008007287979125977
# method #3
import itertools
start=time.time()
for i in range(100):
    result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
print('Faster, pure python:', (time.time()-start))
Faster, pure python: 0.0009944438934326172
# method #4
start=time.time()
for i in range(100):
    w = np.arange(0, n+1, dtype=np.object)
    result2 = (w*w*np.cumsum(w)).sum()
print('Fast method x*x:', time.time()-start)
-----------------------
# method #2
start=time.time()
w=np.arange(0, n+1, dtype=np.object)
result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
start=time.time()
for i in range(100):
    result1 = summation(0, n, mysum)
print('Slow method:', time.time()-start)

# method #2
start=time.time()
for i in range(100):
    w=np.arange(0, n+1, dtype=np.object)
    result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
Slow method: 0.06906533241271973
Fast method: 0.008007287979125977
# method #3
import itertools
start=time.time()
for i in range(100):
    result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
print('Faster, pure python:', (time.time()-start))
Faster, pure python: 0.0009944438934326172
# method #4
start=time.time()
for i in range(100):
    w = np.arange(0, n+1, dtype=np.object)
    result2 = (w*w*np.cumsum(w)).sum()
print('Fast method x*x:', time.time()-start)
-----------------------
# method #2
start=time.time()
w=np.arange(0, n+1, dtype=np.object)
result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
start=time.time()
for i in range(100):
    result1 = summation(0, n, mysum)
print('Slow method:', time.time()-start)

# method #2
start=time.time()
for i in range(100):
    w=np.arange(0, n+1, dtype=np.object)
    result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
Slow method: 0.06906533241271973
Fast method: 0.008007287979125977
# method #3
import itertools
start=time.time()
for i in range(100):
    result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
print('Faster, pure python:', (time.time()-start))
Faster, pure python: 0.0009944438934326172
# method #4
start=time.time()
for i in range(100):
    w = np.arange(0, n+1, dtype=np.object)
    result2 = (w*w*np.cumsum(w)).sum()
print('Fast method x*x:', time.time()-start)
-----------------------
# method #2
start=time.time()
w=np.arange(0, n+1, dtype=np.object)
result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
start=time.time()
for i in range(100):
    result1 = summation(0, n, mysum)
print('Slow method:', time.time()-start)

# method #2
start=time.time()
for i in range(100):
    w=np.arange(0, n+1, dtype=np.object)
    result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
Slow method: 0.06906533241271973
Fast method: 0.008007287979125977
# method #3
import itertools
start=time.time()
for i in range(100):
    result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
print('Faster, pure python:', (time.time()-start))
Faster, pure python: 0.0009944438934326172
# method #4
start=time.time()
for i in range(100):
    w = np.arange(0, n+1, dtype=np.object)
    result2 = (w*w*np.cumsum(w)).sum()
print('Fast method x*x:', time.time()-start)
-----------------------
# method #2
start=time.time()
w=np.arange(0, n+1, dtype=np.object)
result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
start=time.time()
for i in range(100):
    result1 = summation(0, n, mysum)
print('Slow method:', time.time()-start)

# method #2
start=time.time()
for i in range(100):
    w=np.arange(0, n+1, dtype=np.object)
    result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
Slow method: 0.06906533241271973
Fast method: 0.008007287979125977
# method #3
import itertools
start=time.time()
for i in range(100):
    result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
print('Faster, pure python:', (time.time()-start))
Faster, pure python: 0.0009944438934326172
# method #4
start=time.time()
for i in range(100):
    w = np.arange(0, n+1, dtype=np.object)
    result2 = (w*w*np.cumsum(w)).sum()
print('Fast method x*x:', time.time()-start)
-----------------------
result = ((((12 * n + 45) * n + 50) * n + 15) * n - 2) * n // 120
from fractions import Fraction
import math
from functools import reduce

def naive(n):
    return sum(x**2 * sum(range(x+1)) for x in range(n+1))

def lcm(ints):
    return reduce(lambda r, i: r * i // math.gcd(r, i), ints)

def polynomial(xys):
    xs, ys = zip(*xys)
    n = len(xs)
    A = [[Fraction(x**i) for i in range(n)] for x in xs]
    b = list(ys)
    for _ in range(2):
        for i0 in range(n):
            for i in range(i0 + 1, n):
                f = A[i][i0] / A[i0][i0]
                for j in range(i0, n):
                    A[i][j] -= f * A[i0][j]
                b[i] -= f * b[i0]
        A = [row[::-1] for row in A[::-1]]
        b.reverse()
    coeffs = [b[i] / A[i][i] for i in range(n)]
    denominator = lcm(c.denominator for c in coeffs)
    coeffs = [int(c * denominator) for c in coeffs]
    horner = str(coeffs[-1])
    for c in coeffs[-2::-1]:
        horner += ' * n'
        if c:
            horner = f"({horner} {'+' if c > 0 else '-'} {abs(c)})"
    return f'{horner} // {denominator}'

print(polynomial((x, naive(x)) for x in range(6)))
((((12 * n + 45) * n + 50) * n + 15) * n - 2) * n // 120
-----------------------
result = ((((12 * n + 45) * n + 50) * n + 15) * n - 2) * n // 120
from fractions import Fraction
import math
from functools import reduce

def naive(n):
    return sum(x**2 * sum(range(x+1)) for x in range(n+1))

def lcm(ints):
    return reduce(lambda r, i: r * i // math.gcd(r, i), ints)

def polynomial(xys):
    xs, ys = zip(*xys)
    n = len(xs)
    A = [[Fraction(x**i) for i in range(n)] for x in xs]
    b = list(ys)
    for _ in range(2):
        for i0 in range(n):
            for i in range(i0 + 1, n):
                f = A[i][i0] / A[i0][i0]
                for j in range(i0, n):
                    A[i][j] -= f * A[i0][j]
                b[i] -= f * b[i0]
        A = [row[::-1] for row in A[::-1]]
        b.reverse()
    coeffs = [b[i] / A[i][i] for i in range(n)]
    denominator = lcm(c.denominator for c in coeffs)
    coeffs = [int(c * denominator) for c in coeffs]
    horner = str(coeffs[-1])
    for c in coeffs[-2::-1]:
        horner += ' * n'
        if c:
            horner = f"({horner} {'+' if c > 0 else '-'} {abs(c)})"
    return f'{horner} // {denominator}'

print(polynomial((x, naive(x)) for x in range(6)))
((((12 * n + 45) * n + 50) * n + 15) * n - 2) * n // 120
-----------------------
result = ((((12 * n + 45) * n + 50) * n + 15) * n - 2) * n // 120
from fractions import Fraction
import math
from functools import reduce

def naive(n):
    return sum(x**2 * sum(range(x+1)) for x in range(n+1))

def lcm(ints):
    return reduce(lambda r, i: r * i // math.gcd(r, i), ints)

def polynomial(xys):
    xs, ys = zip(*xys)
    n = len(xs)
    A = [[Fraction(x**i) for i in range(n)] for x in xs]
    b = list(ys)
    for _ in range(2):
        for i0 in range(n):
            for i in range(i0 + 1, n):
                f = A[i][i0] / A[i0][i0]
                for j in range(i0, n):
                    A[i][j] -= f * A[i0][j]
                b[i] -= f * b[i0]
        A = [row[::-1] for row in A[::-1]]
        b.reverse()
    coeffs = [b[i] / A[i][i] for i in range(n)]
    denominator = lcm(c.denominator for c in coeffs)
    coeffs = [int(c * denominator) for c in coeffs]
    horner = str(coeffs[-1])
    for c in coeffs[-2::-1]:
        horner += ' * n'
        if c:
            horner = f"({horner} {'+' if c > 0 else '-'} {abs(c)})"
    return f'{horner} // {denominator}'

print(polynomial((x, naive(x)) for x in range(6)))
((((12 * n + 45) * n + 50) * n + 15) * n - 2) * n // 120
-----------------------
from sympy import summation
from sympy import symbols

n, x, y = symbols("n,x,y")
eq = summation(x ** 2 * summation(y, (y, 0, x)), (x, 0, n))
eq.evalf(subs={"n": 1000})
-----------------------
def function5():
    inner_sum = float()
    result = float()

    for x in range(0, n + 1):
        inner_sum += x
        result += x ** 2 * inner_sum
        
    return result
method 2   | 31 µs ± 2.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
method 3   | 116 µs ± 538 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
method 4   | 91 µs ± 356 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
function 5 | 217 µs ± 1.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
from numba import jit
function5 = jit(nopython=True)(function5)
59.8 ns ± 0.209 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
-----------------------
def function5():
    inner_sum = float()
    result = float()

    for x in range(0, n + 1):
        inner_sum += x
        result += x ** 2 * inner_sum
        
    return result
method 2   | 31 µs ± 2.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
method 3   | 116 µs ± 538 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
method 4   | 91 µs ± 356 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
function 5 | 217 µs ± 1.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
from numba import jit
function5 = jit(nopython=True)(function5)
59.8 ns ± 0.209 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
-----------------------
def function5():
    inner_sum = float()
    result = float()

    for x in range(0, n + 1):
        inner_sum += x
        result += x ** 2 * inner_sum
        
    return result
method 2   | 31 µs ± 2.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
method 3   | 116 µs ± 538 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
method 4   | 91 µs ± 356 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
function 5 | 217 µs ± 1.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
from numba import jit
function5 = jit(nopython=True)(function5)
59.8 ns ± 0.209 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
-----------------------
def function5():
    inner_sum = float()
    result = float()

    for x in range(0, n + 1):
        inner_sum += x
        result += x ** 2 * inner_sum
        
    return result
method 2   | 31 µs ± 2.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
method 3   | 116 µs ± 538 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
method 4   | 91 µs ± 356 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
function 5 | 217 µs ± 1.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
from numba import jit
function5 = jit(nopython=True)(function5)
59.8 ns ± 0.209 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

NumPy 1.21.2 may not yet support Python 3.10

copy iconCopydownload iconDownload
pip install pipwin
pipwin install numpy
py -3.10 -mpip install pipwin
py -3.10 -mpipwin refresh
py -3.10 -mpipwin install numpy
-----------------------
pip install pipwin
pipwin install numpy
py -3.10 -mpip install pipwin
py -3.10 -mpipwin refresh
py -3.10 -mpipwin install numpy
-----------------------
$ docker run -it --rm --name python310 -u 0 python:3.10 bash -c 'pip --version; pip install numpy --user --no-cache; pip show numpy; python -c "import numpy as np; print(np.ones(5))"'
pip 21.2.4 from /usr/local/lib/python3.10/site-packages/pip (python 3.10)
Collecting numpy
  Downloading numpy-1.21.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (15.9 MB)
     |████████████████████████████████| 15.9 MB 36.9 MB/s 
Installing collected packages: numpy
  WARNING: The scripts f2py, f2py3 and f2py3.10 are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed numpy-1.21.4
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
Name: numpy
Version: 1.21.4
Summary: NumPy is the fundamental package for array computing with Python.
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email: 
License: BSD
Location: /root/.local/lib/python3.10/site-packages
Requires: 
Required-by: 
[1. 1. 1. 1. 1.]
-----------------------
$ docker run -it --rm --name python310 -u 0 python:3.10 bash -c 'pip --version; pip install numpy --user --no-cache; pip show numpy; python -c "import numpy as np; print(np.ones(5))"'
pip 21.2.4 from /usr/local/lib/python3.10/site-packages/pip (python 3.10)
Collecting numpy
  Downloading numpy-1.21.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (15.9 MB)
     |████████████████████████████████| 15.9 MB 36.9 MB/s 
Installing collected packages: numpy
  WARNING: The scripts f2py, f2py3 and f2py3.10 are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed numpy-1.21.4
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
Name: numpy
Version: 1.21.4
Summary: NumPy is the fundamental package for array computing with Python.
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email: 
License: BSD
Location: /root/.local/lib/python3.10/site-packages
Requires: 
Required-by: 
[1. 1. 1. 1. 1.]

ImportError: cannot import name 'BatchNormalization' from 'keras.layers.normalization'

copy iconCopydownload iconDownload
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (
    BatchNormalization, SeparableConv2D, MaxPooling2D, Activation, Flatten, Dropout, Dense
)
from tensorflow.keras import backend as K


class CancerNet:
    @staticmethod
    def build(width, height, depth, classes):
        model = Sequential()
        shape = (height, width, depth)
        channelDim = -1

        if K.image_data_format() == "channels_first":
            shape = (depth, height, width)
            channelDim = 1

        model.add(SeparableConv2D(32, (3, 3), padding="same", input_shape=shape))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=channelDim))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))

        model.add(SeparableConv2D(64, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=channelDim))
        model.add(SeparableConv2D(64, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=channelDim))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))

        model.add(SeparableConv2D(128, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=channelDim))
        model.add(SeparableConv2D(128, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=channelDim))
        model.add(SeparableConv2D(128, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=channelDim))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))

        model.add(Flatten())
        model.add(Dense(256))
        model.add(Activation("relu"))
        model.add(BatchNormalization())
        model.add(Dropout(0.5))

        model.add(Dense(classes))
        model.add(Activation("softmax"))

        return model

model = CancerNet()
-----------------------
from tensorflow.keras.layers import BatchNormalization

Community Discussions

Trending Discussions on numpy
  • Why is `np.sum(range(N))` very slow?
  • Installing scipy and scikit-learn on apple m1
  • How could I speed up my written python code: spheres contact detection (collision) using spatial searching
  • Error while downloading the requirements using pip install (setup command: use_2to3 is invalid.)
  • TypeError: load() missing 1 required positional argument: 'Loader' in Google Colab
  • How do I calculate square root in Python?
  • Using a pip requirements file in a conda yml file throws AttributeError: 'FileNotFoundError'
  • Problem with memory allocation in Julia code
  • Efficient summation in Python
  • NumPy 1.21.2 may not yet support Python 3.10
Trending Discussions on numpy

QUESTION

Why is `np.sum(range(N))` very slow?

Asked 2022-Mar-29 at 14:31

I saw a video about speed of loops in python, where it was explained that doing sum(range(N)) is much faster than manually looping through range and adding the variables together, since the former runs in C due to built-in functions being used, while in the latter the summation is done in (slow) python. I was curious what happens when adding numpy to the mix. As I expected np.sum(np.arange(N)) is the fastest, but sum(np.arange(N)) and np.sum(range(N)) are even slower than doing the naive for loop.

Why is this?

Here's the script I used to test, some comments about the supposed cause of slowing done where I know (taken mostly from the video) and the results I got on my machine (python 3.10.0, numpy 1.21.2):

updated script:

import numpy as np
from timeit import timeit

N = 10_000_000
repetition = 10

def sum0(N = N):
    s = 0
    i = 0
    while i < N: # condition is checked in python
        s += i
        i += 1 # both additions are done in python
    return s

def sum1(N = N):
    s = 0
    for i in range(N): # increment in C
        s += i # addition in python
    return s

def sum2(N = N):
    return sum(range(N)) # everything in C

def sum3(N = N):
    return sum(list(range(N)))

def sum4(N = N):
    return np.sum(range(N)) # very slow np.array conversion

def sum5(N = N):
    # much faster np.array conversion
    return np.sum(np.fromiter(range(N),dtype = int))

def sum5v2_(N = N):
    # much faster np.array conversion
    return np.sum(np.fromiter(range(N),dtype = np.int_))

def sum6(N = N):
    # possibly slow conversion to Py_long from np.int
    return sum(np.arange(N))

def sum7(N = N):
    # list returns a list of np.int-s
    return sum(list(np.arange(N)))

def sum7v2(N = N):
    # tolist conversion to python int seems faster than the implicit conversion
    # in sum(list()) (tolist returns a list of python int-s)
    return sum(np.arange(N).tolist())

def sum8(N = N):
    return np.sum(np.arange(N)) # everything in numpy (fortran libblas?)

def sum9(N = N):
    return np.arange(N).sum() # remove dispatch overhead

def array_basic(N = N):
    return np.array(range(N))

def array_dtype(N = N):
    return np.array(range(N),dtype = np.int_)

def array_iter(N = N):
    # np.sum's source code mentions to use fromiter to convert from generators
    return np.fromiter(range(N),dtype = np.int_)

print(f"while loop:         {timeit(sum0, number = repetition)}")
print(f"for loop:           {timeit(sum1, number = repetition)}")
print(f"sum_range:          {timeit(sum2, number = repetition)}")
print(f"sum_rangelist:      {timeit(sum3, number = repetition)}")
print(f"npsum_range:        {timeit(sum4, number = repetition)}")
print(f"npsum_iterrange:    {timeit(sum5, number = repetition)}")
print(f"npsum_iterrangev2:  {timeit(sum5, number = repetition)}")
print(f"sum_arange:         {timeit(sum6, number = repetition)}")
print(f"sum_list_arange:    {timeit(sum7, number = repetition)}")
print(f"sum_arange_tolist:  {timeit(sum7v2, number = repetition)}")
print(f"npsum_arange:       {timeit(sum8, number = repetition)}")
print(f"nparangenpsum:      {timeit(sum9, number = repetition)}")
print(f"array_basic:        {timeit(array_basic, number = repetition)}")
print(f"array_dtype:        {timeit(array_dtype, number = repetition)}")
print(f"array_iter:         {timeit(array_iter,  number = repetition)}")

print(f"npsumarangeREP:     {timeit(lambda : sum8(N/1000), number = 100000*repetition)}")
print(f"npsumarangeREP:     {timeit(lambda : sum9(N/1000), number = 100000*repetition)}")

# Example output:
#
# while loop:         11.493371912998555
# for loop:           7.385945574002108
# sum_range:          2.4605720699983067
# sum_rangelist:      4.509678105998319
# npsum_range:        11.85120212900074
# npsum_iterrange:    4.464334709002287
# npsum_iterrangev2:  4.498494338993623
# sum_arange:         9.537815956995473
# sum_list_arange:    13.290120724996086
# sum_arange_tolist:  5.231948580003518
# npsum_arange:       0.241889145996538
# nparangenpsum:      0.21876695199898677
# array_basic:        11.736577274998126
# array_dtype:        8.71628468400013
# array_iter:         4.303306431000237
# npsumarangeREP:     21.240833958996518
# npsumarangeREP:     16.690092379001726

ANSWER

Answered 2021-Oct-16 at 17:42

From the cpython source code for sum sum initially seems to attempt a fast path that assumes all inputs are the same type. If that fails it will just iterate:

/* Fast addition by keeping temporary sums in C instead of new Python objects.
   Assumes all inputs are the same type.  If the assumption fails, default
   to the more general routine.
*/

I'm not entirely certain what is happening under the hood, but it is likely the repeated creation/conversion of C types to Python objects that is causing these slow-downs. It's worth noting that both sum and range are implemented in C.


This next bit is not really an answer to the question, but I wondered if we could speed up sum for python ranges as range is quite a smart object.

To do this I've used functools.singledispatch to override the built-in sum function specifically for the range type; then implemented a small function to calculate the sum of an arithmetic progression.

from functools import singledispatch

def sum_range(range_, /, start=0):
    """Overloaded `sum` for range, compute arithmetic sum"""
    n = len(range_)
    if not n:
        return start
    return int(start + (n * (range_[0] + range_[-1]) / 2))

sum = singledispatch(sum)
sum.register(range, sum_range)

def test():
    """
    >>> sum(range(0, 100))
    4950
    >>> sum(range(0, 10, 2))
    20
    >>> sum(range(0, 9, 2))
    20
    >>> sum(range(0, -10, -1))
    -45
    >>> sum(range(-10, 10))
    -10
    >>> sum(range(-1, -100, -2))
    -2500
    >>> sum(range(0, 10, 100))
    0
    >>> sum(range(0, 0))
    0
    >>> sum(range(0, 100), 50)
    5000
    >>> sum(range(0, 0), 10)
    10
    """

if __name__ == "__main__":
    import doctest
    doctest.testmod()

I'm not sure if this is complete, but it's definitely faster than looping.

Source https://stackoverflow.com/questions/69584027

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install numpy

You can download it from GitHub.
You can use numpy like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

Support

The NumPy project welcomes your expertise and enthusiasm!. Small improvements or fixes are always appreciated; issues labeled as "good first issue" may be a good starting point. If you are considering larger contributions to the source code, please contact us through the mailing list first.

DOWNLOAD this Library from

Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
over 430 million Knowledge Items
Find more libraries
Reuse Solution Kits and Libraries Curated by Popular Use Cases

Save this library and start creating your kit

Explore Related Topics

Share this Page

share link
Compare Data Manipulation Libraries with Highest Support
Compare Data Manipulation Libraries with Highest Quality
Compare Data Manipulation Libraries with Highest Security
Compare Data Manipulation Libraries with Permissive License
Compare Data Manipulation Libraries with Highest Reuse
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
over 430 million Knowledge Items
Find more libraries
Reuse Solution Kits and Libraries Curated by Popular Use Cases

Save this library and start creating your kit

  • © 2022 Open Weaver Inc.