pandas | powerful data analysis / manipulation library

 by   pandas-dev Python Version: 2.2.2 License: BSD-3-Clause

kandi X-RAY | pandas Summary

kandi X-RAY | pandas Summary

pandas is a Python library typically used in Data Science, Pandas applications. pandas has build file available, it has a Permissive License and it has high support. However pandas has 2481 bugs and it has 4 vulnerabilities. You can install using 'pip install pandas' or download it from GitHub, PyPI.

pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way towards this goal.

            kandi-support Support

              pandas has a highly active ecosystem.
              It has 38689 star(s) with 16367 fork(s). There are 1109 watchers for this library.
              There were 7 major release(s) in the last 6 months.
              There are 3403 open issues and 21295 have been closed. On average issues are closed in 221 days. There are 115 open pull requests and 0 closed requests.
              It has a positive sentiment in the developer community.
              The latest version of pandas is 2.2.2

            kandi-Quality Quality

              pandas has 2481 bugs (7 blocker, 0 critical, 2300 major, 174 minor) and 3690 code smells.

            kandi-Security Security

              pandas has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              pandas code analysis shows 4 unresolved vulnerabilities (4 blocker, 0 critical, 0 major, 0 minor).
              There are 22 security hotspots that need review.

            kandi-License License

              pandas is licensed under the BSD-3-Clause License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              pandas releases are available to install and integrate.
              Deployable package is available in PyPI.
              Build file is available. You can build the component from source.
              Installation instructions, examples and code snippets are available.
              pandas saves you 667226 person hours of effort in developing the same functionality from scratch.
              It has 330153 lines of code, 22612 functions and 1317 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed pandas and discovered the below as its top functions. This is intended to give you an instant insight into pandas implemented functionality, and help decide if they suit your requirements.
            • Write the table to a LaTeX file .
            • Convert argument to datetime index .
            • Add numeric operations .
            • Normalize JSON data .
            • Read data from a JSON file .
            • Convert wide to long .
            • Merge two DataFrames .
            • Read data from an XML file .
            • Create a loc indexer .
            • Cut an array .
            Get all kandi verified functions for this library.

            pandas Key Features

            No Key Features are available at this moment for pandas.

            pandas Examples and Code Snippets

            Version 0.13.0 (January 3, 2014)
            Pythondot img1Lines of Code : 1207dot img1License : Permissive (BSD-3-Clause)
            copy iconCopy
            - ``read_excel`` now supports an integer in its ``sheetname`` argument giving
              the index of the sheet to read in (:issue:`4301`).
            - Text parser now treats anything that reads like inf ("inf", "Inf", "-Inf",
              "iNf", etc.) as infinity. (:issue:`4220  
            Hierarchical indexing (MultiIndex)
            Pythondot img2Lines of Code : 1180dot img2License : Permissive (BSD-3-Clause)
            copy iconCopy
            The :class:`MultiIndex` object is the hierarchical analogue of the standard
            :class:`Index` object which typically stores the axis labels in pandas objects. You
            can think of ``MultiIndex`` as an array of tuples where each tuple is unique. A
            What's new in 1.5.0 (September 19, 2022)
            Pythondot img3Lines of Code : 1152dot img3License : Permissive (BSD-3-Clause)
            copy iconCopy
            .. _whatsnew_150.enhancements.pandas-stubs:
            The ``pandas-stubs`` library is now supported by the pandas development team, providing type stubs for the pandas API. Please visit
            pandas - make
            Pythondot img4Lines of Code : 238dot img4License : Non-SPDX (BSD 3-Clause "New" or "Revised" License)
            copy iconCopy
            #!/usr/bin/env python3
            Python script for building documentation.
            To build the docs you must have all optional dependencies for pandas
            installed. See the installation instructions for a list of these.
                $ python clean
            pandas - announce
            Pythondot img5Lines of Code : 73dot img5License : Non-SPDX (BSD 3-Clause "New" or "Revised" License)
            copy iconCopy
            #!/usr/bin/env python3
            Script to generate contributor and pull request lists
            This script generates contributor and pull request lists for release
            announcements using GitHub v3 protocol. Use requires an authentication token in
            order to have suffi  
            pandas - eval performance
            Pythondot img6Lines of Code : 72dot img6License : Non-SPDX (BSD 3-Clause "New" or "Revised" License)
            copy iconCopy
            from timeit import repeat as timeit
            import numpy as np
            import seaborn as sns
            from pandas import DataFrame
            setup_common = """from pandas import DataFrame
            from numpy.random import randn
            df = DataFrame(randn(%d, 3), columns=list('abc'))
            Error tokenizing data. C error: Expected x fields in line 5, saw x
            Pythondot img7Lines of Code : 2dot img7License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            data = pd.read_csv(io.BytesIO(data.content), sep="^")
            How to generate unique code from occurrence in pandas
            Pythondot img8Lines of Code : 10dot img8License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            df['unique_code'] = df['Code'] + '_' + df.groupby('Code').cumcount().add(1).astype(str).str.zfill(4)
               Id  Code unique_code
            0   1  A_01   A_01_0001
            1   2  C_03   C_03_0001
            2   3  A_01   A_01_0002
            3   4  C_02   C_02_0001
            How to prevent NaN when using str.lower in Python?
            Pythondot img9Lines of Code : 4dot img9License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            import pandas as pd
            df = pd.DataFrame({'col':['G5051', 'G5052', 5053, 'G5054']})
            List in dataframe is different to the order it appears in the original list?
            Pythondot img10Lines of Code : 4dot img10License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            df = pd.DataFrame(np.array(data).T)
            df = pd.DataFrame(list(map(list, zip(*data))))

            Community Discussions


            Installing scipy and scikit-learn on apple m1
            Asked 2022-Mar-22 at 06:21

            The installation on the m1 chip for the following packages: Numpy 1.21.1, pandas 1.3.0, torch 1.9.0 and a few other ones works fine for me. They also seem to work properly while testing them. However when I try to install scipy or scikit-learn via pip this error appears:

            ERROR: Failed building wheel for numpy

            Failed to build numpy

            ERROR: Could not build wheels for numpy which use PEP 517 and cannot be installed directly

            Why should Numpy be build again when I have the latest version from pip already installed?

            Every previous installation was done using python3.9 -m pip install ... on Mac OS 11.3.1 with the apple m1 chip.

            Maybe somebody knows how to deal with this error or if its just a matter of time.



            Answered 2021-Aug-02 at 14:33

            Please see this note of scikit-learn about

            Installing on Apple Silicon M1 hardware

            The recently introduced macos/arm64 platform (sometimes also known as macos/aarch64) requires the open source community to upgrade the build configuation and automation to properly support it.

            At the time of writing (January 2021), the only way to get a working installation of scikit-learn on this hardware is to install scikit-learn and its dependencies from the conda-forge distribution, for instance using the miniforge installers:


            The following issue tracks progress on making it possible to install scikit-learn from PyPI with pip:




            Error while downloading the requirements using pip install (setup command: use_2to3 is invalid.)
            Asked 2022-Mar-05 at 07:13

            version pip 21.2.4 python 3.6

            The command:



            Answered 2021-Nov-19 at 13:30

            It looks like setuptools>=58 breaks support for use_2to3:

            setuptools changelog for v58

            So you should update setuptools to setuptools<58 or avoid using packages with use_2to3 in the setup parameters.

            I was having the same problem, pip==19.3.1



            Mapping complex JSON to Pandas Dataframe
            Asked 2022-Feb-25 at 13:57

            I have a complex nested JSON object, which I am trying to unpack into a pandas df in a very specific way.

            JSON Object
            this is an extract, containing randomized data of the JSON object, which shows examples of the hierarchy (inc. children) for 1x family (i.e. 'Falconer Family'), however there is 100s of them in total and this extract just has 1x family, however the full JSON object has multiple -



            Answered 2022-Feb-16 at 06:41

            I think this gets you pretty close; might just need to adjust the various name columns and drop the extra data (I kept the grouping column).

            The main idea is to recursively use pd.json_normalize with pd.concat for all availalable children levels.

            EDIT: Put everything into a single function and added section to collapse the name columns like the expected output.



            AttributeError: Can't get attribute 'new_block' on
            Asked 2022-Feb-25 at 13:18

            I was using pyspark on AWS EMR (4 r5.xlarge as 4 workers, each has one executor and 4 cores), and I got AttributeError: Can't get attribute 'new_block' on . Below is a snippet of the code that threw this error:



            Answered 2021-Aug-26 at 14:53

            I had the same error using pandas 1.3.2 in the server while 1.2 in my client. Downgrading pandas to 1.2 solved the problem.



            How to update pandas DataFrame.drop() for Future Warning - all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
            Asked 2022-Feb-13 at 19:56

            The following code:



            Answered 2022-Feb-13 at 19:56

            From the documentation, pandas.DataFrame.drop has the following parameters:


            • labels: single label or list-like Index or column labels to drop.

            • axis: {0 or ‘index’, 1 or ‘columns’}, default 0 Whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’).

            • index: single label or list-like Alternative to specifying axis (labels, axis=0 is equivalent to index=labels).

            • columns: single label or list-like Alternative to specifying axis (labels, axis=1 is equivalent to columns=labels).

            • level: int or level name, optional For MultiIndex, level from which the labels will be removed.

            • inplace: bool, default False If False, return a copy. Otherwise, do operation inplace and return None.

            • errors: {‘ignore’, ‘raise’}, default ‘raise’ If ‘ignore’, suppress error and only existing labels are dropped.

            Moving forward, only labels (the first parameter) can be positional.

            So, for this example, the drop code should be as follows:



            Cannot set up a conda environment with python 3.10
            Asked 2022-Jan-31 at 10:35

            I am trying to set up a conda environment with python 3.10 installed. For some reason, no install commands for additional packages are working. For example, if I run conda install pandas, I get the error:



            Answered 2021-Oct-08 at 08:42

            Thats a bug in conda, you can read more about it here:

            Right now there is a PR to fix it but its not a released version. For now, just stick with



            ImportError: cannot import name 'ABCIndexClass' from 'pandas.core.dtypes.generic'
            Asked 2022-Jan-12 at 23:01

            I have this output :

            [Pandas-profiling] ImportError: cannot import name 'ABCIndexClass' from 'pandas.core.dtypes.generic'

            when trying to import pandas-profiling in this fashion :



            Answered 2021-Aug-09 at 19:19

            Pandas v1.3 renamed the ABCIndexClass to ABCIndex. The visions dependency of the pandas-profiling package hasn't caught up yet, and so throws an error when it can't find ABCIndexClass. Downgrading pandas to the 1.2.x series will resolve the issue.

            Alternatively, you can just wait for the visions package to be updated.



            Merge two pandas DataFrame based on partial match
            Asked 2022-Jan-06 at 00:54

            Two DataFrames have city names that are not formatted the same way. I'd like to do a Left-outer join and pull geo field for all partial string matches between the field City in both DataFrames.



            Answered 2021-Sep-12 at 20:24

            This should do the job. String match with Levenshtein_distance.

            pip install thefuzz[speedup]



            Create a new column in a Pandas DataFrame from existing column names
            Asked 2021-Nov-15 at 00:22

            I want to deconstruct a pandas DataFrame, using column headers as a new data-column and create a list with all combinations of the row index and columns. Easier to show than explain:



            Answered 2021-Nov-09 at 23:58

            The structure that you want your data in is very messy, so this is probably the best method given the data you want.



            After conda update, python kernel crashes when matplotlib is used
            Asked 2021-Nov-06 at 19:03

            I have create this simple env with conda:



            Answered 2021-Nov-06 at 19:03
            Update 2021-11-06
            • The default pkgs/main channel for conda has reverted to using freetype 2.10.4 for Windows, per main / packages / freetype.
            • If you are still experiencing the issue, use conda list freetype to check the version: freetype != 2.11.0
              • If it is 2.11.0, then change the version per the solution, or conda update --all (providing your default channel isn't changed in the .condarc config file).
            • If this is occurring after installing Anaconda, updating conda or freetype since Oct 27, 2021.
            • Go to the Anaconda prompt and downgrade freetype 2.11.0 in any affected environment.
              • conda install freetype=2.10.4
            • Relevant to any package using matplotlib and any IDE
              • For example, pandas.DataFrame.plot and seaborn
              • Jupyter, Spyder, VSCode, PyCharm, command line.
            • An issue occurs after updating with the most current updates from conda, released Friday, Oct 29.
            • After updating with conda update --all, there's an issue with anything related to matplotlib in any IDE (not just Jupyter).
              • I tested this in JupyterLab, PyCharm, and python from the command prompt.
              • PyCharm: Process finished with exit code -1073741819
              • JupyterLab: kernel just restarts and there are no associated errors or Traceback
              • command prompt: a blank interactive matplotlib window will appear briefly, and then a new command line appears.
            • The issue seems to be with conda update --all in (base), then any plot API that uses matplotlib (e.g. seaborn and pandas.DataFrame.plot) kills the kernel in any environment.
            • I had to reinstall Anaconda, but do not do an update of (base), then my other environments worked.
            • I have not figured out what specifically is causing the issue.
            • I tested the issue with python 3.8.12 and python 3.9.7
            • Current Testing:
              • Following is the conda revision log.
              • Prior to conda update --all this environment was working, but after the updates, plotting with matplotlib crashes the python kernel


            Community Discussions, Code Snippets contain sources that include Stack Exchange Network


            No vulnerabilities reported

            Install pandas

            To install pandas from source you need Cython in addition to the normal dependencies above. Cython can be installed from PyPI:.


            The official documentation is hosted on
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
          • PyPI

            pip install pandas

          • CLONE
          • HTTPS


          • CLI

            gh repo clone pandas-dev/pandas

          • sshUrl


          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link