kandi background
Explore Kits

seaborn | Statistical data visualization in Python | Data Visualization library

 by   mwaskom Python Version: v0.11.2 License: BSD-3-Clause

 by   mwaskom Python Version: v0.11.2 License: BSD-3-Clause

Download this library from

kandi X-RAY | seaborn Summary

seaborn is a Python library typically used in Analytics, Data Visualization, Pandas applications. seaborn has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has high support. You can download it from GitHub.
Statistical data visualization in Python
Support
Support
Quality
Quality
Security
Security
License
License
Reuse
Reuse

kandi-support Support

  • seaborn has a highly active ecosystem.
  • It has 9320 star(s) with 1582 fork(s). There are 248 watchers for this library.
  • It had no major release in the last 12 months.
  • There are 105 open issues and 1892 have been closed. On average issues are closed in 66 days. There are 11 open pull requests and 0 closed requests.
  • It has a positive sentiment in the developer community.
  • The latest version of seaborn is v0.11.2
seaborn Support
Best in #Data Visualization
Average in #Data Visualization
seaborn Support
Best in #Data Visualization
Average in #Data Visualization

quality kandi Quality

  • seaborn has 0 bugs and 0 code smells.
seaborn Quality
Best in #Data Visualization
Average in #Data Visualization
seaborn Quality
Best in #Data Visualization
Average in #Data Visualization

securitySecurity

  • seaborn has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
  • seaborn code analysis shows 0 unresolved vulnerabilities.
  • There are 0 security hotspots that need review.
seaborn Security
Best in #Data Visualization
Average in #Data Visualization
seaborn Security
Best in #Data Visualization
Average in #Data Visualization

license License

  • seaborn is licensed under the BSD-3-Clause License. This license is Permissive.
  • Permissive licenses have the least restrictions, and you can use them in most projects.
seaborn License
Best in #Data Visualization
Average in #Data Visualization
seaborn License
Best in #Data Visualization
Average in #Data Visualization

buildReuse

  • seaborn releases are available to install and integrate.
  • Build file is available. You can build the component from source.
  • Installation instructions are not available. Examples and code snippets are available.
  • seaborn saves you 13118 person hours of effort in developing the same functionality from scratch.
  • It has 26955 lines of code, 1355 functions and 102 files.
  • It has high code complexity. Code complexity directly impacts maintainability of the code.
seaborn Reuse
Best in #Data Visualization
Average in #Data Visualization
seaborn Reuse
Best in #Data Visualization
Average in #Data Visualization
Top functions reviewed by kandi - BETA

kandi has reviewed seaborn and discovered the below as its top functions. This is intended to give you an instant insight into seaborn implemented functionality, and help decide if they suit your requirements.

  • Plot a scatter plot
    • The legend
    • Draw a matplotlib figure
    • Add a legend
    • Combine the data into a single plot
  • Generate a joint -man plot
    • Inject kwargs into kwargs
    • Plot marginal properties for a function
    • Apply func to func
  • Plot a lmplot
    • Show a dark palette
      • Show a light palette
        • Plot the marginal properties of a function
          • Determine the colormap parameters
            • Generate a swarm plot
              • Choose the diverging palette
                • Call func on func
                  • Extract the docstring from the file
                    • Get the plot data
                      • Combine the data into a new plot
                        • Choose a cubehelix palette
                          • Load an example dataset
                            • Plot clusters of data
                              • Move a legend
                                • Plot the residual data
                                  • Set title and col_titles
                                    • Draw a logo
                                      • Chooses a colorbrewer palette

                                        Get all kandi verified functions for this library.

                                        Get all kandi verified functions for this library.

                                        seaborn Key Features

                                        Statistical data visualization in Python

                                        default

                                        copy iconCopydownload iconDownload
                                        seaborn: statistical data visualization
                                        =======================================
                                        
                                        [![PyPI Version](https://img.shields.io/pypi/v/seaborn.svg)](https://pypi.org/project/seaborn/)
                                        [![License](https://img.shields.io/pypi/l/seaborn.svg)](https://github.com/mwaskom/seaborn/blob/master/LICENSE)
                                        [![DOI](https://joss.theoj.org/papers/10.21105/joss.03021/status.svg)](https://doi.org/10.21105/joss.03021)
                                        [![Tests](https://github.com/mwaskom/seaborn/workflows/CI/badge.svg)](https://github.com/mwaskom/seaborn/actions)
                                        [![Code Coverage](https://codecov.io/gh/mwaskom/seaborn/branch/master/graph/badge.svg)](https://codecov.io/gh/mwaskom/seaborn)
                                        
                                        Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.
                                        
                                        
                                        Documentation
                                        -------------
                                        
                                        Online documentation is available at [seaborn.pydata.org](https://seaborn.pydata.org).
                                        
                                        The docs include a [tutorial](https://seaborn.pydata.org/tutorial.html), [example gallery](https://seaborn.pydata.org/examples/index.html), [API reference](https://seaborn.pydata.org/api.html), and other useful information.
                                        
                                        To build the documentation locally, please refer to [`doc/README.md`](doc/README.md).
                                        
                                        There is also a [FAQ](https://github.com/mwaskom/seaborn/wiki/Frequently-Asked-Questions-(FAQs)) page, currently hosted on GitHub.
                                        
                                        Dependencies
                                        ------------
                                        
                                        Seaborn supports Python 3.7+ and no longer supports Python 2.
                                        
                                        Installation requires [numpy](https://numpy.org/), [pandas](https://pandas.pydata.org/), and [matplotlib](https://matplotlib.org/). Some functions will optionally use [scipy](https://www.scipy.org/) and/or [statsmodels](https://www.statsmodels.org/) if they are available.
                                        
                                        
                                        Installation
                                        ------------
                                        
                                        The latest stable release (and required dependencies) can be installed from PyPI:
                                        
                                            pip install seaborn
                                        
                                        It is also possible to include optional dependencies (only relevant for v0.12+):
                                        
                                            pip install seaborn[all]
                                        
                                        Seaborn can also be installed with conda:
                                        
                                            conda install seaborn
                                        
                                        Note that the main anaconda repository typically lags PyPI in adding new releases, but conda-forge (`-c conda-forge`) typically updates quickly.
                                        
                                        Citing
                                        ------
                                        
                                        A paper describing seaborn has been published in the [Journal of Open Source Software](https://joss.theoj.org/papers/10.21105/joss.03021). The paper provides an introduction to the key features of the library, and it can be used as a citation if seaborn proves integral to a scientific publication.
                                        
                                        Testing
                                        -------
                                        
                                        Testing seaborn requires installing additional packages listed in `ci/utils.txt`.
                                        
                                        To test the code, run `make test` in the source directory. This will exercise both the unit tests and docstring examples (using [pytest](https://docs.pytest.org/)) and generate a coverage report.
                                        
                                        The doctests require a network connection (unless all example datasets are cached), but the unit tests can be run offline with `make unittests`.
                                        
                                        Code style is enforced with `flake8` using the settings in the [`setup.cfg`](./setup.cfg) file. Run `make lint` to check.
                                        
                                        Development
                                        -----------
                                        
                                        Seaborn development takes place on Github: https://github.com/mwaskom/seaborn
                                        
                                        Please submit bugs that you encounter to the [issue tracker](https://github.com/mwaskom/seaborn/issues) with a reproducible example demonstrating the problem. Questions about usage are more at home on StackOverflow, where there is a [seaborn tag](https://stackoverflow.com/questions/tagged/seaborn).

                                        Getting Error 0 when plotting boxplot of a filtered dataset

                                        copy iconCopydownload iconDownload
                                        sns.boxplot(data=df, x='rings', y='sex')
                                        
                                        import seaborn as sns
                                        import pandas as pd
                                        
                                        df = pd.DataFrame({'Sex': ['M', 'M', 'F', 'F'],
                                                           'Rings': [1, 2, 3, 4]})
                                        df_m = df[df['Sex'] == 'M']
                                        df_f = df[df['Sex'] == 'F']
                                        sns.boxplot(data=df_f['Rings'])
                                        
                                        sns.boxplot(data=df_f['Rings'].values)
                                        
                                        sns.boxplot(data=df_f, y='Rings')
                                        
                                        import seaborn as sns
                                        import pandas as pd
                                        
                                        df = pd.DataFrame({'Sex': ['M', 'M', 'F', 'F'],
                                                           'Rings': [1, 2, 3, 4]})
                                        df_m = df[df['Sex'] == 'M']
                                        df_f = df[df['Sex'] == 'F']
                                        sns.boxplot(data=df_f['Rings'])
                                        
                                        sns.boxplot(data=df_f['Rings'].values)
                                        
                                        sns.boxplot(data=df_f, y='Rings')
                                        
                                        import seaborn as sns
                                        import pandas as pd
                                        
                                        df = pd.DataFrame({'Sex': ['M', 'M', 'F', 'F'],
                                                           'Rings': [1, 2, 3, 4]})
                                        df_m = df[df['Sex'] == 'M']
                                        df_f = df[df['Sex'] == 'F']
                                        sns.boxplot(data=df_f['Rings'])
                                        
                                        sns.boxplot(data=df_f['Rings'].values)
                                        
                                        sns.boxplot(data=df_f, y='Rings')
                                        

                                        Resize axes of top and right joint marginal plots to match central plot with matplotlib

                                        copy iconCopydownload iconDownload
                                        from matplotlib.transforms import Bbox
                                        
                                        # code added at the end, just before plt.show()
                                        (x0m, y0m), (x1m, y1m) = axs[1, 0].get_position().get_points()  # main heatmap
                                        (x0h, y0h), (x1h, y1h) = axs[0, 0].get_position().get_points()  # horizontal histogram
                                        axs[0, 0].set_position(Bbox([[x0m, y0h], [x1m, y1h]]))
                                        (x0v, y0v), (x1v, y1v) = axs[1, 1].get_position().get_points()  # vertical histogram
                                        axs[1, 1].set_position(Bbox([[x0v, y0m], [x1v, y1m]]))
                                        
                                        plt.show()
                                        

                                        How to paste an Excel chart into PowerPoint placeholder using Python?

                                        copy iconCopydownload iconDownload
                                        import win32com.client as win32
                                        
                                        xlApp = win32.Dispatch('Excel.Application')
                                        wb = xlApp.Workbooks.Open(outputPath+'Chart Pack.xlsb')
                                        
                                        pptApp = win32.Dispatch('PowerPoint.Application')
                                        ppt = pptApp.Presentations.Open(template)
                                        
                                        slide_num = 3
                                        LEFT_PLACEHOLDER = 3
                                        RIGHT_PLACEHOLDER = 2
                                        
                                        # Figure1
                                        window.View.GotoSlide(slide_num)
                                        wb.sheets('Charts').ChartObjects('Figure1').Copy()
                                        ppt.Slides.Item(slide_num).Shapes.Paste().Select()
                                        window.Selection.Cut()
                                        ppt.Slides.Item(slide_num).Shapes(LEFT_PLACEHOLDER).Select()
                                        window.View.Paste()
                                        
                                        # Figure2
                                        window.View.GotoSlide(slide_num)
                                        wb.sheets('Charts').ChartObjects('Figure2').Copy()
                                        ppt.Slides.Item(slide_num).Shapes.Paste().Select()
                                        window.Selection.Cut()
                                        ppt.Slides.Item(slide_num).Shapes(RIGHT_PLACEHOLDER).Select()
                                        window.View.Paste()
                                        

                                        How to install local package with conda

                                        copy iconCopydownload iconDownload
                                        package:
                                          name: jive
                                          version: "0.2.1"
                                        
                                        source:
                                          path: .
                                        
                                        build:
                                          script: python -m pip install --no-deps --ignore-installed .
                                        
                                        requirements:
                                          host:
                                             - python
                                             - pip
                                             - setuptools
                                          run:
                                             - python
                                             - numpy
                                             - pandas
                                             - scipy
                                             - seaborn
                                             - matplotlib
                                             - scikit-learn
                                             - statsmodels
                                             - joblib
                                             - bokeh
                                        

                                        How to add median and IQR to seaborn violinplot

                                        copy iconCopydownload iconDownload
                                        import seaborn as sns
                                        import matplotlib.pyplot as plt
                                        sns.set(rc={'figure.figsize':(20,14)})
                                        sns.set_theme(style="whitegrid")
                                        tips = sns.load_dataset("tips")
                                        ax = sns.violinplot(x="day", y="total_bill", hue="sex",
                                                            data=tips, palette="Set2", split=True,
                                                            scale="count", inner="quartile")
                                        
                                        for l in ax.lines:
                                            ax.text(l.get_data()[0][l.get_data()[0].nonzero()][0], l.get_data()[1][0], f'{l.get_data()[1][0]:.0f}',size='large')    
                                        
                                        for l in ax.lines:
                                            print(l.get_data())
                                        
                                        (array([-0.33143126,  0.        ]), array([13.6975, 13.6975]))
                                        (array([-0.37082654,  0.        ]), array([16.975, 16.975]))
                                        (array([-0.25003508,  0.        ]), array([22.36, 22.36]))
                                        (array([0.        , 0.38725949]), array([12.1625, 12.1625]))
                                        (array([0.        , 0.39493527]), array([13.785, 13.785]))
                                        (array([0.        , 0.25722566]), array([18.675, 18.675]))
                                        (array([0.61440819, 1.        ]), array([12.235, 12.235]))
                                        (array([0.62200517, 1.        ]), array([17.215, 17.215]))
                                        (array([0.71996702, 1.        ]), array([26.0825, 26.0825]))
                                        (array([1.        , 1.27676537]), array([11.35, 11.35]))
                                        (array([1.        , 1.35376254]), array([15.38, 15.38]))
                                        (array([1.        , 1.33610081]), array([16.27, 16.27]))
                                        (array([1.63945286, 2.        ]), array([13.905, 13.905]))
                                        (array([1.60991669, 2.        ]), array([18.24, 18.24]))
                                        (array([1.75431475, 2.        ]), array([24.165, 24.165]))
                                        (array([2.        , 2.17418026]), array([14.05, 14.05]))
                                        (array([2.        , 2.18356376]), array([18.36, 18.36]))
                                        (array([2.        , 2.12950778]), array([25.5625, 25.5625]))
                                        (array([2.62638135, 3.        ]), array([15.135, 15.135]))
                                        (array([2.61919886, 3.        ]), array([20.725, 20.725]))
                                        (array([2.73148987, 3.        ]), array([26.55, 26.55]))
                                        (array([3.        , 3.12054879]), array([15.175, 15.175]))
                                        (array([3.        , 3.12176923]), array([17.41, 17.41]))
                                        (array([3.        , 3.07327664]), array([24.8975, 24.8975]))
                                        
                                        import seaborn as sns
                                        import matplotlib.pyplot as plt
                                        sns.set(rc={'figure.figsize':(20,14)})
                                        sns.set_theme(style="whitegrid")
                                        tips = sns.load_dataset("tips")
                                        ax = sns.violinplot(x="day", y="total_bill", hue="sex",
                                                            data=tips, palette="Set2", split=True,
                                                            scale="count", inner="quartile")
                                        
                                        for l in ax.lines:
                                            ax.text(l.get_data()[0][l.get_data()[0].nonzero()][0], l.get_data()[1][0], f'{l.get_data()[1][0]:.0f}',size='large')    
                                        
                                        for l in ax.lines:
                                            print(l.get_data())
                                        
                                        (array([-0.33143126,  0.        ]), array([13.6975, 13.6975]))
                                        (array([-0.37082654,  0.        ]), array([16.975, 16.975]))
                                        (array([-0.25003508,  0.        ]), array([22.36, 22.36]))
                                        (array([0.        , 0.38725949]), array([12.1625, 12.1625]))
                                        (array([0.        , 0.39493527]), array([13.785, 13.785]))
                                        (array([0.        , 0.25722566]), array([18.675, 18.675]))
                                        (array([0.61440819, 1.        ]), array([12.235, 12.235]))
                                        (array([0.62200517, 1.        ]), array([17.215, 17.215]))
                                        (array([0.71996702, 1.        ]), array([26.0825, 26.0825]))
                                        (array([1.        , 1.27676537]), array([11.35, 11.35]))
                                        (array([1.        , 1.35376254]), array([15.38, 15.38]))
                                        (array([1.        , 1.33610081]), array([16.27, 16.27]))
                                        (array([1.63945286, 2.        ]), array([13.905, 13.905]))
                                        (array([1.60991669, 2.        ]), array([18.24, 18.24]))
                                        (array([1.75431475, 2.        ]), array([24.165, 24.165]))
                                        (array([2.        , 2.17418026]), array([14.05, 14.05]))
                                        (array([2.        , 2.18356376]), array([18.36, 18.36]))
                                        (array([2.        , 2.12950778]), array([25.5625, 25.5625]))
                                        (array([2.62638135, 3.        ]), array([15.135, 15.135]))
                                        (array([2.61919886, 3.        ]), array([20.725, 20.725]))
                                        (array([2.73148987, 3.        ]), array([26.55, 26.55]))
                                        (array([3.        , 3.12054879]), array([15.175, 15.175]))
                                        (array([3.        , 3.12176923]), array([17.41, 17.41]))
                                        (array([3.        , 3.07327664]), array([24.8975, 24.8975]))
                                        
                                        import seaborn as sns
                                        import matplotlib.pyplot as plt
                                        sns.set(rc={'figure.figsize':(20,14)})
                                        sns.set_theme(style="whitegrid")
                                        tips = sns.load_dataset("tips")
                                        ax = sns.violinplot(x="day", y="total_bill", hue="sex",
                                                            data=tips, palette="Set2", split=True,
                                                            scale="count", inner="quartile")
                                        
                                        for l in ax.lines:
                                            ax.text(l.get_data()[0][l.get_data()[0].nonzero()][0], l.get_data()[1][0], f'{l.get_data()[1][0]:.0f}',size='large')    
                                        
                                        for l in ax.lines:
                                            print(l.get_data())
                                        
                                        (array([-0.33143126,  0.        ]), array([13.6975, 13.6975]))
                                        (array([-0.37082654,  0.        ]), array([16.975, 16.975]))
                                        (array([-0.25003508,  0.        ]), array([22.36, 22.36]))
                                        (array([0.        , 0.38725949]), array([12.1625, 12.1625]))
                                        (array([0.        , 0.39493527]), array([13.785, 13.785]))
                                        (array([0.        , 0.25722566]), array([18.675, 18.675]))
                                        (array([0.61440819, 1.        ]), array([12.235, 12.235]))
                                        (array([0.62200517, 1.        ]), array([17.215, 17.215]))
                                        (array([0.71996702, 1.        ]), array([26.0825, 26.0825]))
                                        (array([1.        , 1.27676537]), array([11.35, 11.35]))
                                        (array([1.        , 1.35376254]), array([15.38, 15.38]))
                                        (array([1.        , 1.33610081]), array([16.27, 16.27]))
                                        (array([1.63945286, 2.        ]), array([13.905, 13.905]))
                                        (array([1.60991669, 2.        ]), array([18.24, 18.24]))
                                        (array([1.75431475, 2.        ]), array([24.165, 24.165]))
                                        (array([2.        , 2.17418026]), array([14.05, 14.05]))
                                        (array([2.        , 2.18356376]), array([18.36, 18.36]))
                                        (array([2.        , 2.12950778]), array([25.5625, 25.5625]))
                                        (array([2.62638135, 3.        ]), array([15.135, 15.135]))
                                        (array([2.61919886, 3.        ]), array([20.725, 20.725]))
                                        (array([2.73148987, 3.        ]), array([26.55, 26.55]))
                                        (array([3.        , 3.12054879]), array([15.175, 15.175]))
                                        (array([3.        , 3.12176923]), array([17.41, 17.41]))
                                        (array([3.        , 3.07327664]), array([24.8975, 24.8975]))
                                        

                                        Pandas agg define metric based on data type

                                        copy iconCopydownload iconDownload
                                        def a(x):
                                            if x.dtype == np.dtype('float64'):
                                                dict[x.name] = "mean"
                                            elif x.dtype == np.dtype('object'):
                                                dict[x.name] = "first"
                                        
                                        
                                        dict = {}
                                        
                                        df = df.apply(a)
                                        
                                        iris.agg(dict)
                                        
                                        import seaborn as sns
                                        iris = sns.load_dataset('iris')
                                        
                                        agg_method = {'float64': 'mean', 'object':  'count'}
                                        
                                        iris.agg({k: agg_method[str(v)] for k, v in iris.dtypes.items()})
                                        
                                        sepal_length      5.843333
                                        sepal_width       3.057333
                                        petal_length      3.758000
                                        petal_width       1.199333
                                        species         150.000000
                                        dtype: float64
                                        
                                        import seaborn as sns
                                        iris = sns.load_dataset('iris')
                                        
                                        agg_method = {'float64': 'mean', 'object':  'count'}
                                        
                                        iris.agg({k: agg_method[str(v)] for k, v in iris.dtypes.items()})
                                        
                                        sepal_length      5.843333
                                        sepal_width       3.057333
                                        petal_length      3.758000
                                        petal_width       1.199333
                                        species         150.000000
                                        dtype: float64
                                        
                                        pd.concat([iris.mean(numeric_only=True), 
                                                   iris.select_dtypes('object').count()]
                                                 )
                                        
                                        sepal_length      5.843333
                                        sepal_width       3.057333
                                        petal_length      3.758000
                                        petal_width       1.199333
                                        species         150.000000
                                        

                                        Adding columns & values per group occurrence in pandas after filtering

                                        copy iconCopydownload iconDownload
                                        df = (df_temp.join(pd.crosstab(df['sex'],pd.cut(df['age'], 
                                                                                        bins=range(0,9),
                                                                                        labels=range(1,9)))
                                                             .add_prefix('age_')))
                                        print (df)
                                                     fare  age_1  age_2  age_3  age_4  age_5  age_6  age_7  age_8
                                        sex                                                                      
                                        female  44.479818      4      6      2      5      4      2      1      2
                                        male    25.523893     10      4      4      5      0      1      2      2
                                        

                                        Matplotlib scatter plot marker type from dictionary

                                        copy iconCopydownload iconDownload
                                        x = [4, 8, 1, 0, 2]
                                        y = [0.1, 1, 0.4, 0.8, 0.9]
                                        name = ["A", "A", "B", "A", "B"]
                                        df = pd.DataFrame(data=zip(x, y, name), columns=["x", "y", "name"])
                                        
                                        colors = {"A": "red", "B": "blue"}
                                        markers = {"A": "v", "B": "D"}
                                        
                                        fig, ax = plt.subplots(1, 1)
                                        
                                        for name, group in df.groupby("name"):
                                            group = group.copy()
                                            m = markers.get(name)
                                        
                                            ax.scatter(
                                                x=group["x"],
                                                y=group["y"],
                                                facecolors="none",
                                                edgecolors=group["name"].map(colors),
                                                marker=m,
                                                label=name,
                                            )
                                            ax.legend(loc="lower right")
                                        

                                        Google Colab ModuleNotFoundError: No module named 'sklearn.externals.joblib'

                                        copy iconCopydownload iconDownload
                                        # Libraries to help with reading and manipulating data
                                        import numpy as np
                                        import pandas as pd
                                        
                                        # Libraries to help with data visualization
                                        import matplotlib.pyplot as plt
                                        import seaborn as sns
                                        
                                        sns.set()
                                        
                                        # Removes the limit for the number of displayed columns
                                        pd.set_option("display.max_columns", None)
                                        # Sets the limit for the number of displayed rows
                                        pd.set_option("display.max_rows", 200)
                                        
                                        # to split the data into train and test
                                        from sklearn.model_selection import train_test_split
                                        
                                        # to build linear regression_model
                                        from sklearn.linear_model import LinearRegression
                                        
                                        # to check model performance
                                        from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
                                        
                                        # I changed this part
                                        !pip install mlxtend
                                        import joblib
                                        import sys
                                        sys.modules['sklearn.externals.joblib'] = joblib
                                        from mlxtend.feature_selection import SequentialFeatureSelector as SFS
                                        

                                        Seaborn heatmap change date frequency of yticks

                                        copy iconCopydownload iconDownload
                                        labels = [date if date.strftime('%m%d') in ['0101', '0701'] else ''
                                                  for date in df.index.date]
                                        
                                        labels = [date if row % 183 == 0 else ''
                                                  for row, date in enumerate(df.index.date)]
                                        
                                        sns.heatmap(df, cmap='PuOr', cbar_kws={'label': 'Ice Velocity (m/yr)'},
                                                    vmin=df.values.min(), vmax=df.values.max(),
                                                    yticklabels=labels)
                                        
                                        labels = [date if date.strftime('%m%d') in ['0101', '0701'] else ''
                                                  for date in df.index.date]
                                        
                                        labels = [date if row % 183 == 0 else ''
                                                  for row, date in enumerate(df.index.date)]
                                        
                                        sns.heatmap(df, cmap='PuOr', cbar_kws={'label': 'Ice Velocity (m/yr)'},
                                                    vmin=df.values.min(), vmax=df.values.max(),
                                                    yticklabels=labels)
                                        
                                        labels = [date if date.strftime('%m%d') in ['0101', '0701'] else ''
                                                  for date in df.index.date]
                                        
                                        labels = [date if row % 183 == 0 else ''
                                                  for row, date in enumerate(df.index.date)]
                                        
                                        sns.heatmap(df, cmap='PuOr', cbar_kws={'label': 'Ice Velocity (m/yr)'},
                                                    vmin=df.values.min(), vmax=df.values.max(),
                                                    yticklabels=labels)
                                        

                                        Community Discussions

                                        Trending Discussions on seaborn
                                        • Getting Error 0 when plotting boxplot of a filtered dataset
                                        • Resize axes of top and right joint marginal plots to match central plot with matplotlib
                                        • How to paste an Excel chart into PowerPoint placeholder using Python?
                                        • How to install local package with conda
                                        • How to add median and IQR to seaborn violinplot
                                        • Can't deploy streamlit app on share.streamlit.io
                                        • Pandas agg define metric based on data type
                                        • Heatmap error :'NoneType' object is not callable when using with dataframe
                                        • Adding columns & values per group occurrence in pandas after filtering
                                        • Matplotlib scatter plot marker type from dictionary
                                        Trending Discussions on seaborn

                                        QUESTION

                                        Getting Error 0 when plotting boxplot of a filtered dataset

                                        Asked 2022-Mar-11 at 15:48

                                        I am working on the Kaggle: Abalone dataset and I am facing a weird problem when plotting a boxplot.

                                        import pandas as pd
                                        import seaborn as sns
                                        
                                        df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data', header=None)
                                        df.columns = ['sex', 'Length', 'Diameter', 'Height', 'Whole weight', 'Shucked weight', 'Viscera weight', 'Shell weight', 'rings']
                                        

                                        If a run:

                                        plt.figure(figsize=(16,6))
                                        plt.subplot(121)
                                        sns.boxplot(data=df['rings'])
                                        

                                        working perfectly!

                                        If I filter the dataset by sex like this:

                                        df_f = df[df['sex']=='F']
                                        df_m = df[df['sex']=='M']
                                        df_i = df[df['sex']=='I']
                                        

                                        df_f = (1307,9), df_m=(1528,9) and df_i=(1342,9)

                                        And I run:

                                        plt.figure(figsize=(16,6))
                                        plt.subplot(121)
                                        sns.boxplot(data=df_m['rings'])
                                        

                                        working perfectly!

                                        But if I run the code above for df_f and df_i datasets I get an error:

                                        ---------------------------------------------------------------------------
                                        KeyError                                  Traceback (most recent call last)
                                        ~/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
                                           3360             try:
                                        -> 3361                 return self._engine.get_loc(casted_key)
                                           3362             except KeyError as err:
                                        
                                        ~/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
                                        
                                        ~/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
                                        
                                        pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
                                        
                                        pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
                                        
                                        KeyError: 0
                                        
                                        The above exception was the direct cause of the following exception:
                                        
                                        KeyError                                  Traceback (most recent call last)
                                        /tmp/ipykernel_434828/3363262611.py in <module>
                                        ----> 1 sns.boxplot(data=df_f['Rings'])
                                        
                                        ~/anaconda3/lib/python3.9/site-packages/seaborn/_decorators.py in inner_f(*args, **kwargs)
                                             44             )
                                             45         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
                                        ---> 46         return f(**kwargs)
                                             47     return inner_f
                                             48 
                                        
                                        ~/anaconda3/lib/python3.9/site-packages/seaborn/categorical.py in boxplot(x, y, hue, data, order, hue_order, orient, color, palette, saturation, width, dodge, fliersize, linewidth, whis, ax, **kwargs)
                                           2241 ):
                                           2242 
                                        -> 2243     plotter = _BoxPlotter(x, y, hue, data, order, hue_order,
                                           2244                           orient, color, palette, saturation,
                                           2245                           width, dodge, fliersize, linewidth)
                                        
                                        ~/anaconda3/lib/python3.9/site-packages/seaborn/categorical.py in __init__(self, x, y, hue, data, order, hue_order, orient, color, palette, saturation, width, dodge, fliersize, linewidth)
                                            404                  width, dodge, fliersize, linewidth):
                                            405 
                                        --> 406         self.establish_variables(x, y, hue, data, orient, order, hue_order)
                                            407         self.establish_colors(color, palette, saturation)
                                            408 
                                        
                                        ~/anaconda3/lib/python3.9/site-packages/seaborn/categorical.py in establish_variables(self, x, y, hue, data, orient, order, hue_order, units)
                                             96                 if hasattr(data, "shape"):
                                             97                     if len(data.shape) == 1:
                                        ---> 98                         if np.isscalar(data[0]):
                                             99                             plot_data = [data]
                                            100                         else:
                                        
                                        ~/anaconda3/lib/python3.9/site-packages/pandas/core/series.py in __getitem__(self, key)
                                            940 
                                            941         elif key_is_scalar:
                                        --> 942             return self._get_value(key)
                                            943 
                                            944         if is_hashable(key):
                                        
                                        ~/anaconda3/lib/python3.9/site-packages/pandas/core/series.py in _get_value(self, label, takeable)
                                           1049 
                                           1050         # Similar to Index.get_value, but we do not fall back to positional
                                        -> 1051         loc = self.index.get_loc(label)
                                           1052         return self.index._get_values_for_loc(self, loc, label)
                                           1053 
                                        
                                        ~/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
                                           3361                 return self._engine.get_loc(casted_key)
                                           3362             except KeyError as err:
                                        -> 3363                 raise KeyError(key) from err
                                           3364 
                                           3365         if is_scalar(key) and isna(key) and not self.hasnans:
                                        
                                        KeyError: 0
                                        

                                        There's no missing values, all values are int.

                                        What am I missing here?

                                        ANSWER

                                        Answered 2022-Mar-10 at 10:38

                                        If you want a box plot per value of a categorical column I suggest:

                                        sns.boxplot(data=df, x='rings', y='sex')
                                        

                                        Source https://stackoverflow.com/questions/71422431

                                        Community Discussions, Code Snippets contain sources that include Stack Exchange Network

                                        Vulnerabilities

                                        No vulnerabilities reported

                                        Install seaborn

                                        You can download it from GitHub.
                                        You can use seaborn like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

                                        Support

                                        For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

                                        DOWNLOAD this Library from

                                        Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
                                        over 430 million Knowledge Items
                                        Find more libraries
                                        Reuse Solution Kits and Libraries Curated by Popular Use Cases
                                        Explore Kits

                                        Save this library and start creating your kit

                                        Share this Page

                                        share link
                                        Consider Popular Data Visualization Libraries
                                        Try Top Libraries by mwaskom
                                        Compare Data Visualization Libraries with Highest Support
                                        Compare Data Visualization Libraries with Highest Quality
                                        Compare Data Visualization Libraries with Highest Security
                                        Compare Data Visualization Libraries with Permissive License
                                        Compare Data Visualization Libraries with Highest Reuse
                                        Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
                                        over 430 million Knowledge Items
                                        Find more libraries
                                        Reuse Solution Kits and Libraries Curated by Popular Use Cases
                                        Explore Kits

                                        Save this library and start creating your kit

                                        • © 2022 Open Weaver Inc.