# How to Calculate Summary Statistics for a Pandas DataFrame

by ganesh Updated: Jan 24, 2023

Summary statistics are statistical measures that summarize or describe a set of observations. In the context of a pandas DataFrame, summary statistics are statistical measures that summarize the data in the DataFrame.

A pandas DataFrame's describe() function allows you to generate a number of summary statistics for the data. This function provides a new DataFrame with a row for each numerical column and a column for each statistic, along with the statistical summary of the data.

- describe(): A number of summary statistics, including as the count, mean, standard deviation, minimum, maximum, and quartiles of the data, are computed by default using the describe() method. You may select which columns to include in the summary using the include option and which columns to omit using the exclude parameter.

Other techniques, including mean(), median(), min(), max(), and std(), can be used to obtain certain summary statistics for the data.

Here is how to calculate summary statistics for a Pandas DataFrame;

### Code

In this solution, we use the describe function of the Pandas library

### Dependent Libraries

pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

### Environment Tested

I tested this solution in the following versions. Be mindful of changes when working with other versions.

- The solution is created in Python3.7.
- The solution is tested on Pandas 1.3.1 version.

Using this solution, we are able to create summary statistics of a Dataframe using the Pandas library in Python with simple steps. This process also facilities an easy to use, hassle free method to create a hands-on working version of code which would help us read data in Pandas.

