Plotting is nothing, but it draws points or markers in a diagram. The plot() function will draw a line from point to point, and this function takes parameters for specifying points in the diagram. Scatter is a diagram where a dot represents each value in the data set.
To create a scatterplot using data from a pandas DataFrame, we have two different ways,
- The first way to create a scatterplot is to use the built-in pandas plot.scatter function and import pandas library as import pandas as pd df.
- And the second way is to use the matplotlib.pyplot.scatter.
In scatter plots, identification and correlational relationships are common. The main use of a scatter plot is to observe and show the relationship between numeric variables and the dots in a scatter plot report not only the values of individual data points in pandas but also patterns when the data or information are taken as a whole. The best type of data for scatter plots is when we have two variables that pair well together, and if we have two variables which will pair well together and plotting them in a scatter diagram is a great way to view their relationship and then see if it's a positive or negative correlation. The disadvantage of the scatter plot is it cannot provide the precise extent of association, and a scatter plot does not indicate the quantitative measure of the relationship between the two variables.
Here is an example of how you can plot and scatter using pandas in Python:
Fig : Preview of the output that you will get on running this code from your IDE.
In this solution we're using Pandas library.
import pandas as pd # I made this solution using matplotlib to make the scatterplot from matplotlib import pyplot as plt data = [ (-0.76, -0.66, -1), ( 0.07, 0.59, 1), ( 0.73, 0.60, -1), ( 0.58, -0.46, -1), (-0.71, 0.90, -1), (-0.82, -0.13, -1), ( 0.20, 0.43, 1), (-0.72, -0.93, -1), ( 0.80, -0.79, -1), (-0.35, -0.52, -1), ( 0.53, 0.72, 1), (-0.70, 0.96, -1), ] df = pd.DataFrame(data, columns = ["X", "Y", "Sign"]) x = df['X'] # Values for x-axis y = df['Y'] # Values for y-axis signs = df['Sign'] # Values for changing the marker in the plots for i in range(len(x)): plt.scatter( x[i], y[i], # X, Y coordinates s = 100, # Size of the markers linewidth = 3, # Line width marker = "+" if signs[i] > 0 else "_", # Control wether the marker is a '+' or a '-' color = "green" if signs[i] > 0 else "red" # Control the color based on minus or plus sign )
Follow the steps carefully to get the output easily.
- Install pandas on your IDE(Any of your favorite IDE).
- Copy the snippet using the 'copy' and paste it in your IDE.
- Add required dependencies and import them in Python file.
- Run the file to generate the output.
I hope you found this useful. I have added the link to dependent libraries, version information in the following sections.
I tested this solution in the following versions. Be mindful of changes when working with other versions.
- The solution is created in PyCharm 2021.3.
- The solution is tested on Python 3.9.7.
- Pandas version-v1.5.2.
Using this solution, we are able to plot and scatter using pandas in python with simple steps. This process also facilities an easy way to use, hassle-free method to create a hands-on working version of code which would help us to plot and scatter using pandas in python.
Python 38689 Version:v2.0.2 License: Permissive (BSD-3-Clause)