Scatter Plot Matplotlib

In this class, We discuss Scatter Plot Matplotlib.

For Complete YouTube Video: Click Here

The reader should have prior knowledge of bar and line charts. Click here.

A Scatter plot is used to identify the relationship among two variables.

Among the subcategory find the sale and profit. We consider sale and profit to construct a scatter plot.

The superstore data set is discussed in our previous classes.

The code to get the sale and profit of the subcategory is given below.

import pandas as pd
df=pd.read_excel('sampledata.xls',sheet_name='Orders')
print(df.head())

temp=pd.DataFrame(df.groupby(['Sub-Category']).agg({'Sales':'sum','Profit':'sum'}))
print(temp)

x=temp['Sales'].values
y=temp['Profit'].values
print(x)
print(y)

We are not explaining the above code. Because we explained in our previous classes.

Scatter Plot

The scatter plot code is provided below.

# simple scatter plot
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5))
plt.scatter(x, y, color ='blue')
plt.show()
Scatter Plot Matplotlib1
Scatter Plot

We use the function scatter from matplotlib. pyplot.

The variable x is taking sale values.

Variable y is taking profit values.

We use the parameter color to assign the color to the points.

By looking at the plot, we will get an intuition on how many items are getting profit less than ten thousand, etc.

By default, the points are shown in a circle. We can change the markers.

The concept of markers is explained in the line charts.

Changing the Marker

The triangle marker and the face color of the figure are changed to green in the below program.

# Changing the Marker 
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
plt.scatter(x, y, color ='blue',marker='^',s=100)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()
Scatter Plot Matplotlib2
Changing the Marker

We use the parameter ‘s’ to increase the size of the marker.

# Changing the Marker size
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
plt.scatter(x, y, color ='blue',marker='^',s=100)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()
Scatter Plot Matplotlib3
Changing Marker Size

Color to Axes

We can use the axes function to create the axes object and use the methods to change the color of the axes.

For a detailed understanding of axes, watch figure and subplot class.

To change the color of the axes, we use set_facecolor to change the axes’ color.

# Giving Color to axes
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
ax=plt.axes()
ax.set_facecolor("yellow")
plt.scatter(x, y, color ='blue',marker='^',s=100)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()
Scatter Plot Matplotlib4
Add Color to Axes

Different Categories

We can take different categories when we plot scatters plots.

The profit less than zero is considered as one category

Profit, less than ten thousand and greater than zero as one category.

Above ten thousand profit can be taken as another category.

We can assign different markers and colors to different categories.

Given below is the code to show different categories.

# different categories
temp1=temp.loc[temp['Profit']<10000]
x1=temp1['Sales'].values
y1=temp1['Profit'].values

temp2=temp.loc[(temp['Profit']>10000) & (temp['Profit']<20000)]
x2=temp2['Sales'].values
y2=temp2['Profit'].values


temp3=temp.loc[(temp['Profit']>20000)]
x3=temp3['Sales'].values
y3=temp3['Profit'].values

# Different colors to different categories
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
ax=plt.axes()
ax.set_facecolor("yellow")
plt.scatter(x1, y1, color ='blue',marker='^',s=100)
plt.scatter(x2, y2, color ='black',marker='^',s=100)
plt.scatter(x3, y3, color ='red',marker='^',s=100)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()
Scatter Plot Matplotlib5
Different Categories

Dealing Clumsy Scatter Plots

Suppose we are having a clumsy scatter plot.

And we are dealing with two categories.

If both the points are at an exact location in the coordinate axis, one point is not visible.

In the situations mentioned above, we can use the alpha parameter for visibility.

In the example program given below, we have taken one category circle. 

The triangle marker is chosen for another category.

The circle marker is taken transparently so that the triangle marker is visible.

To make the marker transparent and opaque. We use the parameter alpha.

The alpha value near zero is transparent. And alpha near to one is opaque.

The code to use the alpha parameter is shown below.

# dealing Clumsy scatter plots
import matplotlib.pyplot as plt
x=[1,2,3,4,5,6,7]
y1=[10,20,30,40,50,60,70]
y2=[10,15,25,40,50,55,65]
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
ax=plt.axes()
ax.set_facecolor("yellow")
plt.scatter(x, y1, color ='red',marker='o',s=250,alpha=0.3)
plt.scatter(x, y2, color ='black',marker='^',s=100,alpha=0.99)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()
Scatter Plot Matplotlib6
Dealing Clumsy Data