Scatter Plot Matplotlib
In this class, We discuss Scatter Plot Matplotlib.
For Complete YouTube Video: Click Here
The reader should have prior knowledge of bar and line charts. Click here.
A Scatter plot is used to identify the relationship among two variables.
Among the subcategory find the sale and profit. We consider sale and profit to construct a scatter plot.
The superstore data set is discussed in our previous classes.
The code to get the sale and profit of the subcategory is given below.
import pandas as pd
df=pd.read_excel('sampledata.xls',sheet_name='Orders')
print(df.head())
temp=pd.DataFrame(df.groupby(['Sub-Category']).agg({'Sales':'sum','Profit':'sum'}))
print(temp)
x=temp['Sales'].values
y=temp['Profit'].values
print(x)
print(y)
We are not explaining the above code. Because we explained in our previous classes.
Scatter Plot
The scatter plot code is provided below.
# simple scatter plot
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5))
plt.scatter(x, y, color ='blue')
plt.show()
We use the function scatter from matplotlib. pyplot.
The variable x is taking sale values.
Variable y is taking profit values.
We use the parameter color to assign the color to the points.
By looking at the plot, we will get an intuition on how many items are getting profit less than ten thousand, etc.
By default, the points are shown in a circle. We can change the markers.
The concept of markers is explained in the line charts.
Changing the Marker
The triangle marker and the face color of the figure are changed to green in the below program.
# Changing the Marker
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
plt.scatter(x, y, color ='blue',marker='^',s=100)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()
We use the parameter ‘s’ to increase the size of the marker.
# Changing the Marker size
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
plt.scatter(x, y, color ='blue',marker='^',s=100)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()
Color to Axes
We can use the axes function to create the axes object and use the methods to change the color of the axes.
For a detailed understanding of axes, watch figure and subplot class.
To change the color of the axes, we use set_facecolor to change the axes’ color.
# Giving Color to axes
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
ax=plt.axes()
ax.set_facecolor("yellow")
plt.scatter(x, y, color ='blue',marker='^',s=100)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()
Different Categories
We can take different categories when we plot scatters plots.
The profit less than zero is considered as one category
Profit, less than ten thousand and greater than zero as one category.
Above ten thousand profit can be taken as another category.
We can assign different markers and colors to different categories.
Given below is the code to show different categories.
# different categories
temp1=temp.loc[temp['Profit']<10000]
x1=temp1['Sales'].values
y1=temp1['Profit'].values
temp2=temp.loc[(temp['Profit']>10000) & (temp['Profit']<20000)]
x2=temp2['Sales'].values
y2=temp2['Profit'].values
temp3=temp.loc[(temp['Profit']>20000)]
x3=temp3['Sales'].values
y3=temp3['Profit'].values
# Different colors to different categories
import matplotlib.pyplot as plt
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
ax=plt.axes()
ax.set_facecolor("yellow")
plt.scatter(x1, y1, color ='blue',marker='^',s=100)
plt.scatter(x2, y2, color ='black',marker='^',s=100)
plt.scatter(x3, y3, color ='red',marker='^',s=100)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()
Dealing Clumsy Scatter Plots
Suppose we are having a clumsy scatter plot.
And we are dealing with two categories.
If both the points are at an exact location in the coordinate axis, one point is not visible.
In the situations mentioned above, we can use the alpha parameter for visibility.
In the example program given below, we have taken one category circle.
The triangle marker is chosen for another category.
The circle marker is taken transparently so that the triangle marker is visible.
To make the marker transparent and opaque. We use the parameter alpha.
The alpha value near zero is transparent. And alpha near to one is opaque.
The code to use the alpha parameter is shown below.
# dealing Clumsy scatter plots
import matplotlib.pyplot as plt
x=[1,2,3,4,5,6,7]
y1=[10,20,30,40,50,60,70]
y2=[10,15,25,40,50,55,65]
z=plt.figure(num=1,figsize=(18,5),facecolor ="green")
ax=plt.axes()
ax.set_facecolor("yellow")
plt.scatter(x, y1, color ='red',marker='o',s=250,alpha=0.3)
plt.scatter(x, y2, color ='black',marker='^',s=100,alpha=0.99)
plt.xlabel("Total Sale of Sub Category")
plt.ylabel("Total Profit of Sub Category")
plt.title("Scatter Plot Sales and Profit")
plt.show()