How to Superimpose Scatter Plots Using Matplotlib?

SUPERIMPOSE SCATTER PLOTS

Matplotlib is the most used library of Python. It is used for visualization and analysis of data and is considered a powerful weapon for data scientists and analysts. Having a number of various plots like line plots, bar graphs, pie charts, and so on, it is the most preferred tool.

One such plot is the scatter plot. A scatter plot uses dots to visualize values for two different numerical variables. They are mainly used to observe the relationship between two variables.

In this article, we are going to superimpose scatter plots.

Superimposing scatter plots may seem like a difficult idea, but it follows a simple approach. Follow through this article to learn it in a simple way.

Follow this tutorial on Matplotlib to get started.

Scatter Plots in Matplotlib

Scatter plots are the most frequently used plots of this library by data scientists and analysts. These plots are used to determine the relationship between the variables plotted and how one point changes when the other variable changes. They are also used to observe the trends in the data points.

The matplotlib library has a special method for implementing scatter plots. The scatter method is used to implement scatter plots with various customization options. With the help of this method, we can customize our plots(change the color of the dots, the size of the dots, and the markers).

To know more about scatter plots and their features, do follow this article

The syntax is given below:

matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, *, edgecolors=None, plotnonfinite=False, data=None, **kwargs)

Let us see an example of scatter plots to understand it further.

import matplotlib.pyplot as plt
x = [2, 2.5, 3, 2.8, 2.2, 2.4]
y = [3, 3.5, 4, 3.8, 3.2, 3.4]
plt.scatter(x, y, c='red',edgecolor = 'black')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Example Scatter Plot')
plt.show()

In the first line, we import the matplotlib library using its alias name.

The x and y are the data points or data sets we are going to plot in our graph.

The plt.scatter method is used to plot the data points. This method takes the parameters- x, and y, which are the data points, and the parameter c is used for the color of the marker. The edgecolor is used to mark the boundary of the marker. In this example, it is set to black.

The axes of the graph(X and Y) are set using label. The title of the graph is also set using title property.

All the components of the graph are packed and displayed using the show method.

The output is shown below.

Example Scatter Plot
Example Scatter Plot

Superimpose Scatter Plots Using Matplotlib

Superimposing multiple scatter plots on the same grid using matplotlib is nothing but plotting a few datasets on the graph, which we can say collide with each other at some point.

For the data points to collide or overlay each other, they need to have at least one common or near point. Plotting them is just the same as we did in the previous example.

Let us start off with a simple example.

Superimposing Two Scatter Plots

We are going to take two data sets and plot them on the same axes. For the points to collide or intersect, we are going to generate 100 random points and use them to generate three other datasets such that each of them has some common point from the parent set.

import numpy as np
import pylab as plt
X = np.linspace(0,5,100)
Y1 = X + 2*np.random.random(X.shape)
Y2 = X**2 + np.random.random(X.shape)
plt.scatter(X,Y1,color='b')
plt.scatter(X,Y2,color='g')
plt.show()

The numpy and matplotlib libraries are imported as their aliases.

The X variable consists of 100 evenly spaced numbers with a range of 5 and starting from zero.

Y1 and Y2 are the points generated from X randomly. Y1 is the result of adding 2 to the number from the X variable. Y2 is the dataset that contains the values of the square of those numbers in X.

Y1 and Y2 are plotted with respect to the X variable with the colors Blue and Green. These plots are displayed using the show method.

The output is given below.

Superimpose Two Scatter Plots
Superimpose Two Scatter Plots

The same example can be used to generate a different plot with different colors and markers.

import numpy as np
import matplotlib.pyplot as plt
X = np.linspace(0, 5, 100)
Y1 = X + 2 * np.random.random(X.shape)
Y2 = X**2 + np.random.random(X.shape)
plt.scatter(X, Y1, marker='o', c='red',edgecolor='black')
plt.scatter(X, Y2, marker='d', c='blue',edgecolor='black')
plt.show()

In this code, we just changed the colors of the markers and changed the shapes of the marker. We also added the edgecolor to outline the boundary to make it more interesting.

Superimpose Two Scatter Plots With Edgecolor
Superimpose Two Scatter Plots With Edgecolor

Superimpose Three Scatter Plots

We are going to follow the same procedure and include one more data point that collides with the other two at some point.

import numpy as np
import matplotlib.pyplot as plt
X = np.linspace(0, 5, 100)
Y1 = X + 2 * np.random.random(X.shape)
Y2 = X**2 + np.random.random(X.shape)
Y3 = 3 + np.random.random(X.shape) 
plt.scatter(X, Y1, color='b',marker='d')
plt.scatter(X, Y2, color='g',marker='^')
plt.scatter(X, Y3, color='r',marker='x')  
plt.show()

The parent dataset, Y1, and Y2 are the same from the previous example. The third data point consists of the points in X+3.

We also changed the marker shapes for each dataset. The d marker gives diamond-shaped points, ^ gives triangle points and x gives crossed points.

Superimpose Three Scatter Plots
Superimpose Three Scatter Plots

Conclusion

Let us recap what we have done so far. We learned how matplotlib is one of the powerful libraries for data visualization as it supports various types of plots like line graphs, bar graphs, histograms, scatter plots, and even 3D visualization.

We learned how to implement scatter plots with the matplotlib library using the method plt.scatter. We also have seen an example of implementing scatter plots.

Next, we learned how to superimpose scatter plots. Superimposing multiple plots means nothing but plotting datasets on the same graph where each dataset coincides with another at some point.

We have seen how to superimpose two scatter plots with different markers and colors. Next, we also learned how to overlay three scatter plots.

References

You can know more about the Scatter Plot here in the official documentation

Stack Overflow answer chain on the same topic