Learning data visualization is a crucial thing if you are intending to have a career in many fields such as data science, machine learning engineer, data analyst, and much more just to name a few and this process of data visualization can be done using some software like tableau and Power BI but in this article, we will use a programming language called python with the matplotlib library.
Python is a language used to do almost everything from artificial intelligence applications to computer vision, creating desktop applications, developing web applications, and data visualization. This language contains many libraries for data visualization purposes and one of the most used and famous is called matplotlib.
Matplotlib is a python library for creating data visualization and nice 2D graphs and this guide will help you install and use this library efficiently as a beginner so stay tuned and let’s start using this library.
Installing the environment
Before starting using this library you need first to install an IDE which is an integrated development environment to run the code and see the visualization. There are many IDEs to use such as pycharm and anaconda but we will use Jupyter Notebook in this example so make sure to download and install the software before starting this tutorial.
Installing The Libraries
Before also making any visualization you need to install the matplotlib library and that can be done through pip installer packages.
!pip3 install matplotlib #Installing the library
Now after you see the successful install message you are ready to start the data visualization process.
Plotting The Data
Matplotlib is a huge library and it has many sub-packages to use and the most famous one called pyplot. So let’s start this tutorial by importing the library and creating a simple plot.
import matplotlib.pyplot as plt #Importing the library plt.plot([13, 15, 17, 19]) #Creating a simple line plot plt.show() #Showing the plots
The above picture shows that we’ve created a simple plot using the .plot() function from the matplotlib library but first we’ve imported the library as an alias of plt so whenever we want to use it we just use the plt instead of typing matplotlib.pyplot every time and finally using the plt.show() function to display the plots on the IDE.
The graphs have used the points as the Y-axis and it has chosen random X-axis values but we can give it the values of these two axes. Let’s see how:
import matplotlib.pyplot as plt #Importing the library plt.plot([13, 15, 17, 19], [2, 4, 6, 8]) #Creating a simple line plot plt.show() #Showing the plots
After choosing the new values [2, 4, 6, 8] the matplotlib library has chosen these values to be the Y-axis and the old values [13, 15, 17, 19] as the X-values.
You may have noticed that the graph has not X-label or Y-label and you highly need to add those two labels in your graph to show explain more about what these data or graphs represent. Let’ see how to implement these two labels:
We’ve added two new lines of code which are plt.xlabel() and plt.ylabel() to add the title for the X-axis and the Y-axis respectively.
Matplotlib can do a lot of data visualization graphs and one of them is a bar plot. Let’s create a bar plot together using the matplotlib library:
import matplotlib.pyplot as plt #Importing the library plt.scatter([13, 15, 17, 19, 22, 25], [2, 4, 6, 8, 10, 12]) #Creating a simple plot plt.xlabel("X-Label") #Naming the X-label plt.ylabel("Y-Label") #Naming the Y-label plt.show() #Showing the plots
The code is always the same as the previous examples except we’ve changed the plot function from plt.plot() to plt.scatter() to make a scatter plot representing values on the graph as a point that has an x-axis and y-axis.
The point is represented as a small circle but matplotlib offers the option to choose another shape of the point. Let’s say that we want the point to be “*” for instance. For that, you have to use another parameter called marker in the plt.scatter() function:
import matplotlib.pyplot as plt #Importing the library plt.scatter([13, 15, 17, 19, 22, 25], [2, 4, 6, 8, 10, 12], marker = "*") #Creating a simple plot plt.xlabel("X-Label") #Naming the X-label plt.ylabel("Y-Label") #Naming the Y-label plt.show() #Showing the plots
The picture above shows “*” instead of circles when representing values and you can use many markers such as “o” and “x” letters and many more markers just to show you a few of them.
Bar plots are one of the most used and famous charts in data visualization and among data scientists. Fortunately, matplotlib offers also the ability to create this chart very easily:
import matplotlib.pyplot as plt #Importing the library x = ["A", "B", "C", "D"] #Creating the X data y = [3, 5, 7, 9] #Creating the Y data plt.bar(x, y) #Ploting the data plt.xlabel("X-Label") #Naming the X-label plt.ylabel("Y-Label") #Naming the Y-label plt.show() #Showing the plots
The first thing we’ve created some random data as you see and store them in the x and y variables then used the plt.bar() function to create the bar plot as you see in the picture above.
This short guide has explained the ability of matplotlib in a simple way and teaches you how to generate some of the most used charts in the data visualization field and you can take this guide as an introduction to your data visualization journey.