In this tutorial, we will attempt to generate an amazing and interactive network graph from a pandas data frame to take things up a notch!
Also Read: NetworkX Package – Python Graph Library
Without any delay, Let’s begin!
Loading and Pre-processing Data
This section is focused on loading and pre-processing the dataset. The dataset chosen for this tutorial is the OpenFlights Airport dataset available on Kaggle. As of January 2017, the OpenFlights Airports Database contains data for over 10,000 airports over the globe.
In the code below, we will be importing the
pandas module and load the
routes.csv file into the program. Now out of all the columns in the dataset, we only require the
destination airports from the dataset.
import pandas as pd df = pd.read_csv('routes.csv') df = df[['Source airport','Destination airport']] df = df[:500] df.head()
To make the processing easier and the computation less complex, we will only take the top
500 rows from the dataset. And we will display the first five rows of the dataset using the
We will separate the sources and destination nodes into two separate lists using the Python code below.
sources = list(df['Source airport']) destinations = list(df['Destination airport'])
Now we will move on to the generation of the Network graph using the networkx and pyviz library in the next section.
Generation of Network Graph
We will start off by creating an empty graph using the
net.Network function and passing a number of attributes of the empty network graph. The next step is to iterate over the
sources list and add nodes along with their labels and titles.
After this, we will be adding edges using the add_edge function. We will be making use of exception handling to make sure all errors are taken into consideration (if any).
Look at the code mentioned below.
g_from_data =net.Network(height='600px',width='50%', bgcolor='white',font_color="black", heading="A Networkx Graph from DataFrame",directed=True) for i in range(len(sources)): try: g_from_data.add_node(sources[i],label=sources[i],title=sources[i]) except: pass for (i,j) in zip(sources,destinations): try: g_from_data.add_edge(i,j) except: pass g_from_data.show_buttons(['physics']) g_from_data.show('A_Complete_Networkx_Graph_From_DataFrame.html') display(HTML('A_Complete_Networkx_Graph_From_DataFrame.html'))
Have a look at the network graph generated below. It’s amazing how the graph looks this interesting and fun to check out.
I hope you were able to understand how to generate network graphs using the pandas data frame using the pyviz library in Python programming language. Thank you for reading!
I would recommend you to have a read at the following tutorials below:
- Network Analysis in Python – A Complete Guide
- Neural Networks in Python – A Complete Reference for Beginners
- An Ultimate Guide On Insider Threat Risks And Their Prevention