The most useful, easy to understand, and efficient open-source library tool for handling data, cleaning the data, modification, and analysis in Python is Pandas. As a result, Pandas is quite useful when working with large datasets.
Pandas’ built-in function facilitates reading datasets in a variety of formats. Python users can read CSV files (Comma Separated Values files) in numerous ways with the help of the read_csv() function of the Pandas package. Let’s try to figure out how to read a CSV file from a given URL using the read_csv() function in this post.
Also check: How to Read CSV with Headers Using Pandas?
What is the read_csv() function?
It is one of the pre-defined functions of the Pandas package. It converts a .csv file into Pandas DataObject hence making it readable in Python language. The parameters passed to this function can be altered in numerous ways to achieve the user’s desired output format.
One can also pass a URL of the dataset to this function and access the data in their working IDE. An URL is like the address of a webpage and stands for Uniform Resource Locators.
To have a better understanding of the read_csv() function, here is a detailed article on the same.
In order to avoid errors that might occur while reading a csv file from provided URL, make sure to understand and implement the following steps. Before we start with the implementation, do make sure to install and import the Pandas package into your system. To do so you can follow the following steps in your working IDE.
# To install the package pip install pandas # To import and rename import pandas as pd
Once you have imported the Pandas package, check the version of the installed package because the method we are going to discuss further requires Pandas version 0.19.2 or above. To check the version of your installed package run the following line of code.
# To print version in colab python notebook print(pd.__version__)
# To check version from teminal pip show pandas
If the Pandas package in your system is below the 0.19.2 version make sure to upgrade the package. For the Colab Python notebook, the packages are pre-install, if you need to upgrade the package install the latest version manually by specifying the version, and then do not forget to restart the runtime from the toolbar to notice the upgradation.
!pip install pandas==[version] # Rutime -> Restart Runtime (or simply press Ctrl + M)
To upgrade the package using the terminal run the following line of code
pip install --upgrade pandas (If the above line throws an error try the below code) pip3 install --upgrade pandas
With the help of other packages and combining various functions together, one can read csv files from URLs using the outdated version (below 0.19.2 version) of Pandas, but it is preferred to upgrade your modules as the discussed method is much more efficient and straightforward than others.
Implementation of the read_csv() function
We will be using this URL in this article to demonstrate the implementation
# assigning url to a variable url="https://raw.githubusercontent.com/Tanishqa-10/AskPython/main/Sampledata.csv" # passing parameter to the function x =pd.read_csv(url) print(x)
Customizing the Output
One can also pass additional parameters to the read_csv() function along the URL to view data in a particular desired way. One can select the names of columns that one needs to display and pass them along with the URL as a parameter to the function for viewing only those particular columns. Take a look at the code below for a better understanding.
display = ['Name', 'Code', 'Amount'] url="https://raw.githubusercontent.com/Tanishqa-10/AskPython/main/Sampledata.csv" print(pd.read_csv(url, usecols=display))
The read_csv() function provides many different parameters that can be used along with the URL to customize your output. Also, take a look at this article to understand how to deal with delimiters in the csv file and make efficient use of this function.
The tools that Pandas offers for reading and writing data are perhaps its essential functionalities. The read_csv() method in pandas can read data that is available in a tabular form and stored as a CSV file in memory. In this article, we learned how to make use of this same function to read a csv file from provided URL in Python. At times while working with packages in Python there are chances of failures, we looked into all precautions that need to be taken in order to avoid any kinds of errors while working on the implementation.
To learn from more such detailed and easy-to-understand articles on various topics related to the Pandas package and Python programming language, visit here.