In this article, let us see how to create table-like structures using Python and to deal with their rows and columns. This would be very useful when we are creating data science applications that would require us to deal with a large collection of data. Let us see how can we execute basic functions such as creating, updating, and deleting rows/columns using Python.
What is a Data Frame?
Python, being a language widely used for data analytics and processing, has a necessity to store data in structured forms, say as in our conventional tables in the form of rows and columns. We use the DataFrame object from the Pandas library of python to achieve this. Internally the data is stored in the form of two-dimensional arrays. Let us learn more about DataFrame rows and columns in this article.
Creating a simple DataFrame
Let us learn to create a simple DataFrame with an example.
import pandas as pd
data = {
"TotalScore": [420, 380, 390],
"MathScore": [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)
Result
TotalScore MathScore 0 420 50 1 380 40 2 390 45
Selectively Printing One Dataframe Column
Let us see how to select the desired column in python. Consider that we have a dataframe as seen in the above case. We can select the desired column by their column.
print(df[['MathScore']])
The above code would just print the values of ‘MathScore’ column.
Adding Columns to a Dataframe in Python
Now, at times, we might want to add some more columns as part of our data gathering. we can add more columns to our data frame by declaring a new list and converting it into a column in the dataframe.
# creating a new list called name.
name = ['Rhema', 'Mehreen', 'Nitin']
# Using 'Name' as the column name
# and equating it to the list
df['Name'] = name
# Observe the result
print(df)
Output
TotalScore MathScore Name 0 420 50 Rhema 1 380 40 Mehreen 2 390 45 Nitin
Deleting a column
We can use the drop() method in the pandas dataframe to delete a particular column.
# dropping passed columns
df.drop(["Name"], axis = 1, inplace = True)
Now the column ‘Name’ will be deleted from our dataframe.
Working With Dataframe Rows
Now, let us try to understand the ways to perform these operations on rows.
Selecting a Row
To select rows from a dataframe, we can either use the loc[] method or the iloc[] method. In the loc[] method, we can retrieve the row using the row’s index value. We can also use the iloc[] function to retrieve rows using the integer location to iloc[] function.
# importing pandas package
import pandas as pd
# making data frame from csv file
data = pd.read_csv("employees.csv", index_col ="Name")
# retrieving row by loc method
first = data.loc["Shubham"]
second = data.loc["Mariann"]
print(first, "\n\n\n", second)
In the above code, we are loading a CSV file as a dataframe and assigning the column ‘Name’ as its index value. Later we use the index of the rows to retrieve them.
Creating a Dataframe Row in Python
To insert a new row into our dataframe, we can use append() function, concat() function or loc[] function in the dataframe.
#adding a new row using the next index value.
df.loc[len(df.index)] = ['450', '80', 'Disha']
display(df)
#using append function
new_data = {'Name': 'Ripun', 'MathScore': 89, 'TotalScore': 465}
df = df.append(new_data, ignore_index = True)
#using concat function
concat_data = {'Name':['Sara', 'Daniel'],
'MathScore':[89, 90],
'TotalScore':[410, 445]
}
df2 = pd.DataFrame(concat_data)
df3 = pd.concat([df, df2], ignore_index = True)
df3.reset_index()
print(df3)
Output
Using loc[] method TotalScore MathScore Name 0 420 50 Rhema 1 380 40 Mehreen 2 390 45 Nitin 3 450 80 Disha Using append() function TotalScore MathScore Name 0 420 50 Rhema 1 380 40 Mehreen 2 390 45 Nitin 3 450 80 Disha 4 465 89 Ripun Using Concat() function TotalScore MathScore Name 0 420 50 Rhema 1 380 40 Mehreen 2 390 45 Nitin 3 450 80 Disha 4 465 89 Ripun 5 410 89 Sara 6 445 90 Daniel
Deleting a Row
We can use the drop() method to delete rows. We have to pass the index value of the row as an argument to the method.
# importing pandas module
import pandas as pd
# making data frame from csv file
data = pd.read_csv("employees.csv", index_col ="Name" )
# dropping passed values
data.drop(["Shubham", "Mariann"], inplace = True)
Conclusion
Hence, in this article, we have discussed various ways to deal with rows and columns in python. In general, data frames are two-dimensional structures in Python that we can use to store data and perform various other functions.
References
Find here the official documentation for dataframes – https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html