Working with DataFrame Rows and Columns in Python

DataFrame Rows And Columns

In this article, let us see how to create table-like structures using Python and to deal with their rows and columns. This would be very useful when we are creating data science applications that would require us to deal with a large collection of data. Let us see how can we execute basic functions such as creating, updating, and deleting rows/columns using Python.

What is a Data Frame?

Python, being a language widely used for data analytics and processing, has a necessity to store data in structured forms, say as in our conventional tables in the form of rows and columns. We use the DataFrame object from the Pandas library of python to achieve this. Internally the data is stored in the form of two-dimensional arrays. Let us learn more about DataFrame rows and columns in this article.

Creating a simple DataFrame

Let us learn to create a simple DataFrame with an example.

import pandas as pd

data = {
  "TotalScore": [420, 380, 390],
  "MathScore": [50, 40, 45]
}

#load data into a DataFrame object:
df = pd.DataFrame(data)

print(df) 

Result

       TotalScore  MathScore

  0       420        50
  1       380        40
  2       390        45

Selectively Printing One Dataframe Column

Let us see how to select the desired column in python. Consider that we have a dataframe as seen in the above case. We can select the desired column by their column.

print(df[['MathScore']])

The above code would just print the values of ‘MathScore’ column.

Adding Columns to a Dataframe in Python

Now, at times, we might want to add some more columns as part of our data gathering. we can add more columns to our data frame by declaring a new list and converting it into a column in the dataframe.

# creating  a new list called name.
name = ['Rhema', 'Mehreen', 'Nitin']
  
# Using 'Name' as the column name
# and equating it to the list
df['Name'] = name
  
# Observe the result
print(df)

Output

   TotalScore  MathScore     Name

0         420         50    Rhema
1         380         40  Mehreen
2         390         45    Nitin

Deleting a column

We can use the drop() method in the pandas dataframe to delete a particular column.

# dropping passed columns
df.drop(["Name"], axis = 1, inplace = True)

Now the column ‘Name’ will be deleted from our dataframe.

Working With Dataframe Rows

Now, let us try to understand the ways to perform these operations on rows.

Selecting a Row

To select rows from a dataframe, we can either use the loc[] method or the iloc[] method. In the loc[] method, we can retrieve the row using the row’s index value. We can also use the iloc[] function to retrieve rows using the integer location to iloc[] function.

# importing pandas package
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("employees.csv", index_col ="Name")
  
# retrieving row by loc method
first = data.loc["Shubham"]
second = data.loc["Mariann"]
  
  
print(first, "\n\n\n", second)

In the above code, we are loading a CSV file as a dataframe and assigning the column ‘Name’ as its index value. Later we use the index of the rows to retrieve them.

Creating a Dataframe Row in Python

To insert a new row into our dataframe, we can use append() function, concat() function or loc[] function in the dataframe.

#adding a new row using the next index value.
df.loc[len(df.index)] = ['450', '80', 'Disha'] 
  
display(df)

#using append function

new_data = {'Name': 'Ripun', 'MathScore': 89, 'TotalScore': 465}
df = df.append(new_data, ignore_index = True)

#using concat function

concat_data = {'Name':['Sara', 'Daniel'],
        'MathScore':[89, 90],
        'TotalScore':[410, 445]
       }
  
df2 = pd.DataFrame(concat_data)

  
df3 = pd.concat([df, df2], ignore_index = True)
df3.reset_index()
  
print(df3)

Output

Using loc[] method

  TotalScore MathScore     Name

0        420        50    Rhema
1        380        40  Mehreen
2        390        45    Nitin
3        450        80    Disha

Using append() function

  TotalScore MathScore     Name

0        420        50    Rhema
1        380        40  Mehreen
2        390        45    Nitin
3        450        80    Disha
4        465        89    Ripun


 Using Concat() function

  TotalScore MathScore     Name

0        420        50    Rhema
1        380        40  Mehreen
2        390        45    Nitin
3        450        80    Disha
4        465        89    Ripun
5        410        89     Sara
6        445        90   Daniel

Deleting a Row

We can use the drop() method to delete rows. We have to pass the index value of the row as an argument to the method.

# importing pandas module
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("employees.csv", index_col ="Name" )
  
# dropping passed values
data.drop(["Shubham", "Mariann"], inplace = True)

Conclusion

Hence, in this article, we have discussed various ways to deal with rows and columns in python. In general, data frames are two-dimensional structures in Python that we can use to store data and perform various other functions.

References

Find here the official documentation for dataframes – https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html