Pandas eval(): Evaluate a Python expression as a string

Pandas Eval Cover Image

Hello and welcome to our tutorial! You are aware that the most widely used Python library for data analysis and manipulation is Pandas. It makes use of a data structure known as a DataFrame to represent data as rows and columns.

You could frequently need to do various actions involving two or more columns over the current DataFrame instance while working on any data-related activity at hand. The pandas.eval() method can be useful in this situation.

The DataFrame instance’s columns are evaluated as part of an operation using the pandas.eval() function. An encoded String represents the operation. It cannot be used for individualized or precise computations because it computes the result across the entire column(s).

Also read: Pandas read_excel(): Read an Excel File into a Pandas DataFrame


Syntax of Pandas eval()

pandas.eval(expr, inplace=False, target, **kwargs)

Parameters:

  • expr: String. The expression to be evaluated.
  • inplace: Boolean. If True, a direct change will be made to the source DataFrame. If False, a new DataFrame is created once the operation has been evaluated and is returned; the source DataFrame is left alone.
  • target: Object. Target object for assignment.

Returns:

A DataFrame after evaluation of the operation if inplace=True, else returns None.


Examples of Pandas eval()

Let us first create a data frame.

import pandas as pd

# creating a data frame
data = {
    'A': [10, 3, 5, 12, 4],
    'B': [25, 30, 4, 98, 2],
    'C': [2, 15, 33, 50, 45]
}

df = pd.DataFrame(data)
df
Eval Df

Also read: Pandas read_csv(): Read a CSV File into a DataFrame

Example 1: Calculate the sum of the entries in columns using Pandas eval()

This example is using the pandas.eval() function to calculate the sum of the entries in columns A and B of a dataframe. The pandas.eval() function is powerful as it allows us to evaluate mathematical expressions in the form of a string. In this example, the expression is ‘A+B’, which evaluates to the sum of the entries in columns A and B.

df.eval('A+B')

Output:

0     35
1     33
2      9
3    110
4      6
dtype: int64

The output is calculated as:

A[0] + B[0] = 10 + 25 = 35
A[1] + B[1] = 3 + 30 = 33
A[2] + B[2] = 5 + 4 = 9
A[3] + B[3] = 12 + 98 = 110
A[4] + B[4] = 4 + 2 = 6

Similarly,

df.eval('A+B+C')

Output:

0     37
1     48
2     42
3    160
4     51
dtype: int64

Example 2: Adding a New Column to a DataFrame

This code snippet adds a new column to the DataFrame, named D. The values of this column are computed by adding the values in columns A, B and C. The ‘eval()’ method is used here to safely evaluate an expression, and the ‘inplace’ parameter is set to True to modify the DataFrame directly.

df.eval('D = A+B+C')

Output:

Op 1  pandas eval()

Since the inplace parameter is False by default, the DataFrame has not been directly modified here. You can do so by updating the value of inplace parameter to True.

df.eval('D = A+B+C', inplace=True)
df

Output:

 pandas eval()

Example 3: Using local variables with pandas eval()

In this example, a local variable named ‘k’ is defined with a value of 10. Then, the pandas eval() function is used to evaluate the expression ‘C*@k’ using the local variable. The @ symbol is used to refer to the local variable and use its value for evaluation. The resulting output of the expression is a Series of values calculated by multiplying the values in the Series ‘C’ by 10.

k = 10
df.eval('C*@k')

Output:

0     20
1    150
2    330
3    500
4    450
dtype: int64

Here, k is a local variable having the value 10. In the code, the @ symbol is used to refer to the local variable and use its value for evaluation as shown below:

C[0]*10 = 2*10 = 20
C[1]*10 = 15*10 = 150
C[2]*10 = 33*10 = 330
C[3]*10 = 50*10 = 500
C[4]*10 = 45*10 = 450

Example 4: Perform multiple evaluations at once

This example performs multiple evaluations at once on a DataFrame by specifying the expressions inside ”””. In this example, two columns are evaluated and added to the DataFrame. Column E is calculated as the square of A, while column F is calculated as the product of B and C. The output is calculated as specified for each row in column E and F.

df.eval(
'''
E = A**2
F = B*C
'''
)

Output:

Op2 2

The output is calculated as:

Column E:

E[0] = A[0]**2 = 10**2 = 100
E[1] = A[1]**2 = 3*2 = 9
E[2] = A[2]**2 = 5**2 = 25
E[3] = A[3]**2 = 12**2 = 144
E[4] = A[4]**2 = 4**2 = 16

Column F:

F[0] = B[0]*C[0] = 25*2 = 50
F[1] = B[1]*C[1] = 30*15 = 450
F[2] = B[2]*C[2] = 4*33 = 132
F[3] = B[3]*C[3] = 98*50 = 4900
F[4] = B[4]*C[4] = 2*45 = 90

Conclusion

The DataFrame columns can be operated on using the Pandas eval() method. By analysing the string expression that has been given to it, it returns the outcome.


Reference