What if one has a bunch of values, values that are dates that are set apart in a specific sequence? When it comes down to machine learning handling a tremendous load of data would be a tricky task and so does finding out whether there is any common interval by which the dates within a dataset of such magnitude are separated.
This article explores the nuances of the infer_freq( ) function which lends a helping hand to those who are in a quest to find the most likely frequency within the given input. Furthermore, the article demonstrates its functioning with suitable examples. So, let us start things off by importing the pandas library using the below code.
import pandas as pd
Thereafter we shall delve into infer_freq( ) function through each of the following sections.
- Syntax of the infer_freq( ) function
- Use cases for the infer_freq( ) function
- Potential Errors in using the infer_freq( ) function
Syntax of the infer_freq( ) function
It is to be noted that the input values to be fed into the infer_freq( ) function are to be of the order of DatetimeIndex or TimedeltaIndex. Anything else than the dates & time shall only render the function ineffective for usage. Following is the syntax containing the mandatory and optional constructs that are required for the proper functioning of the infer_freq( ) function.
- index – DatetimeIndex or TimedeltaIndex containing the values to infer the frequency.
- warn – set to ‘True’ by default, it is used to return a warning text should there be no inferable frequency. But in the versions succeeding 1.5.0, this option seems to have been deprecated.
Use cases for the infer_freq( ) function
Let us construct an input index with a set of dates using the DatetimeIndex function using the code as shown below.
I = pd.DatetimeIndex(["2023-01-01", "2023-01-08", "2023-01-15", "2023-01-22"])
Once done we shall send this through the infer_freq( ) function to see how it fares.
Following is the result when the above code is run.
What one could infer from the above result is that each of the given dates in the input index belongs to Sundays of consecutive weeks.
Let us use another dataset such as the one given below to have a closer look at the functionality of infer_freq( ) function. The date_range( ) function shall be used this time to feed in a series instead of an index.
I1 = pd.date_range(start='2023-01-01', end='2023-01-25', periods=25)
Now it is time to infer the frequency from the above dataset using the infer_freq( ) function.
The above result conveys that the given set of data is separated by a frequency of a day. This also establishes that the capability of infer_freq( ) function is not limited to indices but extends to including series too. Having established this conundrum, let us move on to the common errors encountered while using the infer_freq( ) function.
Potential Errors in using the infer_freq( ) function
There are two types of errors one can encounter when using the infer_freq( ) function, viz.
- Type Error
- Value Error
The ‘Type’ error arises when the given input does not contain data that is of the order of date-time, whilst the ‘Value’ error arises when the given input contains values that are lesser than 3 dates. Following is a demonstration to see how it works.
I2 = [['2023', '2021', '2020', '2023']] pd.infer_freq(I2)
Output: ValueError: Need at least 3 dates to infer frequency
Now that we have reached the end of this article, hope it has elaborated on how to use the infer_freq ( ) function from the pandas library. Here’s another article that details the usage of the cut( ) function from the pandas library in Python. There are numerous other enjoyable and equally informative articles in AskPython that might be of great help to those who are looking to level up in Python. Audere est facere!