Time Series Analysis Using Facebook Prophet – A Complete Guide

Time Series Analysis Using Prophet

A sequence of data or data points that are recorded at regular intervals of time is called a time series. Each data point is spaced with definite gaps, which essentially helps data analysts or engineers observe the trends in the data and how they may change over time. When we talk about data points, it must be understood that they are all related to a single object.

There are many examples of time series data other than weather conditions, like overall sales of a product in different time periods such as seasons and festivals, stock market data, and many more.

Time series analysis works on the data collected over a specific period. To understand time series analysis better, here is a simpler explanation: Time series analysis is used to observe trends and analyze the time series data. With time series analysis, we can identify hidden patterns in the data and draw new conclusions.

Prophet is one such algorithm introduced by Facebook to conduct time series analysis and forecasting. Let us see how we can use the Facebook Prophet library to perform time series analysis.

Not familiar with time series analysis at all? Start off from here!

What Is Time Series Analysis?

Time Series analysis is the statistical process of analyzing the data collected over a period of time to observe and study the patterns in the underlying data. It is also used to identify any noise or outliers in the data.

The observations and trends studied in this phase are supplied as input to time series forecasting, where we predict the future of a datapoint based on past trends.

Applications of Time Series Analysis

Time Series analysis can be applied anywhere. Be it in the stock exchange, banks, weather or climate, and so on. There are many applications of time series analysis in the real world, such as:

  • Finance: In the Financial domain, time series analysis is used to analyze historical data and observe trends and patterns in the data for risk management, Interest rate modeling, and credit score analysis
  • Medicine: When we talk about data in the medical field, it is voluminous. All the data about various diseases, symptoms, and patient reports is too hard to manage. With time series analysis, we can track chronic diseases, and allergies and analyze when, in the future, they attack a patient and with what severity
  • Weather Forecast: The basic idea is to observe trends in the past data of the climate or weather of a place to predict or analyze what weather conditions the place would have over a period of time

Related: Time Series Forecasting Using Dickey-Fuller Test

What Is Facebook Prophet?

Prophet is an algorithm developed by researchers at Facebook (now Meta) and released on February 23, 2017. The researchers of the Meta team introduced Prophet in a paper titled “Forecasting at Scale”. Prophet is available in Python and R, and it is open-source, meaning it is free for everyone to download and use.

It is available to experiment with in two languages: Python and R

There are many other tools in Python for time series forecasting, like darts. What does the prophet bring to the table?

The Prophet library is designed to work with univariate time series data. To elaborate, this tool only works on data that has a single variable.

It also fits well for data with non-linearity. When dealing with non-linear data, Prophet fits the data into yearly, weekly, and daily seasonality, including holiday cases.

It can automatically deal with non-linear trends in the data.

Facebook Prophet Installation

This library can be installed from PyPI (for Python) or CRAN (for R).

For Python, we can download the library using this command.

pip install prophet

Keypoints of Using Prophet

When we want to perform time series forecasting or analysis with Prophet, we need to ensure that our data is stored in the form of a data frame (in Python). The data frame should have two essential columns: ds and y. The ds is the date component, and y is the object we want to analyze with Prophet.

If our data frame doesn’t consist of these column names but of the content, we can change the names of those columns. Let us see an example.

Time Series Analysis Using Prophet

In this example, we are going to consider a dataset of the number of passengers traveling in each month of a year starting from January 1949.

The columns of this dataset are Month and #Passengers which we are going to change to ds and y respectively, later in the code.

import pandas as pd
import matplotlib.pyplot as plt
from prophet import Prophet

The Pandas library is imported to read the dataset, and the Matplotlib library is imported to visualize the dataset. These libraries are imported using their aliases. In the last line, we are importing the prophet module from Prophet library.

The next step is to read the data file.

df = pd.read_csv("/kaggle/input/air-passengers/AirPassengers.csv")
df
Data Frame
Data Frame

It is always advised to know the details of the dataset, like the number of columns, their respective datatypes, and the size of the dataset. This can be done with the help of df.info().

Information about the data frame
Information about the data frame

We can also visualize the data to know its characteristics. The code is given below.

df.plot()
plt.show()
Visualizing the Dataset
Visualizing the Dataset

In the following code, we are going to rename the columns to ds and y.We are also converting the dates to a format compatible with Pandas DataFrame.

df.columns = ['ds', 'y']
df['ds']= pd.to_datetime(df['ds'])
df
Renaming the columns
Renaming the columns

We are now going to fit our data into the prophet model and make future predictions.

model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=365)
model.predict(future)
Future Predictions using Prophet
Future Predictions using Prophet

The yhat_lower,yhat_upper,trend_upper and lower are the upper and lower boundaries of the target variable and the trend.

Let us do some Forecasting!

In-Sample Forecasting

In In-Sample forecasting, we take a random data point from the dataset itself and predict the output. In this way, we can determine if our model is performing correctly.

future = list()
for i in range(1, 13):
    date = '1958-%02d' % i
    future.append([date])
future = pd.DataFrame(future)
future.columns = ['ds']
future['ds']= pd.to_datetime(future['ds'])

We are creating a list called future to store the predictions. A for loop is initiated to select a set of dates starting from the year 1958. %02d is used to generate months in a two-digit format. These dates are appended to the future list to start forecasting. Since our dataset spans from the years 1949 to 1960, we are considering the samples from our dataset itself. The list is then converted into a data frame.

forecast = model.predict(future)
forecast

A variable called forecast is initialized to predict the future.

print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].head())
model.plot(forecast)
plt.show()

Lastly, we are printing the required columns of the data frame and displaying the forecast.

In-Sample Forecasting
In-Sample Forecasting

The black dots are the existing data points, and the blue curve is the forecasted data.

Out-of-Sample Forecasting

We are going to follow the same procedure as above but select a date that does not exist in the dataset.

future = list()
for i in range(1, 13):
    date = '1961-%02d' % i
    future.append([date])
future = pd.DataFrame(future)
future.columns = ['ds']
future['ds']= pd.to_datetime(future['ds'])
forecast = model.predict(future)
forecast

In the for loop, we have selected the dates from 1961. As we already know, our dataset has dates and the number of passengers from 1949 to 1960.

print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].head())
model.plot(forecast)
plt.show()

The output is given below.

Out of the sample Forecasting
Out of the sample Forecasting

As you can already guess, the above graph represents the number of passengers that boarded a flight every month during 1961–1962.

Summary

To recapitulate what we have done here, we learned in detail about the Prophet Library. Initially developed by Facebook’s data science team, it is available for use in Python and R. We will understand its significance and installation in the next section.

We have seen a detailed example of time series analysis and forecasting using Prophet for both In-Sample and Out-of-Sample data points.

Dataset

AirPassengers-Kaggle

References

Time Series Wikipedia

You can find more about Prophet here

Prophet – PyPI