How to Render a Data Frame to a LaTeX Table?

How To Render A LaTeX Document From A Data Frame

LaTeX is a type of writing format primarily used in educational and scientific writing. Leslie Lamport introduced it as a successor to the TeX typesetting system.

Unlike the most accepted writing formats like MS Word, the LaTeX author writes a plain text document, which can then be converted into the desired style using LaTeX commands.

The plain text is converted into the desired formatting with the help of markup tagging conventions similar to those of HTML documents.

This tutorial will show us how to get a LaTeX format from a data frame.

Before that, read this article to learn about data frames.

LaTeX Explained

Before we get a LaTeX format from a dataframe, let us understand what LaTeX is all about.

As discussed above, a LaTeX document is a plain text writing system that can be processed into any format we desire using some markup commands.

LaTeX is widely used for its presentation features in scientific reports and academic research papers and also finds use in scientific documents in mathematics, linguistics, and so on.

Why is it used, you may ask?

A LaTeX file is comparatively smaller than your average Word document. It also allows for cross-platform sharing, meaning a LaTeX document can be used across multiple platforms without issues.

It also supports saving templates, so you can reuse the same template or format multiple times, changing whatever is required.

The main advantage is you can write mathematical equations elegantly without the hassle you might face while writing in word processors.

You can use many online platforms to create your first LaTex project!

Let us see how we can create a LaTeX project.

The below image shows the syntax of the LaTeX document.

LaTeX Syntax
LaTeX Syntax

As you can see, the format is similar to the HTML syntax. We mostly use a backslash(\) to create a new entity.

Curious how the LaTeX document turned out?

LaTeX Document
LaTeX Document

As you can see, I have tried to create a mixed fraction using the tools available in the editor. The document looks just like how you would want your big research paper to be! There is no need to select fonts, and wonder if that is an accepted font and struggle with the text alignment.LaTeX codes do it all for you!

You can even try to create matrices and do much more using the tools in the editor linked in the references.

Let us understand what a dataframe is.

What Is a Data Frame?

A data frame is a tabular structure that stores data in rows and columns. A data frame can contain elements of different data types inside it. We can also use indexing to name the entities of a data frame.

There might be some unwanted rows and columns in your data frame that you wish to remove.

Read this article to learn how to filter data frames.

Let us look at an example.

import pandas as pd 
dc={
    "Student":['Alan','Joe','Bard'],
    "Marks":[80,75,89]
}
df=pd.DataFrame(dc,index=["Student1","Student2","Student3"])
print("The data frame is:\n",df)
Dataframe
Dataframe

So in the above code, we are creating a dictionary that has student names and their marks. We import the pandas library to convert this dictionary into a data frame. We are also specifying the index, which will name the rows as given.

Dataframe.To_latex Method Explained

We will see the syntax of the method and learn about its parameters.

The syntax is given below.

DataFrame.to_latex(buf=None, columns=None, col_space=None, header=True, index=True, na_rep='NaN', formatters=None, float_format=None, sparsify=None, index_names=True, bold_rows=False, column_format=None, longtable=None, escape=None, encoding=None, decimal='.', multicolumn=None, multicolumn_format=None, multirow=None, caption=None, label=None, position=None)
NameDescriptionDefault Value/typeNecessity
bufThis argument takes the file path or buffer where you want the output to be written in
If this argument is not provided, the output is a string
NoneOptional
columnsThis argument decides which columns of the data frame to be included in the output
If it is not specified, the output contains all the columns of the data frame
str or numericOptional
col_spaceThis argument decides the space between the columns in the output If not specified, and the default width is usedintOptional
headerThis argument decides if the column names of the data frame should be included in the Latex output
If a list of strings is given in this argument, it will be taken as an alias name to the existing headers in the data frame
TrueRequired
indexThis argument lets us choose to include the row names in the output
By default, this argument prints the index of the data frame
TrueRequired
na_repThe data frame we take might have missing fields
So in the output, these NaN values are represented as NaN
NaNRequired
formattersThe formatted functions are used to modify the columns using position or names
This field only takes a dictionary as input
strOptional
float_formatThis field is used to format floating-point numbersNoneOptional
sparsifyIf the data frame has multiple elements with different indices, and if this field is set to True, it will only print those keys when they change
If this field is set to False, it will print all the multi-index keys
boolOptional
index_namesThis field prints the names of the indexesTrue, boolRequired
bold_rowsWe can print the headers(first row) in bold using this fieldFalse, boolRequired
column_formatIf the column is a column of numbers, ‘r’ is used to format it
For every other column, ‘l’ is used
strOptional
longtableThis field is necessary if you wish to create a LaTeX longtable instead of just a table This field will only work if the LaTeX preamble has a package called usepackagelongtableboolOptional
escapeIf this field is set to true, the special characters of the columns are escaped
When False, the special characters are not escaped
boolOptional
encodingUsed when buf parameter has a file path to write the output to
The default encoding used is utf-8
strOptional
decimalWhen any element in the data frame contains a decimal number, for example, 3.14, this field is used to represent the decimal character
In Europe, ‘,’ is accepted as a replacement
str. decimalRequired
multicolumnIf this field is set to true, for each multi-index, a multicolumn is createdbool, TrueRequired
multicolumn_formatOnly used when multicolumn is set to true
Similar to column_format
str,lRequired
multirowSimilar to multicolumn
But we need to have a usepackage- multirow to be able to use this field
bool, FalseRequired
captionUsed to give a caption to the output
The syntax is \caption[short_caption]{full_caption} so if only one string is passed, no short caption will be set
str or tupleOptional
labelThe label which you want to be placed inside the \label() of your LaTeX documentstrOptional
positionThis argument is to be placed after the \begin() in your LaTeX document
It is a positional argument for tables
strOptional
Arguments of df.to_latex

Return type: This method will return a string if buf is set to None. Else, it returns None.

Let us look at a few examples of how to use this method.

Creating a LaTeX Format From a Numerical Data Frame

Create a data frame with numerical values and print the LaTeX format.

import pandas as pd
df = pd.DataFrame({'Cost':[100,120,80,250],
                   'Selling price':[150,100,100,230],
                   'Profit/loss':['Profit','Loss','Profit','Loss']
})
df=df.set_index('Cost')
print("The data frame is:\n",df)
print("********")
print("The LaTeX table is:") 
print(df.to_latex(index=True))

The pandas library that we imported in the first line is used to create a data frame.

The following line is used to create a data frame named df from a dictionary using the pd.DataFrame method.

In the following line, we are creating an index to the data frame using set_index method. The index for this data frame is Cost.

print("********"): We use this line to separate the two outputs to avoid confusion.

We are printing the data frame and the message “The LaTeX table is:” in the following two lines,

Lastly, we call the df.to_latex method, using the index parameter set to True to keep the index name intact for the LaTeX output.

The below image shows the difference between the LaTeX outputs when we use set_index and when we don’t.

LaTeX With Index Vs Without
LaTeX With Index Vs. Without

The image on the left is the output when we use set_index. The image on the right is when we don’t use set_index.

Writing the LaTeX to a File With Caption

In this example, we are going to create a dataframe and convert it to a LaTeX format but write the output to a separate file using the buf parameter of the syntax.

import pandas as pd
ytus={
    'Channels':['Channel1','Channel2','Channel3','Channel4'],
    'Numhourswatched':[12,8,6,3]
}
df=pd.DataFrame(ytus,index=["Day1","Day2","Day3","Day4"])
print("The dataframe is:\n",df)
print("*************")
df.to_latex('output.tex',index=True,caption='Number of hours spent on youtube')

To explain the code briefly, we have imported the pandas library to work with the method. We are creating a dictionary that consists of the youtube stats of a user. This data included the channels watched for how many hours and on what days. All this data is stored in ytus.

This dictionary is then converted into a data frame with the help of the method pd.DataFrame. We are also giving the names to the rows present in the data frame.

Next, we print the data frame. This data frame is converted into a LaTeX document named output.tex(tex is an extension for LaTeX files, just like csv is an extension for comma-separated value files). We are allowing the index to be included in the LaTeX file with the caption “Number of hours spent on youtube.”

Let us see the data frame first, followed by the LaTeX file.

Dataframe
Dataframe

The LaTeX document will be downloaded into your system or environment, which you can later convert into a pdf format.

LaTeX File
LaTeX File

As seen from the image on the left, we can observe the output.tex file is created(marked in red). We can see that the LaTeX document also has the caption(yellow) we have given in the code.

LaTeX Document With a Label

We have seen how to add a caption to the LaTeX file; now, let us try to add a label to it.

import pandas as pd
nm={
    'Name':['Jane','Doe','Kamal'],
    'Marks':[80,65,79]
}
df=pd.DataFrame(nm)
print("The data frame is:\n",df)
df.to_latex('df.tex',index=True,label="This is a sample document")

As usual, we are importing the pandas library as it is crucial to work with the two methods discussed below.

A new dictionary named nm is initialized, containing the names and marks of three students, Jane, Doe, and Kamal.

Now, we are using the method pd.DataFrame to convert nm to a data frame. This data frame is named df. This dictionary is converted into a data frame with the help of the method pd.DataFrame and named df.

In the last line, we create a new file called df.tex to store the LaTeX table with the label ‘This is a sample document.’

LaTeX With Label
LaTeX With Label

You can customize your LaTeX document by implementing the other fields of the method.

Summary

We have reached the end of this tutorial. To sum up, we have learned what a LaTeX file is, why we should use LaTeX, and why it is so popular.

We have used an online LaTeX editor called Overleaf to create our first LaTeX project and understand its syntax. You can explore this editor to print tables, matrices, and more.

We have understood what is dataframe and how we can store the data in a table-like format using the method pd.DataFrame. We have seen an example of a data frame.

Next, we observed the syntax of the method df.to_latex that takes a data frame as input and renders a LaTeX table. We have also understood the parameters this method accepts in detail.

Coming to the examples, firstly, we have taken a data frame, set an index for each of its rows, converted this data frame to latex, and compared the outputs by using set_index and not using it.

In the second example, we have taken the latex table rendered by a data frame and tried to write it to a separate file with a caption.

Lastly, we took the latex table and created a file for it with a label.

References

Refer to the official pandas’ documentation to learn more about the method used in this tutorial.

Want to create your first LaTeX project? Try Overleaf.