Python struct Module

The Python struct module is used to provide a simple Pythonic interface to access and manipulate C’s structure datatype. This can be a handy tool if you ever need to deal with C code and don’t have the time to write tools in C since it is a low-level language.

This module can convert Python values to a C structure and vice-versa. The C structure is used as a Python bytes object since there is nothing called an object in C; only byte-sized data structures.

Let’s understand how we can use this module to have a Python interface to C structures.


Python struct Module Methods

In this module, since we are concerned with C structures, let’s look at some of the functions that this module provides us with.

struct.pack()

This is used to pack elements into a Python byte-string (byte object). Since the mode of storage is based on bytes, C based programs can use the output of pack(), from a Python program.

Format: struct.pack(format, v1, v2, …)

v1, v2, … are the values which will be packed into the byte object. They represent the field values for the C structure. Since a C structure having n fields must exactly have n values, the arguments must match the values required by the format exactly.

Here, format refers to the format of the packing. This is needed since we need to specify the datatype of the byte-string, as it is used with C code. The below table lists the most common values for format. We need one format per value to specify it’s datatype.

FormatC DatatypePython type
cchara string of length 1
?_Boolbool
hshortinteger
llonginteger
iintinteger
ffloatfloat
ddoublefloat
schar[]string

Let’s understand this using some examples.

The below snippet stores the 3 integers 1, 2 and 3 in a byte object using pack(). Since the size of an integer is 4 bytes on my machine, you see 3 blocks of 4 bytes, which correspond to 3 integers in C.

import struct

# We pack 3 integers, so 'iii' is required
variable = struct.pack('iii', 1, 2, 3)
print(type(variable), variable)

variable_2 = struct.pack('iic', 1, 2, b'A')
print('\n', variable_2)

Output

<class 'bytes'> b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00'
b'\x01\x00\x00\x00\x02\x00\x00\x00A'

If the appropriate type is not passed, the exception struct.error will be raised by the Python struct module.

import struct

# Error!! Incorrect datatype assignment
variable = struct.pack('ccc', 1, 2, 3)
print(type(variable), variable)

Output

struct.error: char format requires a bytes object of length 1

struct.unpack()

This function of the Python struct module, unpacks the packed value into its original representation according to an appropriate format. This returns a tuple of size equal to the number of values passed since the byte object is unpacked to give the elements.

Format: struct.unpack(format, string)

This unpacks the byte string according to the format format specifier.

This is the reverse of struct.pack(). Let’s take one of the old byte strings that we produced using that and try to get back the python values passed to it using unpack().

import struct

byte_str = b'\x01\x00\x00\x00\x02\x00\x00\x00A'

# Using the same format specifier as before, since
# we want to get Python values for the same byte-string
tuple_vals = struct.unpack('iic', byte_str)
print(tuple_vals)

Output

(1, 2, b'A')

As you can see, indeed, we can get pack our old Python values from this tuple, provided we use the same format specifier for both pack() and unpack().


struct.calcsize()

This function returns the total size of the String representation of the struct using a given format specifier, to retrieve the types of the data and calculate the size.

Format: struct.calcsize(fmt)

import struct

print('C Integer Size in Bytes:', struct.calcsize('i'))
print('Size of 3 characters in Bytes:', struct.calcsize('ccc'))

Output

C Integer Size in Bytes: 4
Size of 3 characters in Bytes: 3

struct.pack_into()

This function is used pack values into a Python string buffer, available in the ctypes module.

Format: struct.pack_into(fmt, buffer, offset, v1, v2, …)

Here, fmt refers to the format specifier, as always. buffer is the string buffer which will now contain the packed values, specified. You can also specify an offset location from the base address from which packing will occur.

This does not return any value, and simply stores the values into the buffer string.

import struct 
import ctypes 

# We will create a string buffer having a size
# equal to that of a struct with 'iic' values.
buf_size = struct.calcsize('iic') 

# Create the string buffer
buff = ctypes.create_string_buffer(buf_size) 
  
# struct.pack() returns the packed data 
struct.pack_into('iic', buff, 0, 1, 2, b'A')

print(buff)

# Display the contents of the buffer
print(buff[:])

Output

<ctypes.c_char_Array_9 object at 0x7f4bccef1040>
b'\x01\x00\x00\x00\x02\x00\x00\x00A'

Indeed, we get our packed values in the buffer string.


struct.unpack_from()

Similar to unpack(), a counterpart exists for unpacking values from a buffer string. This does the reverse of struct.pack_into().

Format: struct.unpack_from(fmt, buffer, offset)

This will return a tuple of values, similar to struct.unpack().

import struct 
import ctypes 

# We will create a string buffer having a size
# equal to that of a struct with 'iic' values.
buf_size = struct.calcsize('iic') 

# Create the string buffer
buff = ctypes.create_string_buffer(buf_size) 
  
# struct.pack() returns the packed data 
struct.pack_into('iic', buff, 0, 1, 2, b'A')

print(struct.unpack_from('iic', buff, 0))

Output

(1, 2, b'A')

Conclusion

In this article, we learned about using Python struct module to deal with C-type structure objects.

References

  • JournalDev article on Python struct module