Protecting Python Code from Unauthorized Access

Companies often develop proprietary Python applications containing valuable intellectual property or algorithms that provide them a competitive advantage. However, as Python code is easy to decompile and reverse engineer, businesses worry about their code getting stolen or misused if accessed by unauthorized third parties.

So how can you prevent your Python code from being read by unauthorized users? While there are no bulletproof methods, this article provides an overview of techniques to make your code harder to access and understand by potential attackers.

Also read: Virtual Env : How to Clean up a Virtual Env?

Potential attackers can use tools like uncompyle6 to easily convert compiled Python code (.pyc files) back into equivalent Python source code. They can then steal algorithms or intellectual property from your code.

Some common situations where you’d want to protect Python code:

Developing a SaaS web application with proprietary business logic running on servers
Distributing a Python desktop or mobile app to end users
Licensing a Python library/module to other companies

Fortunately, developers can utilize obfuscation, encryption, bindings, license keys, and other tricks make it harder for attackers to misuse Python code.

What Are Some Python Code Protection Techniques

Here’s an overview of popular techniques to protect Python source code:

1. Obfuscate Code

Code obfuscation modifies and obscures code to make it harder to understand, while retaining original functionality. For Python, it typically does things like:

Renames identifiers like variables functions to meaningless names
Encrypts literal strings
Removes comments and docstrings
Changes code structure without affecting logic flow

This forces attackers to spend more manual effort to understand what the code is doing before they can steal critical pieces.

Example obfuscators: pyArmor, pyShield, PyObfuscate, CX Freeze

2. Encrypt Code

Encryption transforms code into ciphertext that can only be decrypted at runtime using a secret key. This prevents attackers from directly viewing source code. Encrypted code is decrypted in memory during execution.

Example encryptors: PyCrypto, pyconcrete, PyProtect

3. Compile to Binary/Native Code

Converting Python to binary code or native machine code (C extensions) makes it harder to apply reverse engineering techniques. Compiled binary code can also be obfuscated using traditional protections like stripping symbols.

Examples: Cython, Nuitka, PyInstaller

4. Code Splitting

Breaking code into chunks and only distributing non-critical chunks can help protect intellectual property. Critical pieces can be hosted on private cloud servers and accessed as a service.

5. License Keys

Requiring end users to enter a license key or credentials to enable code execution can prevent unauthorized parties from directly using your software. Keys can be verified offline or via a central server.

6. Anti-Debugging/VM Detection

Code can contain tripwires to detect debuggers or throw errors when running in virtual environments commonly used in reverse engineering. This hampers automated attacks.

Now let’s explore some of these techniques through examples.

Obfuscating Code with pyArmor

pyArmor is a popular open-source tool for obfuscating and encrypting Python code. Let’s use it to protect a simple Python script myscript.py:

# myscript.py

import math

def calc_area(r):
  return math.pi * r * r

radius = 5  
print(f"Area of circle with radius {radius} is: {calc_area(radius)}")

First, install pyarmor:

pip3 install pyarmor

Now obfuscate the script:

pyarmor gen myscript.py

This generates an obfuscated script myscript.py in the dist folder. Opening it reveals renamed identifiers and encrypted strings:

# Pyarmor 8.4.6 (trial), 000000, non-profits, 2024-01-16T23:54:04.492165
from pyarmor_runtime_000000 import __pyarmor__
__pyarmor__(__name__, __file__, b'PY000000\x00\x03\x0b\x00\xa7\r\r\n\x80\x00\x01\x00\x08\x00\x00\x00\x04\x00\x00\x00@\x00\x00\x00A\x04\x00\x00\x12\t\x04\x00\x04\xefb\x88Z\x95\xe3\xc9\x07\x01\xc6\x13PO6&\x00\x00\x00\x00\x00\x00\x00\x00\x1cN\xa0\xe1\xc95\x98\x8b\x95\xe7\xd0\x81\x1a\x1c\xbd\xed\xe9\x9b\xa4;\xb4\xea(\x9b7\xd8\xfa\x11\x86\x91\xe0\xf7\xd7\x8d\xd0:\xaf\xa4:\xec\xd0\xae\\\x11\xf71\\]\xd7{x\xcf\x84m\xcc\xf7>_Kl\x057&\x91*79W\xd4%\xa4\x9c{\x8c\x02\xbe\xc6\xf9Niu\xe2R2\xc0\xa9\xf9.\xe7\xd4\x92P\x87\x1b7_3\x7f\xc8~\xf2\x9f\xc7\xf5w\xbfU%\\c\xac\x15\x15\x8aPW-\xd5\xc9\xfdB\xf0O\x9f\xd3\xa0#OE9\xa0w\xb4\xf7\xce#\x8d/\xcc\xf5O\xcf\x17e#\xce\x88\x13\xc3a\xbb\xaf\xf2\x08\xfb*U\xb4\x19l\x8b\x01\x8f\x92\xed{\xd1;\x15\xaf\x06\x1f\xbauC\x06|ep\x86\xb6~*\x7f~\xa4\xab\x99\xaay\xe2\x85N_\xf16\x9eS\x01\x1e\xccG\xfd\xd0|\n\xde\x14\x1f\xf6\x03?x\x1f{ \x89\x95\xefM\\\xd6\x02\xfe3\t2Y\xaf\x98\xf5X\xe5 \xbcq\xce89\x97\x0b\xd0\x82\x9a\x8e\x82\x9e`>\xaeo\x92\xfe"\xbb\xcerT\x93B\xc0\xfc\x12J\x06\xa6\x8fH\xc9\x8b\xd9\xf6e]\x11\xcb\xcd\xbb\x82A`\xd3\xdf\xd3:N\xbc\xddn\xc2\xe2J\xde\xb3\xd1zu\x04+\xed\xbc2?\xa9\xd1\x18h\xef\x83\x8e\x91\xeb_C\xe2\xf1\xf2\x87+<\xd1\x7f4\x1d\xbbJ[\xf3\x16\x8a\xbd\x80\x83\xee\xdaf\x06h\x8a\x03\xf8\x1a>d\xcd\x0f\tDBI[*\xeaR\xda\xf6\xd5\xd0\xf6\xb2\x8a;\xc5\xb3\xd6XB\x0bg\xe7\xe0\xf1Bf\xa6\x82\x9e\xe7(\x9b\xd2\x8e\xce\x7f\x04\xee\xd1\x8cy\xa6O$6E\xad\xa6\xb4\xe3\x7f\x1b\x00h\xed\xc7\xeb\xa7\x18+p\xb0_\x1alu\x0cw\xb5V\x19\xc8\xa2\xb4\xd7\x82S\x91Gn\xa6\xc5\x85\x9dl\x19\xbf\x7f\x16\x06kdb\x83p\x00\xbe\xad^\xde\x08\xe9\\y\x86&\x9c\xb1M\xb8\x9eV\x94LUAc\x02>H\xd2i\xf8\x88s\x03\xb2\xe7*\'\x92\x80\xd3!S\x02:_\xe3g\x1a=T\x81(\xbd<H\xc4E\xc7\xf9\x9f;O\xd0V\xe9\n\xbd\xd0\x8eP\xeaS\xa4\x15\x9c\xa4\xbc\x10x\xf5S\xe2\xd4\x19q|\x95e\x0e-....

The logic remains unchanged, but understanding variable names or workings requires manual reverse engineering effort.

Let’s run the obfuscated script normally:

python3 dist/myscript.py

The output remains identical. pyArmor has added protection without affecting functionality.

Encrypting Code with pyShield

pyShield is another popular open-source tool focused on encryption rather than obfuscation.

Install pyShield:

pip3 install pyshield

Encrypt the sample script:

pyshield -f myscript.py -l3 -o encrypted_myscript.py

This encrypts myscript.py into encrypted_myscript.py which contains the encrypted source code.

Trying to run the encrypted script fails:

sh-3.2$ cat encrypted_myscript.py 

供伲亪何俢伙保俭仙两='仗乪仮佶佲俳侕侰伥侓';保佀侚侠伧伴事乵仃伥='侟你佈佻丗佪仆代倌俪';佒侹侈乜伿何佛了仓侹='临乴俯丣伏倆佾侈仆佭';亱俠争俒以俟侭亱亅倊='低佺丨伞作侌乣侻伙仢';丷两亄丫伦俎侟俨俏侍='佒佀买丹伩俁伤丱俆丠';侘亮伌佱乭亖乭丠信亥='乽來俙仔仇亰亐佰住丠';伀仰倄佒佀丟他乌侟俾='些们便侷佦佪倏伍伦伆';何仮亝丣佮亸伝們俸亘='俷佅予侑倓亾仳乸伃俠';伎义亓仦俼乃乭乔侟仼='乙侤伂佖低丟伩仪使俓';個亞乒伞侖东佇伃侉乘='伎亇俲仂两侜仳仡乩亜';他仑串乥丠俁俢佼乙佧='俟仱俟俲亘伖乨侯俻佼';亟侗俣严丳倅乣亾俽俢='侖五亰伤举佅侭伹俽侧';估伀仅仼乏俓侠乽仅仔='令伩俖伪仦似佌仟丯亲';乹仈仔亳体佗仮俏五佬='乛仝乔二何侩乤严侈付';亹仭俑伝侥俓倔佻個乧='侱仾亯仁仌侑从伪们佋';佞仝侮丯丽亲伐乂侷书='伐仁俱位乑亣予伓以亍';倇俷丰仚仺侗佾你伵亍='伊伢侔伥仸使仕侶亇亾';伽倌仪乸佬佊伏亼伈仔='亠伝低亵仆仄亖丛俵仢';佒乕们产侟伓乇乍倄侜='丝侌世侗俚佐倅仙乴侚';亦众伫仛伴侗乖侏俁伆='亝丗丝佤乥乿佗倌倄俪';仹个乽什依俸义乵们佦='俈佻俁倏俻丶丙仇些倏';乊侼佀俫侖侑仧侂倃侊='丘乣俬亹仆倂侇丛伝从';俓侌佟义侬佈倇侸倃侞='佨佷伮俤俖俌亓伟侩仪';于便亮倓亲乥併俵亪为='乤佞俍仝仔侁亥俴佹丧';倂丸乪介丰伎他体了佇='亅乒乯佶仚仾丯亪亿丝';俠丩住伆仉会亼乛伃伭='伆佃伙侍买亞丱侎侧体';侎主佢临乬仰伫伣仜亚='乺亃久代亽仓侉亡佻丹';乚亊亩係世世伮争体中='伍修俍伦丵乾个侱亃侱';俙俅俉伐侇俾倄佐仙佔='俹亱佌丵亢侅佊伐予么';侣乴仜何似亪佂产佑们='佱俿亯倍乐佩亭個乢二';exec(__import__('base64').b64decode(b'5L+65L255Lmh5L6z5Lmw5L2o5L255LqM5L2A5L2+PSfkvqvkuKzkv4zkuq/kuqLkupHkuqfkv7nku73kupAnCuS4quS9m+S8iOS7puS8tuS8keS9uOS9jeS+heS4pD0n5L235L215L2k5L+J5Lm45LmT5L2L5L+i5LqO5LymJwrku6HkuqPkvp/kvZfkvJTkvYDkv6jkuLHkur3kuJw9J+S/seS8uOS9tOS9iuS4oeS6u+S4o+S+quS7nOS6pCcK5L+X5LyR5LqZ5L2s5L+q5Lq55L+l5Lid5YCD5LqQPSfkv6Lku5Pkubzkv67kvafkuLXku7LlgJLkvb7ku5AnCuS7qeS/heWAheS9p+S/jOS5qeS4p+S6iuS8seS/qz0n5Lis5L6C5L2O5L+E5Li55L2E5L+o5Liu5LuK5LijJwrkubPku5TkvJvku6bkuKvkvbvku6HkvIrkvr/kv6A9J+S+p+S6iOS7veS9l+S+ruS7r+S8oeS5teS8jOWAjicK5Lij5LmE5Lqs5L2i5L+65LmX5L6i5Lml5Lqq5LyaPSfku4nkvprkvp3ku5zkvKfkuKvkuYDkvLnkuKLkuaknCuS+guS7nOS8heS6juS8tOS+hOS5pOS8meS8iOWAhj0n5Ly25L2H5LmF5LyJ5LqE5LyF5LqO5L645Lu45L+8Jwrkupzku6vku7nkuq7kvZ/kvZzkupHkurHku7HkvYA9J+S8l+S4oeS9s+S/veS9o+S7juS9uuS9h+S7r+S7vScK5Lit5L225YCE5Luo5Lqd5L+F5L+w5LiZ5Lip5L2APSflgJHkv6/kvK/lgITkvK3kvaHkvLnkvKTkuLPkvpInCuS7s+S/hOWAgOS8u+S8i+S/uuS/oeS5q+S5heS8sD0n5LmJ5L+65YCG5L265Lqi5L6m5YCL5L6e5L2d5Ly1JwrkuL/kuq3kupHku4rkubDkvJ3ku6Dkub3ku4/ku7I9J+S6nOS8geS/v+S/n+S6hOS/ruS/leS8m+WAiOS9ticK5YCL5YCR5Lu95Lif5L+h5LmM5YCG5LmJ5L6R5L26PSfkuJ7kvKvkvIPkv4jkvI3kv4rkvbXkur/lgIzkv4wnCuS7reS+reS8qeS+veS6vOS6muS+k+S4muS5geS7lT0n5YCG5YCG5L+I5Lyp5L+i5L2W5Ly+5L6m5Ly85Lu5Jwrku4fkvJbkvbbkvLzkuoLkvaLkuJbkvbLkvprkvaQ9J+S4neS7veS+oOS9geWAhuS+heS+keS6p+S/leS5mScK5Liw5L2k5L+p5L6b5Lmk5L6h5LyV5Lq65Lir5L+nPSfkuKzkvbvkurLkuKXkv4jkupXkvLPkvK/kvqHkurUnCuS6lOS8uuS8oeS8iuS5jOS9vuS5r+S7peS4veS5sj0n5YCH5LqB5L+p5L+z5YCI5Lm55Lql5Luv5Lu25L61Jwrkvo7kvL7lgIvkvY/kvrXkubbkvpDkv77kubPku6Y9J+S6nOS9veWAgOS+ouS5jeS4meS6m+WAh+S4tOS5oycK5L2Z.....

The script has been transformed into ciphertext. You can run it right away by typing python3 <filename>.py

To execute the encrypted script, pyShield hooks into the Python import system to transparently decrypt and load modules at runtime. This avoids making changes to the original code.

The output matches the original, but source code is fully encrypted.

Also read: [Fix] “bad interpreter: No such file or directory” Error in Python

Licensing Approach with License Keys

Instead of directly protecting code, you can control access using license keys. Users must supply a valid license key before they can use your Python program or library.

Here is sample code to add a basic license key check:

import sys

VALID_KEYS = {
    'ASDFF-SDASD-FFASF-FDSFD': True  
}

key = input("Enter license key: ")

if VALID_KEYS.get(key) != True:
    print("Invalid license key. Exiting.")
    sys.exit()
    
print("Valid key. Continue execution...")

This checks if the user-entered key matches a hard-coded set of valid keys before allowing further execution.

You can make this more sophisticated by:

Storing keys in an external encrypted database or file
Dynamically generating keys at runtime
Verifying keys via a web API

Usage:

$ python myscript.py
Enter license key: INVALID
Invalid license key. Exiting.

$ python myscript.py  
Enter license key: ASDFF-SDASD-FFASF-FDSFD
Valid key. Continue execution...

Real-World Business Use Cases

Here are some examples of how companies can utilize code protection in real-world apps:

Web Security SaaS Company

Develops web vulnerability scanner in Python
Critical scanning and parsing algorithms represent core IP
Obfuscates code before shipping to prevent clones from emerging

Python Library Company

Sells proprietary machine learning modules to other companies
Encrypts algorithms to prevent unauthorized use
Requires customers to supply license key that enables encrypted code

Game Developer

PC game written in Python and compiled to executable using PyInstaller
Hackers reverse engineer compiled binary to steal in-game assets
Applies anti-debugging tricks to detect debuggers/emulators and disrupt attackers

Summary

While no technique can prevent unauthorized access, strategically combining protections introduces enough friction to deter all but the most skilled and determined attackers. The level of protection should be proportional to the sensitivity of intellectual property.

Basic obfuscation paired with centralized license key checks for many Python applications offers reasonably strong protection without much extra overhead. Encryption and anti-tampering mechanisms can strengthen protection for highly sensitive code distribution to untrusted environments.

Utilizing these application security best practices allows creators to confidently develop Python software containing proprietary logic and algorithms without worrying about critical pieces leaking out.