Insecure deserialization - Python

Insecure deserialization - Python

Need

Secure deserialization process

Context

  • Usage of Python 3 for writing and executing Python code
  • Usage of Pickle for object serialization and deserialization

Description

Non compliant code

        import pickle

def deserialize_object(serialized_object):
    return pickle.loads(serialized_object)
        
        

The function deserialize_object takes a serialized object as an input and returns the deserialized object. The function uses pickle.loads to deserialize the object.

The vulnerability lies in the fact that the function does not validate the content of the serialized object before deserializing it. This means that if the serialized object contains malicious code, this code will be executed during the deserialization process. This is known as insecure deserialization.

Moreover, the function does not cast the deserialized object to a specific type. This means that any type of object can be deserialized, which can lead to unexpected behavior and potential security issues.

Insecure deserialization can lead to various types of attacks, including code execution, denial of service, or even complete system takeover, depending on the context and the specific payload used.

In the context of a Django backend application, this vulnerability could be exploited by an attacker to execute arbitrary code on the server, potentially leading to serious security breaches.

Steps

  • Replace the pickle module with a safer deserialization module, such as json or yaml, which do not allow the execution of arbitrary code during deserialization.
  • If you must use pickle, ensure that the serialized data is coming from a trusted and secure source. Never deserialize data received from an untrusted or unauthenticated source.
  • Implement a schema validation before deserializing objects. This will ensure that the incoming data is in the expected format and type.
  • Consider using digital signatures to verify the authenticity of the serialized data before deserializing it. This can prevent an attacker from tampering with the serialized data.
  • Use a secure and updated encryption method to encrypt the serialized data. This can prevent an attacker from viewing or modifying the serialized data.

Compliant code

        import json
from jsonschema import validate, ValidationError
import hashlib
import hmac

# Define a JSON schema for your object
schema = {
    "type" : "object",
    "properties" : {
        "name" : {"type" : "string"},
        "age" : {"type" : "number"},
    },
}

def deserialize_object(serialized_object, secret_key):
    # Verify the authenticity of the serialized data
    try:
        serialized_object, signature = serialized_object.rsplit('.', 1)
        calculated_signature = hmac.new(secret_key, serialized_object.encode(), hashlib.sha256).hexdigest()
        assert hmac.compare_digest(calculated_signature, signature)
    except (ValueError, AssertionError):
        raise ValueError("Invalid serialized object or signature")

    # Deserialize the object
    deserialized_object = json.loads(serialized_object)

    # Validate the deserialized object against the schema
    try:
        validate(instance=deserialized_object, schema=schema)
    except ValidationError:
        raise ValueError("Invalid data format")

    return deserialized_object
        
        

The above code provides a secure way to deserialize objects in Python. It uses the json module for deserialization, which is safer than pickle as it does not allow the execution of arbitrary code during deserialization.

The jsonschema module is used to validate the format and type of the deserialized object against a predefined schema. This ensures that the incoming data is in the expected format and type.

Before deserialization, the code verifies the authenticity of the serialized data using a digital signature. The signature is calculated using the HMAC-SHA256 algorithm and a secret key. The hmac.compare_digest function is used to compare the calculated signature with the provided signature in a way that is resistant to timing attacks.

If the signature verification fails, or if the deserialized object does not match the schema, the function raises a ValueError.

This approach ensures that the serialized data is coming from a trusted and secure source, and that it has not been tampered with. It also provides a way to ensure that the deserialized data is in the expected format and type.

References