Lack of data validation - Trust boundary violation - Python

Lack of data validation - Trust boundary violation - Python

Need

Enforce strict data validation and trust boundaries

Context

  • Usage of Python 3 for developing Python applications
  • Usage of Django for building web applications

Description

Non compliant code

        from django.http import JsonResponse
from django.views import View

class MyView(View):
    def get(self, request, *args, **kwargs):
        untrusted_data = request.GET.get('untrusted_data')
        trusted_data = "This is some trusted data"
        data = {
            "trusted_data": trusted_data,
            "untrusted_data": untrusted_data
        }
        return JsonResponse(data)
        
        

In this Django view, we have a get method that is mixing trusted and untrusted data in the same data structure. The trusted_data variable is a string that we have defined in our code, so we know it can be trusted. However, the untrusted_data is coming from the request's GET parameters, which can be manipulated by the user, and therefore can't be trusted.

The problem here is that we are not validating the untrusted_data before mixing it with our trusted_data. This means that an attacker could potentially manipulate the untrusted_data in a way that could harm our system or the data it processes.

This is a trust boundary violation because we are not properly separating and handling data based on its level of trust. In this case, we are treating all data the same, regardless of where it comes from or how much we can trust it.

Steps

  • Separate trusted and untrusted data
  • Validate and sanitize untrusted data
  • Ensure trusted data is not affected by untrusted data
  • Use appropriate data structures or structured messages to keep trusted and untrusted data separate

Compliant code

        from django.http import JsonResponse
from django.views import View
from django.core.validators import validate_email
from django.core.exceptions import ValidationError

class MyView(View):
    def get(self, request, *args, **kwargs):
        untrusted_data = request.GET.get('untrusted_data')
        trusted_data = "This is some trusted data"

        # Validate and sanitize untrusted data
        try:
            validate_email(untrusted_data)
        except ValidationError:
            untrusted_data = "Invalid data"

        # Separate data structures for trusted and untrusted data
        trusted_data_structure = {
            "trusted_data": trusted_data,
        }

        untrusted_data_structure = {
            "untrusted_data": untrusted_data,
        }

        # Combine the data structures
        data = {**trusted_data_structure, **untrusted_data_structure}

        return JsonResponse(data)
        
        

The above code fixes the vulnerability by separating the trusted and untrusted data into different data structures. This ensures that the trusted data is not affected by the untrusted data.

The untrusted data is validated and sanitized using Django's built-in validate_email function. This function checks if the untrusted data is a valid email address. If it is not, a ValidationError is raised, and the untrusted data is replaced with a string indicating that the data is invalid.

The trusted and untrusted data structures are then combined into a single data structure using the ** operator, which merges the dictionaries. This allows the data to be returned in a single JsonResponse, while still keeping the trusted and untrusted data separate.

This approach ensures that the system does not mix trusted and untrusted data in the same data structure or structured message, thereby fixing the trust boundary violation vulnerability.

References