CSV injection - Scala

CSV injection - Scala

Need

Prevention of CSV injection attacks

Context

  • Usage of Scala for functional and object-oriented programming in the JVM environment
  • Usage of play.api.mvc for handling HTTP requests and responses in Play Framework
  • Usage of play.api.libs.iteratee for asynchronous stream processing in Play Framework

Description

Non compliant code

        import play.api.mvc._
import play.api.libs.iteratee._

def exportCSV = Action {
  val data = List(
    List("Name", "Age", "Email"),
    List("John", "30", "john@example.com"),
    List("Jane", "=1+1", "jane@example.com")
  )

  val content = data.map(_.mkString(",")).mkString("\\n")

  Ok.chunked(Enumerator(content.getBytes)).as("text/csv")
}
        
        

The above Scala code is a simple Play Framework controller action that exports a list of users as a CSV file. The data list contains a list of users, where each user is represented as a list of strings. The content string is created by joining each user's data with a comma and then joining all users with a newline character.

The vulnerability lies in the fact that the user's data is directly inserted into the CSV file without any sanitization or escaping. If a user's data contains a formula (like "=1+1" in the case of "Jane"), this formula will be directly inserted into the CSV file. When this CSV file is opened in a spreadsheet software like Excel, the formula will be executed. This is a type of CSV Injection attack.

The attacker can inject formulas to perform a variety of attacks, such as stealing information from the user's computer, sending it to a remote server, or even executing arbitrary commands. This vulnerability can lead to serious security breaches if the exported CSV files are used in a sensitive context.

To fix this vulnerability, it's necessary to sanitize the user data before inserting it into the CSV file. A simple way to do this is by prepending each user's data with a single quote ( '). This will cause spreadsheet software to treat the data as a string, even if it looks like a formula.

Steps

  • Sanitize user input: Before including any user input in the CSV file, make sure to sanitize it to remove any potentially harmful characters or formulas.
  • Validate input data: Validate the input data to ensure that it meets the expected format and does not contain any malicious content.
  • Escape special characters: When including user input in the CSV file, escape any special characters that have a special meaning in CSV files, such as quotes or commas.
  • Use a CSV library: Instead of manually generating the CSV content, use a CSV library that handles the formatting and escaping of data automatically. This helps to prevent CSV injection vulnerabilities.
  • Disable formula evaluation: If the exported CSV file is intended to be opened in Excel, disable the automatic evaluation of formulas in the spreadsheet application. This can help prevent the execution of injected formulas.

Compliant code

        import play.api.mvc._
import play.api.libs.iteratee._

def sanitizeInput(input: String): String = {
  if (input.startsWith("=") || input.startsWith("+") || input.startsWith("-") || input.startsWith("@")) {
    "'" + input
  } else {
    input
  }
}

def exportCSV = Action {
  val data = List(
    List("Name", "Age", "Email"),
    List("John", "30", "john@example.com"),
    List("Jane", "=1+1", "jane@example.com")
  )

  val sanitizedData = data.map(_.map(sanitizeInput))
  val content = sanitizedData.map(_.mkString(",")).mkString("\\n")

  Ok.chunked(Enumerator(content.getBytes)).as("text/csv")
}
        
        

The provided code is a Scala function that exports a CSV file using the Play framework. The function first creates a list of data to be included in the CSV file. This data is then converted into a string format suitable for a CSV file, with each item separated by a comma and each row separated by a newline.

The vulnerability in the original code was that it did not sanitize the user input before including it in the CSV file. This could allow a user to inject formulas into the CSV file, which could be interpreted by Excel when the CSV file is opened. This is known as a CSV injection vulnerability.

The fixed code includes a new function, sanitizeInput, which is used to sanitize the user input before it is included in the CSV file. This function checks if the input starts with any of the characters that Excel interprets as a formula (=, +, -, @). If it does, the function prepends a single quote (') to the input. This causes Excel to treat the input as a string rather than a formula, preventing the CSV injection.

The exportCSV function then uses this sanitizeInput function to sanitize all the data before it is included in the CSV file. This ensures that any user input is properly sanitized, preventing CSV injection vulnerabilities.

References