Trojan Source, attack that allows adding code changes invisible to the developer

Few days ago Cambridge University researchers released the publication of a technique to subtly substitute codes malicious in application source code.

The method of attack prepared that It is already listed under CVE-2021-42574 It comes under the name Trojan Source and is based on the formation of text that looks different to the compiler / interpreter and the person viewing the code.

About Trojan Source

The method relies on applying special Unicode characters in code comments, which change the display order of the bidirectional text. With the help of these control characters, some parts of the text can be displayed from left to right, while others from right to left.

In everyday practice, these control characters can be used, for example, to insert Hebrew or Arabic strings into a code file. However, if you use these characters to combine lines with different text directions on the same line, text passages displayed from right to left may overlap existing normal text displayed from left to right.

With this method, a malicious construct can be added to the code, but then make the text with this construction invisible when viewing the code, adding the characters shown from right to left in the next comment or inside the literal, which will result in a result of completely different characters superimposed on the malicious insert. Such code will still be semantically correct, but it will be interpreted and displayed differently.

We have discovered ways to manipulate the encoding of source code files so that human viewers and compilers see different logic. One particularly pernicious method uses Unicode directionality override characters to display the code as an anagram of its true logic. We have verified that this attack works against C, C ++, C #, JavaScript, Java, Rust, Go, and Python, and we suspect that it will work against most other modern languages.

While reviewing the code, the developer will be faced with the visual order of the characters and will see a suspicious comment in an editor text, web interface or IDE, but the compiler and interpreter will use the logical order of the characters and handle the malicious code as is, regardless of the bidirectional text in the comment. Several popular code editors (VS Code, Emacs, Atom), as well as interfaces for viewing code in repositories (GitHub, Gitlab, BitBucket, and all Atlassian products) are affected.

There are several ways to use the method to implement malicious actions: add a hidden "return" expression, which leads to the termination of the function execution prematurely; the conclusion in the comment of expressions that are normally considered valid constructs (for example, to disable important checks); Assigning other string values ​​leading to string validation failures.

In addition, another attack option was proposed (CVE-2021-42694), which involves the use of homoglyphs, symbols that appear similar in appearance, but differ in meaning and have different Unicode codes. These characters can be used in some languages ​​in function and variable names to mislead developers. For example, you can define two functions with indistinguishable names that perform different actions. Without detailed analysis, you cannot immediately understand which of these two functions is called in a particular place.

As a protection measure, it is recommended to implement in compilers, interpreters and assembly tools that support Unicode characters, showing an error or warning whether there are unpaired control characters in comments, string literals, or identifiers that change the output direction. These characters must also be explicitly forbidden in the programming language specifications and must be taken into account in code editors and interfaces for working with repositories.

Besides that vulnerabilities have already started to be fixed prepared for GCC, LLVM / Clang, Rust, Go, Python and binutils. GitHub, Bitbucket and Jira are also already in preparation of a solution together with GitLab.

Finally If you are interested in knowing more about it, you can consult the details in the following link.


Leave a Comment

Your email address will not be published. Required fields are marked with *

*

*

  1. Responsible for the data: AB Internet Networks 2008 SL
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.