Directory traversal attack
A directory traversal (or path traversal) attack exploits insufficient security validation or sanitization of user-supplied file names, such that characters representing "traverse to parent directory" are passed through to the operating system's file system API. An affected application can be exploited to gain unauthorized access to the file system.
Directory traversal is also known as the ../
(dot dot slash) attack, directory climbing, and backtracking. Some forms of this attack are also canonicalization attacks.
Example
A typical example of a vulnerable application in PHP code is:
<?php
$template = 'red.php';
if (isset($_COOKIE['TEMPLATE'])) {
$template = $_COOKIE['TEMPLATE'];
}
include "/home/users/phpguru/templates/" . $template;
An attack against this system could be to send the following HTTP request:
GET /vulnerable.php HTTP/1.0
Cookie: TEMPLATE=../../../../../../../../../etc/passwd
The server would then generate a response such as:
HTTP/1.0 200 OK
Content-Type: text/html
Server: Apache
root:fi3sED95ibqR6:0:1:System Operator:/:/bin/ksh
daemon:*:1:1::/tmp:
phpguru:f8fk3j1OIf31.:182:100:Developer:/home/users/phpguru/:/bin/csh
The repeated ../
characters after /home/users/phpguru/templates/
have caused
include()
to traverse to the root directory, and then include the Unix password file /etc/passwd
.
Unix /etc/passwd
is a common file used to demonstrate directory traversal, as it is often used by crackers to try cracking the passwords. However, in more recent Unix systems, the /etc/passwd
file does not contain the hashed passwords, and they are instead located in the /etc/shadow
file, which cannot be read by unprivileged users on the machine. Even in that case, though, reading /etc/passwd
does still show a list of user accounts.
Variations
Directory traversal in its simplest form uses the ../
pattern. Some common variations are listed below:
Microsoft Windows
Microsoft Windows and DOS directory traversal uses the ..\
or ../
patterns.[1]
Each partition has a separate root directory (labeled C:\
where C could be any partition), and there is no common root directory above that. This means that for most directory vulnerabilities on Windows, attacks are limited to a single partition.
Directory traversal has been the cause of numerous Microsoft vulnerabilities.[2][3]
Percent encoding in URIs
Some web applications attempt to prevent directory traversal by scanning the path of a request URI for patterns such as ../
. This check is sometimes mistakenly performed before percent-decoding, causing URIs containing patterns like %2e%2e%2f
to be accepted despite being decoded into ../
before actual use.[4]
Double encoding
Percent decoding may accidentally be performed multiple times; once before validation, but again afterwards, making the application vulnerable to recursively percent-encoded input such as %252e%252e%252f
(a single percent-decoding pass turns %25
into a literal %-sign). This kind of vulnerability notably affected versions 5.0 and earlier of Microsoft's IIS web server software.[5]
UTF-8
A badly implemented UTF-8 decoder may accept characters encoded using more bytes than necessary, leading to alternative character representations, such as %2f
and %c0%af
to represent /
. This is specifically forbidden by the UTF-8 standard,[6] but has still led to directory traversal vulnerabilities in software such as the IIS web server.[7]
Archives
Some archive formats like zip allow for directory traversal attacks: files in the archive can be written such that they overwrite files on the filesystem by backtracking. Code that extracts archive files can be written to check that the paths of the files in the archive do not engage in path traversal.
Prevention
A possible algorithm for preventing directory traversal would be to:
- Process URI requests that do not result in a file request, e.g., executing a hook into user code, before continuing below.
- When a URI request for a file/directory is to be made, build a full path to the file/directory if it exists, and normalize all characters (e.g.,
%20
converted to spaces). - It is assumed that a 'Document Root' fully qualified, normalized, path is known, and this string has a length N. Assume that no files outside this directory can be served.
- Ensure that the first N characters of the fully qualified path to the requested file is exactly the same as the 'Document Root'.
- If so, allow the file to be returned.
- If not, return an error, since the request is clearly out of bounds from what the web-server should be allowed to serve.
Using a hard-coded predefined file extension to suffix the path does not necessarily limit the scope of the attack to files of that file extension.
<?php
include($_GET['file'] . '.html');
The user can use the NULL character (indicating the end of the string) in order to bypass everything after the $_GET
. (This is PHP-specific.)
See also
- Chroot jails may be subject to directory traversal if incorrectly created. Possible directory traversal attack vectors are open file descriptors to directories outside the jail. The working directory is another possible attack vector.
References
- ^ "Naming Files, Paths, and Namespaces". Microsoft.
File I/O functions in the Windows API convert '/' to '\' as part of converting the name to an NT-style name
- ^ Burnett, Mark (December 20, 2004). "Security Holes That Run Deep". SecurityFocus.
- ^ "Microsoft: Security Vulnerabilities (Directory Traversal)". CVE Details.
- ^ "Path Traversal". OWASP.
- ^ "CVE-2001-0333". Common Vulnerabilities and Exposures.
- ^ "RFC 2279 - UTF-8, a transformation format of ISO 10646". IETF.
- ^ "CVE-2002-1744". Common Vulnerabilities and Exposures.
Resources
- Open Web Application Security Project
- The WASC Threat Classification – Path Traversal
- Path Traversal Vulnerability Exploitation and Remediation
- CWE Common Weakness Enumeration - Path Traversal
External links
- DotDotPwn – The Directory Traversal Fuzzer – [1]
- Conviction for using directory traversal. [2] [3]
- Bugtraq: IIS %c1%1c remote command execution
- Cryptogram Newsletter July 2001 [4].