How to prevent XSS with HTML/PHP?

How to prevent XSS with HTML/PHP?



How do I prevent XSS (cross-site scripting) using just HTML and PHP?



I've seen numerous other posts on this topic but I have not found an article that clear and concisely states how to actually prevent XSS.





Just a note that this won't solve the case where you might want to use user input as an HTML attribute. For example, the source URL of an image. Not a common case, but an easy one to forget.
– Michael Mior
May 16 '11 at 17:12




9 Answers
9



Basically you need to use the function htmlspecialchars() whenever you want to output something to the browser that came from the user input.


htmlspecialchars()



The correct way to use this function is something like this:


echo htmlspecialchars($string, ENT_QUOTES, 'UTF-8');



Google Code University also has these very educational videos on Web Security:



How To Break Web Software - A look at security vulnerabilities in
web software



What Every Engineer Needs to Know About Security
and Where to Learn It





@TimTim: For most cases, yeah. However, when you need to allow HTML input things get a little trickier and if this is the case I recommend you use something like htmlpurifier.org
– Alix Axel
Jan 3 '10 at 20:23





@Alix Axel, so is your answer to use htmlspecialchars or to use htmlpurifier.org?
– TimTim
Jan 3 '10 at 20:39





If you need to accept HTML input use HTML Purifier, if not use htmlspecialchars().
– Alix Axel
Jan 3 '10 at 20:41


htmlspecialchars()





htmlspecialchars or htmlentities ? Check here stackoverflow.com/questions/46483/…
– kiranvj
Nov 16 '12 at 6:19





Most of time it is correct,but it is not as simple as that. You should consider put untrusted string into HTML,Js,Css,and consider put untrusted HTML into HTML. Look at this : owasp.org/index.php/…
– bronze man
May 29 '14 at 17:43



One of my favorite OWASP references is the Cross-Site Scripting explanation because while there are a large number of XSS attack vectors, the following of a few rules can defend against the majority of them greatly!



This is PHP Security Cheat Sheet





Me too.. This is XSS Filter Evasion Cheat Sheet owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet
– user1285107
Oct 16 '12 at 0:54





Not exactly XSS, but I think XSS and CSRF are commonly mixed up and both are really dangerous: owasp.org/index.php/…
– Simon
Jul 20 '15 at 15:26






This page does no longer exist
– Mazzy
Sep 4 at 13:23





@Mazzy last cache web.archive.org/web/20180817180409/owasp.org/index.php/…
– Wahyu Kristianto
Sep 7 at 3:57



One of the most important steps is to sanitize any user input before it is processed and/or rendered back to the browser. PHP has some "filter" functions that can be used.



The form that XSS attacks usually have is to insert a link to some off-site javascript that contains malicious intent for the user. Read more about it here.



You'll also want to test your site - I can recommend the Firefox add-on XSS Me.





What do I need to make sure I sanitize the input exactly from. Is there one particular character/string that I have to watch out for?
– TimTim
Jan 3 '10 at 20:14





@TimTim - no. All user input should always be considered as inherently hostile.
– zombat
Jan 3 '10 at 20:28





Besides, internal data (employees, sysadmin, etc.) could be unsafe. You should identify and monitor (with log date and user) data displayed with interpretation.
– jedema
Oct 4 at 8:40



In order of preference:


e('html_attr')


htmlentities($var, ENT_QUOTES | ENT_HTML5, $charset)


$charset


'UTF-8'



Also, make sure you escape on output, not on input.



Cross-posting this as a consolidated reference from the SO Documentation beta which is going offline.



Cross-site scripting is the unintended execution of remote code by a web client. Any web application might expose itself to XSS if it takes input from a user and outputs it directly on a web page. If input includes HTML or JavaScript, remote code can be executed when this content is rendered by the web client.



For example, if a 3rd party side contains a JavaScript file:


// http://example.com/runme.js
document.write("I'm running");



And a PHP application directly outputs a string passed into it:


<?php
echo '<div>' . $_GET['input'] . '</div>';



If an unchecked GET parameter contains <script src="http://example.com/runme.js"></script> then the output of the PHP script will be:


<script src="http://example.com/runme.js"></script>


<div><script src="http://example.com/runme.js"></script></div>



The 3rd party JavaScript will run and the user will see "I'm running" on the web page.



As a general rule, never trust input coming from a client. Every GET, POST, and cookie value could be anything at all, and should therefore be validated. When outputting any of these values, escape them so they will not be evaluated in an unexpected way.



Keep in mind that even in the simplest applications data can be moved around and it will be hard to keep track of all sources. Therefore it is a best practice to always escape output.



PHP provides a few ways to escape output depending on the context.



PHPs Filter Functions allow the input data to the php script to be sanitized or validated in many ways. They are useful when saving or outputting client input.



htmlspecialchars will convert any "HTML special characters" into their HTML encodings, meaning they will then not be processed as standard HTML. To fix our previous example using this method:


htmlspecialchars


<?php
echo '<div>' . htmlspecialchars($_GET['input']) . '</div>';
// or
echo '<div>' . filter_input(INPUT_GET, 'input', FILTER_SANITIZE_SPECIAL_CHARS) . '</div>';



Would output:


<div>&lt;script src=&quot;http://example.com/runme.js&quot;&gt;&lt;/script&gt;</div>



Everything inside the <div> tag will not be interpreted as a JavaScript tag by the browser, but instead as a simple text node. The user will safely see:


<div>


<script src="http://example.com/runme.js"></script>



When outputting a dynamically generated URL, PHP provides the urlencode function to safely output valid URLs. So, for example, if a user is able to input data that becomes part of another GET parameter:


urlencode


<?php
$input = urlencode($_GET['input']);
// or
$input = filter_input(INPUT_GET, 'input', FILTER_SANITIZE_URL);
echo '<a href="http://example.com/page?input="' . $input . '">Link</a>';



Any malicious input will be converted to an encoded URL parameter.



Sometimes you will want to send HTML or other kind of code inputs. You will need to maintain a list of authorised words (white list) and un-authorized (blacklist).



You can download standard lists available at the OWASP AntiSamy website. Each list is fit for a specific kind of interaction (ebay api, tinyMCE, etc...). And it is open source.



There are libraries existing to filter HTML and prevent XSS attacks for the general case and performing at least as well as AntiSamy lists with very easy use.
For example you have HTML Purifier


<?php
function xss_clean($data)
xmlns)[^>]*+>#iu', '$1>', $data);

// Remove javascript: and vbscript: protocols
$data = preg_replace('#([a-z]*)[x00-x20]*=[x00-x20]*([`'"]*)[x00-x20]*j[x00-x20]*a[x00-x20]*v[x00-x20]*a[x00-x20]*s[x00-x20]*c[x00-x20]*r[x00-x20]*i[x00-x20]*p[x00-x20]*t[x00-x20]*:#iu', '$1=$2nojavascript...', $data);
$data = preg_replace('#([a-z]*)[x00-x20]*=(['"]*)[x00-x20]*v[x00-x20]*b[x00-x20]*s[x00-x20]*c[x00-x20]*r[x00-x20]*i[x00-x20]*p[x00-x20]*t[x00-x20]*:#iu', '$1=$2novbscript...', $data);
$data = preg_replace('#([a-z]*)[x00-x20]*=(['"]*)[x00-x20]*-moz-binding[x00-x20]*:#u', '$1=$2nomozbinding...', $data);

// Only works in IE: <span style="width: expression(alert('Ping!'));"></span>
$data = preg_replace('#(<[^>]+?)style[x00-x20]*=[x00-x20]*[`'"]*.*?expression[x00-x20]*([^>]*+>#i', '$1>', $data);
$data = preg_replace('#(<[^>]+?)style[x00-x20]*=[x00-x20]*[`'"]*.*?behaviour[x00-x20]*([^>]*+>#i', '$1>', $data);
$data = preg_replace('#(<[^>]+?)style[x00-x20]*=[x00-x20]*[`'"]*.*?s[x00-x20]*c[x00-x20]*r[x00-x20]*i[x00-x20]*p[x00-x20]*t[x00-x20]*:*[^>]*+>#iu', '$1>', $data);

// Remove namespaced elements (we do not need them)
$data = preg_replace('#</*w+:w[^>]*+>#i', '', $data);

do
embed
while ($old_data !== $data);

// we are done...
return $data;





You shouldn't use preg_replace as it uses eval on your input. owasp.org/index.php/PHP_Security_Cheat_Sheet#Code_Injection
– CrabLab
Mar 11 '17 at 17:19


preg_replace


eval



Many frameworks help handle XSS in various ways. When rolling your own or if there's some XSS concern, we can leverage filter_input_array (available in PHP 5 >= 5.2.0, PHP 7.)
I typically will add this snippet to my SessionController, because all calls go through there before any other controller interacts with the data. In this manner, all user input gets sanitized in 1 central location. If this is done at the beginning of a project or before your database is poisoned, you shouldn't have any issues at time of output...stops garbage in, garbage out.


/* Prevent XSS input */
$_GET = filter_input_array(INPUT_GET, FILTER_SANITIZE_STRING);
$_POST = filter_input_array(INPUT_POST, FILTER_SANITIZE_STRING);
/* I prefer not to use $_REQUEST...but for those who do: */
$_REQUEST = (array)$_POST + (array)$_GET + (array)$_REQUEST;



The above will remove ALL HTML & script tags. If you need a solution that allows safe tags, based on a whitelist, check out HTML Purifier.



If your database is already poisoned or you want to deal with XSS at time of output, OWASP recommends creating a custom wrapper function for echo, and using it EVERYWHERE you output user-supplied values:


echo


//xss mitigation functions
function xssafe($data,$encoding='UTF-8')
ENT_HTML401,$encoding);

function xecho($data)

echo xssafe($data);



You are also able to set some XSS related HTTP response headers via header(...)


header(...)



X-XSS-Protection "1; mode=block"



to be sure, the browser XSS protection mode is enabled.



Content-Security-Policy "default-src 'self'; ..."



to enable browser-side content security. See this one for Content Security Policy (CSP) details: http://content-security-policy.com/
Especially setting up CSP to block inline-scripts and external script sources is helpful against XSS.



for a general bunch of useful HTTP response headers concerning the security of you webapp, look at OWASP: https://www.owasp.org/index.php/List_of_useful_HTTP_headers



Use htmlspecialchars on PHP. On HTML try to avoid using:


htmlspecialchars


PHP



element.innerHTML = “…”;
element.outerHTML = “…”;
document.write(…);
document.writeln(…);


element.innerHTML = “…”;
element.outerHTML = “…”;
document.write(…);
document.writeln(…);



where var is controlled by the user.


var



Also obviously try avoiding eval(var),
if you have to use any of them then try JS escaping them, HTML escape them and you might have to do some more but for the basics this should be enough.


eval(var)




Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



Would you like to answer one of these unanswered questions instead?

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

ữḛḳṊẴ ẋ,Ẩṙ,ỹḛẪẠứụỿṞṦ,Ṉẍừ,ứ Ị,Ḵ,ṏ ṇỪḎḰṰọửḊ ṾḨḮữẑỶṑỗḮṣṉẃ Ữẩụ,ṓ,ḹẕḪḫỞṿḭ ỒṱṨẁṋṜ ḅẈ ṉ ứṀḱṑỒḵ,ḏ,ḊḖỹẊ Ẻḷổ,ṥ ẔḲẪụḣể Ṱ ḭỏựẶ Ồ Ṩ,ẂḿṡḾồ ỗṗṡịṞẤḵṽẃ ṸḒẄẘ,ủẞẵṦṟầṓế

⃀⃉⃄⃅⃍,⃂₼₡₰⃉₡₿₢⃉₣⃄₯⃊₮₼₹₱₦₷⃄₪₼₶₳₫⃍₽ ₫₪₦⃆₠₥⃁₸₴₷⃊₹⃅⃈₰⃁₫ ⃎⃍₩₣₷ ₻₮⃊⃀⃄⃉₯,⃏⃊,₦⃅₪,₼⃀₾₧₷₾ ₻ ₸₡ ₾,₭⃈₴⃋,€⃁,₩ ₺⃌⃍⃁₱⃋⃋₨⃊⃁⃃₼,⃎,₱⃍₲₶₡ ⃍⃅₶₨₭,⃉₭₾₡₻⃀ ₼₹⃅₹,₻₭ ⃌