Friday, February 14, 2014

How is it with URL encoding + XSS

Correct URLs


All URIs need to be encoded using percent encoding as specified in rfc3986.

URLs may consist of:
* reserved characters - when used with their special meaning: !*'();:@&=+$,/?#[]
* unreserved characters - don't need to be encoded: a-z A-Z 0-9 -._~
* any other characters must be encoded using %xx hex escaping sequence using UTF-8 bytes

This means:
http://example.com/příliš/žluťoučký;kůň=úpěl?ďábelské=ódy is a nice invalid URL.

Correct is:
http://example.com/p%C5%99%C3%ADli%C5%A1/%C5%BElu%C5%A5ou%C4%8Dk%C3%BD;k%C5%AF%C5%88=%C3%BAp%C4%9Bl?%C4%8F%C3%A1belsk%C3%A9=%C3%B3dy

Question: Are URLs in my web app broken?
Answer: Unless you always go with ASCII ... probably yes :)
Better answer: It doesn't matter, browsers are smart enough to correctly parse the simple ones

Semantics of URL parts

Important to know ⁉
  • URL = decode(encode(URL))
  • encode(URL) != encode(decode(encode(URL)))
Decoding depends on correct knowledge of the parts being encoded!
http://example.com/evaluate/3%2B2%2F5 ⇝ http://example.com/evaluate/3+2/5
http://example.com/evaluate/3%2B2/5 ⇝ http://example.com/evaluate/3+2/5
=> ENCODING is responsible to transfer the knowledge into correctly constructed URL
=> only then DECODING is able to correctly identify the parts

Exercise from [2]:
http://example.com/:@-._~!$&'()*+,=;:@-._~!$&'()*+,=:@-._~!$&'()*+,==?/?:@-._~!$'()*+,;=/?:@-._~!$'()*+,;==#/?:@-._~!$&'()*+,;=

The important are delimiters - reserved characters.

URLs & HTML = XSS ?


URLs in HTML = replace & for & But what about URLs and XSS?

Q: Can there be a XSS in URL? 
A: Inside not encoded URL ... YES!
Example URL: http://localhost/'"><img src=x onerror=alert(1)>
⇝ <a href="http://localhost/'"><img src=x onerror=alert(1)>">link</a>

Q: Can there be a XSS in correctly encoded URL?
A: Yes!
Example URL: http://localhost/'+alert(1)+'
Correctly encoded: http://localhost/'+alert(1)+'
<script>location.href='http://localhost/'+alert(1)+'';</script>

Summary

Q: Does it matter when I don't encode URLs using percent-encoding syntax?
A: Only when parts contain reserved characters.

=> ENCODE user-supplied parts of the URL

http://example.com/evaluate/ + encode("3+2/5") ⇝ http://example.com/evaluate/3%2B2%2F5

Q: Can there be a XSS in URL?
A: Yes.

=> ESCAPE user-supplied URLs based on the surrounding context (see OWASP cheet sheet)!

⇝ <a href="http&#x3a;&#x2f;&#x2f;localhost&#x2f;&#x27;&#x22;&#x3e;&#x3c;img&#x20;src&#x3d;x&#x20;onerror&#x3d;alert&#x28;1&#x29;&#x3e;">link</a>

<script>location.href='http\x3a\x2f\x2flocalhost\x2f\x27\x22\x3e\x3cimg\x20src\x3dx\x20onerror\x3dalert\x281\x29\x3e';</script>

Where to continue...


(for serious readers, ordered by simplicity)