2007年11月28日星期三

[Tips]Bypass htmlentities

author: superhei
date: 2007-11-27
http://www.ph4nt0m.org

Gareth Heyes在他的blog上发了一个"htmlentities is badly designed": http://www.thespanner.co.uk/2007/11/26/htmlentities-is-badly-designed/

大意就是说在默认参数下htmlentities不会过滤'导致xss等, php手册里的描叙:

htmlentities
(PHP 3, PHP 4, PHP 5)

htmlentities -- Convert all applicable characters to HTML entities
Description
string htmlentities ( string string [, int quote_style [, string charset]] )


This function is identical to htmlspecialchars() in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities.

Like htmlspecialchars(), the optional second quote_style parameter lets you define what will be done with 'single' and "double" quotes. It takes on one of three constants with the default being ENT_COMPAT:

表格 1. Available quote_style constants

Constant Name Description
ENT_COMPAT Will convert double-quotes and leave single-quotes alone.
ENT_QUOTES Will convert both double and single quotes.
ENT_NOQUOTES Will leave both double and single quotes unconverted.


在htmlspecialchars里:

'&' (ampersand) becomes '&'

'"' (double quote) becomes '"' when ENT_NOQUOTES is not set.

''' (single quote) becomes ''' only when ENT_QUOTES is set.

'<' (less than) becomes '<'

'>' (greater than) becomes '>'

所以使用htmlentities($variable, ENT_QUOTES);要比htmlentities($variable);安全. 但是htmlentities()只是一个字符处理的函数,在很多情况下 可能导致xss等的攻击,例如编码:utf7,utf8...

测试一下:

<?php
echo htmlspecialchars($_GET[url], ENT_QUOTES);
?>

提交:

url=%2bADw-SCRIPT%2bAD4-alert(document.cookie)%2bADw-%2fSCRIPT%2bAD4-

还有很多2次编码的情况也有可能pass htmlentities