Skip Menu |
 

This queue is for tickets about the HTML-Scrubber CPAN distribution.

Report information
The Basics
Id: 69947
Status: open
Priority: 0/
Queue: HTML-Scrubber

People
Owner: Nobody in particular
Requestors: sangeeth2k [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: HTML scrubber validation fails for string 'a>b'
Date: Mon, 1 Aug 2011 14:33:40 -0700
To: bug-HTML-Scrubber [...] rt.cpan.org
From: sangeetha Madangopal <sangeeth2k [...] gmail.com>
Download (untitled) / with headers
text/plain 2.9k
Hello, I am validating HTML::Scrubber to see if it can be used in the project that I am working on. HTML scrubber does not allow use of string like 'a<b' it returns only 'a' after doing the scrub, it truncates '<b'. I looked at the HTML::Parser documentation to see if I can fix this issue, but no luck. Please help configure Scrubber to allow strings like 'a<b' pass the Scrubber validation. Your help is much appreciated! Thank you, Sangeetha Sample script: use strict; use warnings; use Test::More qw( no_plan ); use_ok('HTML::Scrubber'); use Data::Dump qw(dump); my @rules = ( script => 0, img => { src => qr{^(?!(?:(java|vb))?script)}i, alt => 1, align => 1, '*' => , }, ); my @default = ( 0 => # default rule, deny all tags { '*' => 1, # default rule, allow all attributes 'href' => qr{^(?!(?:(java|vb))?script)}i, 'src' => qr{^(?!(?:(java|vb))?script)}i, 'data' => qr{^(?!(?:(java|vb))?script)}i, 'background' => qr{^(?!(?:(java|vb))?script)}i, 'style' => 0, 'data' => qr{^(?!http://)}i, 'cite' => '(?i-xsm:^(?!(?:(java|vb))?script))', 'language' => 0, 'name' => 1, # could be sneaky, but hey ;) 'onblur' => 0, 'onchange' => 0, 'onclick' => 0, 'ondblclick' => 0, 'onerror' => 0, 'onfocus' => 0, 'onkeydown' => 0, 'onkeypress' => 0, 'onkeyup' => 0, 'onload' => 0, 'onmousedown' => 0, 'onmousemove' => 0, 'onmouseout' => 0, 'onmouseover' => 0, 'onmouseup' => 0, 'onreset' => 0, 'onselect' => 0, 'onsubmit' => 0, 'onunload' => 0, 'src' => 0, 'type' => 0, 'allowscriptaccess' => 0, } ); my $scrubber = HTML::Scrubber->new( rules => \@rules, default => \@default ); $scrubber->default(1); my $scrubbed_string; my $orig_string; my @positive_case_strings = ( 'a<b', '>x<', ); foreach my $line (@positive_case_strings) { $scrubbed_string = $scrubber->scrub($line); is(lc($scrubbed_string),lc($line),"XSS controlled \n Orig:$line \n Scrubbed:$scrubbed_string\n"); } Result: perl s.t ok 1 - use HTML::Scrubber; Odd number of elements in anonymous hash at s.t line 7. not ok 2 - XSS controlled # Orig:a<b # Scrubbed:a # # Failed test 'XSS controlled # Orig:a<b # Scrubbed:a # ' # at s.t line 71. # got: 'a' # expected: 'a<b' not ok 3 - XSS controlled # Orig:>x< # Scrubbed:&gt;x # # Failed test 'XSS controlled # Orig:>x< # Scrubbed:&gt;x # ' # at s.t line 71. # got: '&gt;x' # expected: '>x<' 1..3 # Looks like you failed 2 tests of 3.
Download (untitled) / with headers
text/plain 103b
Unconvinced this is a bug - it is invalid HTML (the character should be quoted), so all bets are off.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.