This queue is for tickets about the HTML-Tree CPAN distribution.

Report information
The Basics
Id:
26436
Status:
resolved
Priority:
Low/Low
Queue:

People
Owner:
Jeff.Fearn [...] gmail.com
Requestors:
eharrison [...] realestate.com.au
Cc:
AdminCc:

BugTracker
Severity:
Important
Broken in:
3.23
Fixed in:
(no value)



Subject: as_trimmed_text in HTML::Element does not trim  
sub as_trimmed_text { my $text = shift->as_text(@_); $text =~ s/[\n\r\f\t ]+$//s; $text =~ s/^[\n\r\f\t ]+//s; $text =~ s/[\n\r\f\t ]+/ /g; return $text; } This fails to trim   from $text which is commonly used in HTML The following would resolve the problem: sub as_trimmed_text { my $text = shift->as_text(@_); $text =~ s/[\n\r\f\t\xA0 ]+$//s; $text =~ s/^[\n\r\f\t\xA0 ]+//s; $text =~ s/[\n\r\f\t\xA0 ]+/ /g; return $text; }
From: perl@cjmweb.net
On Mon Apr 16 22:41:10 2007, gzminiz wrote:
Show quoted text
> sub as_trimmed_text {
Show quoted text
> This fails to trim   from $text which is commonly used in HTML > The following would resolve the problem:
This behavior is as designed. U+00A0 ( ) is not considered whitespace in the HTML specification; see http://www.w3.org/TR/html4/struct/text.html#h-9.1 That said, it wouldn't hurt if this was mentioned in the docs for as_trimmed_text.
Updated docs to be clearer on what white space will be cleaned.
From: dma_k@mail.ru
Птн Апр 20 02:31:18 2007, CJM писал:
Show quoted text
> This behavior is as designed. U+00A0 ( ) is not considered > whitespace in the HTML specification; see > http://www.w3.org/TR/html4/struct/text.html#h-9.1
Pity. Would be useful in many cases, as API consumers expect. Maybe one can introduce yet another helper to trim also non-breaking spaces? Or pass an additional option as an argument e.g. as_trimmed_text('trim_nbsp' => 1).
Hi, what I did was add a parameter,extra_chars, that allows the user to add a string that will be used in the regexes. e.g. to remove the encoded or un-encoded   $h->as_trimmed_text(extra_chars => ' \xA0');
Subject: 4.0 released
Hi HTML::Tree ve4rsion 4.0 has been released which includes a fix for this issue. Cheers, Jeff.


This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.