Skip Menu |
 

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the HTML-Tidy CPAN distribution.

Report information
The Basics
Id: 11120
Status: resolved
Priority: 0/
Queue: HTML-Tidy

People
Owner: Nobody in particular
Requestors: anders [...] it.lth.se
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Date: Thu, 20 Jan 2005 10:55:40 +0100
From: Anders Ardo <anders [...] it.lth.se>
To: bug-html-tidy [...] rt.cpan.org
CC: anders.ardo [...] it.lth.se
Subject: Loading of tidy config files / small patch
Download (untitled) / with headers
text/plain 1.4k
Hi Andy Lester, I'm using your HTML::Tidy with success - thanks! It's used to clean HTML files inside a focused Web-crawler. In this context it would be extremely handy to be able to influence the output from Tidy with some of it's many configuration options. So here is a small patch that implements that. Could you please have a look at it and see if it merits inclusion in the distribution? Thanks. The approach taken is to provide the configuration filename as a parameter to the new() method and then use it in calls to the internal _tidy_clean procedure. An alternative would ofcourse to have a new method to more explicitly set the config-file name. The patch passes your tests and my requirements, although I haven't tested it extensively or added a test to the 'make test' section. The other small change I've made is to add a "\n" to the end of the HTML string to be cleaned. It turned out that in a few cases tidy produced incomplete output (which is dissatrous in my application). If you clean the included t.html it ends with a '<p>' instead of '</body></html>' as it should. Adding "\n" to the end of the HTML string fixes that. t.pl is a small test script, usage: ./t.pl < t.html tidy.cfg is a Tidy configuration file used by t.pl Please let me know if there is anything else I can do to get this patch into the distribution. Cheers Anders -- Anders Ardö Department of Information Technology, Lund Institute of Technology Tel: +46 46 2227522 ; URL: http://www.it.lth.se/anders/
Download tidycfg.tgz
application/x-gtar 2.2k

Message body not shown because it is not plain text.

From: rhesa
Download (untitled) / with headers
text/plain 127b
[anders@it.lth.se - Thu Jan 20 05:08:31 2005]: Thanks a million for this patch! It solves all of my issues with HTML::Tidy :-)
RT-Send-CC: rhesa [...] cpan.org
Download (untitled) / with headers
text/plain 150b
This is going into 1.05_02 that I'm releasing tonight. If nothing goes wrong in the few days that follow, I'll release it as 1.06. Thanks very much.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.