Skip Menu |
 

This queue is for tickets about the Filter-Simple CPAN distribution.

Report information
The Basics
Id: 68672
Status: open
Priority: 0/
Queue: Filter-Simple

People
Owner: Nobody in particular
Requestors: chm [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in:
  • 0.84
  • 0.87
Fixed in: (no value)



Subject: sequences of /= lines get masked as comments
MIME-Version: 1.0
X-Mailer: MIME-tools 5.427 (Entity 5.427)
X-RT-Original-Encoding: utf-8
Content-Type: multipart/mixed; boundary="----------=_1307373720-18808-519"
Content-Length: 0
Content-Type: text/html; charset="UTF-8"
Content-Disposition: inline
Content-Transfer-Encoding: binary
Content-Length: 407
If you use the FILTER_ONLY with 'all' then the source
filter passes all lines as expected and the badfilter
catches and fixes the $a:$b syntax error.

If you have FILTER_ONLY with the 'code_no_comments'
argument then the code is mis-folded and the source
filter does not see the text between the two /= on
the lines and misses the syntax fix causing an error.
Subject: badfilter.pm
MIME-Version: 1.0
Content-Type: application/octet-stream; name="badfilter.pm"
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline; filename="badfilter.pm"
Content-Transfer-Encoding: base64
Content-Length: 121
Download badfilter.pm
text/x-perl 121b
package badfilter; use Filter::Simple; FILTER_ONLY code_no_comments => # all => sub { s/(\w+):\$/$1,\$/; }; 1;
MIME-Version: 1.0
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Type: multipart/mixed; boundary="----------=_1307378624-18809-578"
Message-ID: <rt-3.8.HEAD-18809-1307378624-662.68672-0-0 [...] rt.cpan.org>
X-RT-Original-Encoding: utf-8
Content-Length: 0
Content-Disposition: inline
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 1209
The bisect.pm file with the failing lines example
did not get attached so here it is.  This is what
I get if I run 'perl -c bisect.pm' as attached:

$ cat badfilter.pm
package badfilter;

use Filter::Simple;

FILTER_ONLY
   code_no_comments =>
   # all =>
 sub {
    s/(\w+):\$/$1,\$/;
 };

1;

$ cat bisect.pm
use badfilter;

$b /= 1;

$lu->($row:$n1,$r1);

$b /= 1;

$ perl -c bisect.pm
syntax error at bisect.pm line 5, near "$row:"
bisect.pm had compilation errors.


If I edit badfliter.pm to change from using
'code_no_comments' to 'all' in the FILTER_ONLY
specifcation, I get this:

$ cat badfilter.pm
package badfilter;

use Filter::Simple;

FILTER_ONLY
Show quoted text
# code_no_comments =>
   all =>
 sub {
    s/(\w+):\$/$1,\$/;
 };

1;

$ perl -c bisect.pm
bisect.pm syntax OK


Subject: bisect.pm
MIME-Version: 1.0
Content-Type: application/octet-stream; name="bisect.pm"
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline; filename="bisect.pm"
Content-Transfer-Encoding: base64
Content-Length: 57
Download bisect.pm
text/x-perl 57b
use badfilter; $b /= 1; $lu->($row:$n1,$r1); $b /= 1;
MIME-Version: 1.0
In-Reply-To: <rt-3.8.HEAD-18809-1307378624-662.68672-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
References: <rt-3.8.HEAD-18809-1307378624-662.68672-0-0 [...] rt.cpan.org>
Content-Type: text/html; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-18806-1307477801-56.68672-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 489
Tracing through the code in Filter::Simple, I determined
that the problem here is that the extract_quotelike() call
from the Text::Balanced module is treating the two lines
with /= in them as the beginning and end of a multiline
match operator without the m.

Hardwiring the value of $allow_raw_match to 0 makes
the extract_quotelike() work correctly.  I don't know what
would be entailed to work around the problem better.

From thoughtstream [...] gmail.com Tue Jun 7 22: 27:44 2011
MIME-Version: 1.0
X-Spam-Status: No, score=-6.797 tagged_above=-99.9 required=10 tests=[AWL=-0.688, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_HI=-5, RFC_ABUSE_POST=0.001, SPF_NEUTRAL=0.779, T_TO_NO_BRKTS_FREEMAIL=0.01] autolearn=ham
In-Reply-To: <rt-3.8.HEAD-18806-1307477801-629.68672-5-0 [...] rt.cpan.org>
X-Spam-Flag: NO
References: <RT-Ticket-68672 [...] rt.cpan.org> <rt-3.8.HEAD-18809-1307378624-662.68672-5-0 [...] rt.cpan.org> <rt-3.8.HEAD-18806-1307477801-629.68672-5-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
Message-ID: <BANLkTinV458XB=rnYTS27P0srs4kr1H0rA [...] mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
X-RT-Original-Encoding: utf-8
X-Spam-Score: -6.797
Authentication-Results: hipster.bestpractical.com (amavisd-new); dkim=pass header.i= [...] gmail.com
Authentication-Results: hipster.bestpractical.com (amavisd-new); domainkeys=pass header.sender=thoughtstream [...] gmail.com
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id D62C524015E for <cpan-bug+Filter-Simple [...] hipster.bestpractical.com>; Tue, 7 Jun 2011 22:27:44 -0400 (EDT)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wFB2jaaFMF3m for <cpan-bug+Filter-Simple [...] hipster.bestpractical.com>; Tue, 7 Jun 2011 22:27:43 -0400 (EDT)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id 7065D240084 for <bug-Filter-Simple [...] rt.cpan.org>; Tue, 7 Jun 2011 22:27:43 -0400 (EDT)
Received: (qmail 20476 invoked by uid 103); 8 Jun 2011 02:27:42 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 8 Jun 2011 02:27:42 -0000
Received: from mail-bw0-f50.google.com (HELO mail-bw0-f50.google.com) (209.85.214.50) by 16.mx.develooper.com (qpsmtpd/0.80/v0.80-19-gf52d165) with ESMTP; Tue, 07 Jun 2011 19:27:40 -0700
Received: by bwz2 with SMTP id 2so54652bwz.9 for <bug-Filter-Simple [...] rt.cpan.org>; Tue, 07 Jun 2011 19:27:37 -0700 (PDT)
Received: by 10.204.25.194 with SMTP id a2mr134931bkc.197.1307500057091; Tue, 07 Jun 2011 19:27:37 -0700 (PDT)
Received: by 10.204.56.8 with HTTP; Tue, 7 Jun 2011 19:26:57 -0700 (PDT)
Delivered-To: cpan-bug+Filter-Simple [...] hipster.bestpractical.com
Subject: Re: [rt.cpan.org #68672] sequences of /= lines get masked as comments
Domainkey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; b=ooGseY7rzgIbFE7VPkU6MWW+hAPdRsAdbSDJ/7uQBlnvUzmQBZFIHpzC6y4H/+ZAZW hmFJzqF2MkrQaabtJ29nBLsDYoBIGv+GI3O0NJ0f7nGpozc5QKCDRgeB4pkarLqghV5/ MtufwNOG9GJofvlHDNFtkudqEX6nqbS9OUSMk=
Return-Path: <thoughtstream [...] gmail.com>
Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:from :date:x-google-sender-auth:message-id:subject:to:content-type; bh=sRrmdrX5dIg7eL2i75JF+DsD2i8g41K/4HjZX6sGNwE=; b=fWMODbsPppa6rF5Y/STbDRSfwdCzRedyabnIOCipydj3eLsfzumtHBT4FtXNsQDlFJ nrK9H1v9yPxpvPUUwmD2TnPbdg3Wo0Qsepg/UTrYwhPpm0PkaX1p3T3E6Em+ocYhMi14 MU19K6d8lcGcO0TAnuCTiwcR7ohLxuXM34aAc=
X-Spam-Check-BY: 16.mx.develooper.com
X-Original-To: cpan-bug+Filter-Simple [...] hipster.bestpractical.com
X-RT-Mail-Extension: filter-simple
X-Google-Sender-Auth: uDt-kk4rrwZnK0hYWLtlqcsG7g4
Sender: thoughtstream [...] gmail.com
Date: Wed, 8 Jun 2011 12:26:57 +1000
X-Spam-Level:
To: bug-Filter-Simple [...] rt.cpan.org
From: Damian Conway <damian [...] conway.org>
RT-Message-ID: <rt-3.8.HEAD-18808-1307500065-1717.68672-0-0 [...] rt.cpan.org>
Content-Length: 479
Download (untitled) / with headers
text/plain 479b
Show quoted text
> Hardwiring the value of $allow_raw_match to 0 makes > the extract_quotelike() work correctly. I don't know what > would be entailed to work around the problem better.
[Comment only, as I'm no longer maintaining the module] Reimplementing the filtering framework around PPI would be the only way to significantly improve its accuracy. But that would then require PPI added to the Perl core, which would be an excellent addition, but seems very unlikely to happen. :-( Damian
From devel.chm.01 [...] gmail.com Wed Jun 8 07: 43:54 2011
MIME-Version: 1.0
X-Spam-Status: No, score=-4.656 tagged_above=-99.9 required=10 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=1.553, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_HI=-5, RFC_ABUSE_POST=0.001, SPF_NEUTRAL=0.779, T_TO_NO_BRKTS_FREEMAIL=0.01] autolearn=ham
In-Reply-To: <rt-3.8.HEAD-18808-1307500065-453.68672-6-0 [...] rt.cpan.org>
X-Spam-Flag: NO
References: <RT-Ticket-68672 [...] rt.cpan.org> <rt-3.8.HEAD-18809-1307378624-662.68672-5-0 [...] rt.cpan.org> <rt-3.8.HEAD-18806-1307477801-629.68672-5-0 [...] rt.cpan.org> <BANLkTinV458XB=rnYTS27P0srs4kr1H0rA [...] mail.gmail.com> <rt-3.8.HEAD-18808-1307500065-453.68672-6-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
Message-ID: <4DEF6072.7060304 [...] gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
X-RT-Original-Encoding: utf-8
X-Spam-Score: -4.656
Authentication-Results: hipster.bestpractical.com (amavisd-new); dkim=pass header.i= [...] gmail.com
Authentication-Results: hipster.bestpractical.com (amavisd-new); domainkeys=pass header.from=devel.chm.01 [...] gmail.com
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id 1157724039A for <cpan-bug+Filter-Simple [...] hipster.bestpractical.com>; Wed, 8 Jun 2011 07:43:54 -0400 (EDT)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jHvJOsvnZ8Vu for <cpan-bug+Filter-Simple [...] hipster.bestpractical.com>; Wed, 8 Jun 2011 07:43:52 -0400 (EDT)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id D140C24016D for <bug-Filter-Simple [...] rt.cpan.org>; Wed, 8 Jun 2011 07:43:51 -0400 (EDT)
Received: (qmail 31356 invoked by uid 103); 8 Jun 2011 11:43:51 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 8 Jun 2011 11:43:51 -0000
Received: from mail-qw0-f50.google.com (HELO mail-qw0-f50.google.com) (209.85.216.50) by 16.mx.develooper.com (qpsmtpd/0.80/v0.80-19-gf52d165) with ESMTP; Wed, 08 Jun 2011 04:43:47 -0700
Received: by qwe5 with SMTP id 5so188424qwe.9 for <bug-Filter-Simple [...] rt.cpan.org>; Wed, 08 Jun 2011 04:43:44 -0700 (PDT)
Received: by 10.229.20.19 with SMTP id d19mr5254247qcb.245.1307533424563; Wed, 08 Jun 2011 04:43:44 -0700 (PDT)
Received: from [192.168.1.32] (c-98-204-125-24.hsd1.va.comcast.net [98.204.125.24]) by mx.google.com with ESMTPS id i34sm327935qck.31.2011.06.08.04.43.42 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 08 Jun 2011 04:43:43 -0700 (PDT)
Delivered-To: cpan-bug+Filter-Simple [...] hipster.bestpractical.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
Subject: Re: [rt.cpan.org #68672] sequences of /= lines get masked as comments
Domainkey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=DXakYSmgB4F4XxxrXGofBviCdlzkh0jC0Icja3bO7UaVF+nzc/HTQpFbyMKuJ1f6K5 HjpJQtk/CQeVkuf2zlNfliBoav1PBjIkiBR0wbMGz18tdIl+sjOx386VmhOm+Focem52 Pi9G5VrjwrvZHAnSx51X76kGURt9AvSwrH9RI=
Return-Path: <devel.chm.01 [...] gmail.com>
Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=Ep401b67TUJOllV/7i15Om8/Uai7ugNO1BwnZJ1XROU=; b=KbE4UROCA0rajMxN8CiLpjof7W5L3aYsK8fTSPSkLr82uEUtsvrqNqrlGOhhpNqRZQ 29znFTRtfSP6CqQTt1DwtmqRX3i+Izsx25zH6pNdGgLVSyWcm4Te8lDQf+zpWEuDDShn ZjSQFs31cQ4j6FOnTnrxVngxSX09tsvZls6V8=
X-Spam-Check-BY: 16.mx.develooper.com
X-Original-To: cpan-bug+Filter-Simple [...] hipster.bestpractical.com
X-RT-Mail-Extension: filter-simple
Date: Wed, 08 Jun 2011 07:43:46 -0400
X-Spam-Level:
To: bug-Filter-Simple [...] rt.cpan.org
Content-Transfer-Encoding: 7bit
From: chm <devel.chm.01 [...] gmail.com>
RT-Message-ID: <rt-3.8.HEAD-18807-1307533434-197.68672-0-0 [...] rt.cpan.org>
Content-Length: 841
Download (untitled) / with headers
text/plain 841b
On 6/7/2011 10:27 PM, damian@conway.org via RT wrote: Show quoted text
> <URL: http://rt.cpan.org/Ticket/Display.html?id=68672> >
>> Hardwiring the value of $allow_raw_match to 0 makes >> the extract_quotelike() work correctly. I don't know what >> would be entailed to work around the problem better.
> > [Comment only, as I'm no longer maintaining the module] > > Reimplementing the filtering framework around PPI would be the > only way to significantly improve its accuracy. But that would > then require PPI added to the Perl core, which would be an > excellent addition, but seems very unlikely to happen. :-( > > Damian
Thanks for the reply. I managed to submit reports for Filter::Simple and Text::Balanced issues for this problem: If Text::Balanced were to expose the internal functionality that would address this problem for our uses. --Chris


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.