This queue is for tickets about the Regexp-Grammars CPAN distribution.

Report information
The Basics
Id:
132648
Status:
resolved
Priority:
Low/Low

People
Owner:
Nobody in particular
Requestors:
mlsmith997 [...] gmail.com
Cc:
AdminCc:



Subject: Different parse result between debug on/off
Date: Sun, 17 May 2020 17:38:18 +0400
To: bug-Regexp-Grammars@rt.cpan.org
From: Malan Smith <mlsmith997@gmail.com>

Dear Damian

Thank you for the time and effort you put in to make a great Perl feature.

I found a different parse result in module version 1.055 when I use debug mode on or off.  When debugging is on, the result is what I assume is correct.

When I use the following parser (to parse a list of #tags), only the first tag is stored when debug is off, but all tags are stored when debug is on.

===

$parser_tags = qr{
    <logfile: temp.log>
    <debug: on>
    <nocontext:>
    <hashtags>
    <rule: hashtags>        ([#]<[tagword]>)*
    <rule: tagword>            [a-z]+
}xms;

$text = "#this #is #a #test";

===

With debug on:
$VAR1 = 'hashtags';
$VAR2 = {
          'tagword' => [
                         'this',
                         'is',
                         'a',
                         'test'
                       ]
        };

With debug off:
$VAR1 = 'hashtags';
$VAR2 = {
          'tagword' => [
                         'this'
                       ]
        };

I am using Windows 10.0.18363.836, and Strawberry Perl 5.30.2.1 with version string:
This is perl 5, version 30, subversion 0 (v5.30.0) built for MSWin32-x64-multi-thread

I attach a minimum working example and the log files.  I really appreciate any ideas you might have regarding this.

Best regards
Malan Smith

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.

Subject: Re: [rt.cpan.org #132648] Different parse result between debug on/off
Date: Fri, 22 May 2020 22:17:42 +0000
To: bug-Regexp-Grammars@rt.cpan.org
From: Damian Conway <damian@conway.org>
Thanks so much for this exemplary bug report, Malan.

The weird behaviour was caused by the debugger
injecting extra whitespace into the pattern (whilst it was
injecting the extra code blocks needed to do the debugging).

That caused your ([#]<tagword>)* to effectively become
(<ws>[#]<tagword)* and the extra space before the [#] caused
the surrounding rule to match arbitrary whitespace before each #
in the string, which meant the (...)* repetition was able to match
all the tags, instead of just the first.

BTW, I presume you've by now discovered that
the original bug in your grammar was that:

    <rule: hashtags>        ([#]<[tagword]>)*

should have been:

    <rule: hashtags>        (<.ws>[#]<[tagword]>)*

or even just:

    <rule: hashtags>        ( [#]<[tagword]>)*


Meanwhile, I very much appreciate your reporting this issue.
Debuggers really shouldn't change the code they're debugging!
I've just uploaded a new release of the module that fixes that problem

Thanks again,
Damian


This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.