Skip Menu |
 

This queue is for tickets about the SQL-Tokenizer CPAN distribution.

Report information
The Basics
Id: 27797
Status: resolved
Priority: 0/
Queue: SQL-Tokenizer

People
Owner: izut [...] cpan.org
Requestors: jamesml [...] planetolsen.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.07
Fixed in: 0.08



Subject: Fails to tokenize SQL segments containing empty strings (single quotes with nothing inbetween)
Download (untitled) / with headers
text/plain 1.1k
Thank you for a handy module! It seems like using empty strings (two single-quote marks with nothing inbetween) seems to cause a hiccup in the parser and it fails to properly tokenize the string. The following portion of a SQL statement does not get processed into tokens properly: nvl(reward_type,'')='' and group_code = 'XXXX' It gets incorrectly processed into (notice the 4th line, where there are several SQL components all inside of the same "token"): '(', 'reward_type', ',', '\'\')=\'\' and group_code = \'', 'XXXX\'', But if I try to parse this: nvl(reward_type,'x')='x' and group_code = 'XXXX' It gets properly parsed into: 'nvl', '(', 'reward_type', ',', '\'x\'', ')', '=', '\'x\'', ' ', 'and', ' ', 'group_code', ' ', '=', ' ', '\'XXXX\'', I've looked at the regex, and it'll take me some digesting to figure it out so I thought I might throw a bug report your way in case you might be able to figure it out more quickly. Thanks again.. --James Other info: SQL::Tokenizer 0.07 Perl 5.8.6 SunOS 5.10 Generic sun4u sparc SUNW,Sun-Fire-V440
From: igor.sutton [...] gmail.com
Download (untitled) / with headers
text/plain 1.3k
On Thu Jun 28 10:51:37 2007, samej71 wrote: Show quoted text
> Thank you for a handy module! > > It seems like using empty strings (two single-quote marks with nothing > inbetween) seems to cause a hiccup in the parser and it fails to > properly tokenize the string. > > The following portion of a SQL statement does not get processed into > tokens properly: > > nvl(reward_type,'')='' and group_code = 'XXXX' > > It gets incorrectly processed into (notice the 4th line, where there are > several SQL components all inside of the same "token"): > > '(', > 'reward_type', > ',', > '\'\')=\'\' and group_code = \'', > 'XXXX\'', > > But if I try to parse this: > > nvl(reward_type,'x')='x' and group_code = 'XXXX' > > It gets properly parsed into: > > 'nvl', > '(', > 'reward_type', > ',', > '\'x\'', > ')', > '=', > '\'x\'', > ' ', > 'and', > ' ', > 'group_code', > ' ', > '=', > ' ', > '\'XXXX\'', > > I've looked at the regex, and it'll take me some digesting to figure it > out so I thought I might throw a bug report your way in case you might > be able to figure it out more quickly. > > Thanks again.. > > --James > > Other info: > SQL::Tokenizer 0.07 > Perl 5.8.6 > SunOS 5.10 Generic sun4u sparc SUNW,Sun-Fire-V440
I'll be adding a test for it. Thanks for your response.
From: jamesml [...] planetolsen.com
Download (untitled) / with headers
text/plain 1.2k
On Thu Jun 28 10:51:37 2007, samej71 wrote: Show quoted text
> I've looked at the regex, and it'll take me some digesting to figure it > out so I thought I might throw a bug report your way in case you might > be able to figure it out more quickly.
It seems like it was much more simple to fix this than I originally thought. I believe the following simple addition to the regex may fix it: | '' # empty single quoted string | "" # empty double quoted string This segment needs to be added before the other quoted string checks. I inserted them at line 94 to give me: my @query = $query =~ m{ ( (?:>=|<=|==) # >=, <= and == operators | [\(\),=;] # punctuation (parenthesis, comma) | '' # empty single quoted string | "" # empty double quoted string | ".*?(?:(?:""){1,}"|(?<!["\\])"(?!")|\\"{2}) # anything inside double quotes, ungreedy | '.*?(?:(?:''){1,}'|(?<!['\\])'(?!')|\\'{2}) # anything inside single quotes, ungreedy. | --[\ \t\S]* # comments | /\*[\ \t\n\S]*?\*/ # C style comments | [^\s\(\),=;]+ # everything that doesn't matches with above | \n # newline | [\t\ ]+ # any kind of white spaces ) }smxgo;
Subject: Re: [rt.cpan.org #27797] Fails to tokenize SQL segments containing empty strings (single quotes with nothing inbetween)
Date: Thu, 28 Jun 2007 12:53:54 -0300
To: bug-SQL-Tokenizer [...] rt.cpan.org
From: Igor Sutton Lopes <igor.sutton [...] gmail.com>
Download (untitled) / with headers
text/plain 1.6k
On Jun 28, 2007, at 12:09 PM, James via RT wrote: Show quoted text
> > Queue: SQL-Tokenizer > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=27797 > > > On Thu Jun 28 10:51:37 2007, samej71 wrote:
>> I've looked at the regex, and it'll take me some digesting to >> figure it >> out so I thought I might throw a bug report your way in case you >> might >> be able to figure it out more quickly.
> > It seems like it was much more simple to fix this than I originally > thought. I believe the following simple addition to the regex may > fix it: > > | > '' # empty single quoted string > | > "" # empty double quoted string > > This segment needs to be added before the other quoted string > checks. I > inserted them at line 94 to give me: > > my @query = $query =~ m{ > ( > (?:>=|<=|==) # >=, <= and == operators > | > [\(\),=;] # punctuation (parenthesis, comma) > | > '' # empty single quoted string > | > "" # empty double quoted string > | > ".*?(?:(?:""){1,}"|(?<!["\\])"(?!")|\\"{2}) > # anything inside double quotes, ungreedy > | > '.*?(?:(?:''){1,}'|(?<!['\\])'(?!')|\\'{2}) > # anything inside single quotes, ungreedy. > | > --[\ \t\S]* # comments > | > /\*[\ \t\n\S]*?\*/ # C style comments > | > [^\s\(\),=;]+ # everything that doesn't matches with above > | > \n # newline > | > [\t\ ]+ # any kind of white spaces > ) > }smxgo; >
Unfortunately it breaks the other kind of quotation, like '''word''' - since it is a valid quotation scheme on some databases. I'm checking some way to provide this correctly.
Download PGP.sig
application/pgp-signature 186b

Message body not shown because it is not plain text.

Subject: Re: [rt.cpan.org #27797] Fails to tokenize SQL segments containing empty strings (single quotes with nothing inbetween)
Date: Thu, 28 Jun 2007 13:19:52 -0300
To: bug-SQL-Tokenizer [...] rt.cpan.org
From: Igor Sutton Lopes <igor.sutton [...] gmail.com>
Download (untitled) / with headers
text/plain 1.1k
On Jun 28, 2007, at 12:09 PM, James via RT wrote: Show quoted text
> This segment needs to be added before the other quoted string > checks. I > inserted them at line 94 to give me: > > my @query = $query =~ m{ > ( > (?:>=|<=|==) # >=, <= and == operators > | > [\(\),=;] # punctuation (parenthesis, comma) > | > '' # empty single quoted string > | > "" # empty double quoted string > | > ".*?(?:(?:""){1,}"|(?<!["\\])"(?!")|\\"{2}) > # anything inside double quotes, ungreedy > | > '.*?(?:(?:''){1,}'|(?<!['\\])'(?!')|\\'{2}) > # anything inside single quotes, ungreedy. > | > --[\ \t\S]* # comments > | > /\*[\ \t\n\S]*?\*/ # C style comments > | > [^\s\(\),=;]+ # everything that doesn't matches with above > | > \n # newline > | > [\t\ ]+ # any kind of white spaces > ) > }smxgo;
I figured out what needed, and now I'm uploading the new version - 0.08. At this moment, PAUSE is indexing the new file and it is supposed to be finished soon.
Download PGP.sig
application/pgp-signature 186b

Message body not shown because it is not plain text.

Modified regex to solve issue with single and doubled quotes empty strings.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.