Skip Menu |
 

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the CGI CPAN distribution.

Report information
The Basics
Id: 83265
Status: resolved
Priority: 0/
Queue: CGI

People
Owner: MARKSTOS [...] cpan.org
Requestors: rrt [...] sc3d.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 3.63
Fixed in: (no value)



Subject: url() incorrectly unescapes request_uri
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 2598
Download (untitled) / with headers
text/plain 2.5k
My CGI script gets a request such as: GET /~rrt/Software/DarkGlass/Why%20DarkGlass%3F HTTP/1.1 Note the "%3F" at the end of the URL: it's an escaped question mark. My script then asks for the URL using "url()", and gets: "Software/DarkGlass/Why DarkGlass" The question mark has been stripped off, because after unescaping in the line: my $request_uri = unescape($self->request_uri) || ''; the code then strips the query off: my $uri = $rewrite && $request_uri ? $request_uri : $script_name; $uri =~ s/\?.*$//s; # remove query string But this is wrong: if a query string is passed, it uses an unescaped question mark. The "unescape" routine is itself undocumented, being part of CGI::Util which is for internal use only. I'm not an expert, so I don't know if some unescaping should be being done, but it seems that at the very least, question marks should not be unescaped at that point. However, since the question mark denoting a query string should be a literal question mark character in the original request URI, it seems to be that it should be safe to remove the query string before unescaping the URI. The code in question has not been changed since the start of CGI.pm's git history, so I can't find any relevant commit logs. If I'm correct, I'm still not exactly sure what a minimum patch would be, as while the obvious thing to do would be to move the call of unescape to after the stripping of the query string, that would mean applying unescape to the value of $script_name in some circumstances, which might be OK, or might not, I don't know. But I guess $script_name shouldn't contain a query string, so in fact, stripping the query string should only be needed in the case that $uri is set to $request_uri, so it should be possible to rewrite the relevant lines from: my $request_uri = unescape($self->request_uri) || ''; my $query_str = $self->query_string; my $rewrite_in_use = $request_uri && $request_uri !~ /^\Q$script_name/; my $uri = $rewrite && $request_uri ? $request_uri : $script_name; $uri =~ s/\?.*$//s; # remove query string to: my $request_uri = $self->request_uri || ''; my $query_str = $self->query_string; $request_uri =~ s/\?.*$//s; # remove query string $request_uri = unescape($request_uri); my $rewrite_in_use = $request_uri && $request_uri !~ /^\Q$script_name/; my $uri = $rewrite && $request_uri ? $request_uri: $script_name;
From mark [...] summersault.com Mon Feb 11 10: 42:11 2013
MIME-Version: 1.0
X-Spam-Status: No, score=-6.202 tagged_above=-99.9 required=10 tests=[AWL=0.033, BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_SOFTFAIL=0.665] autolearn=ham
In-Reply-To: <rt-3.8.HEAD-14426-1360596581-520.83265-4-0 [...] rt.cpan.org>
X-Spam-Flag: NO
References: <RT-Ticket-83265 [...] rt.cpan.org> <rt-3.8.HEAD-14426-1360596581-520.83265-4-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
Message-ID: <5119113B.6090107 [...] summersault.com>
Content-Type: text/plain; charset=UTF-8
X-RT-Original-Encoding: utf-8
X-Spam-Score: -6.202
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id 440E8240446 for <cpan-bug+CGI [...] hipster.bestpractical.com>; Mon, 11 Feb 2013 10:42:11 -0500 (EST)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DBulOfEhtpAv for <cpan-bug+CGI [...] hipster.bestpractical.com>; Mon, 11 Feb 2013 10:42:09 -0500 (EST)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id 869022403CF for <bug-CGI [...] rt.cpan.org>; Mon, 11 Feb 2013 10:42:09 -0500 (EST)
Received: (qmail 4645 invoked by uid 103); 11 Feb 2013 15:42:08 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 11 Feb 2013 15:42:08 -0000
Received: from tanagra.summersault.com (HELO tanagra.summersault.com) (12.161.105.149) by 16.mx.develooper.com (qpsmtpd/0.84/v0.84-167-g4ed6cab) with ESMTP; Mon, 11 Feb 2013 07:42:04 -0800
Received: (qmail 46099 invoked from network); 11 Feb 2013 15:42:01 -0000
Received: from simba.summersault.com (HELO ?192.168.97.182?) (192.168.97.182) by tanagra.summersault.com with SMTP; 11 Feb 2013 15:42:01 -0000
Delivered-To: cpan-bug+CGI [...] hipster.bestpractical.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2
Subject: Re: [rt.cpan.org #83265] url() incorrectly unescapes request_uri
Return-Path: <mark [...] summersault.com>
X-Spam-Check-BY: 16.mx.develooper.com
X-Original-To: cpan-bug+CGI [...] hipster.bestpractical.com
X-RT-Mail-Extension: cgi
Date: Mon, 11 Feb 2013 10:41:47 -0500
X-Spam-Level:
To: bug-CGI [...] rt.cpan.org
X-Enigmail-Version: 1.5
Content-Transfer-Encoding: 7bit
From: Mark Stosberg <mark [...] summersault.com>
RT-Message-ID: <rt-3.8.HEAD-27811-1360597332-906.83265-0-0 [...] rt.cpan.org>
Content-Length: 682
Download (untitled) / with headers
text/plain 682b
Show quoted text
> My CGI script gets a request such as: > > GET /~rrt/Software/DarkGlass/Why%20DarkGlass%3F HTTP/1.1 > > Note the "%3F" at the end of the URL: it's an escaped question mark. > > My script then asks for the URL using "url()", and gets: > > "Software/DarkGlass/Why DarkGlass" > > The question mark has been stripped off
In this article, I describe a number of Perl solutions for handling URI escaping and unescaping: http://mark.stosberg.com/blog/2010/12/percent-encoding-uris-in-perl.html Why not run your sample URI through a few of them and see how their fare in handling encoded question marks? I think that would help clarify the standard and common behaviors. Mark
MIME-Version: 1.0
In-Reply-To: <rt-3.8.HEAD-27811-1360597332-906.83265-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
References: <RT-Ticket-83265 [...] rt.cpan.org> <rt-3.8.HEAD-14426-1360596581-520.83265-4-0 [...] rt.cpan.org> <5119113B.6090107 [...] summersault.com> <rt-3.8.HEAD-27811-1360597332-906.83265-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-26949-1360597752-1634.83265-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
From: rrt [...] sc3d.org
X-RT-Original-Encoding: utf-8
Content-Length: 884
Download (untitled) / with headers
text/plain 884b
On Mon Feb 11 10:42:12 2013, mark@summersault.com wrote: Show quoted text
> > My CGI script gets a request such as: > > > > GET /~rrt/Software/DarkGlass/Why%20DarkGlass%3F HTTP/1.1 > > > > Note the "%3F" at the end of the URL: it's an escaped question mark. > > > > My script then asks for the URL using "url()", and gets: > > > > "Software/DarkGlass/Why DarkGlass" > > > > The question mark has been stripped off
> > In this article, I describe a number of Perl solutions for handling URI > escaping and unescaping: > > http://mark.stosberg.com/blog/2010/12/percent-encoding-uris-in-perl.html > > Why not run your sample URI through a few of them and see how their fare > in handling encoded question marks? > > I think that would help clarify the standard and common behaviors.
Possibly I wasn't clear: I don't have a problem with the way unescape works, it's just being applied too early.
MIME-Version: 1.0
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
Content-Type: text/plain; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-13344-1360611257-357.83265-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 93
Please provide a new failing automated test for the test suite which showcases the problem.
MIME-Version: 1.0
In-Reply-To: <rt-3.8.HEAD-13344-1360611257-357.83265-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
References: <rt-3.8.HEAD-13344-1360611257-357.83265-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-12503-1360611917-511.83265-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
From: rrt [...] sc3d.org
X-RT-Original-Encoding: utf-8
Content-Length: 669
Download (untitled) / with headers
text/plain 669b
On Mon Feb 11 14:34:17 2013, MARKSTOS wrote: Show quoted text
> Please provide a new failing automated test for the test suite which > showcases the problem.
Before I do that, I'd like to be confident I've understood the problem, since I'm no expert at this stuff. As it happens, I think the problem and the fix may be simpler than I previously thought: I don't think that url() should call unescape at all. The routine reads a URI from either the $self->request_url or the $script_name; it then returns a URL. So, what is it doing unescaping at all? If you agree with my analysis, then the fix is simply to remove the call to "unescape", and a failing test should be easy to write.
From mark [...] summersault.com Mon Feb 11 15: 45:07 2013
MIME-Version: 1.0
X-Spam-Status: No, score=-6.202 tagged_above=-99.9 required=10 tests=[AWL=0.033, BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_SOFTFAIL=0.665] autolearn=ham
In-Reply-To: <rt-3.8.HEAD-12503-1360611917-1046.83265-5-0 [...] rt.cpan.org>
X-Spam-Flag: NO
References: <RT-Ticket-83265 [...] rt.cpan.org> <rt-3.8.HEAD-13344-1360611257-357.83265-5-0 [...] rt.cpan.org> <rt-3.8.HEAD-12503-1360611917-1046.83265-5-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
Message-ID: <51195839.1050608 [...] summersault.com>
Content-Type: text/plain; charset=UTF-8
X-RT-Original-Encoding: utf-8
X-Spam-Score: -6.202
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id 4CF882405CE for <cpan-bug+CGI [...] hipster.bestpractical.com>; Mon, 11 Feb 2013 15:45:07 -0500 (EST)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hEZGZZZaiHjD for <cpan-bug+CGI [...] hipster.bestpractical.com>; Mon, 11 Feb 2013 15:45:02 -0500 (EST)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id 00B242405C8 for <bug-CGI [...] rt.cpan.org>; Mon, 11 Feb 2013 15:45:01 -0500 (EST)
Received: (qmail 31067 invoked by uid 103); 11 Feb 2013 20:45:00 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 11 Feb 2013 20:45:00 -0000
Received: from tanagra.summersault.com (HELO tanagra.summersault.com) (12.161.105.149) by 16.mx.develooper.com (qpsmtpd/0.84/v0.84-167-g4ed6cab) with ESMTP; Mon, 11 Feb 2013 12:44:58 -0800
Received: (qmail 73994 invoked from network); 11 Feb 2013 20:44:55 -0000
Received: from simba.summersault.com (HELO ?192.168.97.182?) (192.168.97.182) by tanagra.summersault.com with SMTP; 11 Feb 2013 20:44:55 -0000
Delivered-To: cpan-bug+CGI [...] hipster.bestpractical.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2
Subject: Re: [rt.cpan.org #83265] url() incorrectly unescapes request_uri
Return-Path: <mark [...] summersault.com>
X-Spam-Check-BY: 16.mx.develooper.com
X-Original-To: cpan-bug+CGI [...] hipster.bestpractical.com
X-RT-Mail-Extension: cgi
Date: Mon, 11 Feb 2013 15:44:41 -0500
X-Spam-Level:
To: bug-CGI [...] rt.cpan.org
X-Enigmail-Version: 1.5
Content-Transfer-Encoding: 7bit
From: Mark Stosberg <mark [...] summersault.com>
RT-Message-ID: <rt-3.8.HEAD-12503-1360615508-875.83265-0-0 [...] rt.cpan.org>
Content-Length: 1148
Download (untitled) / with headers
text/plain 1.1k
Show quoted text
> Before I do that, I'd like to be confident I've understood the problem, > since I'm no expert at this stuff. As it happens, I think the problem > and the fix may be simpler than I previously thought: I don't think that > url() should call unescape at all. The routine reads a URI from either > the $self->request_url or the $script_name; it then returns a URL. So, > what is it doing unescaping at all?
The authority on the issue would be the relevant RFCs for the CGI and HTTP protocols. I linked to the relevant resources through the article I liked above. My recollection at the moment is that that is correct for the request to be URI-unescaped as it comes it. If it was wrong to apply unescaping here, I think a few more people would have noticed in the 10+ years CGI.pm has been in use. I think it's far more likely that there's an edge case with handling encoded question marks in URIs. The other way to increase your confidence is to apply the test you develop against the other modules I reference in the article above. If all other solutions but CGI.pm produce the same result, you can be certain there's a bug in CGI.pm. Mark
MIME-Version: 1.0
In-Reply-To: <rt-3.8.HEAD-12503-1360615508-875.83265-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
References: <RT-Ticket-83265 [...] rt.cpan.org> <rt-3.8.HEAD-13344-1360611257-357.83265-5-0 [...] rt.cpan.org> <rt-3.8.HEAD-12503-1360611917-1046.83265-5-0 [...] rt.cpan.org> <51195839.1050608 [...] summersault.com> <rt-3.8.HEAD-12503-1360615508-875.83265-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-14426-1360617903-1483.83265-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
From: rrt [...] sc3d.org
X-RT-Original-Encoding: utf-8
Content-Length: 3281
Download (untitled) / with headers
text/plain 3.2k
On Mon Feb 11 15:45:08 2013, mark@summersault.com wrote: Show quoted text
> > > The authority on the issue would be the relevant RFCs for the CGI and > HTTP protocols. I linked to the relevant resources through the article I > liked above.
As far as I can tell, that article only links to RFCs that define URLs and URIs, not the RFCs for CGI and HTTP. I looked at RFC3875, the definition of CGI 1.1: http://www.ietf.org/rfc/rfc3875 It says: In all cases, the string is first processed with regard to any reserved characters present, and then the resulting data can be URL-decoded by replacing "%" escape sequences by their character values. This suggests that at least the first part of my interpretation is correct, namely, the URL should not be unescaped until after the query string has been removed. Show quoted text
> If it was wrong to apply unescaping > here, I think a few more people would have noticed in the 10+ years > CGI.pm has been in use.
I'm not so sure. First, the url() function/method to which I refer is not itself strictly covered by any spec: it's a consumer of the CGI interface, it doesn't implement it. Therefore, what it does is largely up to you. Secondly, I've had a FIXME sitting in my code for some years mentioning that '+' characters seem to be incorrectly unescaped, and wondering whether it's a bug in CGI.pm, but I never got around to investigating further until now. However, the url() method does claim to return a URL, and at present it seems it doesn't really do this, as it really returns a URL-decoded string. RFC 3986 (pointed to by your artice) says: "When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters." That seems to say pretty much the same as the CGI RFC, i.e. CGI.pm shouldn't unescape before taking the URL apart. It goes on to say: "The only exception is for percent-encoded octets corresponding to characters in the unreserved set, which can be decoded at any time." But the unescape function also decodes characters in the reserved set, so it can't be used when it is. Finally, supporting my assertion that the URL method should not percent-decode at all, RFC3986 says: "Once produced, a URI is always in its percent-encoded form." Since the url() method/function returns a URL/URI, it shouldn't percent-decode it. (BTW, I agree with your assertion that it's best to use the terms "percent-decode" and "percent-encode", and I'll try to use them and them alone in future!) Show quoted text
> The other way to increase your confidence is to apply the test you > develop against the other modules I reference in the article above. If > all other solutions but CGI.pm produce the same result, you can be > certain there's a bug in CGI.pm.
I'm sorry, I don't really understand this suggestion. As far as I can tell, the "other modules" you refer to are escaping/unescaping modules. I am suggesting that the URL should not be unescaped: i.e. I would not call any of those modules. I don't see how not using a different module from the one I'm currently not using would change things. Apologies if I'm being dense here!
From mark [...] summersault.com Mon Feb 11 16: 35:43 2013
MIME-Version: 1.0
X-Spam-Status: No, score=-6.202 tagged_above=-99.9 required=10 tests=[AWL=0.033, BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_SOFTFAIL=0.665] autolearn=ham
In-Reply-To: <rt-3.8.HEAD-14426-1360617904-1713.83265-5-0 [...] rt.cpan.org>
X-Spam-Flag: NO
References: <RT-Ticket-83265 [...] rt.cpan.org> <rt-3.8.HEAD-13344-1360611257-357.83265-5-0 [...] rt.cpan.org> <rt-3.8.HEAD-12503-1360611917-1046.83265-5-0 [...] rt.cpan.org> <51195839.1050608 [...] summersault.com> <rt-3.8.HEAD-12503-1360615508-875.83265-5-0 [...] rt.cpan.org> <rt-3.8.HEAD-14426-1360617904-1713.83265-5-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
Message-ID: <51196413.7090004 [...] summersault.com>
Content-Type: text/plain; charset=UTF-8
X-RT-Original-Encoding: utf-8
X-Spam-Score: -6.202
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id D428C2405CE for <cpan-bug+cgi [...] hipster.bestpractical.com>; Mon, 11 Feb 2013 16:35:43 -0500 (EST)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id um2T3xLOzERm for <cpan-bug+cgi [...] hipster.bestpractical.com>; Mon, 11 Feb 2013 16:35:39 -0500 (EST)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id C07322405C8 for <bug-cgi [...] rt.cpan.org>; Mon, 11 Feb 2013 16:35:37 -0500 (EST)
Received: (qmail 2661 invoked by uid 103); 11 Feb 2013 21:35:37 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 11 Feb 2013 21:35:37 -0000
Received: from tanagra.summersault.com (HELO tanagra.summersault.com) (12.161.105.149) by 16.mx.develooper.com (qpsmtpd/0.84/v0.84-167-g4ed6cab) with ESMTP; Mon, 11 Feb 2013 13:35:31 -0800
Received: (qmail 79168 invoked from network); 11 Feb 2013 21:35:29 -0000
Received: from simba.summersault.com (HELO ?192.168.97.182?) (192.168.97.182) by tanagra.summersault.com with SMTP; 11 Feb 2013 21:35:29 -0000
Delivered-To: cpan-bug+cgi [...] hipster.bestpractical.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2
Subject: Re: [rt.cpan.org #83265] url() incorrectly unescapes request_uri
Return-Path: <mark [...] summersault.com>
X-Spam-Check-BY: 16.mx.develooper.com
X-Original-To: cpan-bug+cgi [...] hipster.bestpractical.com
X-RT-Mail-Extension: cgi
Date: Mon, 11 Feb 2013 16:35:15 -0500
X-Spam-Level:
To: bug-cgi [...] rt.cpan.org
Content-Transfer-Encoding: 7bit
From: Mark Stosberg <mark [...] summersault.com>
RT-Message-ID: <rt-3.8.HEAD-12503-1360618544-1342.83265-0-0 [...] rt.cpan.org>
Content-Length: 91
Thanks for your help, and the detailed follow-up research and references. I'll take look.
MIME-Version: 1.0
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
Content-Type: text/plain; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-11426-1360940430-516.83265-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 1479
Download (untitled) / with headers
text/plain 1.4k
I looked into this further. I found that REQUEST_URI is not actually defined in the CGI specification. However, it was returned by Apache. I suppose the spec for the expected contents of it is "whatever Apache provides". Here's some actual responses from Apache 2.2 when given URLs: Base case. Nothing interesting here: ----------------------------------- Browser URL: /file?a=b 'QUERY_STRING' => 'a=b', 'SCRIPT_URL' => '/file', 'SCRIPT_URI' => 'http://example.com/file', 'REQUEST_URI' => '/file?a=b', Including query string after unencoded '?' ------------------------------------------- This case looked suspicious because of '&' appended to the query string, which led me to the third cas. Browser URL: /file%3F?a=b 'QUERY_STRING' => '&a=b', 'SCRIPT_URL' => '/file?', 'SCRIPT_URI' => 'http://example.com/file?', 'REQUEST_URI' => '/file%3F?a=b', Query String elements after encoded and unencoded '?' ---------------------------------------------------- Whoa. Apache is behaving inconsistently here. According to 'QUERY_STRING', the query string starts the encoded question mark, but according to the SCRIPT_URL and SCRIPT_URI, it starts only after the unencoded question mark. Browser URL: /file%3Fa=b?c=d 'QUERY_STRING' => 'a=b&c=d', 'SCRIPT_URL' => '/file?a=b', 'SCRIPT_URI' => 'http://www.mark.net.adoptapet.com/file?a=b', 'REQUEST_URI' => '/file%3Fa=b?c=d', I'll discuss this inconsistenncy with Apache developers. Mark
MIME-Version: 1.0
In-Reply-To: <rt-3.8.HEAD-11426-1360940430-516.83265-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
References: <rt-3.8.HEAD-11426-1360940430-516.83265-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-16387-1360940677-552.83265-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
From: rrt [...] sc3d.org
X-RT-Original-Encoding: utf-8
Content-Length: 148
Download (untitled) / with headers
text/plain 148b
On Fri Feb 15 10:00:30 2013, MARKSTOS wrote: Show quoted text
> I looked into this further.
Thanks very much for looking into this. Good luck with the Apache devs!
MIME-Version: 1.0
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
Content-Type: text/plain; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-11426-1360949671-1261.83265-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 150
Download (untitled) / with headers
text/plain 150b
Here's my post to the Apache developers list: https://mail-archives.apache.org/mod_mbox/httpd- dev/201302.mbox/%3Ckflkpv%24qnr%241%40ger.gmane.org%3E
MIME-Version: 1.0
In-Reply-To: <rt-3.8.HEAD-11426-1360949671-1261.83265-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-3.8.HEAD-11426-1360949671-1261.83265-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.16-8066-1377023341-1054.83265-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 757
Download (untitled) / with headers
text/plain 757b
On Fri Feb 15 12:34:31 2013, MARKSTOS wrote: Show quoted text
> Here's my post to the Apache developers list: > > https://mail-archives.apache.org/mod_mbox/httpd- > dev/201302.mbox/%3Ckflkpv%24qnr%241%40ger.gmane.org%3E
The Apache devs clarified that there was a bug with my RewriteRule. I should have added a [B] flag. However, CGI.pm has a related bug, in that it decodes the REQUEST_URI before looking for the query string, so it will return wrong results when url() is called for a case like this, even when Rewrite is properly configured. It appears that was your original conclusion, and I see you've provided a proposed patch for it. Could you prepare some automated tests for it as well, and see if fixing this breaks anything else in the test suite? Mark
From rrt [...] sc3d.org Tue Aug 20 18: 53:17 2013
MIME-Version: 1.0
X-Spam-Status: No, score=-8.119 tagged_above=-99.9 required=10 tests=[AWL=-1.885, BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_SOFTFAIL=0.665] autolearn=ham
In-Reply-To: <rt-4.0.16-8066-1377023341-1643.83265-6-0 [...] rt.cpan.org>
X-Spam-Flag: NO
X-RT-Interface: API
References: <RT-Ticket-83265 [...] rt.cpan.org> <rt-3.8.HEAD-11426-1360949671-1261.83265-6-0 [...] rt.cpan.org> <rt-4.0.16-8066-1377023341-1643.83265-6-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
X-Received: by 10.152.45.106 with SMTP id l10mr3186351lam.12.1377039171477; Tue, 20 Aug 2013 15:52:51 -0700 (PDT)
X-Received: by 10.152.45.106 with SMTP id l10mr3186342lam.12.1377039171323; Tue, 20 Aug 2013 15:52:51 -0700 (PDT)
Message-ID: <CAOnWdoiHFe=6FpZke-QJaZ4u4=ooKH_MsVdSU-az94xPKP3z7w [...] mail.gmail.com>
Content-Type: multipart/alternative; boundary="001a11c29b44a6198c04e468ea10"
X-Spam-Score: -8.119
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id BCE6924062D for <cpan-bug+CGI [...] hipster.bestpractical.com>; Tue, 20 Aug 2013 18:53:17 -0400 (EDT)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bJCyCbg7u7Yb for <cpan-bug+CGI [...] hipster.bestpractical.com>; Tue, 20 Aug 2013 18:53:14 -0400 (EDT)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id C5D962403C2 for <bug-CGI [...] rt.cpan.org>; Tue, 20 Aug 2013 18:53:13 -0400 (EDT)
Received: (qmail 18550 invoked by alias); 20 Aug 2013 22:53:13 -0000
Received: from exprod7og124.obsmtp.com (HELO exprod7og124.obsmtp.com) (64.18.2.26) by la.mx.develooper.com (qpsmtpd/0.28) with SMTP; Tue, 20 Aug 2013 15:53:06 -0700
Received: from mail-lb0-f173.google.com ([209.85.217.173]) (using TLSv1) by exprod7ob124.postini.com ([64.18.6.12]) with SMTP ID DSNKUhPzRHq36CRBcsl0b4+OzKQyTcFIHBE6 [...] postini.com; Tue, 20 Aug 2013 15:53:03 PDT
Received: by mail-lb0-f173.google.com with SMTP id 10so36105lbf.18 for <bug-CGI [...] rt.cpan.org>; Tue, 20 Aug 2013 15:52:51 -0700 (PDT)
Received: by 10.152.111.33 with HTTP; Tue, 20 Aug 2013 15:52:51 -0700 (PDT)
Delivered-To: cpan-bug+CGI [...] hipster.bestpractical.com
Subject: Re: [rt.cpan.org #83265] url() incorrectly detects beginning of query string by decoding request_uri
Return-Path: <rrt [...] sc3d.org>
X-Spam-Check-BY: la.mx.develooper.com
X-Original-To: cpan-bug+CGI [...] hipster.bestpractical.com
X-RT-Mail-Extension: cgi
X-Google-Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=2soiWMsLXJqdL1aov1S8pR/moZBeFBb1ZpVPLX5p2bk=; b=jRAuxCtmEJHDyE3aYV2BDmwH6RWo1SkegokU00KOHdl0m2moXtwv4BR2NJAKJqDjVs pVCQI3O4M71NDHVFpI5Ebor4/ra1rI9aAxGnau1g3pj1BGJUFrYjGPeGNmviUxrNKvpu NGBQ6PfpVHCmWvFeta41IEr1Al/gSbM+BbblUuQC4qPm9rsIkqJ0TktGorg1xU2i4++w PSlY/9fF47j1FCpWfGgFTnQB/8jL5xDpSxvriryG+BSkUn33ge4jsxEl7qhOV6IfXpv3 LYUOP+GVArbQj/woh1WNniArxujbpBAyWU0S0zihsYswRt8f6TqtgI7OHaykzvc6UrGd eaTQ==
Date: Tue, 20 Aug 2013 23:52:51 +0100
X-Spam-Level:
To: bug-CGI [...] rt.cpan.org
X-GM-Message-State: ALoCoQnrUqKop9m4n6DLb/Uqs6BCkvoBbSxC8L0JowgK3A7/l+92eTdZdVerPEGoV6pi3NJXD3i3Zqz73qa3Uzcb1dM/HTi/GU6EWL4oV4fjIIuPni80nENfbeQoyLBcklDKLbKu07M1OVK2Vq6op+IpvGL6PS+sXtrh8u3KJxIhZ7CCUoSgSDI=
From: Reuben Thomas <rrt [...] sc3d.org>
RT-Message-ID: <rt-4.0.16-12141-1377039198-126.83265-0-0 [...] rt.cpan.org>
Content-Length: 0
content-type: text/plain; charset="utf-8"
X-RT-Original-Encoding: utf-8
Content-Length: 988
Download (untitled) / with headers
text/plain 988b
On 20 August 2013 19:29, Mark Stosberg via RT <bug-CGI@rt.cpan.org> wrote: Show quoted text
> The Apache devs clarified that there was a bug with my RewriteRule. I > should have added a [B] flag. > > However, CGI.pm has a related bug, in that it decodes the > REQUEST_URI before looking for the query string, so it will return wrong > results when url() is called for a case like this, even when Rewrite is > properly configured. > > It appears that was your original conclusion, and I see you've provided a > proposed patch for it. Could you prepare some automated tests for it as > well, and see if fixing this breaks anything else in the test suite? >
Thanks very much for following this up. Looking back at the bug history, I see I proposed two patches: one which moved the call to "unescape", and one which removed the call altogether (and it was the second one I ended up considering correct). Could you please confirm which behaviour you think is the right one, and why? -- http://rrt.sc3d.org
content-type: text/html; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-RT-Original-Encoding: utf-8
Content-Length: 1373
X-RT-Interface: REST
MIME-Version: 1.0
X-Mailer: MIME-tools 5.504 (Entity 5.504)
RT-Message-ID: <rt-4.0.18-30698-1400761059-468.83265-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: binary
Content-Length: 241
Download (untitled) / with headers
text/plain 241b
This issue has been copied to: https://github.com/leejo/CGI.pm/issues/112 please take all future correspondence there. This ticket will remain open but please do not reply here. This ticket will be closed when the github issue is dealt with.
MIME-Version: 1.0
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-10523-1411225714-810.83265-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 394
Download (untitled) / with headers
text/plain 394b
commit b4c99a9036d50c1894ae9e2920a387abd85f790f Author: Lee Johnson <lee@givengain.ch> Date: Sat Sep 20 16:07:00 2014 +0100 resolve #112 [rt.cpan.org #83265] - strip query string first before calling unescape otherwise we incorrectly remove any previously encoded ? chars in the URI Changes | 3 +++ lib/CGI.pm | 8 ++++---- 2 files changed, 7 insertions(+), 4 deletions(-)


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.