Skip Menu |
 

This queue is for tickets about the WWW-Curl CPAN distribution.

Report information
The Basics
Id: 61569
Status: resolved
Priority: 0/
Queue: WWW-Curl

People
Owner: Nobody in particular
Requestors: andy.jenkinson [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: redirect handling
Date: Wed, 22 Sep 2010 18:08:34 +0100
To: bug-WWW-Curl [...] rt.cpan.org
From: Andy Jenkinson <andy.jenkinson [...] gmail.com>
Download (untitled) / with headers
text/plain 770b
Hi, Not being familiar with the inner workings, I'm not sure if this can/should be resolved in WWW::Curl, but here goes: When setting CURLOPT_FOLLOWLOCATION to 1, curl will follow redirects and fetch the new response appropriately. However, the filehandles specified in CURLOPT_WRITEHEADER (and I assume CURLOPT_WRITEDATA but I have not tested this) are written to multiple times - once per server response. This means that in a typical 301/302 redirect situation the header filehandle will contain two headers once finished, ending up looking like: HTTP/1.1 301 Moved Permanently Location: http://uri.of.redirect ... etc HTTP/1.1 200 OK ... etc If not addressable efficiently in WWW::Curl, I suggest listing as a limitation just to make others aware? Cheers, Andy
Download (untitled) / with headers
text/plain 2.2k
On Wed Sep 22 13:08:48 2010, andy.jenkinson@gmail.com wrote: Show quoted text
> Hi, > > Not being familiar with the inner workings, I'm not sure if this > can/should be resolved in WWW::Curl, but here goes: > > When setting CURLOPT_FOLLOWLOCATION to 1, curl will follow redirects > and fetch the new response appropriately. However, the filehandles > specified in CURLOPT_WRITEHEADER (and I assume CURLOPT_WRITEDATA > but I have not tested this) are written to multiple times - once > per server response. This means that in a typical 301/302 redirect > situation the header filehandle will contain two headers once > finished, ending up looking like: > > HTTP/1.1 301 Moved Permanently > Location: http://uri.of.redirect > ... etc > > HTTP/1.1 200 OK > ... etc > > If not addressable efficiently in WWW::Curl, I suggest listing as a > limitation just to make others aware? > > Cheers, > Andy
Hey, I think it's default behaviour for libcurl to output all header information, if a redirect happened and header data was requested. The reasoning is a bit deep, but I think it's because either the application is not interested in headers, only content - in the case that the body gets processed, or the application is interested in the headers (for example, to build an HTTP::Response object from it). In the latter case, one of the reasons you want to hang on to the 301/302's header data is that it might contain cookies (think of the "POST /foo/login -> 301/302 -> GET /" case). So basically the application writer has three choices to resolve this: 1. Don't bother with headers at all. Most information can be extracted from getinfo[1] easily. 2. Disable CURLOPT_FOLLOWLOCATION and do redirect handling in the application code. The url to redirect to is available from getinfo with the CURLINFO_REDIRECT_URL constant. One thing to watch for is the POST->redirect->GET behaviour that's common in browsers. 3. Implement the suggested method in the CURLOPT_HEADERFUNCTION[2] documentation, that is delimit http responses based on the http status line. I will be updating the documentation to link to this bugreport along a short explanation in 4.14 [1] http://curl.haxx.se/libcurl/c/curl_easy_getinfo.html [2] http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTHEADERFUNCTION
Subject: Re: [rt.cpan.org #61569] redirect handling
Date: Sun, 17 Oct 2010 09:26:45 +0100
To: "bug-WWW-Curl [...] rt.cpan.org" <bug-WWW-Curl [...] rt.cpan.org>
From: Andy Jenkinson <andy.jenkinson [...] gmail.com>
Download (untitled) / with headers
text/plain 2.4k
Thank you, very helpful! On 16 Oct 2010, at 21:53, "Balint Szilakszi via RT" <bug-WWW-Curl@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=61569 > > > On Wed Sep 22 13:08:48 2010, andy.jenkinson@gmail.com wrote:
>> Hi, >> >> Not being familiar with the inner workings, I'm not sure if this >> can/should be resolved in WWW::Curl, but here goes: >> >> When setting CURLOPT_FOLLOWLOCATION to 1, curl will follow redirects >> and fetch the new response appropriately. However, the filehandles >> specified in CURLOPT_WRITEHEADER (and I assume CURLOPT_WRITEDATA >> but I have not tested this) are written to multiple times - once >> per server response. This means that in a typical 301/302 redirect >> situation the header filehandle will contain two headers once >> finished, ending up looking like: >> >> HTTP/1.1 301 Moved Permanently >> Location: http://uri.of.redirect >> ... etc >> >> HTTP/1.1 200 OK >> ... etc >> >> If not addressable efficiently in WWW::Curl, I suggest listing as a >> limitation just to make others aware? >> >> Cheers, >> Andy
> > Hey, > > I think it's default behaviour for libcurl to output all header > information, if a redirect happened and header data was requested. > > The reasoning is a bit deep, but I think it's because either the > application is not interested in headers, only content - in the case > that the body gets processed, or the application is interested in the > headers (for example, to build an HTTP::Response object from it). In the > latter case, one of the reasons you want to hang on to the 301/302's > header data is that it might contain cookies (think of the "POST > /foo/login -> 301/302 -> GET /" case). > > So basically the application writer has three choices to resolve this: > > 1. Don't bother with headers at all. Most information can be extracted > from getinfo[1] easily. > 2. Disable CURLOPT_FOLLOWLOCATION and do redirect handling in the > application code. The url to redirect to is available from getinfo with > the CURLINFO_REDIRECT_URL constant. One thing to watch for is the > POST->redirect->GET behaviour that's common in browsers. > 3. Implement the suggested method in the CURLOPT_HEADERFUNCTION[2] > documentation, that is delimit http responses based on the http status line. > > I will be updating the documentation to link to this bugreport along a > short explanation in 4.14 > > [1] http://curl.haxx.se/libcurl/c/curl_easy_getinfo.html > [2] > http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTHEADERFUNCTION
Release 4.14 is out, thus I'm resolving this ticket.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.