Skip Menu |
 

This queue is for tickets about the WWW-Curl CPAN distribution.

Report information
The Basics
Id: 61569
Status: resolved
Priority: 0/
Queue: WWW-Curl

People
Owner: Nobody in particular
Requestors: andy.jenkinson [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



From andy.jenkinson [...] gmail.com Wed Sep 22 13: 05:40 2010
MIME-Version: 1.0 (Apple Message framework v1081)
X-Spam-Status: No, score=-9.913 tagged_above=-99.9 required=10 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, SPF_NEUTRAL=0.686] autolearn=ham
X-Mailer: Apple Mail (2.1081)
X-Spam-Flag: NO
Message-ID: <0BF24779-8350-4524-B63F-348750CD2494 [...] gmail.com>
content-type: text/plain; charset="utf-8"
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
X-Spam-Score: -9.913
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id 361DB240D17 for <cpan-bug+WWW-Curl [...] hipster.bestpractical.com>; Wed, 22 Sep 2010 13:05:40 -0400 (EDT)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7uNsD9jO-ozq for <cpan-bug+WWW-Curl [...] hipster.bestpractical.com>; Wed, 22 Sep 2010 13:05:38 -0400 (EDT)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id EAE9E240CFA for <bug-WWW-Curl [...] rt.cpan.org>; Wed, 22 Sep 2010 13:05:37 -0400 (EDT)
Received: (qmail 31577 invoked by uid 103); 22 Sep 2010 17:08:43 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 22 Sep 2010 17:08:43 -0000
Received: from mail-wy0-f178.google.com (HELO mail-wy0-f178.google.com) (74.125.82.178) by 16.mx.develooper.com (qpsmtpd/0.80) with ESMTP; Wed, 22 Sep 2010 10:08:42 -0700
Received: by wyb40 with SMTP id 40so54885wyb.9 for <bug-WWW-Curl [...] rt.cpan.org>; Wed, 22 Sep 2010 10:08:37 -0700 (PDT)
Received: by 10.216.8.138 with SMTP id 10mr7247154wer.57.1285175316670; Wed, 22 Sep 2010 10:08:36 -0700 (PDT)
Received: from pob.windows.ebi.ac.uk (hx-dnat-242.ebi.ac.uk [193.62.194.242]) by mx.google.com with ESMTPS id w1sm7010725weq.25.2010.09.22.10.08.35 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 22 Sep 2010 10:08:35 -0700 (PDT)
Authentication-Results: hipster.bestpractical.com (amavisd-new); dkim=pass header.i= [...] gmail.com
Authentication-Results: hipster.bestpractical.com (amavisd-new); domainkeys=pass header.from=andy.jenkinson [...] gmail.com
Delivered-To: cpan-bug+WWW-Curl [...] hipster.bestpractical.com
Subject: redirect handling
Return-Path: <andy.jenkinson [...] gmail.com>
Domainkey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:content-type:content-transfer-encoding:subject:date:message-id :to:mime-version:x-mailer; b=M0g06DInmWzUHhdEn9hLV1xPcbPqsfGIrg484tnMzT/o+LZf0RY5L8kMRQPdVezItf 1yZtACSViS52Jv3x/uLvWv97jWhQUV7Ni7xPx7onxfZ0L+LoT+zSgi5kUY2/k5ELfHhA wV2akVWNkOCTSpn/ahw/jn20MXff2ME6TJ7Hw=
X-RT-Mail-Extension: www-curl
X-Original-To: cpan-bug+WWW-Curl [...] hipster.bestpractical.com
X-Spam-Check-BY: 16.mx.develooper.com
Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:content-type :content-transfer-encoding:subject:date:message-id:to:mime-version :x-mailer; bh=qoZWBNKz0bPqh6csat/rN9wWx3tDP1iZvTg1rEgx568=; b=GT41fXQzVrdemJnk2jmlmYqpAML6t/p4Er34HRrwhoLQwADv621+Cm7GxJMhe/B/jk VCvmFNymZ+CNs+EUgcczqHT0agZ+RGyftl8vMp6AY3kcYm6saBvzN0b5PGywNXnNoPYv WrDkqdlKrpFkr0NgGd4V6cSBMIXaR1mRNZo9Y=
Date: Wed, 22 Sep 2010 18:08:34 +0100
X-Spam-Level:
To: bug-WWW-Curl [...] rt.cpan.org
Content-Transfer-Encoding: quoted-printable
From: Andy Jenkinson <andy.jenkinson [...] gmail.com>
X-RT-Original-Encoding: us-ascii
Content-Length: 770
Download (untitled) / with headers
text/plain 770b
Hi, Not being familiar with the inner workings, I'm not sure if this can/should be resolved in WWW::Curl, but here goes: When setting CURLOPT_FOLLOWLOCATION to 1, curl will follow redirects and fetch the new response appropriately. However, the filehandles specified in CURLOPT_WRITEHEADER (and I assume CURLOPT_WRITEDATA but I have not tested this) are written to multiple times - once per server response. This means that in a typical 301/302 redirect situation the header filehandle will contain two headers once finished, ending up looking like: HTTP/1.1 301 Moved Permanently Location: http://uri.of.redirect ... etc HTTP/1.1 200 OK ... etc If not addressable efficiently in WWW::Curl, I suggest listing as a limitation just to make others aware? Cheers, Andy
MIME-Version: 1.0
In-Reply-To: <0BF24779-8350-4524-B63F-348750CD2494 [...] gmail.com>
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
References: <0BF24779-8350-4524-B63F-348750CD2494 [...] gmail.com>
Content-Type: text/plain; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-2363-1287262403-603.61569-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 2301
Download (untitled) / with headers
text/plain 2.2k
On Wed Sep 22 13:08:48 2010, andy.jenkinson@gmail.com wrote: Show quoted text
> Hi, > > Not being familiar with the inner workings, I'm not sure if this > can/should be resolved in WWW::Curl, but here goes: > > When setting CURLOPT_FOLLOWLOCATION to 1, curl will follow redirects > and fetch the new response appropriately. However, the filehandles > specified in CURLOPT_WRITEHEADER (and I assume CURLOPT_WRITEDATA > but I have not tested this) are written to multiple times - once > per server response. This means that in a typical 301/302 redirect > situation the header filehandle will contain two headers once > finished, ending up looking like: > > HTTP/1.1 301 Moved Permanently > Location: http://uri.of.redirect > ... etc > > HTTP/1.1 200 OK > ... etc > > If not addressable efficiently in WWW::Curl, I suggest listing as a > limitation just to make others aware? > > Cheers, > Andy
Hey, I think it's default behaviour for libcurl to output all header information, if a redirect happened and header data was requested. The reasoning is a bit deep, but I think it's because either the application is not interested in headers, only content - in the case that the body gets processed, or the application is interested in the headers (for example, to build an HTTP::Response object from it). In the latter case, one of the reasons you want to hang on to the 301/302's header data is that it might contain cookies (think of the "POST /foo/login -> 301/302 -> GET /" case). So basically the application writer has three choices to resolve this: 1. Don't bother with headers at all. Most information can be extracted from getinfo[1] easily. 2. Disable CURLOPT_FOLLOWLOCATION and do redirect handling in the application code. The url to redirect to is available from getinfo with the CURLINFO_REDIRECT_URL constant. One thing to watch for is the POST->redirect->GET behaviour that's common in browsers. 3. Implement the suggested method in the CURLOPT_HEADERFUNCTION[2] documentation, that is delimit http responses based on the http status line. I will be updating the documentation to link to this bugreport along a short explanation in 4.14 [1] http://curl.haxx.se/libcurl/c/curl_easy_getinfo.html [2] http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTHEADERFUNCTION
From andy.jenkinson [...] gmail.com Sun Oct 17 04: 26:58 2010
MIME-Version: 1.0 (iPhone Mail 8B117)
X-Spam-Status: No, score=-6.219 tagged_above=-99.9 required=10 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, MIME_QP_LONG_LINE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_NEUTRAL=0.779] autolearn=ham
In-Reply-To: <rt-3.8.HEAD-2363-1287262403-81.61569-6-0 [...] rt.cpan.org>
X-Mailer: iPhone Mail (8B117)
X-Spam-Flag: NO
References: <RT-Ticket-61569 [...] rt.cpan.org> <0BF24779-8350-4524-B63F-348750CD2494 [...] gmail.com> <rt-3.8.HEAD-2363-1287262403-81.61569-6-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
Content-Type: text/plain; charset="utf-8"
Message-ID: <E1873707-8E83-445C-B1FC-CC47E21099DA [...] gmail.com>
X-RT-Original-Encoding: utf-8
X-Spam-Score: -6.219
Authentication-Results: hipster.bestpractical.com (amavisd-new); dkim=pass header.i= [...] gmail.com
Authentication-Results: hipster.bestpractical.com (amavisd-new); domainkeys=pass header.from=andy.jenkinson [...] gmail.com
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id 34B78240E7B for <cpan-bug+WWW-Curl [...] hipster.bestpractical.com>; Sun, 17 Oct 2010 04:26:58 -0400 (EDT)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ym62ukagIEt4 for <cpan-bug+WWW-Curl [...] hipster.bestpractical.com>; Sun, 17 Oct 2010 04:26:56 -0400 (EDT)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id D09F8240E99 for <bug-WWW-Curl [...] rt.cpan.org>; Sun, 17 Oct 2010 04:26:35 -0400 (EDT)
Received: (qmail 5198 invoked by uid 103); 17 Oct 2010 08:26:35 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 17 Oct 2010 08:26:35 -0000
Received: from mail-wy0-f178.google.com (HELO mail-wy0-f178.google.com) (74.125.82.178) by 16.mx.develooper.com (qpsmtpd/0.80) with ESMTP; Sun, 17 Oct 2010 01:26:32 -0700
Received: by wyg36 with SMTP id 36so2174403wyg.9 for <bug-WWW-Curl [...] rt.cpan.org>; Sun, 17 Oct 2010 01:26:29 -0700 (PDT)
Received: by 10.216.0.206 with SMTP id 56mr3110232web.33.1287303988697; Sun, 17 Oct 2010 01:26:28 -0700 (PDT)
Received: from [192.168.123.11] ([84.93.149.75]) by mx.google.com with ESMTPS id x12sm5318279weq.42.2010.10.17.01.26.26 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 17 Oct 2010 01:26:27 -0700 (PDT)
Delivered-To: cpan-bug+WWW-Curl [...] hipster.bestpractical.com
Subject: Re: [rt.cpan.org #61569] redirect handling
Domainkey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:references:from:content-type:x-mailer:in-reply-to :message-id:date:to:content-transfer-encoding:mime-version; b=whCMWuAizSloJzprYK0Qkih1F6FwyQzdkOW8GYW4g8/dMDikZ8K4OhzC8di4zdPYyf a2D3SRUeLZRdjMLFWZIrsQ2UL3f/2M+iwYD4B2GKFrKC2c0OJKZEQ/0VrcjsWhP9YYCW rFWo0mqhUZE2Dp5H/q8yjWgV9/6o4owbtgaCA=
Return-Path: <andy.jenkinson [...] gmail.com>
Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:subject:references:from :content-type:x-mailer:in-reply-to:message-id:date:to :content-transfer-encoding:mime-version; bh=mDkTWG8e2fhAeOrbY+/uBWG2O6i6Od0gSRIu7OkA5yk=; b=oW6PRcYnDrpwq8fpxhVHfj823paOqWDh+fHxP9z832P6+DyAsCDFEMMQ82SOoe8zjY LT5SsjmC5nIeCOzzj4a8LGatbNRjtVyrtLIaO8oX64EktNeCDOWF1n43gtTpp/cEmwKS BhwLtL0ObcDup/v5TaqyO/m/+ZCtD3V9U5v64=
X-Spam-Check-BY: 16.mx.develooper.com
X-Original-To: cpan-bug+WWW-Curl [...] hipster.bestpractical.com
X-RT-Mail-Extension: www-curl
Date: Sun, 17 Oct 2010 09:26:45 +0100
X-Spam-Level:
To: "bug-WWW-Curl [...] rt.cpan.org" <bug-WWW-Curl [...] rt.cpan.org>
Content-Transfer-Encoding: quoted-printable
From: Andy Jenkinson <andy.jenkinson [...] gmail.com>
RT-Message-ID: <rt-3.8.HEAD-2361-1287304018-1953.61569-0-0 [...] rt.cpan.org>
Content-Length: 2558
Download (untitled) / with headers
text/plain 2.4k
Thank you, very helpful! On 16 Oct 2010, at 21:53, "Balint Szilakszi via RT" <bug-WWW-Curl@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=61569 > > > On Wed Sep 22 13:08:48 2010, andy.jenkinson@gmail.com wrote:
>> Hi, >> >> Not being familiar with the inner workings, I'm not sure if this >> can/should be resolved in WWW::Curl, but here goes: >> >> When setting CURLOPT_FOLLOWLOCATION to 1, curl will follow redirects >> and fetch the new response appropriately. However, the filehandles >> specified in CURLOPT_WRITEHEADER (and I assume CURLOPT_WRITEDATA >> but I have not tested this) are written to multiple times - once >> per server response. This means that in a typical 301/302 redirect >> situation the header filehandle will contain two headers once >> finished, ending up looking like: >> >> HTTP/1.1 301 Moved Permanently >> Location: http://uri.of.redirect >> ... etc >> >> HTTP/1.1 200 OK >> ... etc >> >> If not addressable efficiently in WWW::Curl, I suggest listing as a >> limitation just to make others aware? >> >> Cheers, >> Andy
> > Hey, > > I think it's default behaviour for libcurl to output all header > information, if a redirect happened and header data was requested. > > The reasoning is a bit deep, but I think it's because either the > application is not interested in headers, only content - in the case > that the body gets processed, or the application is interested in the > headers (for example, to build an HTTP::Response object from it). In the > latter case, one of the reasons you want to hang on to the 301/302's > header data is that it might contain cookies (think of the "POST > /foo/login -> 301/302 -> GET /" case). > > So basically the application writer has three choices to resolve this: > > 1. Don't bother with headers at all. Most information can be extracted > from getinfo[1] easily. > 2. Disable CURLOPT_FOLLOWLOCATION and do redirect handling in the > application code. The url to redirect to is available from getinfo with > the CURLINFO_REDIRECT_URL constant. One thing to watch for is the > POST->redirect->GET behaviour that's common in browsers. > 3. Implement the suggested method in the CURLOPT_HEADERFUNCTION[2] > documentation, that is delimit http responses based on the http status line. > > I will be updating the documentation to link to this bugreport along a > short explanation in 4.14 > > [1] http://curl.haxx.se/libcurl/c/curl_easy_getinfo.html > [2] > http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTHEADERFUNCTION
MIME-Version: 1.0
In-Reply-To: <0BF24779-8350-4524-B63F-348750CD2494 [...] gmail.com>
X-Mailer: MIME-tools 5.427 (Entity 5.427)
Content-Disposition: inline
References: <0BF24779-8350-4524-B63F-348750CD2494 [...] gmail.com>
Content-Type: text/plain; charset="UTF-8"
Message-ID: <rt-3.8.HEAD-2361-1287951670-806.61569-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 52
Release 4.14 is out, thus I'm resolving this ticket.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.