Skip Menu |
 

This queue is for tickets about the Encode CPAN distribution.

Report information
The Basics
Id: 51204
Status: resolved
Priority: 0/
Queue: Encode

People
Owner: Nobody in particular
Requestors: GAAS [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 2.37
Fixed in: (no value)

Attachments
0001-Make-the-UTF8-encoder-croak-when-callback-CHECK-is-p.patch



Subject: Callback CHECK not supported for UTF-8 decoder/encoder
Download (untitled) / with headers
text/plain 537b
I was surprised how much time I ended up spending debugging why I could not get: Encode::decode("UTF-8", $octets, sub { sprintf "%%%02X", shift }) to work. My program behaved very oddly and the behaviour changed in random ways. Turns out Encode just sliently turn the CODE address into an integer and treat that as flags to decide how to behave. Very confusing. The least that should happen is that the decoder croaks when a CODE reference is passed in, but I would be much happier if such callbacks could simply be made to work.
Patch to make Encode::decode("UTF-8", $bytes, sub {}) croak.
From 32ca48a2d4d46aa42855266f4e37da8a7342a92d Mon Sep 17 00:00:00 2001 From: Gisle Aas <gisle@aas.no> Date: Sun, 8 Nov 2009 12:15:58 +0100 Subject: [PATCH] Make the UTF8 encoder croak when callback CHECK is passed in --- Encode.xs | 22 ++++++++++++++++++---- 1 files changed, 18 insertions(+), 4 deletions(-) diff --git a/Encode.xs b/Encode.xs index e5f4c9a..e9ccd3f 100644 --- a/Encode.xs +++ b/Encode.xs @@ -401,19 +401,26 @@ MODULE = Encode PACKAGE = Encode::utf8 PREFIX = Method_ PROTOTYPES: DISABLE void -Method_decode_xs(obj,src,check = 0) +Method_decode_xs(obj,src,check_sv = &PL_sv_no) SV * obj SV * src -int check +SV * check_sv PREINIT: STRLEN slen; U8 *s; U8 *e; SV *dst; bool renewed = 0; + int check; CODE: { dSP; ENTER; SAVETMPS; + if (SvROK(check_sv)) { + croak("UTF-8 decoder doesn't support callback CHECK"); + } + else { + check = SvIV(check_sv); + } if (src == &PL_sv_undef) src = newSV(0); s = (U8 *) SvPV(src, slen); e = (U8 *) SvEND(src); @@ -464,18 +471,25 @@ CODE: } void -Method_encode_xs(obj,src,check = 0) +Method_encode_xs(obj,src,check_sv = &PL_sv_no) SV * obj SV * src -int check +SV * check_sv PREINIT: STRLEN slen; U8 *s; U8 *e; SV *dst; bool renewed = 0; + int check; CODE: { + if (SvROK(check_sv)) { + croak("UTF-8 encoder doesn't support callback CHECK"); + } + else { + check = SvIV(check_sv); + } if (src == &PL_sv_undef) src = newSV(0); s = (U8 *) SvPV(src, slen); e = (U8 *) SvEND(src); -- 1.6.2.95.g934f7
Download (untitled) / with headers
text/plain 179b
Thanks, applied in my repo. VERSION++ soon. Dan the Maintainer Thereof On Sun Nov 08 06:18:02 2009, GAAS wrote: Show quoted text
> Patch to make Encode::decode("UTF-8", $bytes, sub {}) croak.
Download (untitled) / with headers
text/plain 808b
Hi, the ticket seems closed though. Some of my modules expected that this wouldn't croak but be simply ignored. I would be much happier if such callbacks could work, too. :-) I tested under Encode.pm 2.23: perl -MEncode -e 'print Encode::decode("UTF-8", "\x80", Encode::FB_XMLCREF), "\n"' &#x80; => CHECK values works as single ¥x80 octets is malformed in UTF-8. I guess this is one of cases that CHECK is needed for decode(). perl -MEncode -e 'print Encode::decode("UTF-8", "\x80", sub { sprintf "%%%02X", shift }), "\n"' &#128; => CHECK coderef is ignored and doesn't work. I think this should work however. This invokes "UTF-8 decoder doesn't support callback CHECK" under Encode 2.38. Thank you both for maintaining the great module anyway. Ref: http://kawa.at.webry.info/200911/article_12.html
Fixed in Version 2.39. Thanks for your insight. Indeed, it's useful on decode. For encode, it has no use since it always succeeds. Dan the Encode Maintainer On Mon Nov 23 18:17:14 2009, KAWASAKI wrote: Show quoted text
> Hi, the ticket seems closed though. > > Some of my modules expected that this wouldn't croak but be simply > ignored. > I would be much happier if such callbacks could work, too. :-) > > I tested under Encode.pm 2.23: > > perl -MEncode -e 'print Encode::decode("UTF-8", "\x80", > Encode::FB_XMLCREF), "\n"' > &#x80; > > => CHECK values works as single ¥x80 octets is malformed in UTF-8. > I guess this is one of cases that CHECK is needed for decode(). > > perl -MEncode -e 'print Encode::decode("UTF-8", "\x80", sub { sprintf > "%%%02X", shift }), > "\n"' > &#128; > > => CHECK coderef is ignored and doesn't work. I think this should work > however. > This invokes "UTF-8 decoder doesn't support callback CHECK" under > Encode 2.38. > > Thank you both for maintaining the great module anyway. > > Ref: > http://kawa.at.webry.info/200911/article_12.html


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.