Skip Menu |
 

This queue is for tickets about the Unicode-Normalize CPAN distribution.

Report information
The Basics
Id: 102766
Status: resolved
Priority: 0/
Queue: Unicode-Normalize

People
Owner: Nobody in particular
Requestors: philkime [...] kime.org.uk
Cc: ether [...] cpan.org
SREZIC [...] cpan.org
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 1.23



Subject: 1.18 - 20x slower than 1.17?
Date: Sat, 14 Mar 2015 18:38:39 +0100
To: bug-Unicode-Normalize [...] rt.cpan.org
From: Kime Philip <philkime [...] kime.org.uk>
Download (untitled) / with headers
text/plain 349b
I have a project heavily using Unicode::Normalize and after upgrading to 1.18 from 1.17, the program runs about 20x slower. I saw in the release notes about removing XSUB - does this mean 1.18 is all pure perl now? I looked at the referenced perldelta but it wasn’t clear if there was something else I need to do to use 1.18? -- Dr Philip Kime
Subject: Re: [rt.cpan.org #102766] 1.18 - 20x slower than 1.17?
Date: Sun, 15 Mar 2015 19:40:36 +0900
To: bug-Unicode-Normalize [...] rt.cpan.org
From: Sadahiro Tomoyuki <rsn10260 [...] nifty.com>
Download (untitled) / with headers
text/plain 1.2k
Hello, The latest Unicode::Normalize 1.18 provides pure perl only. Older distributions of Unicode::Normalize, 0.06-1.17, have both pure perl and XS. The pure perl code of 1.18 is same as that of 1.17. Then the change from 1.17 to 1.18 was just removal of XS. Currently the C-level unicode support in perl seems unstable or experimental so that perl5 porters try to fix on EBCDIC systems. 1.17 is available from BackPAN. If speed is important, even though latest, using 1.18 is not recommended. Regards, SADAHIRO Tomoyuki Show quoted text
> Sat Mar 14 13:38:52 2015: Request 102766 was acted upon. > Transaction: Ticket created by philkime@kime.org.uk > Queue: Unicode-Normalize > Subject: 1.18 - 20x slower than 1.17? > Broken in: (no value) > Severity: (no value) > Owner: Nobody > Requestors: philkime@kime.org.uk > Status: new > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102766 > > > > I have a project heavily using Unicode::Normalize and after upgrading to 1.18 from 1.17, the program runs about 20x slower. I saw in the release notes about removing XSUB - does this mean 1.18 is all pure perl now? I looked at the referenced perldelta but it wasn’t clear if there was something else I need to do to use 1.18? > > > -- > Dr Philip Kime >
Subject: Re: [rt.cpan.org #102766] 1.18 - 20x slower than 1.17?
Date: Sun, 15 Mar 2015 15:26:05 +0100
To: bug-Unicode-Normalize [...] rt.cpan.org
From: Kime Philip <philkime [...] kime.org.uk>
Download (untitled) / with headers
text/plain 162b
Ok, thanks -do you plan to ad XS back in if this changes? It’s quite important for projects which need to do a lot of normalisation … PK -- Dr Philip Kime
Subject: Re: [rt.cpan.org #102766] 1.18 - 20x slower than 1.17?
Date: Mon, 16 Mar 2015 21:09:38 +0900
To: bug-Unicode-Normalize [...] rt.cpan.org
From: Sadahiro Tomoyuki <rsn10260 [...] nifty.com>
Download (untitled) / with headers
text/plain 632b
Hello, sadly it is uncertain. perl5-porters has no objection against removal of XS from Unicode::Normalize. http://www.nntp.perl.org/group/perl.perl5.porters/2013/12/msg210313.html Efficiency seems to be less important than portability. There would be no guarantee that possible newer API is compatible to older one. Regards, SADAHIRO Tomoyuki Show quoted text
> Queue: Unicode-Normalize > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102766 > > > Ok, thanks -do you plan to ad XS back in if this changes? It’s quite important for projects which need to do a lot of normalisation … > > PK > > -- > Dr Philip Kime >
Subject: Re: [rt.cpan.org #102766] 1.18 - 20x slower than 1.17?
Date: Mon, 16 Mar 2015 21:13:20 +0100
To: bug-Unicode-Normalize [...] rt.cpan.org
From: Kime Philip <philkime [...] kime.org.uk>
Download (untitled) / with headers
text/plain 1.1k
That is bad - Unicode::Collate and Unicode::Normalize are really important modules and performance is extremely important. I use them a lot to develop the LaTeX biblatex/biber systems which relies on these modules. You have done fantastic work with them. I hope the XS versions can return … PK Show quoted text
> On 16 Mar 2015, at 1:10 pm, Sadahiro Tomoyuki via RT <bug-Unicode-Normalize@rt.cpan.org> wrote: > > <URL: https://rt.cpan.org/Ticket/Display.html?id=102766 > > > Hello, > > sadly it is uncertain. > > perl5-porters has no objection against removal of XS > from Unicode::Normalize. > http://www.nntp.perl.org/group/perl.perl5.porters/2013/12/msg210313.html > > Efficiency seems to be less important than portability. > There would be no guarantee that possible newer API is > compatible to older one. > > Regards, SADAHIRO Tomoyuki >
>> Queue: Unicode-Normalize >> Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102766 > >> >> Ok, thanks -do you plan to ad XS back in if this changes? It’s quite important for projects which need to do a lot of normalisation … >> >> PK >> >> -- >> Dr Philip Kime >>
> > >
-- Dr Philip Kime
Download (untitled) / with headers
text/plain 944b
Maybe the issue should be raised again at p5p? If I understand it correctly, then here's a trade-off between EBCDIC compatibility and massive slowdown for all platforms, and this does not feel right. Regards, Slaven On 2015-03-16 08:10:12, rsn10260@nifty.com wrote: Show quoted text
> Hello, > > sadly it is uncertain. > > perl5-porters has no objection against removal of XS > from Unicode::Normalize. > http://www.nntp.perl.org/group/perl.perl5.porters/2013/12/msg210313.html > > Efficiency seems to be less important than portability. > There would be no guarantee that possible newer API is > compatible to older one. > > Regards, SADAHIRO Tomoyuki >
> > Queue: Unicode-Normalize > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102766 > > > > > Ok, thanks -do you plan to ad XS back in if this changes? It’s quite > > important for projects which need to do a lot of normalisation … > > > > PK > > > > -- > > Dr Philip Kime > >
Subject: Re: [rt.cpan.org #102766] 1.18 - 20x slower than 1.17?
Date: Tue, 17 Mar 2015 21:10:57 +0900
To: bug-Unicode-Normalize [...] rt.cpan.org
From: Sadahiro Tomoyuki <rsn10260 [...] nifty.com>
Download (untitled) / with headers
text/plain 1.2k
hello, other people may be able to do so; I won't do so, as I did it in vain. Users' and other peoples' opinion may make sense. Regards, SADAHIRO Tomoyuki Show quoted text
> Queue: Unicode-Normalize > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102766 > > > Maybe the issue should be raised again at p5p? If I understand it correctly, then here's a trade-off between EBCDIC compatibility and massive slowdown for all platforms, and this does not feel right. > > Regards, > Slaven > > On 2015-03-16 08:10:12, rsn10260@nifty.com wrote:
> > Hello, > > > > sadly it is uncertain. > > > > perl5-porters has no objection against removal of XS > > from Unicode::Normalize. > > http://www.nntp.perl.org/group/perl.perl5.porters/2013/12/msg210313.html > > > > Efficiency seems to be less important than portability. > > There would be no guarantee that possible newer API is > > compatible to older one. > > > > Regards, SADAHIRO Tomoyuki > >
> > > Queue: Unicode-Normalize > > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102766 > > > > > > > Ok, thanks -do you plan to ad XS back in if this changes? It’s quite > > > important for projects which need to do a lot of normalisation … > > > > > > PK > > > > > > -- > > > Dr Philip Kime > > >
> >
Download (untitled) / with headers
text/plain 366b
As far as I can tell, this simple patch (against 1.17, the last release that bundles the XS implementation) that replaces all the deprecated functions by their suggested counterparts makes the dist build and test fine on any perl from 5.6.1 to the latest 5.23. I haven't tested on a EBCDIC platform, but I don't think it can possibly be worse than before. Vincent
Subject: unichr_1.17.patch
Download unichr_1.17.patch
text/x-diff 5.2k
--- Unicode-Normalize-1.17/Normalize.xs 2013-10-04 23:38:11.000000000 -0300 +++ Unicode-Normalize-1.17/Normalize.xs 2015-07-14 15:06:25.000000000 -0300 @@ -21,14 +21,14 @@ #include "unfexc.h" /* Perl 5.6.1 ? */ -#ifndef uvuni_to_utf8 -#define uvuni_to_utf8 uv_to_utf8 -#endif /* uvuni_to_utf8 */ +#ifndef uvchr_to_utf8 +#define uvchr_to_utf8 uv_to_utf8 +#endif /* uvchr_to_utf8 */ /* Perl 5.6.1 ? */ -#ifndef utf8n_to_uvuni -#define utf8n_to_uvuni utf8_to_uv -#endif /* utf8n_to_uvuni */ +#ifndef utf8n_to_uvchr +#define utf8n_to_uvchr utf8_to_uv +#endif /* utf8n_to_uvchr */ /* UTF8_ALLOW_BOM is used before Perl 5.8.0 */ #ifndef UTF8_ALLOW_BOM @@ -49,7 +49,7 @@ #define AllowAnyUTF (UTF8_ALLOW_SURROGATE|UTF8_ALLOW_BOM|UTF8_ALLOW_FE_FF|UTF8_ALLOW_FFFF) -/* check if the string buffer is enough before uvuni_to_utf8(). */ +/* check if the string buffer is enough before uvchr_to_utf8(). */ /* dstart, d, and dlen should be defined outside before. */ #define Renew_d_if_not_enough_to(need) STRLEN curlen = d - dstart; \ if (dlen < curlen + (need)) { \ @@ -58,7 +58,7 @@ d = dstart + curlen; \ } -/* if utf8n_to_uvuni() sets retlen to 0 (if broken?) */ +/* if utf8n_to_uvchr() sets retlen to 0 (if broken?) */ #define ErrRetlenIsZero "panic (Unicode::Normalize %s): zero-length character" /* utf8_hop() hops back before start. Maybe broken UTF-8 */ @@ -197,10 +197,10 @@ if (! Hangul_IsS(uv)) return d; - d = uvuni_to_utf8(d, (lindex + Hangul_LBase)); - d = uvuni_to_utf8(d, (vindex + Hangul_VBase)); + d = uvchr_to_utf8(d, (lindex + Hangul_LBase)); + d = uvchr_to_utf8(d, (vindex + Hangul_VBase)); if (tindex) - d = uvuni_to_utf8(d, (tindex + Hangul_TBase)); + d = uvchr_to_utf8(d, (tindex + Hangul_TBase)); return d; } @@ -231,7 +231,7 @@ while (p < e) { STRLEN retlen; - UV uv = utf8n_to_uvuni(p, e - p, &retlen, AllowAnyUTF); + UV uv = utf8n_to_uvchr(p, e - p, &retlen, AllowAnyUTF); if (!retlen) croak(ErrRetlenIsZero, "decompose"); p += retlen; @@ -251,7 +251,7 @@ } else { Renew_d_if_not_enough_to(UTF8_MAXLEN) - d = uvuni_to_utf8(d, uv); + d = uvchr_to_utf8(d, uv); } } } @@ -276,7 +276,7 @@ while (p < e) { U8 curCC; STRLEN retlen; - UV uv = utf8n_to_uvuni(p, e - p, &retlen, AllowAnyUTF); + UV uv = utf8n_to_uvchr(p, e - p, &retlen, AllowAnyUTF); if (!retlen) croak(ErrRetlenIsZero, "reorder"); p += retlen; @@ -316,14 +316,14 @@ for (i = 0; i < cc_pos; i++) { Renew_d_if_not_enough_to(UTF8_MAXLEN) - d = uvuni_to_utf8(d, seq_ptr[i].uv); + d = uvchr_to_utf8(d, seq_ptr[i].uv); } cc_pos = 0; } if (curCC == 0) { Renew_d_if_not_enough_to(UTF8_MAXLEN) - d = uvuni_to_utf8(d, uv); + d = uvchr_to_utf8(d, uv); } } if (seq_ext) @@ -353,7 +353,7 @@ while (p < e) { U8 curCC; STRLEN retlen; - UV uv = utf8n_to_uvuni(p, e - p, &retlen, AllowAnyUTF); + UV uv = utf8n_to_uvchr(p, e - p, &retlen, AllowAnyUTF); if (!retlen) croak(ErrRetlenIsZero, "compose"); p += retlen; @@ -369,7 +369,7 @@ } else { Renew_d_if_not_enough_to(UTF8_MAXLEN) - d = uvuni_to_utf8(d, uv); + d = uvchr_to_utf8(d, uv); continue; } } @@ -428,7 +428,7 @@ /* output */ { Renew_d_if_not_enough_to(UTF8_MAXLEN) - d = uvuni_to_utf8(d, uvS); /* starter (composed or not) */ + d = uvchr_to_utf8(d, uvS); /* starter (composed or not) */ } if (cc_pos) { @@ -436,7 +436,7 @@ for (i = 0; i < cc_pos; i++) { Renew_d_if_not_enough_to(UTF8_MAXLEN) - d = uvuni_to_utf8(d, seq_ptr[i]); + d = uvchr_to_utf8(d, seq_ptr[i]); } cc_pos = 0; } @@ -623,7 +623,7 @@ preCC = 0; for (p = s; p < e; p += retlen) { - UV uv = utf8n_to_uvuni(p, e - p, &retlen, AllowAnyUTF); + UV uv = utf8n_to_uvchr(p, e - p, &retlen, AllowAnyUTF); if (!retlen) croak(ErrRetlenIsZero, "checkNFD or -NFKD"); @@ -660,7 +660,7 @@ preCC = 0; for (p = s; p < e; p += retlen) { - UV uv = utf8n_to_uvuni(p, e - p, &retlen, AllowAnyUTF); + UV uv = utf8n_to_uvchr(p, e - p, &retlen, AllowAnyUTF); if (!retlen) croak(ErrRetlenIsZero, "checkNFC or -NFKC"); @@ -718,7 +718,7 @@ U8 *sCan; UV uvLead; STRLEN canlen = 0; - UV uv = utf8n_to_uvuni(p, e - p, &retlen, AllowAnyUTF); + UV uv = utf8n_to_uvchr(p, e - p, &retlen, AllowAnyUTF); if (!retlen) croak(ErrRetlenIsZero, "checkFCD or -FCC"); @@ -727,7 +727,7 @@ if (sCan) { STRLEN canret; canlen = (STRLEN)strlen((char *) sCan); - uvLead = utf8n_to_uvuni(sCan, canlen, &canret, AllowAnyUTF); + uvLead = utf8n_to_uvchr(sCan, canlen, &canret, AllowAnyUTF); if (!canret) croak(ErrRetlenIsZero, "checkFCD or -FCC"); } @@ -758,7 +758,7 @@ U8* pCan = utf8_hop(eCan, -1); if (pCan < sCan) croak(ErrHopBeforeStart); - uvTrail = utf8n_to_uvuni(pCan, eCan - pCan, &canret, AllowAnyUTF); + uvTrail = utf8n_to_uvchr(pCan, eCan - pCan, &canret, AllowAnyUTF); if (!canret) croak(ErrRetlenIsZero, "checkFCD or -FCC"); preCC = getCombinClass(uvTrail); @@ -897,7 +897,7 @@ p = utf8_hop(p, -1); if (p < s) croak(ErrHopBeforeStart); - uv = utf8n_to_uvuni(p, e - p, NULL, AllowAnyUTF); + uv = utf8n_to_uvchr(p, e - p, NULL, AllowAnyUTF); if (getCombinClass(uv) == 0) /* Last Starter found */ break; }
Download (untitled) / with headers
text/plain 529b
Thank you, but sorry, any patch that should break on EBCDIC platforms cannot be applied, since Unicode::Normalize is incorporated in perl. Show quoted text
> As far as I can tell, this simple patch (against 1.17, the last > release that bundles the XS implementation) that replaces all the > deprecated functions by their suggested counterparts makes the dist > build and test fine on any perl from 5.6.1 to the latest 5.23. > > I haven't tested on a EBCDIC platform, but I don't think it can > possibly be worse than before. > > > Vincent
RT-Send-CC: rsn10260 [...] nifty.com, rjbs [...] manxome.org
Download (untitled) / with headers
text/plain 1.1k
On Wed Jul 15 07:08:51 2015, SADAHIRO wrote: Show quoted text
> Thank you, but sorry, any patch that should break > on EBCDIC platforms cannot be applied, since > Unicode::Normalize is incorporated in perl. >
> > As far as I can tell, this simple patch (against 1.17, the last > > release that bundles the XS implementation) that replaces all the > > deprecated functions by their suggested counterparts makes the dist > > build and test fine on any perl from 5.6.1 to the latest 5.23. > > > > I haven't tested on a EBCDIC platform, but I don't think it can > > possibly be worse than before. > > > > > > Vincent
Unicode::Normalize has not been ported to EBCDIC platforms. There is therefore no reason to reject a patch to it on the basis that it won't work on EBCDIC. (I haven't examined this patch at all.) p5p wants the XS version of Unicode::Normalize back. I was very disappointed that you took it away. I most likely will be the one to port it to EBCDIC at some point, and when I do, I will make sure that the changed version I come up with does not cause it to break on any ASCII version that it currently runs on, and will issue a pull request so that you won't have to do anything but critique my work.
Download (untitled) / with headers
text/plain 312b
On Thu Jul 16 01:13:50 2015, khw wrote: Show quoted text
> On Wed Jul 15 07:08:51 2015, SADAHIRO wrote:
> > Thank you, but sorry, any patch that should break > > on EBCDIC platforms cannot be applied, since > > Unicode::Normalize is incorporated in perl.
SADAHIRO, as per the last comment, it looks like XS could be put back in?
Download (untitled) / with headers
text/plain 1.4k
As far as I know, after perl 5.8.0, there were two seasons when people worked for porting perl on the EBCDIC platform. The following are each one of mail I found: * 23 Jul 2003 http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/ 2003-07/msg00004.html [ AW: Perl 5.8.1 release candidate 2 now available for testing (a first result for BS2000) ] * 17 Jun 2006 http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/ 2006-06/msg00593.html ( IBM z/OS Unix source code fixes for Perl 5.8.7 ) I have neither been reported yet that Unicode::Normalizes would work on the platforms, nor that it should not work there. I am not sure till the end. P.S. IBM seemed to announce perl had worked there: the User's Guide shows at least a part of Encode module would work; no comment about Unicode::Normalizes. * http://www-03.ibm.com/systems/z/os/zos/features/unix/ported/perl/ What is Perl for z/OS? Perl for z/OS provides a port of the Perl (version 5.8.7) scripting language to the z/OS UNIX System Services platform. Perl (Practical Extraction and Report Language) is a very popular general-purpose programming language that is widely used on UNIX and other computing platforms. This port of Perl to the z/OS platform offers an enhancement over other versions of Perl in that it is preconfigured and precompiled and is designed to address ASCII/EBCDIC conversion and provide Unicode support.
RT-Send-CC: rjbs [...] manxome.org, rsn10260 [...] nifty.com
Download (untitled) / with headers
text/plain 2.7k
On Fri Jul 24 06:18:46 2015, SADAHIRO wrote: Show quoted text
> As far as I know, after perl 5.8.0, > there were two seasons when people worked > for porting perl on the EBCDIC platform. > The following are each one of mail I found: > > * 23 Jul 2003 > http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/ > 2003-07/msg00004.html > [ AW: Perl 5.8.1 release candidate 2 now available > for testing (a first result for BS2000) ] > > * 17 Jun 2006 > http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/ > 2006-06/msg00593.html > ( IBM z/OS Unix source code fixes for Perl 5.8.7 ) > > I have neither been reported yet > that Unicode::Normalizes would work on the platforms, > nor that it should not work there. > I am not sure till the end. > > P.S. IBM seemed to announce perl had worked there: > the User's Guide shows at least a part of Encode module > would work; no comment about Unicode::Normalizes. > > * http://www-03.ibm.com/systems/z/os/zos/features/unix/ported/perl/ > > What is Perl for z/OS? > Perl for z/OS provides a port of the Perl (version 5.8.7) scripting > language to the z/OS UNIX System Services platform. > Perl (Practical Extraction and Report Language) is a very popular > general-purpose programming language that is widely used on UNIX > and other computing platforms. > This port of Perl to the z/OS platform offers an enhancement > over other versions of Perl in that it is preconfigured and > precompiled and is designed to address ASCII/EBCDIC conversion > and provide Unicode support.
The EBCDIC port in 5.8, even if it worked well in Unicode, which I'm sceptical of, was broken in the many intervening years and releases where there were no EBCDIC smokers. The port I was referring to was the one for 5.22, which required many changes. Unicode::Normalize was not included there. Here is the perl5220delta.pod text: =item z/OS running EBCDIC Code Page 1047 Core perl now works on this EBCDIC platform. Earlier perls also worked, but, even though support wasn't officially withdrawn, recent perls would not compile and run well. Perl 5.20 would work, but had many bugs which have now been fixed. Many CPAN modules that ship with Perl still fail tests, including C<Pod::Simple>. However the version of C<Pod::Simple> currently on CPAN should work; it was fixed too late to include in Perl 5.22. Work is under way to fix many of the still-broken CPAN modules, which likely will be installed on CPAN when completed, so that you may not have to wait until Perl 5.24 to get a working version. Let me put it another way, I have a commit bit to core perl. If this patch looks ok to you, and you put the XS version back into U::N, I will personally and quickly apply it to blead (after looking it over myself).
Download (untitled) / with headers
text/plain 313b
Show quoted text
> SADAHIRO, as per the last comment, it looks like XS could be put back in?
As the XS lacks backward compatibility, I shall no longer maintain it. Whoever want to do so, however, can maintain it. P.S. I has not followed recent core changes. I had wrote it for perl 5.6.1 and 5.8.0, that have been out of date.
Download (untitled) / with headers
text/plain 190b
SADAHIRO, if it is acceptable to you, I will have P5P take charge of making a new release, ensuring backward compatibility, and keeping this module maintained. Is that acceptable? -- rjbs
RT-Send-CC: rsn10260 [...] nifty.com, rjbs [...] manxome.org, philkime [...] kime.org.uk, SREZIC [...] cpan.org, perl [...] profvince.com
Download (untitled) / with headers
text/plain 454b
On Thu Jul 30 06:26:02 2015, SADAHIRO wrote: Show quoted text
> > SADAHIRO, as per the last comment, it looks like XS could be put back in?
> > As the XS lacks backward compatibility, > I shall no longer maintain it. > Whoever want to do so, however, can maintain it. > > P.S. > > I has not followed recent core changes. > I had wrote it for perl 5.6.1 and 5.8.0, > that have been out of date.
I will test Vincent's patch. The XS needs to get back into core somehow
Download (untitled) / with headers
text/plain 562b
Show quoted text
> SADAHIRO, if it is acceptable to you, I will have P5P take charge of > making a new release, ensuring backward compatibility, and keeping > this module maintained. > > Is that acceptable?
Yes, no problem. P5P should be appropriate. I need not to continue its maintainer. P.S. The releases of 1.18 and 1.19 don't include any meaningful change for its XS. The following commits might be reverted: * 2015-07-14 Steve Hay Upgrade Unicode-Normalize from version 1.18 to 1.19 * 2014-05-28 Steve Hay Upgrade Unicode::Normalize from version 1.17 to 1.18
RT-Send-CC: SREZIC [...] cpan.org, rjbs [...] manxome.org, perl [...] profvince.com, rsn10260 [...] nifty.com, Stromeko [...] NexGo.DE
Download (untitled) / with headers
text/plain 446b
Unicode::Normalize 1.23 has been released, which restores the XSUB interface. unicode.org publishes a comprehensive normalization test suite. There was essentially no speed difference when this was run on the .xs versus the pure perl versions. This was not written to do benchmarks. But I found it surprising that it didn't shown some differences in light of what you said about a 20x speed difference. Apparently other overhead dwarfs that.
CC: SREZIC [...] cpan.org, rjbs [...] manxome.org, perl [...] profvince.com, rsn10260 [...] nifty.com
Subject: Re: [rt.cpan.org #102766] 1.18 - 20x slower than 1.17?
Date: Wed, 28 Oct 2015 09:38:02 +0100
To: bug-Unicode-Normalize [...] rt.cpan.org
From: ASSI <Stromeko [...] NexGo.DE>
Download (untitled) / with headers
text/plain 873b
Am 26.10.2015 um 22:42 schrieb Karl Williamson via RT: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=102766 > > > Unicode::Normalize 1.23 has been released, which restores the XSUB > interface.
Thank you very much. Show quoted text
> unicode.org publishes a comprehensive normalization test suite. > There was essentially no speed difference when this was run on the > .xs versus the pure perl versions. This was not written to do > benchmarks. But I found it surprising that it didn't shown some > differences in light of what you said about a 20x speed difference. > Apparently other overhead dwarfs that.
I have confirmation that Biber works with acceptable speed with version 1.23 again on Cygwin. I don't know what exactly Biber is doing to trigger this large performance difference PP vs. XS, it would maybe be useful to try and find out. -- Achim. (on the road :-)
Download (untitled) / with headers
text/plain 650b
On Wed Oct 28 04:38:26 2015, Stromeko@NexGo.DE wrote: Show quoted text
> I have confirmation that Biber works with acceptable speed with version > 1.23 again on Cygwin. I don't know what exactly Biber is doing to > trigger this large performance difference PP vs. XS, it would maybe be > useful to try and find out.
Many thanks for 1.23 Biber just does a lot of normalisation on reading and writing files. As per standard practice, it converts to NFD on points of entry and NFC on points of exit and this happens in several areas of operation. I can imagine that the characteristics of particular test suites can make this invisible for small amounts of text.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.