This queue is for tickets about the Math-GMPz CPAN distribution.

Report information
The Basics
Id:
123268
Status:
resolved
Priority:
Low/Low
Queue:

People
Owner:
Nobody in particular
Requestors:
ribasushi [...] leporine.io
Cc:
AdminCc:

BugTracker
Severity:
Important
Broken in:
(no value)
Fixed in:
(no value)



Subject: Upgraded perl-string handling is surprising ( the API does not warn, and the docs do not discuss it )
Consider the output of the following simple one-liner: perl -e ' use warnings; use strict; use Math::GMPz; my $gmpz = Math::GMPz::Rmpz_init2_nobless(8); my $byte = "\xff"; for my $up ( 0, 1, 0, 1, 0 ) { $up ? utf8::upgrade($byte) : utf8::downgrade($byte) ; Math::GMPz::Rmpz_import( $gmpz, length( $byte ), 1, 1, 0, 0, $byte, ); warn Math::GMPz::Rmpz_get_str( $gmpz, 16 ); } ' My main surprise came from the fact that the documentation did not discuss this eventuality at all, so I assumed I do not need to guard against it. Then weeks later one of my inputs suffered a change causing implicit upgrades, and nothing made sense afterwards. I am not sure whether the fix should be in doc, in code, or both. I am only reporting the confusing behavior.
Apologies for taking over 3 years to notice this. In the demo, I'm seeing $gmpz set to 0xc3 after each utf8::upgrade($byte), and to 0xff after each utf8::downgrade. This matches my expectations - but probably only because I expect GMP's mpz_export() to read in, byte by byte, the actual data contained in the string that was passed to it. I've just pushed to the math-gmpz github repo a new version of GMPz.pod that notes (in the Rmpz_import documentation): NOTE: The actual data contained in $bstr is read in, byte by byte. Hence, eg., a string containing any non-ASCII characters will therefore assign a different value to $rop if a utf8::upgrade($bstr) has been done. I've also added some tests in t/imp_exp.t to check that Rmpz_import is behaving the way I expect. I'll close this ticket now. However, do feel welcome to comment and reopen it - especially if I've missed something. I don't personally do anything with utf8 at all, and it took me a few minutes to work out just what was going on, and what the mindset should be. I certainly don't rule out the possibility that I've got something wrong. AFAICT, if one wants Rmpz_import to produce the same result irrespective of utf8::upgrade(), then one needs to perform a utf8::downgrade() of the string before passing it to Rmpz_import(). Of course, if all of the bytes are less than or equal to 0x7f, then there's no need to do anything. Cheers, Rob


This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.