Skip Menu |
 

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Spreadsheet-ParseExcel CPAN distribution.

Maintainer(s)' notes

If you are reporting a bug in Spreadsheet::ParseExcel here are some pointers

1) State the issues as clearly and as concisely as possible. A simple program or Excel test file (see below) will often explain the issue better than a lot of text.

2) Provide information on your system, version of perl and module versions. The following program will generate everything that is required. Put this information in your bug report.

    #!/usr/bin/perl -w

    print "\n    Perl version   : $]";
    print "\n    OS name        : $^O";
    print "\n    Module versions: (not all are required)\n";

    my @modules = qw(
                      Spreadsheet::ParseExcel
                      Scalar::Util
                      Unicode::Map
                      Spreadsheet::WriteExcel
                      Parse::RecDescent
                      File::Temp
                      OLE::Storage_Lite
                      IO::Stringy
                    );

    for my $module (@modules) {
        my $version;
        eval "require $module";

        if (not $@) {
            $version = $module->VERSION;
            $version = '(unknown)' if not defined $version;
        }
        else {
            $version = '(not installed)';
        }

        printf "%21s%-24s\t%s\n", "", $module, $version;
    }

    __END__

3) Upgrade to the latest version of Spreadsheet::ParseExcel (or at least test on a system with an upgraded version). The issue you are reporting may already have been fixed.

4) Create a small example program that demonstrates your problem. The program should be as small as possible. A few lines of codes are worth tens of lines of text when trying to describe a bug.

5) Supply an Excel file that demonstrates the problem. This is very important. If the file is big, or contains confidential information, try to reduce it down to the smallest Excel file that represents the issue. If you don't wish to post a file here then send it to me directly: jmcnamara@cpan.org

6) Say if the test file was created by Excel, OpenOffice, Gnumeric or something else. Say which version of that application you used.

7) If you are submitting a patch you should check with the maintainer whether the issue has already been patched or if a fix is in the works. Patches should be accompanied by test cases.

Asking a question

If you would like to ask a more general question there is the Spreadsheet::ParseExcel Google Group.

Report information
The Basics
Id: 63386
Status: open
Priority: 0/
Queue: Spreadsheet-ParseExcel

People
Owner: Nobody in particular
Requestors: ftl [...] dnv.com
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: 0.58
Fixed in: (no value)



Subject: Better detection of date-fields
Download (untitled) / with headers
text/plain 951b
Dear Sir, thank you for very useful module! row 2099 in sub _NewCell in ParseExcel.pm I recommend to extend the detection regexp of date-formats to include an optional [$-409] in front, and a ;@ at the end, as this seems to be quite common date format extentions. (as in "[$-409]d\\-mmm\\-yy;@") See http://stackoverflow.com/questions/894805/excel-number-format-what- is-409 for details about 409 and other regionale settings. In addition, this regexp should be made configurable, to accept strange local issues, on request. I have also added a few more separation chars, and replaced }i with dmyDMY. Hopefully this may contribute to further enhancements of this very fine and useful module. best regards Frode T. Lie Code to insert at row 2099 my $date_re = ($oBook->{DateRegExp} ||= qr{^(?:\[\$\-\d+\])? [dmyDMY][;,-\s\\/dmyDMY]*(?:;\@)?$}); if ( $FmtStr =~ $date_re ) { $rhKey{Type} = "Date"; }
Download (untitled) / with headers
text/plain 238b
Hi Frode, I'm not really clear on what the issue is. Perhaps you could submit an example program and Excel file along the lines of the following guidelines: https://rt.cpan.org/Dist/Display.html?Queue=Spreadsheet-ParseExcel John. --
From: ftl [...] dnv.com
Download (untitled) / with headers
text/plain 362b
Sorry for the missing example See attachement. Actually, quite many international formats exists, so making the regexp customizeable would also make sense (e.g. $oBook->{DateRegExp} = qr/^detect [my] date formats*/; ) in the example, i have extended the re to qr{^(?:\[\$\-\d+\])?[dmyDMY][;,-\s\\/dmyDMY\.]*(?:;\@)?$}; including . as possible separator.
Subject: us_date.pl
Download us_date.pl
text/x-perl 748b
use strict; use Data::Dump qw( dump ); require Spreadsheet::ParseExcel; my $oBook = Spreadsheet::ParseExcel::Workbook->Parse('us_date.xls'); #get the cell B2 foreach my $i (1..5) { my $oCell = $oBook->{Worksheet}[0]{Cells}[1][$i]; #get the text format string for this cell (its index) my $FmtIdx = $oCell->get_format()->{FmtIdx}; my $format_string = $oBook->{FormatStr}{$FmtIdx}; print "Format index $FmtIdx, \nwhich is defined as '$format_string', \n". "but is detected as type: ".$oCell->type()."\n"; my $alt_regexp = qr{^(?:\[\$\-\d+\])?[dmyDMY][;,-\s\\/dmyDMY\.]*(?:;\@)?$}; print "\nWith regexp: $alt_regexp \nthe format is detected as '". ($format_string =~ $alt_regexp ? "Date" : "Numeric")."'\n\n\n"; }
Subject: us_date.xls
Download us_date.xls
application/octet-stream 83.5k

Message body not shown because it is not plain text.

Download (untitled) / with headers
text/plain 1.1k
On Mon Nov 29 16:24:59 2010, ftl wrote: Show quoted text
> Actually, quite many international formats exists, so making the > regexp customizeable would also make sense
Hi, The type() and ChkType() methods are fundamentally flawed and I would prefer to deprecate them. Excel doesn't have a native date cell type so reporting the cell type based on a regular expression is never going to match all possible cases. I would prefer a scheme where the format was more more readily available so that the end-user could decide on the type themselves. Something like this: my $num_format = $cell->get_num_format(); if ($num_format =~ /some regex/) { # Do something } When I get a chance I'll add this to the interface. In the meantime if you would like to add a testcase for type() it would be very useful. Something along the lines of: http://cpansearch.perl.org/src/JMCNAMARA/Spreadsheet-ParseExcel-0.58/t/20_number_format_default.t http://cpansearch.perl.org/src/JMCNAMARA/Spreadsheet-ParseExcel-0.58/t/21_number_format_user.t If you are familiar with Git you can fork the code on GitHub: https://github.com/jmcnamara/spreadsheet-parseexcel John. --
Download (untitled) / with headers
text/plain 221b
On Sat Nov 27 15:28:51 2010, ftl wrote: Show quoted text
> an optional [$-409] in front, and a ;@ at the end, as this seems to be > quite common date format extentions. (as in "[$-409]d\\-mmm\\-yy;@")
What does the trailing ';@' mean ?
Download (untitled) / with headers
text/plain 490b
On Tue Mar 11 11:54:33 2014, DOUGW wrote: Show quoted text
> What does the trailing ';@' mean ?
The '@' means 'text' format for a number. The ';' is a separator. A format can include the following 4 sections: positive_num_format;negative_num_format;zeroes_format;text_format See the following (expand show all): http://office.microsoft.com/en-gb/excel-help/create-or-delete-a-custom-number-format-HP005199500.aspx?redir=0 Having said that, the format in the first post doesn't look correct. John
Download (untitled) / with headers
text/plain 1.2k
Actually the initial format is correct, even if it looks strange. This is the main cause of the original article. The [$-nnn] is a locale flag for date formats. Which makes even more sense to include it in the "heuristic date detection algoritm" (Which should be customable/modifyable, as i presume e.g. non-norwegians would not expect that date formats would contain åååå instead of yyyy....). And for date-formats, negative numbers are not not allowed, so only ;@ is needed at the end (try ;@@ at the end and enter a text to see the effect) btw: "[$-409]d\\-mmm\\-yy;@" was perl-encoded, so "[$-409]d\-mmm\-yy;@" would make more sense to enter in Excel. kind regards Frode . ty. 11. mars 2014 12.20.05 skreiv JMCNAMARA: Show quoted text
> On Tue Mar 11 11:54:33 2014, DOUGW wrote:
> > What does the trailing ';@' mean ?
> > The '@' means 'text' format for a number. The ';' is a separator. > > A format can include the following 4 sections: > > positive_num_format;negative_num_format;zeroes_format;text_format > > See the following (expand show all): > > http://office.microsoft.com/en-gb/excel-help/create-or-delete-a- > custom-number-format-HP005199500.aspx?redir=0 > > Having said that, the format in the first post doesn't look correct. > > John


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.