Skip Menu | will be shut down on March 1st, 2021.

This queue is for tickets about the bioperl CPAN distribution.

Report information
The Basics
Id: 64656
Status: new
Priority: 0/
Queue: bioperl

Owner: Nobody in particular
Requestors: paolo [...]

Bug Information
Severity: Important
Broken in: 1.6.1
Fixed in: (no value)

Subject: Bio::Tools::GFF Invalid GFF3 output
Download (untitled) / with headers
text/plain 1.5k
I was trying to write GFF3 files starting from Bio::SeqI and Bio::SeqFeatureI type of objects, using Bio::Tools::GFF. The results are far from encouraging. 1) The various elements are not identified by the mandatory ID attribute, nor are linked through a Parent attribute. 2) Bio::Tools::GFF is unable to parse nested Bio::SeqFeature::Generic objects. 3) Complete lack of controlled vocabulary for attributes leads to the insertion of invalid GFF3 attributes. 4) Missing phase information when parsing Bio::Seq objects loaded from GenBank format files. I have attached a simple script that takes a file in GenBank format and translates it into GFF3 format. I use this on-line validator for GFF3: Now: I understand that Bio::Tools::GFF should be replaced by the correspondent functionality in Bio::SeqIO. Is there any estimate? Is there any plan to identify and standardize attribute names when populating objects through Bio::SeqIO, so that these attributes can be properly translated in the equivalent ones when exporting to a specific format? Currently I do not have too much time available for development and I would hate to spend it reinventing wheels. I would therefore appreciate if you could point me to existing resources that could help me in creating valid GFF3 files from BioPerl objects (of course containing all the necessary elements). I cannot guarantee any commitment in contributing with the development of BioPerl. However, I would also appreciate instructions on how I could be helpful in contributing to the codebase. Thanks. Paolo Amedeo
text/x-perl 469b
#!/usr/local/bin/perl use strict; use warnings; use Bio::SeqIO; use Bio::Tools::GFF; use File::Basename; my $usage = basename($0) . ' gbk_file gff_output_file'; die "$usage\n\n" unless @ARGV == 2; my $seq_in = Bio::SeqIO->new(-file => $ARGV[0], -format => 'genbank'); my $out = Bio::Tools::GFF->new(-gff_version => 3, -file => ">$ARGV[1]"); while (my $seq = $seq_in->next_seq()) { my @features = $seq->get_SeqFeatures(); $out->write_feature(@features); }

This service is sponsored and maintained by Best Practical Solutions and runs on infrastructure.

Please report any issues with to