Trees | Indices | Help |
---|
|
Parse Unigene flat file format files such as the Hs.data file. Here is an overview of the flat file format that this parser deals with: Line types/qualifiers: ID UniGene cluster ID TITLE Title for the cluster GENE Gene symbol CYTOBAND Cytological band EXPRESS Tissues of origin for ESTs in cluster RESTR_EXPR Single tissue or development stage contributes more than half the total EST frequency for this gene. GNM_TERMINUS genomic confirmation of presence of a 3' terminus; T if a non-templated polyA tail is found among a cluster's sequences; else I if templated As are found in genomic sequence or S if a canonical polyA signal is found on the genomic sequence GENE_ID Entrez gene identifier associated with at least one sequence in this cluster; to be used instead of LocusLink. LOCUSLINK LocusLink identifier associated with at least one sequence in this cluster; deprecated in favor of GENE_ID HOMOL Homology; CHROMOSOME Chromosome. For plants, CHROMOSOME refers to mapping on the arabidopsis genome. STS STS ACC= GenBank/EMBL/DDBJ accession number of STS [optional field] UNISTS= identifier in NCBI's UNISTS database TXMAP Transcript map interval MARKER= Marker found on at least one sequence in this cluster RHPANEL= Radiation Hybrid panel used to place marker PROTSIM Protein Similarity data for the sequence with highest-scoring protein similarity in this cluster ORG= Organism PROTGI= Sequence GI of protein PROTID= Sequence ID of protein PCT= Percent alignment ALN= length of aligned region (aa) SCOUNT Number of sequences in the cluster SEQUENCE Sequence ACC= GenBank/EMBL/DDBJ accession number of sequence NID= Unique nucleotide sequence identifier (gi) PID= Unique protein sequence identifier (used for non-ESTs) CLONE= Clone identifier (used for ESTs only) END= End (5'/3') of clone insert read (used for ESTs only) LID= Library ID; see Hs.lib.info for library name and tissue MGC= 5' CDS-completeness indicator; if present, the clone associated with this sequence is believed CDS-complete. A value greater than 511 is the gi of the CDS-complete mRNA matched by the EST, otherwise the value is an indicator of the reliability of the test indicating CDS completeness; higher values indicate more reliable CDS-completeness predictions. SEQTYPE= Description of the nucleotide sequence. Possible values are mRNA, EST and HTC. TRACE= The Trace ID of the EST sequence, as provided by NCBI Trace Archive
|
|||
|
|||
SequenceLine Store the information for one SEQUENCE line from a Unigene file |
|||
ProtsimLine Store the information for one PROTSIM line from a Unigene file |
|||
STSLine Store the information for one STS line from a Unigene file |
|||
Record Store a Unigene record |
|||
UnigeneSequenceRecord Store the information for one SEQUENCE line from a Unigene file (DEPRECATED). |
|||
UnigeneProtsimRecord Store the information for one PROTSIM line from a Unigene file (DEPRECATED). |
|||
UnigeneSTSRecord Store the information for one STS line from a Unigene file (DEPRECATED). |
|||
UnigeneRecord Store a Unigene record (DEPRECATED). |
|||
_RecordConsumer This class is DEPRECATED; please use the read() function in this module instead. |
|||
_Scanner Scans a Unigene Flat File Format file (DEPRECATED). |
|||
RecordParser This class is DEPRECATED; please use the read() function in this module instead. |
|||
Iterator This class is DEPRECATED; please use the parse() function in this module instead. |
|
|||
|
|||
|
|||
|
|
|||
UG_INDENT = 12
|
|||
__package__ =
|
Trees | Indices | Help |
---|
Generated by Epydoc 3.0.1 on Sat Aug 20 10:37:29 2011 | http://epydoc.sourceforge.net |