Kielce, Poland 2007
Add to favorites

gp2fasta

gp2fasta is converting gp files from NCBI GenPept or GenBank format to fasta. Its main purpose is to create fasta files with short, but still accurate headers for sequence.

For example:
>Strpur-115729834-h
PNQILMQFRLDDNGSSYYKELASIIYGASPEFELAIFTVCFKENPNALSTFTMAGGITQKVQTWDYNGGYIGSAYFSV

stands for:
gi: 115729834
organism: Strongylocentrotus purpuratus
additional: h (hypothetical protein)
sequence:PNQILMQFRLDDNGSSYYKELASIIYGASPEFELAIFTVCFKENPNALSTFTMAGGITQKVQTWDYNGGYIGSAYFSV
 
Options:
- for id: GI or LOCUS;
- for organism: e.g. Mus musculus, M.musculeu or Musmus;
- detailed definition;
- additional information:
    P -> PREDICTED
    s -> similar
    h -> hypothetical protein
    u -> unnamed protein product
    n -> novel
    p -> putative
    o -> open reading frame

Each option is separated with "separator" (in this case "-").



Organism: Homo sapiens  H.sapiens   Homsap

ID:  GI   Locus

Genename           Additional          Separator



Example file

Free QT4 version of gp2fasta with GUI


My other projects:

MetaDisorder - Prediction of Intrinsically Unstructured Proteins (protein disorder) from amino acid sequence only.
Protein isoelectric point calculator
Shannon entropy calculator - Real example how to calculate and interpret information entropy



Suggestions and comments
please send to: lukaskoz@o2.pl
Author: Lukasz Kozlowski
Last modified: 29.10.2012


top
nvumade