Difference between revisions of "FASTA"
From Genome Analysis Wiki
Jump to navigationJump to searchLine 1: | Line 1: | ||
A simple text format for storing DNA sequences. | A simple text format for storing DNA sequences. | ||
− | A FASTA file can store one or more DNA sequences. Each record in a FASTA file begins with a > character (which must be the first character in the line) a sequence label and optional commentary. This | + | A FASTA file can store one or more DNA sequences. Each record in a FASTA file begins with one line header a > character (which must be the first character in the line), a sequence label and optional commentary. This header line is followed by a sequence that can wrap over multiple lines, as needed. Typically, each line has about 50 characters and it is recommended that every line in a sequence should have the same length -- to facilitate indexing. Nearly all programs that support FASTA format recognize A, C, T, G and N as valid characters in the sequence. Many also recognize IUPAC codes. |
== Example == | == Example == |
Latest revision as of 19:39, 23 March 2010
A simple text format for storing DNA sequences.
A FASTA file can store one or more DNA sequences. Each record in a FASTA file begins with one line header a > character (which must be the first character in the line), a sequence label and optional commentary. This header line is followed by a sequence that can wrap over multiple lines, as needed. Typically, each line has about 50 characters and it is recommended that every line in a sequence should have the same length -- to facilitate indexing. Nearly all programs that support FASTA format recognize A, C, T, G and N as valid characters in the sequence. Many also recognize IUPAC codes.
Example
>sequenceName Comments about the sequence len=120 ACTGACTGACACTGACTGACACTGACTGACACTGACTGACACTGACTGAC ACTGACTGACACTGACTGACACTGACTGACACTGACTGACACTGACTGAC ACTGACTGACACTGACTGAC