Jennifer Shelton edited introduction.tex  over 8 years ago

Commit id: f056c3403c7e7387f346190333b4e22ac13bc123

deletions | additions      

       

\subsection{FASTA file format specifications versus recommendations}  FASTA file format requirements are very minimal. minimal \cite{FASTA_format}.  Each sequence is preceded by a header/description line that begins with a \verb|>|. Sequence lines can include any standard IUB/IUPAC single character symbols for nucleic acids or amino acids or the ambiguous codes that indicate possible residues or bases. They can also include \verb|-| to indicate alignment gaps and \verb|*| to indicate stop codons. It is often recommended to wrap FASTA file sequences lines. It is also common practice to use the first `word' in a header (i.e. any character string to the left of the first space in the header) as the unique sequence id. Although these features are common they are not required leading to format compatibility issues with tools that treat these conventions as required features.