Procedure to run FASTA program
There are four steps require to run FASTA program.
Step 1: Specify the tool input (sequence and database).
Step 2: Entering of input sequence.
Step 3: Set up the parameters.
Step 4: Submit the query for processing.
Figure 1: Fasta home page
Step 1: Specify the tool input
Select the database to search :Databases are required to run the sequence similarity search. Multiple databases can be used at the same time. The different databases are
Uniprot Knoweldge base
Uniprot KB/swiss-prot
Uniprot KB/ Swissprot isoforms
Uniprot KB /Trembl
UniProtKB Taxonomic Subsets
UniProt Clusters
Patents
Structure
Figure 2 : Selecting the database
Step 2 Entering of input sequence
The query sequence can be entered directly in GCG, FASTA, EMBL, GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot formats.
Sequence file upload
A file containing the valid sequence in any format mentioned above can be used as a query for sequence similarity search. Sequence type indicates the type of sequence (PROTEIN / DNA / RNA) for similarity search.Go to simulator tab to know more about how to retrieve the query sequence.
Figure 3 : Entering of input sequence
Step 3: Setting up parameters
User has to specify the type of program and the matrix for scoring. FASTA, FASTX, FASTY, SSEARCH, GGSEARCH and GLSEARCH are the different programs used. Substitution matrix are used for scoring alignments. The matrices are BLOSUM50, BLOSUM62, BLASTP62, PAM120, PAM250, MDM10, MDM20, and MDM40. BLOSUM50 is set as a default substitution matrix. Parameters include.
GAP open and GAP extended penalty: Common and regular cause for GAP is mutation, if gap penalty is low we can get high scoring sequence similarity search. Also gaps will increase uncertainty in alignment.
Ktup: It is a value given as the word size for comparison.
Expectation value (E-value): It decreases exponentially with the score that is assigned to an alignment between two sequences.
Strands, Histograms, Filter: It filters the low complex regions in sequence similarity search. Histogram will give graphical representation of scores.
Statistical estimates, Scores, alignments, sequence range and database range: specify the range of the query for search in database.
HSPs, Score format, Transition table score format: are the different score formats. Transition table gives the genetic codes used in translation.
Figure 4 : Setting up parameters.
Step 4: Submission
The result page can be seen in another window by clicking submit. This is an interactive process, when the process is complete the result will be displayed in the browser. Result can be sent to a valid email address which has to be specified in the text box.
Figure 5 :Submission
FASTA Result Analysis
Summary Table
Result page appears by giving the information like aligned sequences from the sequence similarity search, database id, source of the sequence, Gene-expression, molecule type, Nucleotide sequence, Genomics, Protein sequences, Ontologies, Enzymes, protein families, and Literature, which is followed by the length of sequence, score, identities, positives and E-value.
Figure 6 : Summary table
Tool output:
Tool output gives complete statistical details of the sequence similarity search.
Figure 7 : Tool output
Visual output
FASTA visual output gives the result of the sequence match and subject match with their E-values in a colour full schema.
Figure 8: Visual output
This experiment uses :FASTaye (FASTA) , http://www.ebi.ac.uk/Tools/sss/fasta/