Theory Procedure Self Evaluation Simulator Assignment Reference Feedback

Procedure to run FASTA program

There are four steps require to run FASTA program.

Step 1: Specify the tool input (sequence and database).

Step 2: Entering of input sequence.

Step 3: Set up the parameters.

Step 4: Submit the query for processing.

Figure 1: Fasta home page

Step 1: Specify the tool input

Select the database to search :Databases are required to run the sequence similarity search. Multiple databases can be used at the same time. The different databases are

Uniprot Knoweldge base

Uniprot KB/swiss-prot

Uniprot KB/ Swissprot isoforms

Uniprot KB /Trembl

UniProtKB Taxonomic Subsets

UniProt Clusters

Patents

Structure

Figure 2 : Selecting the database

Step 2 Entering of input sequence

The query sequence can be entered directly in GCG, FASTA, EMBL, GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot formats.

Sequence file upload

A file containing the valid sequence in any format mentioned above can be used as a query for sequence similarity search. Sequence type indicates the type of sequence (PROTEIN / DNA / RNA) for similarity search.Go to simulator tab to know more about how to retrieve the query sequence.

Figure 3 : Entering of input sequence

Step 3: Setting up parameters

User has to specify the type of program and the matrix for scoring. FASTA, FASTX, FASTY, SSEARCH, GGSEARCH and GLSEARCH are the different programs used. Substitution matrix are used for scoring alignments. The matrices are BLOSUM50, BLOSUM62, BLASTP62, PAM120, PAM250, MDM10, MDM20, and MDM40. BLOSUM50 is set as a default substitution matrix. Parameters include.

GAP open and GAP extended penalty: Common and regular cause for GAP is mutation, if gap penalty is low we can get high scoring sequence similarity search. Also gaps will increase uncertainty in alignment.

Ktup: It is a value given as the word size for comparison.

Expectation value (E-value): It decreases exponentially with the score that is assigned to an alignment between two sequences.

Strands, Histograms, Filter: It filters the low complex regions in sequence similarity search. Histogram will give graphical representation of scores.

Statistical estimates, Scores, alignments, sequence range and database range: specify the range of the query for search in database.

HSPs, Score format, Transition table score format: are the different score formats. Transition table gives the genetic codes used in translation.

Figure 4 : Setting up parameters.

Step 4: Submission

The result page can be seen in another window by clicking submit. This is an interactive process, when the process is complete the result will be displayed in the browser. Result can be sent to a valid email address which has to be specified in the text box.

Figure 5 :Submission

FASTA Result Analysis

Summary Table

Result page appears by giving the information like aligned sequences from the sequence similarity search, database id, source of the sequence, Gene-expression, molecule type, Nucleotide sequence, Genomics, Protein sequences, Ontologies, Enzymes, protein families, and Literature, which is followed by the length of sequence, score, identities, positives and E-value.

Figure 6 : Summary table

Tool output:

Tool output gives complete statistical details of the sequence similarity search.

Figure 7 : Tool output

Visual output

FASTA visual output gives the result of the sequence match and subject match with their E-values in a colour full schema.

Figure 8: Visual output

This experiment uses :FASTaye (FASTA) , http://www.ebi.ac.uk/Tools/sss/fasta/

Cite this Simulator: