- Secondary structure analysis of a protein using SOPMA
In this exericise one can learn how to analyze the secondary structure of a protein using SOPMA. The structure of a protein has a very important role in its function. The binding of a protein with other molecules is very specific to carry out its function properly. For this reason every protein has a particular structure. Protein structures are classified into primary, secondary, tertiary, and quaternary. The proteins are synthesized as primary sequence and then it fold to form secondary, tertiary and quaternary structure.
All proteins are made up of long chain of amino acids that fold into a 3-D shape. Amino acids are organic compounds that contain a hydrogen atom, α carbon, two functional groups and a side chain R group. The α carbon is the first carbon atom that attached to a functional group. The two functional groups in amino acid are an amino group and a carboxyl group. The functional groups and R group are also bonded to α carbon atom. The side chain refers to a particular amino acid. There are almost 20 amino acids are found in human body that varies in their R groups. R group can be hydrophobic or hydrophilic. The hydrophobic side chains will tend to get away from water environment while hydrophilic side chains are attracted towards it. The atoms attached to some of the hydrophilic side chains make them acidic and some of them make them basic. So the basic ends will attracted towards the acidic ends. This makes the protein to be in its native conformation. The native conformation is the condition of a protein which is correctly folded and functional.
Amino acids are linked to each other by peptide bond. A peptide bond is formed when the carboxyl group of one amino acid linked to the amino group of another molecule through a covalent bond. During this reaction a molecule of water is released. Short sequence of amino acids held together by peptide bonds is called peptides. Each amino acid in a peptide is called as a residue. Each end of every peptide has an N-terminus and C-terminus residue. N-terminus is the starting of a protein which contains an amino acid with a free amine group (-NH2) and the C-terminus is the end of a protein which contains an amino acid (-COOH) with a free carboxyl group.
The primary structure of a protein is made up of linear sequence of amino acid. It is synthesized during the translation process of DNA to mRNA. DNA (Deoxyribonucleic acid) is the genetic material that contains all the genetic information for the development and maintaining all functions in all living organisms. The information is stored as genetic codes using four types of bases. They are adenine (A), guanine (G), cytosine(C) and thymine (T). In two strands of DNA, adenine always pair with thymine and guanine pair with cytosine. Each of these base pairs will bond with a sugar and phosphate molecule to form a nucleotide. The base pairing of DNA will result in a ladder shape structure of these strands which is called a double helix. RNA is differing from DNA only in 1 base pair i.e. in RNA it is uracil (U) instead of thymine. mRNA (messenger RNA) is a molecule of RNA which is forming from DNA transcription process. During the transcription process, DNA is transcribed to mRNA i.e. thymine is replaced by Uracil.
The intermolecular and intramolecular hydrogen bonding between the amide groups in primary structure of protein form secondary structure. The attraction of hydrogen molecule towards electro negative atom (N, F, O etc) within same molecule is called intramolecular hydrogen bonding and formed between two different molecules is called intermolecular H bonding. Alpha helices and beta sheets are the two important secondary structures in protein. The alpha helix has a right handed helix conformation. It is stabilized by hydrogen bonds between the carbonyl (CO) group and amino (NH) group of the fourth amino acid in the C – terminal. The structures that are formed with zigzag back bone of amino acids are called as strands (e.g. beta strands). Beta sheets are planar structures that are made up of beta strands that connected through hydrogen bonds.
The structural information of a protein can be determined by x–ray crystallography or nuclear magnetic resonance (NMR) spectroscopy methods. Here X-rays of a particular wave length are diffracted by electrons in a comparable size of atom. The resulting X-ray patterns are obtained as small spots in an X-ray film. These patterns are used to calculate the coordinates of atoms in a protein. NMR spectroscopy (Nuclear Magnetic Resonance) is also used for determining the structure of molecules. The nucleus of an atom that is located in a high magnetic field can absorb the electromagnetic radiation of a particular frequency. Electromagnetic radiation is a form of energy that contains both electric and magnetic fields. This type of radiation includes X-rays, gamma rays, radio waves, visible light etc.
The Self-Optimized Prediction method With Alignment (SOPMA) is a tool to predict the secondary structure of a protein. Based on the query (primary sequence of a protein), SOPMA will predict its secondary structure. SOPMA is using homologue method of Levin et al.. According to this method, short homologous sequence of amino acids will tend to form similar secondary structure. So it has a whole database consist of 126 chains of non-homologous proteins. If the user enters an unknown protein, it will search against a collection of proteins in the database that have some similar properties and evolutionary history.