Objective
To integrate the microarray expression data with the biological network created using the system biology tool.
Theory
Cell is the most important principle component of life, where several reactions take place. Cell contains genetic material DNA that encodes for proteins. Proteins carry out many important functions, some of which include: acting as enzymes, providing energy for metabolic activities, giving shape to the cell and enabling motility to cell. Along with these activities, it plays a very important role in cell signaling.
Gene is the main heredity unit to perform all these actions. They hold the information to build and maintain cellular organisms. The expression of a gene is the process by which information from a gene, used in getting the functional product. Gene expression can observed by different experimental techniques such as Northern blot, serial analysis of gene expression (SAGE), Western blot, DNA microarray, RNA-Seq etc.
Microarray technique
There are various traditonal methods to analyze the gene using molecular biology experimental techniques but it is not possible to analyze or research on a large number of genes using these tradational methods simultaneously. DNA microarray is one of the advanced technology which enables the molecular biologist to analyze the gene expression of many genes quick and efficient manner.
Microarray is a glass slide to which DNA molecules are arranged orderly at specific spots. These spots may be million in number of identical copies of DNA molecules. To determine the on and off genes, researcher should need mRNA molecules present in cell , then the researcher labeling the mRNA molecule by reverse transcriptase generates complementary cDNA to the mRNA. The fluorescent nucleotides are added to the cDNA, with different dyes for different conditions. The labeled cDNA samples are placed on microarray slide, which bind / hybrdize with the DNA sample placed in microarray chip.
Figure 1: Schematic representation of Gene expression using microarray technique
( Image source : en.wikipedia.org/wiki/File:Microarray-schema.jpg )
Following hybridization, the spots in the microarray excited by a laser and scanned at suitable wavelengths to detect the red and green dyes. If a condition (healthy) for a particular gene was in greater level than from other condition (diseased), one would find the spot to be red or it would be green. Thus at the end of the experiment one can see the image of microarray. After getting the image of microarray, the image processing and analysis would be done and the expression data is represented in different ways.
- Absolute measurement
- Relative measurement or expression ratio
- Log2 (expression ratio)
- Discrete values
Thus, the microarray technique would provide the expression level of thousands of genes. These genes can be represented graphically in the form of biological networks whereas the gene expression data can be used to analyzed and validate these biological networks.
DNA, RNA, proteins, and metabolites all these molecules are ingredients of a cell, which in turn part of a tissue. These different tissues form organs of an organism. Many of these organisms form an ecosystem. During course of time, these organisms subjected to evolution, results in phylogenetic relationships. These relationships described as a network. Biological networks at the molecular level are gene regulation, signal transduction, protein interaction, and metabolic networks.
How to represent these complex pathways or networks on a computer?
In mathematics or computer science perspective, network is represented as graph. It has nodes otherwise called as vertices, which are linked to each other through edges. Network is simply a collection of links connecting nodes in the network. Minimum information required to form a network is connectivity rules i.e., which node connects to which node. Speaking in biological terms, the links might refer to function of a particular protein or gene (nodes) or their expression patterns. Networks are collections of interactions, which contain pathways that are interlinked. All the pathways are subsets of networks, which means in a network pathways are interlinked.
Figure 2: Example of complex biological network(protein protein interaction network)
Why to design or model biological networks?
Modeling a biological network would represent a biological scenario like a pathway and can help us to bring out the hidden properties of that system. This model can help us predict the dynamic behavor of the network which could be comparable with the experimental results. Accuracy of the model can be increased or it can be corrected with these predictions.
What are different kind of Biological networks?
There are different kinds of biological networks, mainly depending upon the biological process studied. One can distinguish gene regulatory networks, protein-protein interaction networks, and metabolic networks.
Protein interaction networks; The nodes are proteins and the edges are interaction between proteins. Signal transduction networks are proteins interaction networks. These are proteins regulates the transmission of signals within the cell. The edges are directed and can have activation on inhibition role.
Gene regulatory network governs gene expression. The nodes are genes, protein, or mRNA. The edges can represent inhibition or activation.
Metabolic pathways or network describes the set of biochemical reactions, which regulates the biological process. The nodes can be substrates or products and the edges reflect reactions or regulation of them.
Modeling, reverse engineering and analysis of these macromolecular complex networks have interested the computational biologists, which lead them to develop specific tools like cytoscape, cell designer, E-cell, J-Designer etc.
Cytoscape is an open source platform to visualise and analyze complex networks, which is protected under GNU LGPL (Lesser General Public License). It is available to all platforms. It is also extensible through plug-in architecture for different computational analyses. These biological networks are organised as graphs, connecting each node to another node through edges (called as interactions). This software becomes more powerful when it is used in connection with large databases of genetic interactions, protein-protein interactions.