- To study about Gene and its features.
- To retrieve gene information from NCBI Gene database.
NCBI (National Center for Biotechnology Information) acting as a resource database for molecular biology, computational biology, biochemistry and genetics information which aids in developing new technologies for managing the biological process that control health and diseases. Database is collection of information managed as a computer program in an organized manner. Biological databases are libraries of biological information collected from different literatures, experiments and other analysis which can easily be accessed, updated, and retrieved. Composite database integrates various other primary databases.NCBI established on November 4, 1988 as a part of National Library of Medicine (NLM) at the National Institute of Health (NIH). It can be accessed from the URL http://www.ncbi.nlm.nih.gov/. It maintains collaborations with several NIH institutes, industries and other government agencies.
The major functions of NCBI are:
- Create public databases for storing, retrieving, and analyzing knowledge about molecular biology, biochemistry, and genetics.
- Conduct research in computational biology, for analyzing the structure and function of biological molecules.
- Develop software tools for analyzing genomic data.
- Disseminate biomedical information.
- Efforts to gather biotechnology information worldwide.
NCBI supports various databases which can be accessed online through Entrez search engine. Entrez is a powerful search engine which allows users to search and retrieve different data from National Center for Biotechnology Information (NCBI). Entrez integrates different scientific literatures, nucleotide and protein databases, protein domain data, population study datasets, expression data, pathways and systems of interacting molecules, complete genome details and taxonomic information into a tightly inter linked system. User can search and retrieve the queries from different databases with in the same interface all together. For more details click here.
NCBI is a composite database and supports various other databases which can be accessed online through Entrez search engine. NCBI Gene database provide information on the different genes from genome of an organism which are completely sequenced. Gene is the hereditary unit which inherits features from ancestors in a living organism. Genetics is the branch of science which deals with the study of genes. Every organism have different genes corresponding to different characters, of which some are visible characters like hair color, height and some are invisible like biological process inside the body. Genes are built from long molecules called DNA (hereditary material) which lies in the chromosome (single piece of coiled DNA with many genes) and is transferred from generations to generations. An allele is another form of a gene which determines different characters that are passed from parents to offspring’s. This process was discovered by Gregor Mendel and it is known as “Mendel’s Law of Inheritance”. The total genetic content of an organism constitute the genome in the chromosome of an organism. There are about 20,000 genes located in the 23 pairs of chromosomes of an organism. All the known genes, its product and functions are provided in the Gene database of NCBI.
The information managed in the Gene database is the results of curation and automated integration of data from NCBI’s Reference Sequence project (RefSeq). It comprises of information about various species including their nomenclature, associated pathways, RefSeqs, phenotypes and links to genome. It can be accessed from the URL http://www.ncbi.nlm.nih.gov/gene. Gene database can be accessed by simply query the word, preferably the gene name or the disease names to the query box which will display the list of genes associated with the search. User can also search records with their GeneID, which is a unique identifier given by NCBI. The ‘limits’ feature allows the user to filter the search according to their needs.
There are different methods to query in NCBI gene. Queries can be searched on the following basis(Table 1).
Table 1: Different method to query in gene database
Mainly one can search for genes either by giving a particular disease condition and searching the different genes involved or collecting the gene names from the articles/ interaction experiments/ pathways and providing the names as such. The gene names are provided by the gene nomenclature commitee specific for each organism. HUGO Gene Nomenclature Committee (HGNC) is the committe which gives unique gene names for the organism Homo sapiens (human). One can also search genes based on these names.
User can filter results by applying limits to the query. To make the search specific, the user can combine the searches with Boolean operators like AND, OR, NOT (Uppercase only). The user can limit their search for a particular organism by selecting an organism from the drop down menu in the “Limit by Chromosomal Region”. The “Exclude” option enables to exclude some information from our result set and “Include” allows incorporating user needs, specifically. User can limit by RefSeq (Reference Sequence) status, which locate the genomic RefSeq for the reference assembly. User can limit the search by taxonomy, which enables to search by organism or a group of organisms. User can filter the search by dates which can be selected from the drop down menu. The “Advanced” search option allows the user a detailed search regarding specific queries. After finishing the search, if the user wants to continue the search for a new query, “clear” button can be used.