Procedure
- Follow ( https://vlab.amrita.edu/index.php?sub=3&brch=311&sim=1835&cnt=2) to install R in personal computer.
- Install the SeqinR package.
- To load “SeqinR” R package follow > library("seqinr")
- For retrieving a specific sequence with particular NCBI accession, use R function “getncbiseq()”.
> getncbiseq <- function(accession)
{
require("seqinr") # this function requires the SeqinR R package
# first find which ACNUC database the accession is stored in:
dbs <- c("genbank","refseq","refseqViruses","bacterial")
numdbs <- length(dbs)
for (i in 1:numdbs)
{
db <- dbs[i]
choosebank(db)
# check if the sequence is in ACNUC database 'db':
resquery <- try(query(".tmpquery", paste("AC=", accession)), silent = TRUE)
if (!(inherits(resquery, "try-error")))
{
queryname <- "query2"
thequery <- paste("AC=",accession,sep="")
query(`queryname`,`thequery`)
# see if a sequence was retrieved:
seq <- getSequence(query2$req[[1]])
closebank()
return(seq)
}
closebank()
}
print(paste("ERROR: accession",accession,"was not found"))
}
EXAMPLE
- After entering function getncbiseq() into R, user can retrieve a sequence using NCBI Nucleotide database, for example accession NC_001477.
- Follow the code
> dengueseq <- getncbiseq("NC_001477")
3. Here dengueseq represented a variable, a vector included nucleotide sequence.
Procedure to work the simulator
- Follow the code in the command window:
library("seqinr") # Load seqinr package in R
choosebank("genbank") # ChossingGenBank for the sequence
choosebank("refseqViruses") # Choosing the sub database which we want to search
test_query<- query("Dengue1", "AC=NC_001477") # querying and setting the name and passing accession number
attributes(test_query) # for viewing the attriutes of query()
dengueseq<- getSequence(test_query$req[[1]]) # get the dna sequence
seqname<- getName(test_query)
#write.fasta(sequences = dengueseq, names = seqname, file.out="denguevirus.fasta")
R console for querying NCBI database in R
Note
This experiment retrieves the sequence data directly from the NCBI database using R programming. Using the choosebank() function in the ‘seqinr’ library the R platform connects to the GenBank database. Using the R programming, user can retrieve and save the species specific sequence data in a variable.