[BioC] getHomolog in biomaRt

Steve Pederson stephen.pederson at student.adelaide.edu.au
Tue Apr 10 14:36:25 CEST 2007


Hi,

I'm still on a steep learning curve with R & am trying to convert a 
large batch of mouse entrezIDs to homologous human entrezID & when 
sending as a batch to biomaRt the search result doesn't contain the 
query string (is this possible as a suggestion for the next update?), so 
is unable to be matched to the original. For example:

 > getHomolog( id = c("73663","66645","74855"), to.type = "entrezgene", 
from.type = "entrezgene", from.mart = mouse, to.mart=human )
      V1
1 55269

As a result, I'm sending one at a time via a quick function that I set 
up. The batch regularly seems to fail & I get the following error message:
Error in read.table(con, sep = "\t", header = FALSE, quote = "", 
comment.char = "",  :
         no lines available in input

This is an example of the exact code that causes it:
library(biomaRt)
human <- useMart("ensembl","hsapiens_gene_ensembl")
mouse <- useMart("ensembl","mmusculus_gene_ensembl")
getHomolog( id = "380768", to.type = "entrezgene", from.type = 
"entrezgene", from.mart = mouse, to.mart=human )

The response is not NULL, as my code is set up to handle this response.

My main question is, does anyone know how do I stop the loop aborting 
when I receive this error message, which I think is external? If I can 
record which specific IDs are causing the error, I could exclude them 
from the original batch, but the error-handling is a bit murky to my 
reading in the R help. My actual function is included below 
(biomaRt.conversion).

Unfortunately, I don't have any MySQL experience (yet) so that isn't an 
option for me as an alternative.

The list is derived from those unable to be matched from 
ProbeMatchDB2.0, as that database maps via Unigene
http://brainarray.mbni.med.umich.edu/Brainarray/Database/ProbeMatchDB/ncbi_probmatch_para_step1.asp

Thanks,

Steve



biomaRt.conversion <- function(x,from.id,to.id,from.sp,to.sp)
   {
     # x is the initial list of ids
     # from.id & to.id are the type of codes (e.g entrez or unigene)
     # from.mart & to.mart can only be human or mouse
     # Warnings will need to be suppressed in the case of no match existing
     homologs <- c()
     no.homolog <- c()
     if (from.sp=="human") mart1 
<-useMart("ensembl","hsapiens_gene_ensembl")
     if (to.sp=="human") mart2 <- useMart("ensembl","hsapiens_gene_ensembl")
     if (from.sp=="mouse") mart1 
<-useMart("ensembl","mmusculus_gene_ensembl")
     if (to.sp=="mouse") mart2 <- 
useMart("ensembl","mmusculus_gene_ensembl")
     for (i in 1:length(x))
       {
         suppressWarnings(hum <- getHomolog( id = x[i], to.type=to.id, 
from.type =from.id, from.mart = mart1, to.mart = mart2))
         if (is.null(hum)==FALSE) # if a homolog was found
           {
             #A duplicate removal stage
             if(dim(hum)[1]>1)
               {
                 j=1 # the first entry in hum to check for duplicates
                 k=dim(hum)[1]
                 while(j<k)
                   {
                     if(length(which(hum==hum[j]))>1)# if there is a 
duplicate
                       {
                         hum <- hum[-(which(hum==hum[j])[-1]),] #removes 
all the duplicates except the first
                         #reset the values
                         if(is.null(dim(hum)[1])==TRUE)
                           {
                             k=0 #this will exit the loop if "hum" is 
now a single value
                           }
                         else
                           {
                             k=dim(hum)[1]
                             j=j+1
                           }
                       }
                   }
               }

             for (j in 1:length(hum))
               {
                 homologs <- rbind(homologs,c(x[i],hum[j]))
               }

           }
         else #if no homolog was found
           {
             no.homolog <- c(no.homolog,x[i])
           }
       }
     colnames(homologs) <- 
c(paste(from.sp,"ID",sep="."),paste(to.sp,"ID",sep="."))
     list(homologs=data.frame(homologs),no.homolog=no.homolog)
   }



More information about the Bioconductor mailing list