Type: | Package |
Title: | Quick Access to Homologene and Gene Annotation Updates |
Version: | 1.4.68.19.3.27 |
Depends: | R (≥ 3.1.2) |
Imports: | dplyr (≥ 0.7.4), magrittr (≥ 1.5), purrr (≥ 0.2.5), readr (≥ 1.3.1), R.utils(≥ 2.8.0) |
Suggests: | testthat (≥ 1.0.2) |
Date: | 2019-03-28 |
BugReports: | https://github.com/oganm/homologene/issues |
URL: | https://github.com/oganm/homologene |
Description: | A wrapper for the homologene database by the National Center for Biotechnology Information ('NCBI'). It allows searching for gene homologs across species. Data in this package can be found at ftp://ftp.ncbi.nih.gov/pub/HomoloGene/build68/. The package also includes an updated version of the homologene database where gene identifiers and symbols are replaced with their latest (at the time of submission) version and functions to fetch latest annotation data to keep updated. |
License: | MIT + file LICENSE |
LazyData: | true |
RoxygenNote: | 6.1.1 |
NeedsCompilation: | no |
Packaged: | 2019-03-28 20:51:16 UTC; omancarci |
Author: | Ogan Mancarci [aut, cre], Leon French [ctb] |
Maintainer: | Ogan Mancarci <ogan.mancarci@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2019-03-28 23:10:03 UTC |
Attempt to automatically translate a gene list
Description
Given a list of query gene list and a target gene list, the function
tries find the homology pairing that matches the query list to the target list. The query list
is a short list of genes while the target list is supposed to represent a large number of genes from the target
species. The default output will be the largest possible list. If returnAllPossible = TRUE
then
all possible pairings with any matches are returned. It is possible to limit the
search by setting possibleOrigins
and possibleTargets
. Note that gene symbols of some species
are more similar to each other than others. Using this with small gene lists and without providing any
possibleOrigins
or possibleTargets
might return multiple hits, or if returnAllPossible = TRUE
a wrong match can be returned.
Usage
autoTranslate(genes, targetGenes, possibleOrigins = NULL,
possibleTargets = NULL, returnAllPossible = FALSE,
db = homologene::homologeneData)
Arguments
genes |
A list of genes to match the target. Symbols or NCBI ids |
targetGenes |
The target list. This list is supposed to represent a large number of genes from the target species. |
possibleOrigins |
Taxonomic identifiers of possible origin species |
possibleTargets |
Taxonomic identifiers of possible target species |
returnAllPossible |
if TRUE returns all possible pairings with non zero gene matches. If FALSE (default) returns the best match |
db |
Homologene database to use. |
Value
A data frame if returnAllPossibe = FALSE
and a list of data frames if TRUE
Download gene history file
Description
Downloads and reads the gene history file from NCBI website. This file is needed for other functions
Usage
getGeneHistory(destfile = NULL, justRead = FALSE)
Arguments
destfile |
Path of the output file. If NULL a temp file will be used |
justRead |
If TRUE and destfile exists, it reads the file instead of downloading the latest one from NCBI |
Value
A data frame with latest gene history information
Download gene symbol information
Description
This function downloads the gene_info file from NCBI website and returns the gene symbols for current IDs.
Usage
getGeneInfo(destfile = NULL, justRead = FALSE, chunk_size = 1e+06)
Arguments
destfile |
Path of the output file. If NULL a temp file will be used |
justRead |
If TRUE and destfile exists, it reads the file instead of downloading the latest one from NCBI |
chunk_size |
Chunk size to be used with |
Value
A data frame with gene symbols for each current gene id
Get the latest homologene file
Description
This function downloads the latest homologene file from NCBI. Note that Homologene
has not been updated since 2014 so the output will be identical to homologeneData
included in this package. This function is here for futureproofing purposes.
Usage
getHomologene(destfile = NULL, justRead = FALSE)
Arguments
destfile |
Path of the output file. If NULL a temp file will be used |
justRead |
If TRUE and destfile exists, it reads the file instead of downloading the latest one from NCBI |
Value
A data frame with homology groups, gene ids and gene symbols
Get homologues of given genes
Description
Given a list of genes and a taxid, returns a data frame inlcuding the genes and their corresponding homologues
Usage
homologene(genes, inTax, outTax, db = homologene::homologeneData)
Arguments
genes |
A vector of gene symbols or NCBI ids |
inTax |
taxid of the species that the input genes are coming from |
outTax |
taxid of the species that you are seeking homology |
db |
Homologene database to use. |
Examples
homologene(c('Eno2','17441'), inTax = 10090, outTax = 9606)
homologeneData
Description
List of gene homologues used by homologene functions
Usage
homologeneData
Format
An object of class data.frame
with 275237 rows and 4 columns.
homologeneData2
Description
A modified copy of the homologene database. Homologene was updated at 2014 and many of its gene IDs and symbols are out of date. Here the IDs and symbols are replaced with their most current version Last update: Wed Mar 27 16:34:11 2019
Usage
homologeneData2
Format
An object of class data.frame
with 269592 rows and 4 columns.
Version of homologene used
Description
Version of homologene used
Usage
homologeneVersion
Format
An object of class integer
of length 1.
Human/mouse wraper for homologene
Description
Human/mouse wraper for homologene
Usage
human2mouse(genes, db = homologene::homologeneData)
Arguments
genes |
A vector of gene symbols or NCBI ids |
db |
Homologene database to use. |
Examples
human2mouse(c('ENO2','4340'))
Mouse/human wraper for homologene
Description
Mouse/human wraper for homologene
Usage
mouse2human(genes, db = homologene::homologeneData)
Arguments
genes |
A vector of gene symbols or NCBI ids |
db |
Homologene database to use. |
Examples
mouse2human(c('Eno2','17441'))
Objects exported from other packages
Description
These objects are imported from other packages. Follow the links below to see their documentation.
Names and ids of included species
Description
Names and ids of included species
Usage
taxData
Format
An object of class data.frame
with 21 rows and 2 columns.
Update homologene database
Description
Creates an updated version of the homologene database. This is done by downloading
the latest gene annotation information and tracing changes in gene symbols and
identifiers over history. homologeneData2
was created using
this function over the original homologeneData
. This function
requires downloading large amounts of data from the NCBI ftp servers.
Usage
updateHomologene(destfile = NULL,
baseline = homologene::homologeneData2, gene_history = NULL,
gene_info = NULL)
Arguments
destfile |
Optional. Path of the output file. |
baseline |
The baseline homologene file to be used. By default uses the
|
gene_history |
A gene history data frame, possibly returned by |
gene_info |
A gene info data frame that contatins ID-symbol matches,
possibly returned by |
Value
Homologene database in a data frame with updated gene IDs and symbols
Update gene IDs
Description
Given a list of gene ids and gene history information, traces changes in the gene's name to get the latest valid ID
Usage
updateIDs(ids, gene_history)
Arguments
ids |
Gene ids |
gene_history |
Gene history information, probably returned by |
Value
A character vector. New ids for genes that changed ids, or "-" for discontinued genes. the input itself.
Examples
## Not run:
gene_history = getGeneHistory()
updateIDs(c("4340964", "4349034", "4332470", "4334151", "4323831"),gene_history)
## End(Not run)