Abstract
Tximeta performs numerous annotation and metadata gathering tasks on behalf of users during the import of transcript counts and abundance from quantification tools such as salmon. Data are imported as SummarizedExperiment objects with associated GenomicRanges metadata. Correct metadata is added automatically via reference sequence digests, facilitating genomic analyses and assisting in computational reproducibility.
The tximeta package (Love et al. 2020) extends the tximport package (Soneson, Love, and Robinson 2015) for import of transcript-level quantification data into R/Bioconductor. It automatically adds annotation metadata when the RNA-seq data has been quantified with salmon (Patro et al. 2017) our related tools. To our knowledge, tximeta is the only package for RNA-seq data import that can automatically identify and attach transcriptome metadata based on the unique sequence collection of the reference transcripts. For more details on these packages – including the motivation for tximeta and description of similar work – consult the References below.
Note: tximeta() requires that the
entire output of the quantificaation tool is present in
the output directories and unmodified in order to identify the
provenance of the reference transcripts. In general, it’s a good idea to
not modify or re-arrange the output directory of bioinformatic software
as other downstream software often rely on and assume a consistent
directory structure. For sharing multiple samples, one can use, for
example, tar -czf to bundle up a set of salmon
output directories. For tips on using tximeta() with other
quantifiers see the other quantifiers
section below.
tximeta() import starts with sample tableThe first step using tximeta() is to read in the sample
table, which will become the column data, colData,
of the final object, a SummarizedExperiment. The sample table
should contain all the information we need to identify the
salmon quantification directories.
Here we will use a salmon quantification file in the
tximportData package to demonstrate the usage of
tximeta. We do not have a sample table, so we construct one
in R. It is recommended to keep a sample table as a CSV or TSV file
while working on an RNA-seq project with multiple samples.
dir <- system.file("extdata/salmon_dm", package="tximportData")
files <- file.path(dir, "SRR1197474", "quant.sf")
file.exists(files)## [1] TRUE
## files
## 1 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library/tximportData/extdata/salmon_dm/SRR1197474/quant.sf
## names condition
## 1 SRR1197474 A
tximeta() expects at least two columns in
coldata:
files - a pointer to the quant.sf
filesnames - the unique names that should be used to
identify samplesOther columns will be propagated to colData of the
SummarizedExperiment output.
tximeta()Normally, we would just run tximeta like so:
However, to avoid downloading remote GTF files during this vignette,
we will point to a GTF file saved locally (in the tximportData
package). We link the transcriptome of the salmon index to its
locally saved GTF. The standard recommended usage of
tximeta(),
in particular for quantifiction with reference transcripts with a pre-computed digest, would be the code
chunk above. If one were adding another set of reference transcripts,
one would normally specify a remote GTF source, not a local one.
This following code is therefore not recommended for a typical
workflow, but is particular to the vignette code.
indexDir <- file.path(dir, "Dm.BDGP6.22.98_salmon-0.14.1")
fastaFTP <- c("ftp://ftp.ensembl.org/pub/release-98/fasta/drosophila_melanogaster/cdna/Drosophila_melanogaster.BDGP6.22.cdna.all.fa.gz",
"ftp://ftp.ensembl.org/pub/release-98/fasta/drosophila_melanogaster/ncrna/Drosophila_melanogaster.BDGP6.22.ncrna.fa.gz")
gtfPath <- file.path(dir,"Drosophila_melanogaster.BDGP6.22.98.gtf.gz")
suppressPackageStartupMessages(library(tximeta))
makeLinkedTxome(indexDir=indexDir,
source="LocalEnsembl",
organism="Drosophila melanogaster",
release="98",
genome="BDGP6.22",
fasta=fastaFTP,
gtf=gtfPath,
write=FALSE)## saving linkedTxome in bfc (first time)
## importing salmon quantification files
## reading in files with read_tsv
## 1
## found matching linked transcriptome:
## [ LocalEnsembl - Drosophila melanogaster - release 98 ]
## building TxDb with 'txdbmaker' package
## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ...
## Warning in .makeTxDb_normarg_chrominfo(chrominfo): genome version information
## is not available for this TxDb object
## OK
## generating transcript ranges
## Warning in checkAssays2Txps(assays, txps):
##
## Warning: the annotation is missing some transcripts that were quantified.
## 5 out of 33706 txps were missing from GTF/GFF but were in the indexed FASTA
## (e.g. this can occur with transcripts located on haplotype chromosomes).
## In order to build a ranged SummarizedExperiment, these txps were removed.
## To keep these txps, and to skip adding ranges, use skipMeta=TRUE
##
## Example missing txps: [FBtr0307759, FBtr0084079, FBtr0084080, ...]
This warning, “Warning: the annotation is missing some
transcripts that were quantified.”, is common and occurs when
annotation sources provide transcripts in the FASTA file that aren’t
annotated in the GTF file. This is upstream of tximeta(),
but this package will notify you the number of transcripts for which
this is the case.
tximeta() recognized the computed digest of the
transcriptome that the files were quantified against, it accessed the
GTF file of the transcriptome source, found and attached the transcript
ranges, and added the appropriate transcriptome and genome metadata. A
digest is a small string of alphanumeric characters that
uniquely identifies the collection of sequences that were used for
quantification (it is the hash value from applying the hash function to
a particular collection of nucleotide sequences). We sometimes also call
this value a “checksum” (in the tximeta paper), and we sometimes call
the table of pre-computed digests or linked digests a “hash table”.
Note that a remote GTF is only downloaded once, and a local or remote
GTF is only parsed to build a TxDb or EnsDb once: if
tximeta() recognizes that it has seen this salmon
index before, it will use a cached version of the metadata and
transcript ranges.
tximeta() makes use of Bioconductor packages for storing
transcript databases as TxDb or EnsDb objects, which
both are connected by default to sqlite backends. For
GENCODE and RefSeq GTF files, tximeta() uses the
txdbmaker package to parse the GTF and build a TxDb.
For Ensembl GTF files, tximeta() will first attempt to
obtain the correct EnsDb object using AnnotationHub.
The ensembldb package (Rainer, Gatto,
and Weichenberger 2019) contains classes and methods for
extracting relevant data from Ensembl files. If the EnsDb has
already been made available on AnnotationHub, tximeta()
will download the database directly, which saves the user time parsing
the GTF into a database (to avoid this, set useHub=FALSE).
If the relevant EnsDb is not available on AnnotationHub,
tximeta() will build an EnsDb using
ensembldb after downloading the GTF file. Again, the
download/construction of a transcript database occurs only once, and
upon subsequent usage of tximeta functions, the cached version
will be used.
The following digests for human, mouse, and drosophila reference
transcripts are supported in this version of tximeta():
| source | organism | releases |
|---|---|---|
| GENCODE | Homo sapiens | 23-49 |
| GENCODE | Mus musculus | M6-M38 |
| Ensembl | Homo sapiens | 76-115 |
| Ensembl | Mus musculus | 76-115 |
| Ensembl | Drosophila melanogaster | 79-115 |
| RefSeq | Homo sapiens | p1-p13 |
| RefSeq | Mus musculus | p2-p6 |
For Ensembl transcriptomes, we support the combined protein coding (cDNA) and non-coding (ncRNA) sequences, as well as the protein coding alone (although the former approach combining coding and non-coding transcripts is recommended for more accurate quantification).
tximeta() also supports linked transcriptomes,
a mechanism to link quantification data to key metadata. This can be
used to resolve many potential situations where one of the above
pre-computed digests won’t be a match (due to modifying, re-ordering,
adding, or removing transcripts, which changes the value of the compute
digest). See the Linked
transcriptomes section below for a demonstration. (The
makeLinkedTxome function was used above to avoid downloading
the GTF during the vignette building process.)
For oarfish quantification (Zare
Jousheghani, Singh, and Patro 2025) using the
--annotated and --novel reference transcripts,
use the tximix() function described below.
We have our coldata from before. Note that we’ve removed
files.
## DataFrame with 1 row and 2 columns
## names condition
## <character> <character>
## SRR1197474 SRR1197474 A
Here we show the three matrices that were imported.
## [1] "counts" "abundance" "length"
If there were inferential replicates (Gibbs samples or bootstrap
samples), these would be imported as additional assays named
"infRep1", "infRep2", …
tximeta() has imported the correct ranges for the
transcripts:
## GRanges object with 33701 ranges and 3 metadata columns:
## seqnames ranges strand | tx_id gene_id
## <Rle> <IRanges> <Rle> | <integer> <CharacterList>
## FBtr0070129 X 656673-657899 + | 28756 FBgn0025637
## FBtr0070126 X 656356-657899 + | 28752 FBgn0025637
## FBtr0070128 X 656673-657899 + | 28755 FBgn0025637
## FBtr0070124 X 656114-657899 + | 28750 FBgn0025637
## FBtr0070127 X 656356-657899 + | 28753 FBgn0025637
## ... ... ... ... . ... ...
## FBtr0114299 2R 21325218-21325323 + | 9300 FBgn0086023
## FBtr0113582 3R 5598638-5598777 - | 24474 FBgn0082989
## FBtr0091635 3L 1488906-1489045 + | 13780 FBgn0086670
## FBtr0113599 3L 261803-261953 - | 16973 FBgn0083014
## FBtr0113600 3L 831870-832008 - | 17070 FBgn0083057
## tx_name
## <character>
## FBtr0070129 FBtr0070129
## FBtr0070126 FBtr0070126
## FBtr0070128 FBtr0070128
## FBtr0070124 FBtr0070124
## FBtr0070127 FBtr0070127
## ... ...
## FBtr0114299 FBtr0114299
## FBtr0113582 FBtr0113582
## FBtr0091635 FBtr0091635
## FBtr0113599 FBtr0113599
## FBtr0113600 FBtr0113600
## -------
## seqinfo: 25 sequences from an unspecified genome; no seqlengths
We have appropriate genome information, which prevents us from making bioinformatic mistakes:
## Seqinfo object with 25 sequences from an unspecified genome; no seqlengths:
## seqnames seqlengths isCircular genome
## 2L <NA> <NA> <NA>
## 2R <NA> <NA> <NA>
## 3L <NA> <NA> <NA>
## 3R <NA> <NA> <NA>
## 4 <NA> <NA> <NA>
## ... ... ... ...
## 211000022280481 <NA> <NA> <NA>
## 211000022280494 <NA> <NA> <NA>
## 211000022280703 <NA> <NA> <NA>
## mitochondrion_genome <NA> <NA> <NA>
## rDNA <NA> <NA> <NA>
The se object has associated metadata that allows
tximeta to link to locally stored cached databases and
other Bioconductor objects. In further sections, we will show examples
functions that leverage this databases for adding exon information,
summarize transcript-level data to the gene level, or add identifiers.
However, first we mention that the user can easily access the cached
database with the following helper function. In this case,
tximeta has an associated EnsDb object that we can
retrieve and use in our R session:
## loading existing TxDb created: 2025-09-29 21:59:02
## [1] "TxDb"
## attr(,"package")
## [1] "GenomicFeatures"
The database returned by retrieveDb is either a
TxDb in the case of GENCODE or RefSeq GTF annotation file, or
an EnsDb in the case of an Ensembl GTF annotation file. For
further use of these two database objects, consult the
GenomicFeatures vignettes and the ensembldb vignettes,
respectively (both Bioconductor packages).
Because the SummarizedExperiment maintains all the metadata of its creation, it also keeps a pointer to the necessary database for pulling out additional information, as demonstrated in the following sections.
If necessary, the tximeta package can pull down the remote
source to build a TxDb, but given that we’ve already built a TxDb once,
it simply loads the cached version. In order to remove the cached TxDb
and regenerate, one can remove the relevant entry from the
tximeta file cache that resides at the location given by
getTximetaBFC().
The se object created by tximeta, has the
start, end, and strand information for each transcript. Here, we swap
out the transcript GRanges for exons-by-transcript
GRangesList (it is a list of GRanges, where each
element of the list gives the exons for a particular transcript).
## loading existing TxDb created: 2025-09-29 21:59:02
## generating exon ranges
## GRanges object with 2 ranges and 3 metadata columns:
## seqnames ranges strand | exon_id exon_name exon_rank
## <Rle> <IRanges> <Rle> | <integer> <character> <integer>
## [1] X 656673-656740 + | 72949 FBtr0070129-E1 1
## [2] X 657099-657899 + | 72952 FBtr0070129-E2 2
## -------
## seqinfo: 25 sequences from an unspecified genome; no seqlengths
As with the transcript ranges, the exon ranges will be generated once
and cached locally. As it takes a non-negligible amount of time to
generate the exon-by-transcript GRangesList, this local caching
offers substantial time savings for repeated usage of
addExons with the same transcriptome.
We have implemented addExons to work only on the
transcript-level SummarizedExperiment object. We provide some
motivation for this choice in ?addExons. Briefly, if it is
desired to know the exons associated with a particular gene, we feel
that it makes more sense to pull out the relevant set of
exons-by-transcript for the transcripts for this gene, rather than
losing the hierarchical structure (exons to transcripts to genes) that
would occur with a GRangesList of exons grouped per gene.
Likewise, the tximeta package can make use of the cached
TxDb database for the purpose of summarizing transcript-level
quantifications and bias corrections to the gene-level. After
summarization, the rowRanges reflect the start and end
position of the gene, which in Bioconductor are defined by the leftmost
and rightmost genomic coordinates of all the transcripts. As with the
transcript and exons, the gene ranges are cached locally for repeated
usage. The transcript IDs are stored as a CharacterList column
tx_ids.
Note: you can also use summarizeToGene
on an object created with skipMeta=TRUE which therefore
does not have ranges or an associated TxDb by setting
skipRanges=TRUE and providing your own tx2gene
table which is passed to tximport.
## loading existing TxDb created: 2025-09-29 21:59:02
## obtaining transcript-to-gene mapping from database
## generating gene ranges
## assignRanges='range': gene ranges assigned by total range of isoforms
## see details at: ?summarizeToGene,SummarizedExperiment-method
## summarizing abundance
## summarizing counts
## summarizing length
## GRanges object with 17208 ranges and 2 metadata columns:
## seqnames ranges strand | gene_id
## <Rle> <IRanges> <Rle> | <character>
## FBgn0000003 3R 6822498-6822796 + | FBgn0000003
## FBgn0000008 2R 22136968-22172834 + | FBgn0000008
## FBgn0000014 3R 16807214-16830049 - | FBgn0000014
## FBgn0000015 3R 16927212-16972236 - | FBgn0000015
## FBgn0000017 3L 16615866-16647882 - | FBgn0000017
## ... ... ... ... . ...
## FBgn0286199 3R 24279572-24281576 + | FBgn0286199
## FBgn0286203 2R 5413744-5456095 + | FBgn0286203
## FBgn0286204 3R 8950246-8963037 - | FBgn0286204
## FBgn0286213 3L 13023352-13024762 + | FBgn0286213
## FBgn0286222 X 6678424-6681845 + | FBgn0286222
## tx_ids
## <CharacterList>
## FBgn0000003 FBtr0081624
## FBgn0000008 FBtr0071763,FBtr0100521,FBtr0342981,...
## FBgn0000014 FBtr0306337,FBtr0083388,FBtr0083387,...
## FBgn0000015 FBtr0415463,FBtr0415464,FBtr0083385,...
## FBgn0000017 FBtr0112790,FBtr0345369,FBtr0075357,...
## ... ...
## FBgn0286199 FBtr0084600
## FBgn0286203 FBtr0299918,FBtr0299920,FBtr0299921,...
## FBgn0286204 FBtr0082014,FBtr0334329
## FBgn0286213 FBtr0075878
## FBgn0286222 FBtr0070953,FBtr0070954
## -------
## seqinfo: 25 sequences from an unspecified genome; no seqlengths
We also offer a new type of range assignment, based on the most
abundant isoform rather than the leftmost to rightmost coordinate. See
the assignRanges argument of ?summarizeToGene.
Using the most abundant isoform arguably will reflect more accurate
genomic distances than the default option.
For more explanation about why this may be a better choice, see the following tutorial chapter:
https://tidyomics.github.io/tidy-ranges-tutorial/gene-ranges-in-tximeta.html
In the below diagram, the pink feature is the set of all exons belonging to any isoform of the gene, such that the TSS is on the right side of this minus strand feature. However, the blue feature is the most abundant isoform (the brown features are the next most abundant isoforms). The pink feature is therefore not a good representation for the locus.
We would like to add support to easily map transcript or gene identifiers from one annotation to another. This is just a prototype function, but we show how we can easily add alternate IDs given that we know the organism and the source of the transcriptome. (This function currently only works for GENCODE and Ensembl gene or transcript IDs but could be extended to work for arbitrary sources.)
##
## mapping to new IDs using org.Dm.eg.db
## if all matching IDs are desired, and '1:many mappings' are reported,
## set multiVals='list' to obtain all the matching IDs
## 'select()' returned 1:many mapping between keys and columns
## DataFrame with 17208 rows and 3 columns
## gene_id tx_ids REFSEQ
## <character> <CharacterList> <character>
## FBgn0000003 FBgn0000003 FBtr0081624 NR_001992
## FBgn0000008 FBgn0000008 FBtr0071763,FBtr0100521,FBtr0342981,... NM_001014543
## FBgn0000014 FBgn0000014 FBtr0306337,FBtr0083388,FBtr0083387,... NM_001170161
## FBgn0000015 FBgn0000015 FBtr0415463,FBtr0415464,FBtr0083385,... NM_001275719
## FBgn0000017 FBgn0000017 FBtr0112790,FBtr0345369,FBtr0075357,... NM_001104153
## ... ... ... ...
## FBgn0286199 FBgn0286199 FBtr0084600 NM_142982
## FBgn0286203 FBgn0286203 FBtr0299918,FBtr0299920,FBtr0299921,... NM_001144134
## FBgn0286204 FBgn0286204 FBtr0082014,FBtr0334329 NM_001275464
## FBgn0286213 FBgn0286213 FBtr0075878 NM_168534
## FBgn0286222 FBgn0286222 FBtr0070953,FBtr0070954 NM_132111
The following code chunk demonstrates how to build a DESeqDataSet and begin a differential expression analysis.
suppressPackageStartupMessages(library(DESeq2))
# here there is a single sample so we use ~1.
# expect a warning that there is only a single sample...
suppressWarnings({dds <- DESeqDataSet(gse, ~1)})## using counts and average transcript lengths from tximeta
The Swish method in the fishpond package directly
works with the SummarizedExperiment output from
tximeta, and can perform differential analysis on transcript
expression taking into account inferential replicates, e.g. bootstrap or
Gibbs samples, which are imported and arranged by tximeta
if these were generated during quantification.
# un-evaluated code
library(fishpond)
y <- se
y <- scaleInfReps(y)
y <- labelKeep(y)
y <- swish(y, x="condition")
# ... see Swish vignette in fishpond packageWe have a convenient wrapper function that will build a DGEList object for use with edgeR.
suppressPackageStartupMessages(library(edgeR))
y <- makeDGEList(gse)
# ... see edgeR User's Guide for further stepsThe following code chunk demonstrates the code inside of the above wrapper function, and produces the same output.
cts <- assays(gse)[["counts"]]
normMat <- assays(gse)[["length"]]
normMat <- normMat / exp(rowMeans(log(normMat)))
o <- log(calcNormFactors(cts/normMat)) + log(colSums(cts/normMat))
y <- DGEList(cts)
y <- scaleOffset(y, t(t(log(normMat)) + o))
# ... see edgeR User's Guide for further stepsFor limma with voom transformation we recommend, as in the tximport vignette to generate counts-from-abundance instead of providing an offset for average transcript length.
## loading existing TxDb created: 2025-09-29 21:59:02
## obtaining transcript-to-gene mapping from database
## loading existing gene ranges created: 2025-09-29 21:59:05
## assignRanges='range': gene ranges assigned by total range of isoforms
## see details at: ?summarizeToGene,SummarizedExperiment-method
## summarizing abundance
## summarizing counts
## summarizing length
Above we generated counts-from-abundance when calling
summarizeToGene. The counts-from-abundance status is then
stored in the metadata:
## [1] "lengthScaledTPM"
The following information is attached to the
SummarizedExperiment by tximeta:
## [1] "tximetaInfo" "quantInfo" "countsFromAbundance"
## [4] "level" "txomeInfo" "txdbInfo"
## List of 31
## $ salmon_version : chr "0.14.1"
## $ samp_type : chr "none"
## $ opt_type : chr "vb"
## $ quant_errors :List of 1
## ..$ : list()
## $ num_libraries : int 1
## $ library_types : chr "ISR"
## $ frag_dist_length : int 1001
## $ seq_bias_correct : logi FALSE
## $ gc_bias_correct : logi TRUE
## $ num_bias_bins : int 4096
## $ mapping_type : chr "mapping"
## $ num_valid_targets : int 33706
## $ num_decoy_targets : int 0
## $ num_eq_classes : int 70718
## $ serialized_eq_classes : logi FALSE
## $ length_classes : int [1:5, 1] 867 1533 2379 3854 71382
## $ index_seq_hash : chr "7ba5e9597796ea86cf11ccf6635ca88fbc37c2848d38083c23986aa2c6a21eae"
## $ index_name_hash : chr "b6426061057bba9b7afb4dc76fa68238414cf35b4190c95ca6fc44280d4ca87c"
## $ index_seq_hash512 : chr "05f111abcda1efd2e489ace6324128cdaaa311712a28ed716d957fdfd8706ec41ca9177ebf12f54e99c2a89582d06f31c5e09dc1dce2d13"| __truncated__
## $ index_name_hash512 : chr "ccdf58f23e48c8c53cd122b5f5990b5adce9fec87ddf8bd88153afbe93296d87b818fba89d12dbc20c882f7d98353840394c5040fea7432"| __truncated__
## $ num_bootstraps : int 0
## $ num_processed : int 42422337
## $ num_mapped : int 34098209
## $ num_decoy_fragments : int 0
## $ num_dovetail_fragments : int 2048810
## $ num_fragments_filtered_vm : int 989383
## $ num_alignments_below_threshold_for_mapped_fragments_vm: int 267540
## $ percent_mapped : num 80.4
## $ call : chr "quant"
## $ start_time : chr "Sat Oct 12 13:55:01 2019"
## $ end_time : chr "Sat Oct 12 14:08:11 2019"
## List of 9
## $ index : chr "Dm.BDGP6.22.98_salmon-0.14.1"
## $ source : chr "LocalEnsembl"
## $ organism : chr "Drosophila melanogaster"
## $ release : chr "98"
## $ genome : chr "BDGP6.22"
## $ fasta :List of 1
## ..$ : chr [1:2] "ftp://ftp.ensembl.org/pub/release-98/fasta/drosophila_melanogaster/cdna/Drosophila_melanogaster.BDGP6.22.cdna.all.fa.gz" "ftp://ftp.ensembl.org/pub/release-98/fasta/drosophila_melanogaster/ncrna/Drosophila_melanogaster.BDGP6.22.ncrna.fa.gz"
## $ gtf : chr "/Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library/tximportData/extdata/salmon_dm/Drosophila_"| __truncated__
## $ sha256 : chr "7ba5e9597796ea86cf11ccf6635ca88fbc37c2848d38083c23986aa2c6a21eae"
## $ linkedTxome: logi TRUE
## List of 3
## $ version :Classes 'package_version', 'numeric_version' hidden list of 1
## ..$ : int [1:3] 1 27 5
## $ type : chr "salmon"
## $ importTime: POSIXct[1:1], format: "2025-09-29 17:58:57"
## Named chr [1:12] "TxDb" "GenomicFeatures" ...
## - attr(*, "names")= chr [1:12] "Db type" "Supporting package" "Data source" "Organism" ...
dir <- system.file("extdata/oarfish", package="tximportData")
names <- paste0("rep", 2:4)
files <- file.path(dir, paste0("sgnex_h9_", names, ".quant.gz"))
coldata <- data.frame(files, names)
se <- tximix(coldata, type="oarfish")## reading in files with read_tsv
## 1 2 3
## returning un-ranged SummarizedExperiment, other tximix functions:
## -- tximixInspectDigests() to check matching digests
## -- makeLinkedTxome/makeLinkedTxpData() to link digests to metadata
## -- tximixUpdate() to update metadata and optionally add ranges
## # A tibble: 2 × 6
## index source organism release linkedTxome small_digest
## <chr> <chr> <chr> <chr> <lgl> <chr>
## 1 annotated GENCODE Homo sapiens 48 FALSE 6fc626
## 2 novel <NA> <NA> <NA> NA 43158f
# this chunk not evaluated as it involves
# downloading 56 Mb file from ftp.ebi.ac.uk
se_update <- tximixUpdate(se)
mcols(se_update)# define novel set so we can add metadata
novel <- data.frame(
seqnames = paste0("chr", rep(1:22, each = 500)),
start = 1e6 + 1 + 0:499 * 1000,
end = 1e6 + 1 + 0:499 * 1000 + 1000 - 1,
strand = "+",
tx_name = paste0("novel", 1:(22 * 500)),
gene_id = paste0("novel_gene", rep(1:(22 * 10), each = 50)),
type = "protein_coding"
)
head(novel)## seqnames start end strand tx_name gene_id type
## 1 chr1 1000001 1001000 + novel1 novel_gene1 protein_coding
## 2 chr1 1001001 1002000 + novel2 novel_gene1 protein_coding
## 3 chr1 1002001 1003000 + novel3 novel_gene1 protein_coding
## 4 chr1 1003001 1004000 + novel4 novel_gene1 protein_coding
## 5 chr1 1004001 1005000 + novel5 novel_gene1 protein_coding
## 6 chr1 1005001 1006000 + novel6 novel_gene1 protein_coding
## building TxDb with 'txdbmaker' package
## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ...
## Warning in .get_cds_IDX(mcols0$type, mcols0$phase): The "phase" metadata column contains non-NA values for features of type
## stop_codon. This information was ignored.
## Warning in .makeTxDb_normarg_chrominfo(chrominfo): genome version information
## is not available for this TxDb object
## OK
## generating transcript ranges
## --annotated index: adding metadata for 385669 transcripts
## --novel index: no transcript metadata found
## consider to add a 'linkedTxome', or 'linkedTxpData'
## txpData: adding transcript metadata for 11000 transcripts
## building RangedSE: subsetting to 396669 out of 423044 rows with range data
tximeta makes use of BiocFileCache to store
transcript and other databases, so saving the relevant databases in a
centralized location used by other Bioconductor packages as well. It is
possible that an error can occur in connecting to these databases,
either if the files were accidentally removed from the file system, or
if there was an error generating or writing the database to the cache
location. In each of these cases, it is easy to remove the entry in the
BiocFileCache so that tximeta will know to
regenerate the transcript database or any other missing database.
If you have used the default cache location, then you can obtain access to your BiocFileCache with:
## Loading required package: dbplyr
Otherwise, you can recall your particular tximeta cache
location with getTximetaBFC().
You can then inspect the entries in your BiocFileCache using
bfcinfo and remove the entry associated with the missing
database with bfcremove. See the BiocFileCache vignette for
more details on finding and removing entries from a BiocFileCache.
Note that there may be many entries in the BiocFileCache location,
including .sqlite database files and serialized
.rds files. You should only remove the entry associated
with the missing database, e.g. if R gave an error when trying to
connect to the TxDb associated with GENCODE v99 human transcripts, you
should look for the rid of the entry associated with the
human v99 GTF from GENCODE.
tximeta automatically imports relevant metadata when the
transcriptome matches a known source – known in the sense that
it is in the set of pre-computed digests in tximeta
(GENCODE, Ensembl, and RefSeq for human and mouse). tximeta
also facilitates the linking of transcriptomes used in building the
salmon index with relevant public sources, in the case that
these are not part of this pre-computed set known to
tximeta. The linking of the transcriptome source with the
quantification files is important in the case that the transcript
sequence no longer matches a known source (uniquely combined or filtered
FASTA files), or if the source is not known to tximeta.
Combinations of coding and non-coding human, mouse, and fruit fly
Ensembl transcripts should be automatically recognized by
tximeta and does not require making a linkedTxome.
As the package is further developed, we plan to roll out support for all
common transcriptomes, from all sources.
Note: if you are using salmon in alignment
mode, then there is no salmon index, and without the salmon index, there
is no digest. We don’t have a perfect solution for this yet, but you can
still summarize transcript counts to gene with a tx2gene
table that you construct manually (see tximport vignette
for example code). Just specify the arguments,
skipMeta=TRUE, txOut=FALSE, tx2gene=tx2gene, when calling
tximeta and it will perform summarization to gene level as
in tximport.
We now demonstrate how to make a linkedTxome and how to share and load a linkedTxome. We point to a salmon quantification file which was quantified against a transcriptome that included the coding and non-coding Drosophila melanogaster transcripts, as well as an artificial transcript of 960 bp (for demonstration purposes only).
dir <- system.file("extdata/salmon_dm", package="tximportData")
file <- file.path(dir, "SRR1197474.plus", "quant.sf")
file.exists(file)## [1] TRUE
Trying to import the files gives a message that tximeta
couldn’t find a matching transcriptome, so it returns an non-ranged
SummarizedExperiment.
## importing salmon quantification files
## reading in files with read_tsv
## 1
## couldn't find matching transcriptome, returning non-ranged SummarizedExperiment
If the transcriptome used to generate the salmon index does not match any transcriptomes from known sources (e.g. from combining or filtering known transcriptome files), there is not much that can be done to automatically populate the metadata during quantification import. However, we can facilitate the following two cases:
tximeta offers functionality to assist reproducible
analysis in both of these cases.
To make this quantification reproducible, we make a
linkedTxome which records key information about the sources
of the transcript FASTA files, and the location of the relevant GTF
file. It also records the digest of the transcriptome that was computed
by salmon during the index step.
Source: when creating the linkedTxome
one must specify the source of the transcriptome. See
?linkedTxome for a note on the implications of this text
string. For canonical GENCODE or Ensembl transcriptomes, one can use
"GENCODE" or "Ensembl", but for modified or
otherwise any transcriptomes defined by a local database, it is
recommended to use a different string, "LocalGENCODE" or
`“LocalEnsembl”, which will avoid tximeta pulling canonical
GENCODE or Ensembl resources from AnnotationHub.
Multiple GTF/GFF files: linkedTxome and
tximeta do not currently support multiple GTF/GFF files,
which is a more complicated case than multiple FASTA, which is
supported. Currently, we recommend that users should add or combine
GTF/GFF files themselves to create a single GTF/GFF file that contains
all features used in quantification, and then upload such a file to
Zenodo, which can then be linked as shown below. Feel free to
contact the developers on the Bioconductor support site or GitHub Issue
page for further details or feature requests.
Stringtie: A special note for building on top of
Stringtie-generated transcripts: it is a good idea to change gene
identifiers, to not include a period ., as the
period will later be used to separate transcript versions from gene
identifiers. This can be done before building the salmon index, by
changing periods in the gene identifier to an underscore. See this GitHub
issue for details.
By default, linkedTxome will write out a JSON file which
can be shared with others, linking the digest of the index with the
other metadata, including FASTA and GTF sources. By default, it will
write out to a file with the same name as the indexDir, but
with a .json extension added. This can be prevented with
write=FALSE, and the file location can be changed with
jsonFile.
First we specify the path where the salmon index is located.
Typically you would not use system.file and
file.path to locate this directory, but simply define
indexDir to be the path of the salmon directory on
your machine. Here we use system.file and
file.path because we have included parts of a
salmon index directory in the tximeta package itself
for demonstration of functionality in this vignette.
Now we provide the location of the FASTA files and the GTF file for this transcriptome.
Note: the basename for the GTF file is used as a
unique identifier for the cached versions of the TxDb and the
transcript ranges, which are stored on the user’s behalf via
BiocFileCache. This is not an issue, as GENCODE, Ensembl, and
RefSeq all provide GTF files which are uniquely identified by their
filename,
e.g. Drosophila_melanogaster.BDGP6.22.98.gtf.gz.
The recommended usage of tximeta would be to specify a
remote GTF source, as seen in the commented-out line below:
fastaFTP <- c("ftp://ftp.ensembl.org/pub/release-98/fasta/drosophila_melanogaster/cdna/Drosophila_melanogaster.BDGP6.22.cdna.all.fa.gz",
"ftp://ftp.ensembl.org/pub/release-98/fasta/drosophila_melanogaster/ncrna/Drosophila_melanogaster.BDGP6.22.ncrna.fa.gz",
"extra_transcript.fa.gz")
#gtfFTP <- "ftp://path/to/custom/Drosophila_melanogaster.BDGP6.22.98.plus.gtf.gz"Instead of the above commented-out FTP location for the GTF file, we
specify a location within an R package. This step is just to avoid
downloading from a remote FTP during vignette building. This use of
file.path to point to a file in an R package is specific to
this vignette and should not be used in a typical workflow. The
following GTF file is a modified version of the release 98 from Ensembl,
which includes description of a one transcript, one exon artificial gene
which was inserted into the transcriptome (for demonstration purposes
only).
Finally, we create a linkedTxome. In this vignette, we point
to a temporary directory for the JSON file, but a more typical workflow
would write the JSON file to the same location as the salmon
index by not specifying jsonFile.
makeLinkedTxome performs two operation: (1) it creates a
new entry in an internal table that links the transcriptome used in the
salmon index to its sources, and (2) it creates a JSON file
such that this linkedTxome can be shared.
tmp <- tempdir() # just for vignette demo, make temp directory
jsonFile <- file.path(tmp, paste0(basename(indexDir), ".json"))
makeLinkedTxome(indexDir=indexDir,
source="LocalEnsembl", organism="Drosophila melanogaster",
release="98", genome="BDGP6.22",
fasta=fastaFTP, gtf=gtfPath,
jsonFile=jsonFile)## writing linkedTxome to /var/folders/r0/l4fjk6cj5xj0j3brt4bplpl40000gt/T//RtmpTbvn63/Dm.BDGP6.22.98.plus_salmon-0.14.1.json
## saving linkedTxome in bfc
After running makeLinkedTxome, the connection between
this salmon index (and its digest) with the sources is saved
for persistent usage. Note that because we added a single transcript of
960bp to the FASTA file used for quantification, tximeta
could tell that this was not quantified against release 98 of the
Ensembl transcripts for Drosophila melanogaster. Only when the
correct set of transcripts were specified does tximeta
recognize and import the correct metadata.
With use of tximeta and a linkedTxome, the
software figures out if the remote GTF has been accessed and compiled
into a TxDb before, and on future calls, it will simply load
the pre-computed metadata and transcript ranges.
Note the warning that 5 of the transcripts are missing from the GTF
file and so are dropped from the final output. This is a problem coming
from the annotation source, and not easily avoided by
tximeta.
## importing salmon quantification files
## reading in files with read_tsv
## 1
## found matching linked transcriptome:
## [ LocalEnsembl - Drosophila melanogaster - release 98 ]
## building TxDb with 'txdbmaker' package
## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ...
## Warning in .makeTxDb_normarg_chrominfo(chrominfo): genome version information
## is not available for this TxDb object
## OK
## generating transcript ranges
## Warning in checkAssays2Txps(assays, txps):
##
## Warning: the annotation is missing some transcripts that were quantified.
## 5 out of 33707 txps were missing from GTF/GFF but were in the indexed FASTA
## (e.g. this can occur with transcripts located on haplotype chromosomes).
## In order to build a ranged SummarizedExperiment, these txps were removed.
## To keep these txps, and to skip adding ranges, use skipMeta=TRUE
##
## Example missing txps: [FBtr0307759, FBtr0084079, FBtr0084080, ...]
We can see that the appropriate metadata and transcript ranges are attached.
## GRanges object with 33702 ranges and 3 metadata columns:
## seqnames ranges strand | tx_id gene_id
## <Rle> <IRanges> <Rle> | <integer> <CharacterList>
## Newgene 3R 1-960 + | 20219 Newgene
## FBtr0070129 X 656673-657899 + | 28757 FBgn0025637
## FBtr0070126 X 656356-657899 + | 28753 FBgn0025637
## FBtr0070128 X 656673-657899 + | 28756 FBgn0025637
## FBtr0070124 X 656114-657899 + | 28751 FBgn0025637
## ... ... ... ... . ... ...
## FBtr0114299 2R 21325218-21325323 + | 9300 FBgn0086023
## FBtr0113582 3R 5598638-5598777 - | 24475 FBgn0082989
## FBtr0091635 3L 1488906-1489045 + | 13780 FBgn0086670
## FBtr0113599 3L 261803-261953 - | 16973 FBgn0083014
## FBtr0113600 3L 831870-832008 - | 17070 FBgn0083057
## tx_name
## <character>
## Newgene Newgene
## FBtr0070129 FBtr0070129
## FBtr0070126 FBtr0070126
## FBtr0070128 FBtr0070128
## FBtr0070124 FBtr0070124
## ... ...
## FBtr0114299 FBtr0114299
## FBtr0113582 FBtr0113582
## FBtr0091635 FBtr0091635
## FBtr0113599 FBtr0113599
## FBtr0113600 FBtr0113600
## -------
## seqinfo: 25 sequences from an unspecified genome; no seqlengths
## Seqinfo object with 25 sequences from an unspecified genome; no seqlengths:
## seqnames seqlengths isCircular genome
## 2L <NA> <NA> <NA>
## 2R <NA> <NA> <NA>
## 3L <NA> <NA> <NA>
## 3R <NA> <NA> <NA>
## 4 <NA> <NA> <NA>
## ... ... ... ...
## 211000022280481 <NA> <NA> <NA>
## 211000022280494 <NA> <NA> <NA>
## 211000022280703 <NA> <NA> <NA>
## mitochondrion_genome <NA> <NA> <NA>
## rDNA <NA> <NA> <NA>
The following code removes the entire table with information about the linkedTxomes. This is just for demonstration, so that we can show how to load a JSON file below.
Note: Running this code will clear any information about linkedTxomes. Don’t run this unless you really want to clear this table!
library(BiocFileCache)
if (interactive()) {
bfcloc <- getTximetaBFC()
} else {
bfcloc <- tempdir()
}
bfc <- BiocFileCache(bfcloc)
bfcinfo(bfc)## # A tibble: 9 × 10
## rid rname create_time access_time rpath rtype fpath last_modified_time etag
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
## 1 BFC1 link… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 2 BFC2 Dros… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 3 BFC3 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 4 BFC4 exon… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 5 BFC5 gene… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 6 BFC6 genc… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 7 BFC7 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 8 BFC8 Dros… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 9 BFC9 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## # ℹ 1 more variable: expires <dbl>
# only run the next line if you want to remove your linkedTxome table!
bfcremove(bfc, bfcquery(bfc, "linkedTxomeTbl")$rid)
bfcinfo(bfc)## # A tibble: 8 × 10
## rid rname create_time access_time rpath rtype fpath last_modified_time etag
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
## 1 BFC2 Dros… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 2 BFC3 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 3 BFC4 exon… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 4 BFC5 gene… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 5 BFC6 genc… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 6 BFC7 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 7 BFC8 Dros… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 8 BFC9 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## # ℹ 1 more variable: expires <dbl>
If a collaborator or the Suppmentary Files for a publication shares a
linkedTxome JSON file, we can likewise use
tximeta to automatically assemble the relevant metadata and
transcript ranges. This implies that the other person has used
tximeta with the function makeLinkedTxome
demonstrated above, pointing to their salmon index and to the
FASTA and GTF source(s).
We point to the JSON file and use loadLinkedTxome and
then the relevant metadata is saved for persistent usage. In this case,
we saved the JSON file in a temporary directory.
## saving linkedTxome in bfc (first time)
Again, using tximeta figures out whether it needs to
access the remote GTF or not, and assembles the appropriate object on
the user’s behalf.
## importing salmon quantification files
## reading in files with read_tsv
## 1
## found matching linked transcriptome:
## [ LocalEnsembl - Drosophila melanogaster - release 98 ]
## loading existing TxDb created: 2025-09-29 22:00:17
## loading existing transcript ranges created: 2025-09-29 22:00:17
## Warning in checkAssays2Txps(assays, txps):
##
## Warning: the annotation is missing some transcripts that were quantified.
## 5 out of 33707 txps were missing from GTF/GFF but were in the indexed FASTA
## (e.g. this can occur with transcripts located on haplotype chromosomes).
## In order to build a ranged SummarizedExperiment, these txps were removed.
## To keep these txps, and to skip adding ranges, use skipMeta=TRUE
##
## Example missing txps: [FBtr0307759, FBtr0084079, FBtr0084080, ...]
Finally, we clear the linkedTxomes table again so that the above examples will work. This is just for the vignette code and not part of a typical workflow.
Note: Running this code will clear any information about linkedTxomes. Don’t run this unless you really want to clear this table!
if (interactive()) {
bfcloc <- getTximetaBFC()
} else {
bfcloc <- tempdir()
}
bfc <- BiocFileCache(bfcloc)
bfcinfo(bfc)## # A tibble: 9 × 10
## rid rname create_time access_time rpath rtype fpath last_modified_time etag
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
## 1 BFC2 Dros… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 2 BFC3 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 3 BFC4 exon… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 4 BFC5 gene… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 5 BFC6 genc… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 6 BFC7 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 7 BFC8 Dros… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 8 BFC9 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 9 BFC10 link… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## # ℹ 1 more variable: expires <dbl>
# only run the next line if you want to remove your linkedTxome table!
bfcremove(bfc, bfcquery(bfc, "linkedTxomeTbl")$rid)
bfcinfo(bfc)## # A tibble: 8 × 10
## rid rname create_time access_time rpath rtype fpath last_modified_time etag
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
## 1 BFC2 Dros… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 2 BFC3 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 3 BFC4 exon… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 4 BFC5 gene… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 5 BFC6 genc… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 6 BFC7 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 7 BFC8 Dros… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## 8 BFC9 txpR… 2025-09-29… 2025-09-29… /var… rela… 7999… NA <NA>
## # ℹ 1 more variable: expires <dbl>
For alevin quantification, one should point to the
quants_mat.gz file that contains the counts for all of the
cells. In order to tximeta() to work with alevin
quantification, it requires that alevin was run using gene IDs
in the tgMap step and not gene symbols.
tximeta can import the output from any quantifiers that
are supported by tximport, and if these are not
salmon, alevin, or Sailfish output, it will
simply return a non-ranged SummarizedExperiment by default.
An alternative solution is to wrap other quantifiers in workflows
that include metadata information JSON files along with each
quantification file. One can place these files in
aux_info/meta_info.json or any relative location specified
by customMetaInfo, for example
customMetaInfo="meta_info.json". This JSON file is located
relative to the quantification file and should contain a tag
index_seq_hash with an associated value of the SHA-256 hash
(digest) of the reference transcripts. For computing the hash value of
the reference transcripts, see the FastaDigest python
package. The hash value used by salmon is the SHA-256 hash
value of the reference sequences stripped of the header lines, and
concatenated together with the empty string (so only cDNA sequences
combined without any new line characters). FastaDigest can be
installed with pip install fasta_digest.
This vignette described the use of tximeta to import
quantification data into R/Bioconductor with automatic detection and
addition of metadata. The SummarizedExperiment produced by
tximeta can then be provided to downstream statistical
analysis packages as described above. The tximeta package does
not contain any functionality for automated differential analysis.
The ARMOR workflow
does automate a variety of differential analyses, and make use of
tximeta for creation of a SummarizedExperiment
with attached annotation metadata. ARMOR stands for ``An Automated
Reproducible MOdular Workflow for Preprocessing and Differential
Analysis of RNA-seq Data’’ and is described in more detail in the
article by Orjuela et al. (2019).
tximeta makes use of the default BiocFileCache location, unless otherwise specified by the user. As of BiocFileCache version > 1.15.1, the default caching location used by BiocFileCache has changed. In order to continue to use the same cache (without re-downloading files), please follow the steps in the BiocFileCache vignette, under the heading Default Caching Location Update.
The development of tximeta has benefited from suggestions from these and other individuals in the community:
## Loading required package: usethis
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.5.1 Patched (2025-09-10 r88807)
## os macOS Ventura 13.7.7
## system aarch64, darwin20
## ui X11
## language (EN)
## collate C
## ctype en_US.UTF-8
## tz America/New_York
## date 2025-09-29
## pandoc 3.1.12.3 @ /opt/homebrew/bin/ (via rmarkdown)
## quarto 1.7.32 @ /usr/local/bin/quarto
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## abind 1.4-8 2024-09-12 [2] CRAN (R 4.5.0)
## AnnotationDbi * 1.71.1 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## AnnotationFilter 1.33.0 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## AnnotationHub 3.99.6 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## archive 1.1.12 2025-03-20 [2] CRAN (R 4.5.0)
## Biobase * 2.69.1 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## BiocFileCache * 2.99.6 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## BiocGenerics * 0.55.1 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## BiocIO 1.19.0 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## BiocManager 1.30.26 2025-06-05 [2] CRAN (R 4.5.0)
## BiocParallel 1.43.4 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## BiocVersion 3.22.0 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## biomaRt 2.65.16 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## Biostrings 2.77.2 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## bit 4.6.0 2025-03-06 [2] CRAN (R 4.5.0)
## bit64 4.6.0-1 2025-01-16 [2] CRAN (R 4.5.0)
## bitops 1.0-9 2024-10-03 [2] CRAN (R 4.5.0)
## blob 1.2.4 2023-03-17 [2] CRAN (R 4.5.0)
## bslib 0.9.0 2025-01-30 [2] CRAN (R 4.5.0)
## cachem 1.1.0 2024-05-16 [2] CRAN (R 4.5.0)
## cli 3.6.5 2025-04-23 [2] CRAN (R 4.5.0)
## codetools 0.2-20 2024-03-31 [2] CRAN (R 4.5.1)
## crayon 1.5.3 2024-06-20 [2] CRAN (R 4.5.0)
## curl 7.0.0 2025-08-19 [2] CRAN (R 4.5.0)
## DBI 1.2.3 2024-06-02 [2] CRAN (R 4.5.0)
## dbplyr * 2.5.1 2025-09-10 [2] CRAN (R 4.5.0)
## DelayedArray 0.35.3 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## DESeq2 * 1.49.4 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## devtools * 2.4.5 2022-10-11 [2] CRAN (R 4.5.0)
## dichromat 2.0-0.1 2022-05-02 [2] CRAN (R 4.5.0)
## digest 0.6.37 2024-08-19 [2] CRAN (R 4.5.0)
## dplyr 1.1.4 2023-11-17 [2] CRAN (R 4.5.0)
## edgeR * 4.7.5 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.5.0)
## ensembldb 2.33.2 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## evaluate 1.0.5 2025-08-27 [2] CRAN (R 4.5.0)
## farver 2.1.2 2024-05-13 [2] CRAN (R 4.5.0)
## fastmap 1.2.0 2024-05-15 [2] CRAN (R 4.5.0)
## filelock 1.0.3 2023-12-11 [2] CRAN (R 4.5.0)
## fs 1.6.6 2025-04-12 [2] CRAN (R 4.5.0)
## generics * 0.1.4 2025-05-09 [2] CRAN (R 4.5.0)
## GenomeInfoDb 1.45.11 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## GenomicAlignments 1.45.4 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## GenomicFeatures * 1.61.6 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## GenomicRanges * 1.61.5 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## ggplot2 4.0.0 2025-09-11 [2] CRAN (R 4.5.0)
## glue 1.8.0 2024-09-30 [2] CRAN (R 4.5.0)
## gtable 0.3.6 2024-10-25 [2] CRAN (R 4.5.0)
## hms 1.1.3 2023-03-21 [2] CRAN (R 4.5.0)
## htmltools 0.5.8.1 2024-04-04 [2] CRAN (R 4.5.0)
## htmlwidgets 1.6.4 2023-12-06 [2] CRAN (R 4.5.0)
## httpuv 1.6.16 2025-04-16 [2] CRAN (R 4.5.0)
## httr 1.4.7 2023-08-15 [2] CRAN (R 4.5.0)
## httr2 1.2.1 2025-07-22 [2] CRAN (R 4.5.0)
## IRanges * 2.43.2 2025-09-26 [2] Bioconductor 3.22 (R 4.5.1)
## jquerylib 0.1.4 2021-04-26 [2] CRAN (R 4.5.0)
## jsonlite 2.0.0 2025-03-27 [2] CRAN (R 4.5.0)
## KEGGREST 1.49.1 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## knitr 1.50 2025-03-16 [2] CRAN (R 4.5.0)
## later 1.4.4 2025-08-27 [2] CRAN (R 4.5.0)
## lattice 0.22-7 2025-04-02 [2] CRAN (R 4.5.1)
## lazyeval 0.2.2 2019-03-15 [2] CRAN (R 4.5.0)
## lifecycle 1.0.4 2023-11-07 [2] CRAN (R 4.5.0)
## limma * 3.65.4 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## locfit 1.5-9.12 2025-03-05 [2] CRAN (R 4.5.0)
## magrittr 2.0.4 2025-09-12 [2] CRAN (R 4.5.0)
## Matrix 1.7-4 2025-08-28 [2] CRAN (R 4.5.1)
## MatrixGenerics * 1.21.0 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## matrixStats * 1.5.0 2025-01-07 [2] CRAN (R 4.5.0)
## memoise 2.0.1 2021-11-26 [2] CRAN (R 4.5.0)
## mime 0.13 2025-03-17 [2] CRAN (R 4.5.0)
## miniUI 0.1.2 2025-04-17 [2] CRAN (R 4.5.0)
## org.Dm.eg.db * 3.21.0 2025-09-10 [2] Bioconductor
## pillar 1.11.1 2025-09-17 [2] CRAN (R 4.5.0)
## pkgbuild 1.4.8 2025-05-26 [2] CRAN (R 4.5.0)
## pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.5.0)
## pkgload 1.4.1 2025-09-23 [2] CRAN (R 4.5.0)
## png 0.1-8 2022-11-29 [2] CRAN (R 4.5.0)
## prettyunits 1.2.0 2023-09-24 [2] CRAN (R 4.5.0)
## profvis 0.4.0 2024-09-20 [2] CRAN (R 4.5.0)
## progress 1.2.3 2023-12-06 [2] CRAN (R 4.5.0)
## promises 1.3.3 2025-05-29 [2] CRAN (R 4.5.0)
## ProtGenerics 1.41.0 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## purrr 1.1.0 2025-07-10 [2] CRAN (R 4.5.0)
## R6 2.6.1 2025-02-15 [2] CRAN (R 4.5.0)
## rappdirs 0.3.3 2021-01-31 [2] CRAN (R 4.5.0)
## RColorBrewer 1.1-3 2022-04-03 [2] CRAN (R 4.5.0)
## Rcpp 1.1.0 2025-07-02 [2] CRAN (R 4.5.0)
## RCurl 1.98-1.17 2025-03-22 [2] CRAN (R 4.5.0)
## readr 2.1.5 2024-01-10 [2] CRAN (R 4.5.0)
## remotes 2.5.0 2024-03-17 [2] CRAN (R 4.5.0)
## restfulr 0.0.16 2025-06-27 [2] CRAN (R 4.5.0)
## rjson 0.2.23 2024-09-16 [2] CRAN (R 4.5.0)
## rlang 1.1.6 2025-04-11 [2] CRAN (R 4.5.0)
## rmarkdown 2.30 2025-09-28 [2] CRAN (R 4.5.1)
## Rsamtools 2.25.3 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## RSQLite 2.4.3 2025-08-20 [2] CRAN (R 4.5.0)
## rtracklayer 1.69.1 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## S4Arrays 1.9.1 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## S4Vectors * 0.47.4 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## S7 0.2.0 2024-11-07 [2] CRAN (R 4.5.0)
## sass 0.4.10 2025-04-11 [2] CRAN (R 4.5.0)
## scales 1.4.0 2025-04-24 [2] CRAN (R 4.5.0)
## Seqinfo * 0.99.2 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## sessioninfo 1.2.3 2025-02-05 [2] CRAN (R 4.5.0)
## shiny 1.11.1 2025-07-03 [2] CRAN (R 4.5.0)
## SparseArray 1.9.1 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## statmod 1.5.0 2023-01-06 [2] CRAN (R 4.5.0)
## stringi 1.8.7 2025-03-27 [2] CRAN (R 4.5.0)
## stringr 1.5.2 2025-09-08 [2] CRAN (R 4.5.0)
## SummarizedExperiment * 1.39.2 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## tibble 3.3.0 2025-06-08 [2] CRAN (R 4.5.0)
## tidyselect 1.2.1 2024-03-11 [2] CRAN (R 4.5.0)
## txdbmaker 1.5.6 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## tximeta * 1.27.5 2025-09-29 [1] Bioconductor 3.22 (R 4.5.1)
## tximport 1.37.2 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## tzdb 0.5.0 2025-03-15 [2] CRAN (R 4.5.0)
## UCSC.utils 1.5.0 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## urlchecker 1.0.1 2021-11-30 [2] CRAN (R 4.5.0)
## usethis * 3.2.1 2025-09-06 [2] CRAN (R 4.5.0)
## utf8 1.2.6 2025-06-08 [2] CRAN (R 4.5.0)
## vctrs 0.6.5 2023-12-01 [2] CRAN (R 4.5.0)
## vroom 1.6.6 2025-09-19 [2] CRAN (R 4.5.0)
## withr 3.0.2 2024-10-28 [2] CRAN (R 4.5.0)
## xfun 0.53 2025-08-19 [2] CRAN (R 4.5.0)
## XML 3.99-0.19 2025-08-22 [2] CRAN (R 4.5.0)
## xtable 1.8-4 2019-04-21 [2] CRAN (R 4.5.0)
## XVector 0.49.1 2025-09-29 [2] Bioconductor 3.22 (R 4.5.1)
## yaml 2.3.10 2024-07-26 [2] CRAN (R 4.5.0)
##
## [1] /private/var/folders/r0/l4fjk6cj5xj0j3brt4bplpl40000gt/T/RtmpCbvQDG/Rinst760d25272e93
## [2] /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library
## * ── Packages attached to the search path.
##
## ──────────────────────────────────────────────────────────────────────────────