% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/addWeights.R
\name{addWeights}
\alias{addWeights}
\title{Calculate weights for the regulons by computing co-association between TF and target gene expression}
\usage{
addWeights(
  regulon,
  expMatrix = NULL,
  peakMatrix = NULL,
  exp_assay = "logcounts",
  peak_assay = "PeakMatrix",
  method = c("wilcoxon", "corr", "MI"),
  clusters = NULL,
  exp_cutoff = 1,
  peak_cutoff = 0,
  block_factor = NULL,
  min_targets = 10,
  tf_re.merge = FALSE,
  aggregateCells = FALSE,
  useDim = "IterativeLSI_ATAC",
  cellNum = 10,
  BPPARAM = BiocParallel::SerialParam(progressbar = TRUE)
)
}
\arguments{
\item{regulon}{A DataFrame object consisting of tf (regulator) and target in the column names.}

\item{expMatrix}{A SingleCellExperiment object containing gene expression information}

\item{peakMatrix}{A SingleCellExperiment object or matrix containing peak accessibility with
peaks in the rows and cells in the columns}

\item{exp_assay}{String specifying the name of the assay to be retrieved from the SingleCellExperiment object}

\item{peak_assay}{String indicating the name of the assay in peakMatrix for chromatin accessibility}

\item{method}{String specifying the method of weights calculation. Four options are available: \code{corr},\code{MI}, \code{lmfit},\code{wilcoxon} and \code{logFC}.}

\item{clusters}{A vector corresponding to the cluster labels of the cells}

\item{exp_cutoff}{A scalar indicating the minimum gene expression for transcription factor above which
cell is considered as having expressed transcription factor.}

\item{peak_cutoff}{A scalar indicating the minimum peak accessibility above which peak is
considered open.}

\item{block_factor}{String specifying the field in the colData of the SingleCellExperiment object to be used as blocking factor (such as batch)}

\item{min_targets}{Integer specifying the minimum number of targets for each tf in the regulon with 10 targets as the default}

\item{tf_re.merge}{A logical to indicate whether to consider both TF expression and chromatin accessibility. See details.}

\item{aggregateCells}{A logical to indicate whether to aggregate cells into groups determined by cellNum. This option can be used to
overcome data sparsity when using \code{wilcoxon}.}

\item{useDim}{String indicating the name of the dimensionality reduction matrix in expMatrix used for cell aggregation}

\item{cellNum}{A numeric specifying the number of cells per cluster for cell aggregation. Default is 10.}

\item{BPPARAM}{A BiocParallelParam object specifying whether summation should be parallelized. Use BiocParallel::SerialParam() for
serial evaluation and use BiocParallel::MulticoreParam() for parallel evaluation}
}
\value{
A DataFrame with columns of corr and/or MI added to the regulon. TFs not found in the expression matrix and regulons not
meeting the minimal number of targets were filtered out.
}
\description{
Calculate weights for the regulons by computing co-association between TF and target gene expression
}
\details{
This function estimates the regulatory potential of transcription factor on its target genes, or in other words,
the magnitude of gene expression changes induced by transcription factor activity, using one of the four methods:
\itemize{
\item{\code{corr} - correlation between TF and target gene expression}
\item{\code{MI} - mutual information between the TF and target gene expression}
\item{\code{wilcoxon} - effect size of the Wilcoxon test between target gene expression in cells jointly expressing all 3 elements vs
cells that do not}}
Two measures (\code{corr} and \code{wilcoxon}) give both the magnitude and directionality of changes whereas \code{MI} always outputs
positive weights. The correlation and mutual information statistics are computed on the pseudobulked gene expression or accessibility
matrices, whereas the Wilcoxon method groups cells based on the joint expression of TF, RE and TG in each single cell.

When using the \code{corr} method, the default practice is to compute weights by correlating the pseudobulk target gene expression vs
the pseudobulk TF gene expression. However, often times, an inhibitor of TF does not alter the gene expression of the TF.
In rare cases, cells may even compensate by increasing the expression of the TF. In this case, the activity of the TF,
if computed by TF-TG correlation, may show a spurious increase in its activity. As an alternative to gene expression,
we may correlate the product of TF and RE against TG. When \code{tf_re.merge} is \code{TRUE}, we take the product of
the gene expression and chromatin accessibility.
}
\examples{
# create a mock SingleCellExperiment object for gene expression matrix
expMatrix <- scuttle::mockSCE()
expMatrix <- scuttle::logNormCounts(expMatrix)
expMatrix$cluster <- sample(LETTERS[1:5], ncol(expMatrix), replace=TRUE)

# create a mock SingleCellExperiment object for peak matrix
peakMatrix <- scuttle::mockSCE()
rownames(peakMatrix) <- 1:2000

# create a mock regulon
regulon <- S4Vectors::DataFrame(tf=c(rep('Gene_0001',5), rep('Gene_0002',10)),
                      idxATAC=1:15,
                      target=c(paste0('Gene_000',2:6), paste0('Gene_00',11:20)))

# add weights to regulon
regulon.w <- addWeights(regulon=regulon, expMatrix=expMatrix, exp_assay='logcounts',
peakMatrix=peakMatrix, peak_assay='counts', clusters=expMatrix$cluster,
min_targets=5, method='wilcox')

# add weights with cell aggregation
expMatrix <- scater::runPCA(expMatrix)
regulon.w <- addWeights(regulon=regulon, expMatrix=expMatrix, exp_assay='logcounts',
peakMatrix=peakMatrix, peak_assay='counts', clusters=expMatrix$cluster,
min_targets=5, method='wilcox', aggregateCells=TRUE, cellNum=3, useDim = 'PCA')

}
\author{
Xiaosai Yao, Shang-yang Chen, Tomasz Wlodarczyk
}
