\name{trinarySimilarity}
\alias{trinarySimilarity}
\title{
Computes the tanimoto similarity coefficient between the bioactivity profiles
of two compounds, each represented as a column in a compound vs. target sparse matrix
}
\description{
This computes tanimoto similarity coefficients between bioactivity profiles in a sparse matrix
aware way, where only commonly tested targets are considered. The computation is trinary in that
each compound is a column in a compound vs target matrix with three possible values 
(2=active, 1=inactive, 0=untested or inconclusive) as generated by the \code{perTargetMatrix} function.
A comparison will return a value of NA unless one of the two minimum thresholds is satisfied,
either a minimum number of shared screened targets, or a minimum number of shared active targets
as performed in Dancik, V. et al. (see references).
}
\usage{
trinarySimilarity(queryMatrix, targetMatrix, 
    minSharedScreenedTargets = 12, minSharedActiveTargets = 3)
}
\arguments{
  \item{queryMatrix}{
     This is a compound vs. target sparse matrix representing the bioactivity profiles for one
   compounds across one or more assays or targets. The format must be a
    \code{dgCMatrix} sparse matrix as computed by the \code{perTargetMatrix} function with the option
    \code{useNumericScores = FALSE}. This should be a single column representing the bioactivity
    profile for a single compound. This can be extracted from a larger compound vs. target
    sparse matrix with queryMatrix[,colNumber,drop=FALSE] where colNumber is the desired compound
    column number.
}
  \item{targetMatrix}{
   This is a compound vs. target sparse matrix representing the bioactivity profiles for one or more
   compounds across one or more assays or targets. The format must be 
    \code{dgCMatrix} sparse matrix as computed by the \code{perTargetMatrix} function with the option
    \code{useNumericScores = FALSE}. Similarity will be computed between the query and each
    column of this matrix individually.
}
  \item{minSharedScreenedTargets}{
    A \code{numeric} value specifying the minimum number of shared screened targets needed for a meaningful
    similarity computation. If both this threshold and \code{minSharedActiveTargets} are unsatisfied,
    the returned result will be \code{NA} instead of a computed value. The default of 12 was determined
    taken from Dancik, V. et al. (see references) as experimentally determined to result in meaningful
    predictions.
}
  \item{minSharedActiveTargets}{
    A \code{numeric} value specifying the minimum number of shared active targets needed for a meaningful
    similarity computation. If both this threshold and \code{minSharedScreenedTargets} are unsatisfied,
    the returned result will be \code{NA} instead of a computed value. The default of 3 was determined
    taken from Dancik, V. et al. (see references) as experimentally determined to result in meaningful
    predictions.
}
}
\value{
A \code{numeric} vector where each element represents the tanimoto similarity between
the \code{queryMatrix} and a given row in the \code{targetMatrix} where only the shared
set of commonly screened targets is considered. If both the \code{minSharedScreenedTargets}
and \code{minSharedActiveTargets} thresholds are unsatisfied, an \code{NA} will be returned for the
given similarity value.
An \code{NA} will also be returned if the tanimoto coefficient is undefined due
to a zero in the denominator, which occurs when neither compound was found active
against any of the commonly screened targets.
}
\references{
Tanimoto similarity coefficient: Tanimoto TT (1957) IBM Internal Report 17th Nov see also Jaccard P (1901) Bulletin del la Societe Vaudoisedes Sciences Naturelles 37, 241-272.

Dancik, V. et al. Connecting Small Molecules with Similar Assay Performance 
Profiles Leads to New Biological Hypotheses. J Biomol Screen 19, 771-781 (2014).
}
\author{
Tyler Backman
}
\seealso{
\code{\link{perTargetMatrix}}
\code{\link{getBioassaySetByCids}}
\code{\link{bioactivityFingerprint}}
}
\examples{
## connect to a test database
extdata_dir <- system.file("extdata", package="bioassayR")
sampleDatabasePath <- file.path(extdata_dir, "sampleDatabase.sqlite")
sampleDB <- connectBioassayDB(sampleDatabasePath)

## retrieve activity data for three compounds
assays <- getBioassaySetByCids(sampleDB, c("2244","3715","133021"))

## collapse assays into perTargetMatrix
targetMatrix <- perTargetMatrix(assays)

## compute similarity between first column and all columns
queryMatrix <- targetMatrix[,1,drop=FALSE]
trinarySimilarity(queryMatrix, targetMatrix)

## disconnect from sample database
disconnectBioassayDB(sampleDB)
}
\keyword{ utilities }
