% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/neighSmooth.R
\name{neighSmooth}
\alias{neighSmooth}
\title{Euclidean neighbor smoothing}
\usage{
neighSmooth(
  focusData,
  euclidSpaceData,
  neighRows = "default",
  ctrlRows = NULL,
  kNeighK = "default",
  kMeansK = "default",
  kMeansCenters = NULL,
  kMeansClusters = NULL,
  method = "mean",
  nCores = detectCores() - 1
)
}
\arguments{
\item{focusData}{The data that should be smoothed. Should be a matrix with
the variables to be smoothed as columns.}

\item{euclidSpaceData}{The data cloud in which the nearest neighbors for the
events should be identified. Can be a vector, matrix or dataframe. It is
worth noting that if this data has more than 10 dimensions, the first step
of the algorithm will be the creation of a 10-dimensional PCA using
fast.prcomp from gmodels. So in cases where this function is used iteratively,
it might be wiser to run the PCA beforehand.}

\item{neighRows}{The rows in the dataset that correspond to the neighbors
of the focusData points. "default" is all the focusData points, but a subset
can be added instead, if preferred. This is good to use to increase
robustness, e.g. by running 100 iterations with different sets of neighbors
with the same number of points from each group/individual.}

\item{ctrlRows}{Optionally, a set of control rows that are used to remove
background signal from the neighRows data before sending the data back.}

\item{kNeighK}{The number of nearest neighbors. "default" is the max of
100 and the number of neighbor rows divided by 10000. Mutliple different
values here is preferred.}

\item{kMeansK}{The number of clusters in the initial step of the algorithm.
A higher number leads to shorter runtime, but potentially lower accuracy.
This is not used if kMeansCenters is provided. "default" is the highest of 1
and the number of cells in euclidSpaceData divided by 1000.}

\item{kMeansCenters}{Here, a pre-clustering of the data can be provided, in
which case the clustering will not be performed internally. Wise if for
example a bootstrapping scheme is used to define the neighRows iteratively,
as the k-means step can be quite time consuming. This part is the cluster
centers or centroids.}

\item{kMeansClusters}{See above. Here, the clusters, instead of the centroids
are provided if used.}

\item{method}{The method to use for the smoothing. Three values possible:
mean (default), median and mode.}

\item{nCores}{The number of cores used. Defaults to number of cores in the
computer minus 1.}
}
\value{
An object of the same dimensions as focusData that has been smoothed.
}
\description{
This function constructs a variable that for each event shows the average
value for its euclidean k-nearest neighbors. It builds on the same
idea as has been put forward in the Sconify package:
-Burns TJ (2019). Sconify: A toolkit for performing KNN-based statistics for
flow and mass cytometry data. R package version 1.4.0 and
-Hart GT, Tran TM, Theorell J, Schlums H, Arora G, Rajagopalan S, et al.
Adaptive NK cells in people exposed to Plasmodium falciparum correlate
with protection from malaria. J Exp Med. 2019 Jun 3;216(6):1280–90.
First, the k nearest neighbors are defined for cell x. Then, the average
value for the k nearest neighbors is returned as the result for cell x.
}
\examples{
data(testData)
data(testDataSNE)
euclidSpaceData <-
    testData[, c(
        "SYK", "CD16", "CD57", "EAT.2",
        "CD8", "NKG2C", "CD2", "CD56"
    )]
\dontrun{
smoothGroupVector <- neighSmooth(
    focusData = as.numeric(testData$label),
    euclidSpaceData
)
}
}
