% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utils_clustering.R
\name{collapseSimilarChimeras}
\alias{collapseSimilarChimeras}
\title{Call clustering multiple times to collapse similar reads into duplex groups}
\usage{
collapseSimilarChimeras(
  gi,
  read_stats_df,
  maxgap = 5,
  niter = 2,
  minoverlap = 10,
  min_nodes = 10
)
}
\arguments{
\item{gi}{\code{GInteractions} object}

\item{read_stats_df}{\code{tibble} with the mapping 'read_id' and 'duplex_id' fields
'read_id' refers to the unique read, 'duplex_id' refers to the entry collapsed
identical reads i.e two identical reads will will correspond to two unique read_id and
the single duplex_id with n_reads=2}

\item{maxgap}{Maximum relative shift between the overlapping read arms}

\item{niter}{Number of times clustering will be called}

\item{minoverlap}{Minimum required overlap between either read arm}

\item{min_nodes}{Minimum count of nodes to finish the interaction merging}
}
\value{
a list with the following keys
\describe{
\item{gi_updated}{ \code{GInteractions} object with both collapsed duplex groups
and not-collapsed unchanged reads}
\item{stats_df}{ \code{tibble} With the mapping from the unique read -
with the the infromation about time and memory reaquired for the function
call
}
}
}
\description{
Function calls clustering algorithm several times and collapses highly similar
reads to the temporary duplex groups (DGs).
}
\details{
Calling this procedure before global read clustering
substantially reduces time required for calling DGs.
Collapsed duplex groups are aggregated only from the reads which are shifted
by only a few nucleotides from each other.
These DGs are temporary until full library clustering is called.
To keep track of the mapping of the temprary DGs to the input, dedicated
dataframe is returned. The 'duplex_id' column will be added or updated as
identifier for the temporary duplex group.
The number of reads under single 'duplex_id' is recorded in the 'n_reads' fields
}
