% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/cbaf-processOneStudy.R
\name{processOneStudy}
\alias{processOneStudy}
\title{Check Expression/methylation Profile for various subgroups of a cancer
 study.}
\usage{
processOneStudy(genesList, submissionName, studyName, desiredTechnique
  , desiredCaseList = FALSE, validateGenes = TRUE, calculate =
  c("frequencyPercentage", "frequencyRatio", "meanValue"), cutoff=NULL,
  round=TRUE, topGenes = TRUE, shortenStudyNames = TRUE, geneLimit = 50,
  rankingMethod = "variation", heatmapFileFormat = "TIFF", resolution = 600,
  RowCex = "auto", ColCex = "auto", heatmapMargines = "auto",
  rowLabelsAngle = 0, columnLabelsAngle = 45, heatmapColor = "RdBu",
  reverseColor = TRUE, transposedHeatmap = FALSE, simplifyBy = FALSE,
  genesToDrop = FALSE, transposeResults = FALSE)
}
\arguments{
\item{genesList}{a list that contains at least one gene group}

\item{submissionName}{a character string containing name of interest. It is
used for naming the process.}

\item{studyName}{a character string showing the desired cancer name. It is an
standard cancer study name that can be found on cbioportal.org, such as
\code{"Acute Myeloid Leukemia (TCGA, NEJM 2013)"}.}

\item{desiredTechnique}{a character string that is one of the following
techniques: \code{"RNA-Seq"}, \code{"RNA-SeqRTN"}, \code{"microRNA-Seq"},
\code{"microarray.mRNA"}, \code{"microarray.microRNA"} or
\code{"methylation"}.}

\item{desiredCaseList}{a numeric vector that contains the index of desired
cancer subgroups, assuming the user knows index of desired subgroups. If not,
 desiredCaseList is set to \code{"none"}, function will show the available
 subgroups and ask the user to enter the desired ones during the
 process. The default value is \code{"none"}.}

\item{validateGenes}{a logical value that, if set to be \code{TRUE}, causes
the function to check each cancer study to find whether or not each gene has
a record. If a cancer doesn't have a record for specific gene, function looks
for alternative gene names that cbioportal might use instead of the given
gene name.}

\item{calculate}{a character vector that containes the statistical procedures
users prefer the function to compute. The complete results can be obtained
by \code{c("frequencyPercentage", "frequencyRatio", "meanValue",
"medianValue")}. This will tell the function to compute the following:
\code{"frequencyPercentage"}, which is the percentge of samples having the
value greather than specific cutoff divided by the total sample size for
every study / study subgroup;
\code{"frequency ratio"}, which shows the number of selected samples divided
by the total number of samples that give the frequency percentage for every
study / study subgroup. It shows the selected and total sample sizes.;
\code{"Mean Value"}, that contains mean value of selected samples for each
study;
\code{"Median Value"}, which shows the median value of selected samples for
each study.
The default input is \code{calculate = c("frequencyPercentage",
"frequencyRatio", "meanValue")}.}

\item{cutoff}{a number used to limit samples to those that are greather than
specific number (cutoff). The default value for methylation data is 0.8 while
 gene expression studies use default value of 2. For methylation studies, it
 is \code{average of relevant locations}, for the rest, it is
 \code{"log z-score"}. To change the cutoff to any desired number, change the
 option to \code{cutoff = desiredNumber}, in which desiredNumber is the
 number of interest.}

\item{round}{a logical value that, if set to be \code{TRUE}, will force the
function to round all the calculated values to two decimal places. The
default value is \code{TRUE}.}

\item{topGenes}{a logical value that, if set as \code{TRUE}, causes the
function to create three dataframes that contain the five top genes for each
cancer. To get all the three dataframes, \code{"frequencyPercentage"},
\code{"meanValue"} and \code{"medianValue"} must have been included for
\code{"calculate"}.}

\item{shortenStudyNames}{a logical vector. If the value is set as
\code{TRUE}, function will try to remove the last part of the cancer names
aiming to shorten them. The removed segment usually contains the name of
scientific group that has conducted the experiment.}

\item{geneLimit}{if large number of genes exist in at least one gene group,
this option can be used to limit the number of genes that are shown on
heatmap. For instance, \code{geneLimit=50} will limit the heatmap to 50 genes
 showing the most variation across multiple study / study subgroups. The
 default value is \code{50}.}

\item{rankingMethod}{a character value that determines how genes will be
ranked prior to drawing heatmap. \code{"variation"} orders the genes based on
unique values in one or few cancer studies while \code{"highValue"} ranks the
 genes when they cotain high values in multiple / many cancer studies. This
 option is useful when number of genes are too much so that user has to limit
 the number of genes on heatmap by \code{geneLimit}.}

\item{heatmapFileFormat}{This option enables the user to select the desired
image file format of the heatmaps. The default value is \code{"TIFF"}. Other
supported formats include \code{"JPG"}, \code{"BMP"}, \code{"PNG"}, and
\code{"PDF"}.}

\item{resolution}{a number. This option can be used to adjust the resolution
of the output heatmaps as 'dot per inch'. The defalut value is 600.}

\item{RowCex}{a number that specifies letter size in heatmap row names,
which ranges from 0 to 2. If \code{RowCex = "auto"}, the function will
automatically determine the best RowCex.}

\item{ColCex}{a number that specifies letter size in heatmap column names,
which ranges from 0 to 2. If \code{ColCex = "auto"}, the function will
automatically determine the best ColCex.}

\item{heatmapMargines}{a numeric vector that is used to set heatmap margins.
If \code{heatmapMargines = "auto"}, the function will automatically
determine the best possible margines. Otherwise, enter the desired margine as
e.g. c(10,10.)}

\item{rowLabelsAngle}{a number that determines the angle with which the
gene names are shown in heatmaps. The default value is 0 degree.}

\item{columnLabelsAngle}{a number that determines the angle with which the
studies/study subgroups names are shown in heatmaps. The default value is 45
degree.}

\item{heatmapColor}{a character string that defines heatmap color. The
default value is \code{'RdBu'}. \code{'RdGr'} is also a popular color in
genomic studies. To see the rest of colors, please type
\code{library(RColorBrewer)} and then \code{display.brewer.all()}.}

\item{reverseColor}{a logical value that reverses the color gradiant for
heatmap(s).}

\item{transposedHeatmap}{a logical value that transposes heatmap rows to
columns and vice versa.}

\item{simplifyBy}{a number that tells the function to change the values
smaller than that to zero. The purpose behind this option is to facilitate
recognizing candidate genes. Therefore, it is not suited for publications. It
has the same unit as \code{cutoff}.}

\item{genesToDrop}{a character vector. Gene names within this vector will be
omitted from heatmap.The default value is \code{FALSE}.}

\item{transposeResults}{a logical value that enables the function to replace
the columns and rows of data.}
}
\value{
a BiocFileCache object that containes some or all of the following
groups, based on what user has chosen: \code{ObtainedData},
\code{validationResults}, \code{frequencyPercentage},
\code{Top.Genes.of.Frequency.Percentage}, \code{frequencyRatio},
\code{meanValue}, \code{Top.Genes.of.Mean.Value}, \code{medianValue},
\code{Top.Genes.of.Median.Value}. It also saves these results in one excel
files for convenience. Based on preference, three heatmaps for frequency
percentage, mean value and median can be generated. If more than one group of
 genes is entered, output for each group will be strored in a separate
 sub-directory.
}
\description{
This function Obtains the requested data for the given genes
across multiple subgroups of a cancer. It can check whether or not all genes
are included in subgroups of a cancer study and, if not, looks for the
alternative gene names. Then it calculates frequency percentage, frequency
ratio, mean value and median value of samples greather than specific value in
 the selected subgroups of the cancer. Furthermore, it looks for the five
 genes that comprise the highest values in each cancer study subgroup.
}
\details{
\tabular{lllll}{
Package: \tab cbaf \cr
Type: \tab Package \cr
Version: \tab 1.31.1 \cr
Date: \tab 2025-10-26 \cr
License: \tab Artistic-2.0 \cr
}
}
\examples{
genes <- list(K.demethylases = c("KDM1A", "KDM1B", "KDM2A", "KDM2B", "KDM3A",
 "KDM3B", "JMJD1C", "KDM4A"), K.methyltransferases = c("SUV39H1", "SUV39H2",
 "EHMT1", "EHMT2", "SETDB1", "SETDB2", "KMT2A", "KMT2A"))

processOneStudy(genes, "test", "Breast Invasive Carcinoma (TCGA, Cell 2015)",
"RNA-Seq", desiredCaseList = c(2,3,4,5), calculate = c("frequencyPercentage",
"frequencyRatio"), heatmapMargines = c(16, 10), RowCex = 1, ColCex = 1)

}
\author{
Arman Shahrisa, \email{shahrisa.arman@hotmail.com} [maintainer,
copyright holder]

Maryam Tahmasebi Birgani, \email{tahmasebi-ma@ajums.ac.ir}
}
