% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/manhattan_data_preprocess.R
\name{manhattan_data_preprocess}
\alias{manhattan_data_preprocess}
\alias{manhattan_data_preprocess.default}
\alias{manhattan_data_preprocess.data.frame}
\alias{manhattan_data_preprocess,GRanges-method}
\title{Preprocess GWAS Result}
\usage{
manhattan_data_preprocess(x, ...)

\method{manhattan_data_preprocess}{default}(x, ...)

\method{manhattan_data_preprocess}{data.frame}(
  x,
  chromosome = NULL,
  signif = c(5e-08, 1e-05),
  pval.colname = "pval",
  chr.colname = "chr",
  pos.colname = "pos",
  highlight.colname = NULL,
  chr.order = NULL,
  signif.col = NULL,
  chr.col = NULL,
  highlight.col = NULL,
  preserve.position = FALSE,
  thin = NULL,
  thin.n = 1000,
  thin.bins = 200,
  pval.log.transform = TRUE,
  chr.gap.scaling = 1,
  ...
)

\S4method{manhattan_data_preprocess}{GRanges}(
  x,
  chromosome = NULL,
  signif = c(5e-08, 1e-05),
  pval.colname = "pval",
  highlight.colname = NULL,
  chr.order = NULL,
  signif.col = NULL,
  chr.col = NULL,
  highlight.col = NULL,
  preserve.position = FALSE,
  thin = NULL,
  thin.n = 100,
  thin.bins = 200,
  pval.log.transform = TRUE,
  chr.gap.scaling = 1,
  ...
)
}
\arguments{
\item{x}{a data frame or any other extension of data frame (e.g. a tibble).
At bare minimum, it should contain chromosome, position, and p-value.}

\item{...}{Additional arguments for manhattan_data_preprocess.}

\item{chromosome}{a character. This is supplied if a manhattan plot of a single chromosome is
desired. If \code{NULL}, then all the chromosomes in the data will be plotted.}

\item{signif}{a numeric vector. Significant p-value thresholds to be drawn for
manhattan plot. At least one value should be provided. Default value is c(5e-08, 1e-5)}

\item{pval.colname}{a character. Column name of \code{x} containing p.value.}

\item{chr.colname}{a character. Column name of \code{x} containing chromosome.}

\item{pos.colname}{a character. Column name of \code{x} containing position.}

\item{highlight.colname}{a character. If you desire to color certain points
(e.g. significant variants) rather than color by chromosome, you can specify the
category in this column, and provide the color mapping in \code{highlight.col}.
Ignored if \code{NULL}.}

\item{chr.order}{a character vector. Order of chromosomes presented in manhattan plot.}

\item{signif.col}{a character vector of equal length as \code{signif}. It contains
colors for the lines drawn at \code{signif}. If \code{NULL}, the smallest value is colored
black while others are grey.}

\item{chr.col}{a character vector of equal length as chr.order. It contains colors
for the chromosomes. Name of the vector should match \code{chr.order}. If \code{NULL}, default
colors are applied using \code{RColorBrewer}.}

\item{highlight.col}{a character vector. It contains color mapping for the values from
\code{highlight.colname}.}

\item{preserve.position}{a logical. If \code{TRUE}, the width of each chromosome reflect the
number of variants and the position of each variant is correctly scaled? If \code{FALSE}, the
width of each chromosome is equal and the variants are equally spaced.}

\item{thin}{a logical. If \code{TRUE}, \code{thinPoints} will be applied. Defaults to \code{TRUE} if
\code{chromosome} is \code{NULL}. Defaults to \code{FALSE} if \code{chromosome} is supplied.}

\item{thin.n}{an integer. Number of max points per horizontal partitions of the plot.
Defaults to 1000.}

\item{thin.bins}{an integer. Number of bins to partition the data. Defaults to 200.}

\item{pval.log.transform}{a logical. If \code{TRUE}, the p-value will be transformed to -log10(p-value).}

\item{chr.gap.scaling}{scaling factor for gap between chromosome if you desire to change it.
This can also be set in \code{manhattan_plot}}
}
\value{
a MPdata object. This object contains all the necessary components
for constructing a manhattan plot.
}
\description{
Preprocesses a result from Genome Wide Association Study
before making a manhattan plot.
It accepts a \code{data.frame}, which at bare minimum should
contain a chromosome, position, and p-value.
Additional options, such as chromosome color, label column names,
and colors for specific variants, are provided here.
}
\details{
\code{manhattan_data_preprocess} gathers information needed to plot a manhattan plot
and organizes the information as \code{MPdata} S3 object.

New positions for each points are calculated, and stored in the data.frame as
\code{"new_pos"}. By default, all chromosomes will have the same width, with each
point being equally spaced. This behavior is changed when \code{preserve.position = TRUE}.
The width of each chromosome will scale to the number of points and the points will
reflect the original positions.

\code{chr.col} and \code{highlight.col}, maps the data values to colors. If they are
an unnamed vector, then the function will try its best to match the values of
\code{chr.colname} or \code{highlight.colname} to the colors. If they are a named vector,
then they are expected to map all values to a color. If \code{highlight.colname} is
supplied, then \code{chr.col} is ignored.

While feeding a \code{data.frame} directly into \code{manhattan_plot}
does preprocessing & plotting in one step. If you plan on making multiple plots
with different graphic options, you have the choice to preprocess separately and
then generate plots.
}
\examples{

gwasdat <- data.frame(
  "chromosome" = rep(1:5, each = 30),
  "position" = c(replicate(5, sample(1:300, 30))),
  "pvalue" = rbeta(150, 1, 1)^5
)

  manhattan_data_preprocess(
  gwasdat, pval.colname = "pvalue", chr.colname = "chromosome", pos.colname = "position",
  chr.order = as.character(1:5)
)

}
