% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/findGSEP.R
\name{findGSEP}
\alias{findGSEP}
\title{Estimate genome size of polyploid species using k-mer frequencies.}
\usage{
findGSEP(
  path,
  samples,
  sizek,
  exp_hom,
  ploidy,
  range_left,
  range_right,
  xlimit,
  ylimit,
  output_dir = "outfile"
)
}
\arguments{
\item{path}{is the histo file location (mandatory).}

\item{samples}{is the histo file name (mandatory)}

\item{sizek}{is the size of k used to generate the histo file (mandatory).
K is involved in calculating heterzygosity if the genome is heterozygous.}

\item{exp_hom}{a rough average k-mer coverage for finding the homozygous regions.
In general, one can get peaks in the k-mer frequencies file, but has to
determine which one is for the homozygous regions, and which one is for the
heterozygous regions. It is optional, however, it must be provided
if one wants to estimate size for a heterozygous genome.
VALUE for exp_hom must satisfy fp < VALUE < 2*fp, where fp is the freq for homozygous peak.
If not provided, 0 by default assumes the genome is homozygous.}

\item{ploidy}{is the number of ploidy. (mandatory).}

\item{range_left}{is the left range for estimation, default is exp_hom*0.2, normally do not need
to change this. (optional).}

\item{range_right}{is the right range for estimation, default is exp_hom*0.2, normally do not need
to change this. (optional).}

\item{xlimit}{is the x-axis range, if not given, then it will automatically calculate a proper range,
normally do not need to change this. (optional).}

\item{ylimit}{is the y-axis range, if not given, then it will automatically calculate a proper range,
normally do not need to change this. (optional).}

\item{output_dir}{is the path to write output files (optional).
If not specify, will use tempdir() as output directory.}
}
\value{
No return value, called for side effects. The function generates PDF, PNG, and CSV files in the specified output directory.
}
\description{
findGSEP is a function for multiple polyploidy
genome size estimation by fitting k-mer frequencies iteratively
with a normal distribution model.

To use findGSEP, one needs to prepare a histo file,
which contains two tab-separated columns.
The first column gives frequencies at which k-mers occur in reads,
while the second column gives counts of such distinct k-mers.
Parameters k and related histo file are required for any estimation.

Dependencies (R library) required: pracma, fGarch, etc. - see DESCRIPTION for details.
}
\examples{
\dontrun{
test_histo <- system.file("extdata","example.histo",package = "findGSEP")
path <- dirname(test_histo)
samples <- basename(test_histo)
sizek <- 21
exp_hom <- 200
ploidy <- 3
range_left <- exp_hom*0.2
range_right <- exp_hom*0.2
xlimit <- -1
ylimit <- -1
output_dir <- tempdir()

findGSEP(path, samples, sizek, exp_hom, ploidy, range_left, range_right, xlimit, ylimit, output_dir)
}
}
