% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/VCG_sampler.R
\name{VCG_sampler}
\alias{VCG_sampler}
\title{VCG Sampler for Energy Distance Balancing}
\usage{
VCG_sampler(formula, data, n, c_w = NULL, random = FALSE, plot = TRUE)
}
\arguments{
\item{formula}{A formula specifying the treated indicator and covariates, e.g., `treated ~ cov1 + cov2 | stratum`. The treated variable must be binary (0=pool, 1=treated)}

\item{data}{A data frame containing the variables specified in the formula.}

\item{n}{Integer. Number of observations to sample from the pool, or a vector of n for each stratum}

\item{c_w}{Optional: Vector of positive weights for covariates, reflecting the relative importance of the covariates for balancing.}

\item{random}{Logical. If `TRUE`, the distance is used as the probability for selecting the observation; otherwise, the nearest observations are used (deterministic). Default: FALSE}

\item{plot}{Logical. If `TRUE`, returns a visualization of the balancing effect.}
}
\value{
If `plot = TRUE`, returns a list with:
\itemize{
  \item A data frame with added columns:
    \itemize{
      \item `VCG`: Indicator for selected pool units. VCG==1 indicates the VCG selected.
      \item `e_weights`: Energy weights used for selection
      \item `<treated>_balanced`: A factor indicating balanced treated assignment.
    }
  \item A ggplot2 object showing the median and MAD differences before and after balancing,
        with a 95% permutation ellipse as an approximation for typical random deviations.
}
If `plot = FALSE`, returns only the modified data frame.
}
\description{
This function performs energy distance based balancing and selects a subset from pool based on energy distance to approximate a randomized control trial. Optionally, it visualizes the balancing results.
}
\details{
If random is set to FALSE, the function selects the top `n` units from the pool with the lowest energy distance and assigns them to the VCG group.
If random is set to TRUE, the function samples  `n` units from pool with sampling probability inversely proportional to energy distance.
The quality of covariate balancing is visualized using differences in medians and median absolute deviations (MADs).
Permutation ellipses are generated by randomly permuting the pool and treated groups to estimate usual (random) variability.
Only the X and Y axes are computed directly; the ellipse is interpolated between the axes.
This method is intended as a visual approximation rather than a precise statistical test.
}
\examples{

dat   <- data.frame(
  cov1  = rnorm(50, 10, 1),
  cov2  = rnorm(50, 7,  1),
  cov3  = rnorm(50, 5,  1),
  treated = rep(c(0, 1), c(35, 15))
)
  VCG_sampler(treated ~ cov1 + cov2 + cov3, data=dat, n=5)

}
