The goal of skewsamp is to provide access to sample size estimation methods for group comparisons where the underlying data are skewed and thus violate the assumptions for common methods of sample size estimation.

In particular, skewsamp offers an approach based on generalized linear models (GLM) as described by Cundill & Alexander (2015) and the “NECDF” (Noether Empirical Distribution Function) approach based on the nonparametric Wilxocon-Mann-Whitney test in the location shift paradigm as described by Chakraborti, Hong, & van de Wiel (2006).

Installation

Skewsamp is available directly from CRAN:

install.packages("skewsamp")

You can also install the latest development version directly from github:

# install.packages("devtools") # needed for the installation
devtools::install_github("https://github.com/jobrachem/skewsamp)

Documentation

All function are documented, so that you can use R’s builtin help system. You can also refer to the online documentation, which includes a list of all functions.

Simulation study

We verified the correctness of our implementation through extensive simulations. The data, code and final report are available on the Open Science Framework. Note that the report is written in german.

The simulations revealed that the GLM-based approach (Cundill & Alexander, 2015) works robustly. The nonparametric NECDF approach is dependent on pilot data and can provide significant underestimations of the required sample sizes. Please consult the report linked above for further details.

Usage

Example 1

Sample size determination in the GLM approach for gamma-distributed data:

library(skewsamp)
skewsamp::n_gamma(mean0 = 1, effect = 0.5, shape0 = 1, alpha = 0.05, power = 0.9)
#> Estimated sample size for group difference.
#> Generalized Regression, Gamma Distribution, link: log 
#> 
#> N (total)         87.48 
#> n0 (Group 0)      43.74 
#> n1 (Group 1)      43.74 
#> 
#> Effect size       0.5 
#> Effect type       1 - (mean1/mean0) 
#> Type I error      0.05 
#> Target power      0.9 
#> Two-sided         TRUE 
#> 
#> Call: skewsamp::n_gamma(mean0 = 1, effect = 0.5, shape0 = 1, alpha = 0.05, 
#>     power = 0.9)

Example 2

Sample size determination in the location shift approach. This approach requires pilot data, which we draw from an exponential distribution for the sake of the example:

library(skewsamp)
skewsamp::n_locshift(s1 = rexp(10), s2 = rexp(10), delta = 0.5, alpha = 0.05, power = 0.9)
#> Estimated sample size for group difference.
#> Wilcoxon-Mann-Whitney Test, Location shift 
#> 
#> N (total)         48.46 
#> n0 (Group 0)      24.23 
#> n1 (Group 1)      24.23 
#> 
#> Effect size       0.5 
#> Effect type       location shift 
#> Type I error      0.05 
#> Target power      0.9 
#> Two-sided         FALSE 
#> 
#> Call: skewsamp::n_locshift(s1 = rexp(10), s2 = rexp(10), delta = 0.5, 
#>     alpha = 0.05, power = 0.9)

References