Bootstrapping in R

Bootstrapping is a method for estimating the sampling distribution of an estimator by resampling with replacement from the original sample. The method is especially useful when sampling distribution of estimator is not standard distribution.
Bootstrapping can be used in the following scenarios:

Small sample sizes with unknown population distribution
When assumption of normality does not hold
Skewed data
A non-linear combination of variables (Eg. A ratio)
A location statistic other than the mean (Median, Difference in median)

Let us see how to perform bootstrapping in R.

Objective is to generate sampling distribution of sample median. The original sample contains 10 values: It is imported in R as bootdata.

bootdata

R has function boot() in package boot to generate bootstrap replicates of a statistic applied to data. This function allows both parametric and nonparametric resampling.

install.packages("boot")
library(boot)

The function boot() available in package boot calculates a statistic (in our case - median) for specified number of times (say 1000). The statistic is defined using a function. The following function f accepts data and vector of random numbers i and calculates median for resampled data.

f <- function (data, i) {
  d <- data[i,]
  med <- median(d)
  return(med)
}

The above function is called 1000 times using the function boot().

bootobject <- boot(data = bootdata, statistic = f, R = 1000)

data= is the original sample (A vector, matrix or a dataframe).
statistic= is a function which when applied to data returns a vector containing the statistic(s) of interest.
R= is the number of bootstrap replicates. Generally, 1000 or more replicates are generated to get sampling distribution.

Note that the function boot() calls function f by sending original data and the vector of random numbers 1000 times.

bootobject


ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = bootdata, statistic = f, R = 1000)


Bootstrap Statistics :
    original  bias    std. error
t1*      1.1 0.05645   0.3337398

This bootstrap object can be used for further analysis. For instance, suppose we want to know the 95% confidence interval of the median:

boot.ci(bootobject, type = "perc")

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL : 
boot.ci(boot.out = bootobject, type = "perc")

Intervals : 
Level     Percentile     
95%   ( 0.45,  1.70 )  
Calculations and Intervals on Original Scale

boot.ci() generates five types of equi-tailed two-sided nonparametric confidence intervals.
Types include:

Intervals using normalized approximation ("norm")
Basic bootstrap method ("basic")
Studentized intervals ("stud")
Bootstrap percentile method ("perc")
Adjusted bootstrap percentile (BCa) method ("bca")