A Python package for multivariate sampling and resampling techniques for simulation, random sample generation, estimation, and experimental design
Author: Siavash Tabrizian - [email protected]
There are different sampling techniques can be used in order to generate sample. This package helps the user to generate samples with three methods:
1 - Crude Monte Carlo sampling (Simple random sampling/SRS):
The unbiased sample mean estimator
can be used in this case in order to estimate the population mean . In this sampling technique, in order to obtain observations, first random numbers should be generated from where is the number of random variables, and after that using the CDF, the value can be taken from the distribution.
Sampling steps for generating observations:
For i <= n:
For j <= R:
1. build the cumulative distribution of the random variable (CDF)
2. draw a random number from [0,1] interval = r
3. find the value of the random variable for r using the CDF of jth random variable
======================================
2 - Antithetic Sampling
In this sampling technique, in order to obtain observations, first random numbers should be generated from where is the number of random variables, and after that using the CDF, two values can be taken from the distribution using and .
Sampling steps:
For i <= n/2:
For j <= R:
1. build the cumulative distribution of the random variable (CDF)
2. draw a random number from [0,1] interval = r
3. find the value of the random variable for r using the CDF of jth random variable
3. find the value of the random variable for 1-r using the CDF of jth random variable
======================================
2 - Latin Hypercube Sampling (LHS)
In this sampling technique, in order to obtain observations, first each random variable should be stratified into intervals. Thereafter, a permutation of intervals should be generated for each random variable, and they all together represent hypercubes in the sample space, then a random observation can be taken from each hypercube randomely.
Sampling steps:
For i <= n:
1. Generate $R$ random permutations of \{1,...,n\} = p^r_i
For j <= R:
1. build the cumulative distribution of the random variable (CDF)
2. draw a random number from p^r_i interval = r
3. find the value of the random variable for r using the CDF of jth random variable
In this section of the code the description of the second class of sampling module is presented:
1 - Monte Carlo simulation:
There are number of replications and in each replication a sample is going to be generated using one of the techniques from the previous section. The final estimation is the sample mean over the obtained estimations:
2 - Bootstraping:
In this resmapling technique, of smaller size samples are going to be generated from a given sample of the larger size. The estimation can be done by using the sample mean estimator.
3 - Jacknife:
It is another resampling technique for generating a set of samples of smaller size from a given sample of larger size. In this method number of samples are going to be generated from a sample of size . In each sample , observation is taken out from the sample, and this leads to samples of size . The estimation is similar to bootstraping can be obtained using the sample mean estimator.
In the experiment folder there are some expermints using this package. exp1 has 4 random variables and the evaluation function is defined in evalfunc. We can test the law of large numbers when the sample size increases.
var1 = [[0.0,1.0,2.0],[0.1,0.4,0.5]]
var2 = [[1.5,2.5,3.5,4.5,8.0],[0.05,0.05,0.2,0.4,0.3]]
var3 = [[0.1,7.0],[0.05,0.95]]
var4 = [[0.0,0.05,0.07,0.9,0.4],[0.2,0.2,0.5,0.05,0.05]]
RVs =[var1,var2,var3,var4]
def evalfunc(obs):
out = 0.0
out += 10*obs[0]*obs[1]
out += 100*obs[2]
out += (10*obs[3])**2
return out
dist = ProbDist(RVs, evalfunc) #instance of the distribution with its evaluation function
sampl = samp_gen(dist)
resampl = resampling(sampl)
visual = visualsamp(resampl,'Res/')
visual.lawlargevs()