[통계계산] 9. Random Number Generation

Computational Statistics

Featured image

[통계계산] 9. Numerical Integration


목차

  1. Computer Arithmetic
  2. LU decomposition
  3. Cholesky Decomposition
  4. QR Decomposition
  5. Interative Methods
  6. Non-linear Equations
  7. Optimization
  8. Numerical Integration
  9. Random Number Generation
  10. Monte Carlo Integration


sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] scatterplot3d_0.3-41

loaded via a namespace (and not attached):
 [1] fansi_0.5.0       digest_0.6.27     utf8_1.2.1        crayon_1.4.1     
 [5] IRdisplay_1.0     repr_1.1.3        lifecycle_1.0.0   jsonlite_1.7.2   
 [9] evaluate_0.14     pillar_1.6.1      rlang_0.4.11      uuid_0.1-4       
[13] vctrs_0.3.8       ellipsis_0.3.2    IRkernel_1.1      tools_3.6.3      
[17] compiler_3.6.3    base64enc_0.1-3   pbdZMQ_0.3-5      htmltools_0.5.1.1
# install.packages("ggplot2")

library(scatterplot3d)
library(ggplot2)
Registered S3 methods overwritten by 'tibble':
  method     from  
  format.tbl pillar
  print.tbl  pillar

Registered S3 methods overwritten by 'ggplot2':
  method         from 
  [.quosures     rlang
  c.quosures     rlang
  print.quosures rlang

Random number generation

Uniform random number generation

Goal

To generate $U_i \stackrel{iid}{\sim} \text{unif}(0, 1)$.

Congruential generator

\begin{align} x_{i+1} &= a x_i \mod m, \quad i=1, 2, \dotsc
u_i &= x_i / m. \end{align
}

set.seed(2020)  # set 2020th seed; does not mean x_0 = 2020
runif(5)
set.seed(2020)  # same seed results in same "random" sequence
runif(5)

<ol class=list-inline><li>0.646902838954702</li><li>0.394225758267567</li><li>0.618501814315096</li><li>0.476891135564074</li><li>0.136097185546532</li></ol>

<ol class=list-inline><li>0.646902838954702</li><li>0.394225758267567</li><li>0.618501814315096</li><li>0.476891135564074</li><li>0.136097185546532</li></ol>

Good RNGs should have long periods, and should give samples which appear to be drawn from a uniform distribution. If the size of the sample is much less than the period of the RNG, then the sample should appear random.

Autocorrelation

# IBM System/360 RANDU generator
# a = 2^16 + 3 = 65539
# m = 2^31
n <- 2000
a <- 65539
m <- 2^31
u <- vector(length=n)
u[1] <- 1
for (i in seq_len(n-1)) {
    u[i+1] <- a * u[i] %% m
}
scatterplot3d(u[1:(n-2)], u[2:(n-1)], u[3:n], angle=160)

png

# Early MATLAB RNG
# a = 7^5
# m = 2^31 - 1
n <- 2000
a2 <- 7^5
m2 <- 2^31 - 1
v <- vector(length=n)
v[1] <- 1
for (i in seq_len(n-1)) {
    v[i+1] <- a2 * v[i] %% m2
}
scatterplot3d(v[1:(n-2)], v[2:(n-1)], v[3:n], angle=20)

png

A simple modification is to introduce shuffling in the sequence, which we won’t cover in detail.

R’s RNG

RNGkind()

<ol class=list-inline><li>‘Mersenne-Twister’</li><li>‘Inversion’</li><li>‘Rejection’</li></ol>

Transformation methods

From now on, we assume the the problem of generating uniform random numbers has been solved for practical purposes.

Inverse CDF method

For a random variable $X$, let $F$ be its cumulative distribution function (CDF), that is, $F(x) = P(X \le x)$. Recall that $F$ is right-continuous and nondecreasing. Also, if $F$ is strictrly increasing, random variable $F(X)$ is uniformly distributed on $[0, 1]$. Below, we generalize this result.

The inverse CDF of $X$ is defined as \(F^{-1}(u) = \inf\{x: F(x) \ge u\}, ,\) which coincides the usual inverse of $F$ if $F$ is strictly increasing.

Proposition 1. Let $X$ be a random variable with CDF $F$. Then the following holds.

  1. If $F$ is continuous, then $U = F(X)$ is a uniform random variable on $[0, 1]$.
  2. If $F$ is not continous, then $P[F(X) \le y] \le y$ for all $y \in [0, 1]$.
  3. If $U$ is uniform on $[0, 1]$, then $F^{-1}(U)$ has CDF $F$.

Proof.

Let $u \in (0, 1)$. Then by continuity of $F$, there is $y$ such that $F(y) = u$. By (*), \(P[F(X) \le u] = P[F(X) \le F(y)] = F(y) = u.\)

Exponential distribution

# Exp(1)
n <- 1000
u <- runif(n)
x <- -log(u)
expdata <- data.frame(x)
plt <- ggplot(expdata, aes(x=x)) +  
    geom_histogram(binwidth=0.25, fill="white", color="black") + 
    geom_density(aes(y=0.25 * ..count..), color="purple")
print(plt)

png

Cauchy distribution

From the density of the Cauchy distribution \(f(x) = \frac{\beta}{\pi(\beta^2 + (x-\mu)^2)},\) its CDF is $F(x) = \frac{1}{2} + \frac{1}{\pi}\tan^{-1}\left(\frac{x-\mu}{\beta}\right)$. Thus \(F^{-1}(u) = \mu + \beta\tan(\pi u - \pi/2) = \mu - \frac{\beta}{\tan(\pi u)} .\)

# standard Cauchy (beta=1, mu=0)
n <- 1000
u <- runif(n)
x <- -1/tan(pi * u)
#hist(x, breaks=40)
cauchydata <- data.frame(x)
plt <- ggplot(cauchydata, aes(x=x)) +  
    geom_histogram(binwidth=10, fill="white", color="black") + 
    geom_density(aes(y=10 * ..count..), color="purple") + 
    xlim(-200, 200)
print(plt)
Warning message:
“Removed 4 rows containing non-finite values (stat_bin).”
Warning message:
“Removed 4 rows containing non-finite values (stat_density).”
Warning message:
“Removed 2 rows containing missing values (geom_bar).”

png

Discrete uniform distribution

$X \sim \text{unif}({1, 2, \dotsc, k})$. It is easy to verify $F(x) = \frac{1}{k}\lfloor x \rfloor$ for $x \in [0, n]$ and $F^{-1}(u)=\lceil ku \rceil$.

k <- 10
n <- 1000
u <- runif(n)
x <- ceiling(k * u)
table(x)
x
  1   2   3   4   5   6   7   8   9  10 
108 108  92  93  97  85  94 118 108  97 

Geometric distribution

If $X \sim \text{geom}(p)$, then its probability mass functin $p(x) = (1-p)^{x-1}p$.

For $Y \sim \text{Exp}(\lambda)$, \begin{align} P(\lceil Y \rceil = k) &= P(k-1 < Y \le k) = F_Y(k) - F_Y(k-1) = (1 - e^{-\lambda k}) - (1 - e^{-\lambda(k-1)})
&= e^{-\lambda(k-1)}(1 - e^{-\lambda})
&- (1 - p)^{k-1} p \end{align
} if $\lambda$ satisfies $p = 1 - e^{-\lambda}$, or $\lambda = -\log(1-p)$.

For this $\lambda$, $X = \lceil Y \rceil = \lceil -\frac{1}{\lambda}\log U \rceil = \lceil \frac{\log U}{\log(1-p)}\rceil$.

gengeom <- function(p, nsamp=1) {
    u <- runif(nsamp)
    y <- log(u) / log(1 - p)
    ceiling(y)
}    
nsamp <- 1000
p <- 0.3
x <- gengeom(p, nsamp)
geomdata <- data.frame(x)
plt <- ggplot(geomdata, aes(x=x)) + geom_histogram(binwidth=0.25)
print(plt)

png

Normal random numbers

For $X \sim N(0, 1)$, inverse CDF $\Phi^{-1}$ does not have a closed form.

Box-Muller

Generates $X, Y \stackrel{iid}{\sim} N(0, 1)$.

Transforms the random Cartesian coordinates $(X, Y)$ to polar coorinates $(R, \Theta)$. Since $(X, Y)=(R\cos\Theta, R\sin\Theta)$, \(\iint f_{XY}(x, y)dxdy = \frac{1}{2\pi}\exp(-\frac{x^2 + y^2}{2})dxdy = \iint \frac{1}{2\pi}\exp(-\frac{r^2}{2})rdrd\theta .\)

Hence $R$ has density $f_R(r) = r\exp(-\frac{r^2}{2})$ and $\Theta$ is uninform on $[0, 2\pi]$. Since \(P(R > \rho) = P(R^2 > \rho^2) = \int_\rho^{\infty} r\exp(-\frac{r^2}{2})dr = \exp(-\frac{1}{2}\rho^2),\) random variable $R^2$ is exponentially distributed with $\lambda = 1/2$.

Thus for independent $U, V \sim \text{unif}(0, 1)$, set \(R = (-2\log U)^{1/2}, \quad \Theta = 2\pi V .\) Then $(X, Y) = (R\cos\Theta, R\sin\Theta)$ are independently $N(0, 1)$.

boxmuller <- function(nsamp) {
    n <- ceiling(nsamp / 2)
    u <- runif(n)
    v <- runif(n)
    r <- sqrt(-2 * log(u))
    theta <- 2 * pi * v
    x <- r * cos(theta)
    y <- r * sin(theta)
    samp <- c(x, y)
    samp[seq_len(nsamp)]
}
#hist(c(x, y))
n <- 1000
normdata1 <- data.frame(x = boxmuller(n))
plt <- ggplot(normdata1, aes(x=x)) +  
    geom_histogram(binwidth=0.25, fill="white", color="black") + 
    geom_density(aes(y=0.25 * ..count..), color="purple")
print(plt)

png

Marsaglia

marsaglia <- function(nsamp) {
    n <- ceiling(nsamp / 2)
    it <- 0
    x <- numeric(n)
    y <- numeric(n)
    while (it < n) {
        u <- 2 * runif(1) - 1
        v <- 2 * runif(1) - 1
        tau <- sqrt(u^2 + v^2)
        if (tau > 1) next
        x[it] <- sqrt(-4 * log(tau)) * u / tau
        y[it] <- sqrt(-4 * log(tau)) * v / tau
        it <- it + 1
    }
    samp <- c(x, y)
    samp[seq_len(nsamp)]    
}
n <- 1000
normdata2 <- data.frame(x = marsaglia(n))
plt <- ggplot(normdata2, aes(x=x)) +  
    geom_histogram(binwidth=0.25, fill="white", color="black") + 
    geom_density(aes(y=0.25 * ..count..), color="purple")
print(plt)

png

Random numbers by definition

Bernoulli

  1. Set the success probability $p$
  2. Generate $U \sim \text{unif}(0, 1)$.
  3. Let $X = \mathbb{I}_{{U \le p}}$.
  4. Then $X \sim \text{Ber}(p)$.

Binomial

  1. Set the success probability $p$
  2. Generate $n$ independent $X_i \sim \text{Ber}(p)$.
  3. Let $X_n = \sum_{i=1}^n X_i$.
  4. Then $X_n \sim B(n, p)$.

Negative binomial

  1. Set the success probability $p$
  2. Generate $r$ independent $X_i \sim \text{Geom}(p)$.
  3. Let $X_r = \sum_{i=1}^r X_i$.
  4. Then $X_r \sim \text{NegBin}(r, p)$.

Poisson

  1. Generate $U_i \stackrel{iid}{\sim} \text{unif}(0, 1)$.
  2. Find $N$ such that $\prod_{i=1}^N U_i \ge e^{-\lambda} > \prod_{i=1}^{N+1} U_i$.
  3. Then $N \sim \text{Poi}(\lambda)$.
## Binomial random number generation
genbin <- function(n, p) {
    u <- runif(n)
    x <- sum(u < p)
}
n <- 10; p <- 0.6
nsamp <- 1000
x <- numeric(nsamp)
for (i in seq_len(nsamp)) {
    x[i] <- genbin(n, p)
}
bindata <- data.frame(x)
plt <- ggplot(bindata, aes(x=x)) + geom_histogram(binwidth=0.25)
print(plt)

png

# Negative binomial random number generation
gengeom <- function(p, nsamp=1) {
    u <- runif(nsamp)
    y <- log(u) / log(1 - p)
    ceiling(y)
}    
nsamp <- 1000
p <- 0.6
r <- 5
x <- numeric(nsamp)
for (i in seq_len(r)) {
    x <- x + gengeom(p, nsamp)
}
negbindata <- data.frame(x)
plt <- ggplot(negbindata, aes(x=x)) + geom_histogram(binwidth=0.25)
print(plt)

png

# Poisson random number generation
genpoi <- function(lam, maxiter=1000) {
    u_cum <- 1.0
    k <- 0
    while (u_cum > exp(-lam) && k < maxiter ) {
        u <- runif(1)
        u_cum <- u_cum * u
        k <- k + 1
    }
    k
}
lam <- 3  # Poisson rate
nsamp <- 1000
x <- numeric(nsamp)
for (i in seq_len(nsamp)) {
    x[i] <- genpoi(lam)
}
poidata <- data.frame(x)
plt <- ggplot(poidata, aes(x=x)) + geom_histogram(binwidth=0.25)
print(plt)

png

Chi-square

  1. Generate $Z_1, \dotsc, Z_{\nu} \stackrel{iid}{\sim} N(0, 1)$.
  2. Let $X_{\nu} = \sum_{i=1}^{\nu} Z_i^2$.
  3. Then $X_{\nu} \sim \chi^2(\nu)$.

Alternatively, for even $\nu$:

  1. Generate $U_i \stackrel{iid}{\sim} \text{unif}(0, 1)$.
  2. Let $X_{\nu} = -2\log(\prod_{i=1}^{\nu/2} U_i)$.
  3. Then $X_{\nu} \sim \chi^2(\nu)$.

This is because $\chi^2(\nu) = \text{Gamma}(\nu/2, 2)$, where 2 is the scale parameter.

Student’s $t$

  1. Generate $Z \sim N(0, 1)$ and $X \sim \chi^2(\nu)$ independently.
  2. Let $T = Z / \sqrt{X/\nu}$.
  3. Then $T \sim t(\nu)$.

$F$

  1. Generate $X_1 \sim \chi^2(\nu_1)$ and $X_2 \sim \chi^2(\nu_2)$ independently.
  2. Let \(F = \frac{X_1/\nu_1}{X_2/\nu_2}.\)
  3. The $F \sim F(\nu_1, \nu_2)$.
## chi-square random number generation
genchisq1 <- function(nsamp, nu) {
    z <- matrix(rnorm(nsamp * nu), nrow=nsamp)
    rowSums(z^2)
}
nu <- 6
n <- 1000
chisqdata1 <- data.frame(x = genchisq1(n, nu))
plt <- ggplot(chisqdata1, aes(x=x)) +  
    geom_histogram(binwidth=0.25, fill="white", color="black") + 
    geom_density(aes(y=0.25 * ..count..), color="purple")
print(plt)

png

## chi-square random number generation 2
genchisq2 <- function(nsamp, nu) {
    u <- matrix(runif(nsamp * nu / 2), nrow=nsamp)
    -2 * log(apply(u, 1, prod) )
}
nu <- 6
n <- 1000
chisqdata2 <- data.frame(x = genchisq2(n, nu))
plt <- ggplot(chisqdata2, aes(x=x)) +  
    geom_histogram(binwidth=0.25, fill="white", color="black") + 
    geom_density(aes(y=0.25 * ..count..), color="purple")
print(plt)

png

## Student's t random number generation
gent <- function(nsamp, nu) {
    z <- rnorm(nsamp)
    chisq <- genchisq1(nsamp, nu)
    trv <- z / sqrt(chisq / nu)
}
nu <- 6
n <- 1000
tdata <- data.frame(x = gent(n, nu))
plt <- ggplot(tdata, aes(x=x)) +  
    geom_histogram(binwidth=0.25, fill="white", color="black") + 
    geom_density(aes(y=0.25 * ..count..), color="purple")
print(plt)

png

# F random number generation
genF <- function(nsamp, nu1, nu2) {
    chisq1 <- genchisq1(nsamp, nu1)
    chisq2 <- genchisq1(nsamp, nu2)
    Frv <- chisq1 / nu1 / chisq2 * nu2
}
nu1 <- 10; nu2 <- 6
n <- 1000
Fdata <- data.frame(x = genF(n, nu1, nu2))
plt <- ggplot(Fdata, aes(x=x)) +  
    geom_histogram(binwidth=0.25, fill="white", color="black") + 
    geom_density(aes(y=0.25 * ..count..), color="purple")
print(plt)

png

Acceptance-rejection sampling (or just rejection sampling)

Suppose we want to sample from a distribution with complicated density $f(x)$. It is not easy to use the above method since neither the cdf nor its inverse is analytically available.

John von Neumann, while working on the Mahanttan Project, suggested the following:

  1. Find the “envelope” of $f$, i.e., a simple density $g(x)$ such that \(f(x) \le c g(x) =: h(x)\) for all $x$, for some constant $c > 0$.

  2. Sample a random variable $X$ distributed according to $g$.

  3. Accept $X$ as a representative of $f$ if
    • $U \le \displaystyle\frac{f(X)}{h(X)}$,
    • where $U$ is a uniform random variable on $[0, 1]$ drawn independently.
  4. Reject $X$ otherwise. Go to Line 2.

Why AR sampling works?

Definition (Body of a function). For a nonnegative, integrable function $f$, its body is defined and denoted by \(B_f = \{(x, y): 0 \le y \le f(x) \}.\)

Theorem 1. Supose random variable $X$ has density $g$, $U\sim\text{unif}[0, 1]$ is independent of $X$, and there exists $c > 0$ such that $f(x) \le c g(x)$ for all $x$. Then, the random vector $(X, cg(X)U)$ is uniformly distributed over the body $B_{cg}$ of $cg$.

Proof of Theorem 1

We want to show that the joint density of $(X, cg(X)U)$ is \(\frac{1}{|B_{cg}|}\mathbb{I}_{B_{cg}}(x, y).\)

Let $Y = cg(X)U$. Then $Y|{X=x} = cg(x) U \sim \text{unif}[0, cg(x)]$. That is, the conditional density of $Y$ given $X$ is \(p_{Y|X}(y|x) = \frac{1}{cg(x)}\mathbb{I}_{[0, cg(x)]}(y).\) By construction, the marginal density of $X$ is given by $p_X(x) = g(x)$. Therefore the joint density of $(X, Y)$ is \begin{align} p_{XY}(x, y) &= p_X(x)p_{Y|X}(y|x) = \frac{1}{c}\mathbb{I}_{[0, cg(x)]}(y)
&= \begin{cases} 1/c, & \text{if } 0 \le y \le c g(x),
0, & \text{otherwise}. \end{cases}
&= \frac{1}{c}\mathbb{I}_{B_{cg}}(x,y). \end{align
} Now since \(1 = \int_{-\infty}^{\infty}\int_0^{cg(x)}\frac{1}{c}dydx = \frac{1}{c}|B_{cg}|,\) we have $\frac{1}{|B_{cg}|}\mathbb{I}{B{cg}}(x, y)$ as desired.

Example: Marsaglia

One way to sample from the unit disk is to sample $(U, V)$ from $\text{unif}[-1, 1]\times\text{unif}[-1, 1]$, and discard the sample if $U^2 + V^2 > 1$ and resample.

We have $X=(U, V)$.

Example: gamma random numbers

Recall that the Gamma distribution with shape parameter $\alpha$ and scale parameter $\beta$ has density \(f_{\Gamma}(x; \alpha, \beta) = \frac{1}{\Gamma(\alpha)\beta^{\alpha}}x^{\alpha-1}e^{-x/\beta}\) for $x \ge 0$. If $X \sim \text{Gamma}(\alpha, \beta)$, then $cX \sim \text{Gamma}(\alpha, c\beta)$. Hence it suffices to sample from $\text{Gamma}(\alpha, 1)$. Furthermore, $X\sim \text{Gamma}(\alpha, 1)$ and $\alpha > 1$, then $X \stackrel{d}{=} Y + Z$ where $Y \sim \text{Gamma}(\lfloor \alpha \rfloor, 1)$, $Z \sim \text{Gamma}(\alpha - \lfloor \alpha \rfloor, 1)$ and independent of $Y$. The $Y$ can be generated by summing $\lfloor \alpha \rfloor$ independent $\text{Exp}(1)$ random variables. Therefore we only need to sample from $\text{Gamma}(\alpha, 1)$, $\alpha \in (0, 1)$.

If $0 < \alpha < 1$, we see that \(x^{\alpha - 1}e^{-x} \le \begin{cases} x^{\alpha - 1}, & \text{if } 0 \le x \le 1, \\ e^{-x}, & \text{otherwise}. \end{cases}\) Thus we choose \(h(x) = \begin{cases} x^{\alpha - 1}/\Gamma(\alpha), & \text{if } 0 \le x \le 1, \\ e^{-x}/\Gamma(\alpha), & \text{otherwise}. \end{cases}\) leading to \(g(x) = \begin{cases} x^{\alpha - 1}/(1/\alpha + 1/e), & \text{if } 0 \le x \le 1, \\ e^{-x}/(1/\alpha + 1/e), & \text{otherwise}. \end{cases}\) and \(c = \frac{1}{\Gamma(\alpha)}\left(\frac{1}{\alpha} + \frac{1}{e}\right).\) Density $g$ has cdf \(G(x) = \begin{cases} x^{\alpha}/(1 + \alpha/e), & \text{if } 0 \le x \le 1, \\ \frac{1 + \alpha/e - \alpha e^{-x}}{1 + \alpha/e}, & \text{otherwise}. \end{cases}\) whose inverse is \(G^{-1}(u) = \begin{cases} [(1 + \alpha/e)u]^{1/\alpha}, & \text{if } 0 \le u \le 1/[1+\alpha/e], \\ -\log(1/\alpha + 1/e) - \log(1 - u), & 1/[1+\alpha/e] \le u < 1. \end{cases}\)

gengamma_ar <- function(nsamp, alph) {
    # sample X from g
    v <- runif(nsamp)  # unif rv for inverse method
    idx <- v > 1 / (1 + alph * exp(-1))
    x <- numeric(nsamp)
    x[idx]  = -log(1 / alph + exp(-1)) - log(1 - v[idx])
    x[!idx] = ((1 + alph * exp(-1)) * v[!idx])^(1 / alph)
    
    # test acceptance
    u <- runif(nsamp)
    idx2 <- (x > 1)
    accept <- logical(nsamp)
    accept[idx2]  <- (u[idx2] < x[idx2]^(alph - 1))
    accept[!idx2] <- (u[!idx2] < exp(-x[!idx2]))
    
    x[accept]
}

n <- 2000
alph <- 0.5
x <- gengamma_ar(n, alph) 
length(x)
length(x) / n

1501

0.7505

gamma(0.5) / (1 / alph + exp(-1) )   # acceptance ratio

0.748540580270693

gamdata <- data.frame(x = x)
plt <- ggplot(gamdata, aes(x=x)) +  
    geom_histogram(binwidth=0.25, fill="white", color="black") + 
    geom_density(aes(y=0.25 * ..count..), color="purple")
print(plt)

png

Multivariate random numbers

Batch methods

Multinomial

Suppose we want to sample $(X_1, \dotsc, X_k) \sim \text{mult}(n, p_1, \dotsc, p_k)$, where $\sum_{i=1}^k p_i = 1$, $\sum_{i=1}^n X_i = n$.

Multivariate normal

Suppose we want to sample $\mathbf{X} = (X_1, \dotsc, X_p)$ from MVN $N(\boldsymbol{\mu}, \boldsymbol{\Sigma})$. If we can find $\mathbf{L} \in \mathbb{R}^{p \times p}$ such that $\mathbf{L}^T\mathbf{L} = \boldsymbol{\Sigma}$, then \(\mathbf{L}^T\mathbf{Z} + \boldsymbol{\mu}, \quad \mathbf{Z} \sim N(0, \mathbf{I}_p)\) follows the desired distribution.

Multivariate $t$

A multivariate $t$ distribution with degrees of freedom $\nu$, scale matrix $\boldsymbol{\Sigma}$, and location vector $\boldsymbol{\mu}$ is the distribution of the random vector \(\mathbf{T} = \frac{1}{\sqrt{\chi^2_{\nu}/\nu}}\mathbf{X} + \boldsymbol{\mu},\) where $\mathbf{X} \sim N(0, \boldsymbol{\Sigma})$, and $\chi^2_{\nu}$ is the chi-square random variable with $\nu$ degrees of freedom, independent of $\mathbf{X}$.

Sequential sampling

In many cases we can sample a random vector by sampling each component in turn and conditioning: \begin{align} p_{X_1, \dotsc, X_p}(x_1, \dotsc, x_p) &= p_{X_1}(x_1)\prod_{j=2}^p p_{X_j|X_1, \dotsc, X_{j-1}}(x_j | x_1, \dotsc, x_{j-1})
&= p_{X_1}(x_1)p_{X_2|X_1}(x_2|x_1) \prod_{j=3}^p p_{X_j|X_2, \dotsc, X_{j-1}}(x_j | x_1, \dotsc, x_{j-1})
&= \dotsb \end{align
}

Multinomial

For $(X_1, \dotsc, X_k) \sim \text{mult}(n, p_1, \dotsc, p_k)$, it is immediate to see that $X_1 \sim B(n, p_1)$. Given $X_1 = x_1$, $(X_2, \dotsc, X_k) \sim \text{mult}(n - x_1, p_2 / (1 - p_1), \dotsc, p_k / (1 - p_1) )$. Hence $X_2 {X_1 = x_1} \sim B(n - x_1, p_2 / (1 - p_1) )$ and so on.

Multivariate normal

If we want to sample $\mathbf{X} = (X_1, \dotsc, X_p)$ from MVN with mean $\boldsymbol{\mu}=(\mu_1, \dotsc, \mu_p)^T$ and covariance matrix $\boldsymbol{\Sigma} = (\sigma_{ij})$, then note that the first component $X_1 \sim N(\mu_1, \sigma_{11})$. From the conditional distribution formula for multivariate normal, we see that \((X_2, \dotsc, X_p) | \{X_1 = x_1\} \sim N(\bar{\boldsymbol{\mu}}, \bar{\boldsymbol{\Sigma}}), \quad \bar{\boldsymbol{\mu}} = \boldsymbol{\mu}_2 + \boldsymbol{\Sigma}_{12}^T(x_1 - \mu_1)/\sigma_{11}, ~ \bar{\boldsymbol{\Sigma}} = \boldsymbol{\Sigma}_{22} - \boldsymbol{\Sigma}_{12}^T\boldsymbol{\Sigma}_{12}/\sigma_{11}\) if we partition \begin{align} \boldsymbol{\mu} &= (\mu_1, \boldsymbol{\mu}_2)^T
\boldsymbol{\Sigma} &= \begin{bmatrix} \sigma_{11} & \boldsymbol{\Sigma}_{12}
\boldsymbol{\Sigma}_{12}^T & \boldsymbol{\Sigma}_{22} \end{bmatrix}
. \end{align
}