[통계계산] 6. Non-linear Equations

Computational Statistics

Featured image

[통계계산] 6. Non-linear Equations


목차

  1. Computer Arithmetic
  2. LU decomposition
  3. Cholesky Decomposition
  4. QR Decomposition
  5. Interative Methods
  6. Non-linear Equations
  7. Optimization
  8. Numerical Integration
  9. Random Number Generation
  10. Monte Carlo Integration


sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] fansi_0.5.0       digest_0.6.27     utf8_1.2.1        crayon_1.4.1     
 [5] IRdisplay_1.0     repr_1.1.3        lifecycle_1.0.0   jsonlite_1.7.2   
 [9] evaluate_0.14     pillar_1.6.1      rlang_0.4.11      uuid_0.1-4       
[13] vctrs_0.3.8       ellipsis_0.3.2    IRkernel_1.1      tools_3.6.3      
[17] compiler_3.6.3    base64enc_0.1-3   pbdZMQ_0.3-5      htmltools_0.5.1.1

Nonlinear equation solve

Problem

Univariate problems

\[f: \mathbb{R} \to \mathbb{R}\]

Bisection

bisect <- function(f, l, r, tol=1e-6) {
    iter <- 0
    while ( r - l > tol ) {
        m <- (l + r) / 2
        if ( f(l) * f(m) < 0 ) {
            r <- m
        } else {
            l <- m
        }
        iter <- iter + 1
    }
    ret <- list(val = (l + r) / 2, iter = iter)
}
F <- function(x) { pt(x, 5) - 0.95 }   # Student's t distribution with 5 df, 95%ile
F(1.291)
F(2.582)
ret <- bisect(F, 1.291, 2.582)
ret$val
ret$iter

-0.0765841085370957

0.0253437879217759

2.01504860901833

21

qt(0.95, 5)

2.01504837333302

Fixed-point iteration

Example

Find a root of $f(x) = x^4 - \sin x$ on $(0, 1]$.

x <- 1.0
for (i in 1:10) {
    x <- (sin(x))^0.25
    print(x)
}
[1] 0.9577668
[1] 0.9509906
[1] 0.9498498
[1] 0.9496563
[1] 0.9496234
[1] 0.9496178
[1] 0.9496169
[1] 0.9496167
[1] 0.9496167
[1] 0.9496167
x <- 1.0
for (i in 1:10) {
    x <- sin(x) / x^3
    print(x)
}
[1] 0.841471
[1] 1.251418
[1] 0.4844576
[1] 4.096051
[1] -0.01187393
[1] 7092.524
[1] -2.604367e-12
[1] 1.474333e+23
[1] -2.289953e-70
[1] 1.906983e+139

Contraction mapping theorem

Theorem. Suppose the function $F$ defined on a closed interval $I \subset \mathbb{R}$ satisfies

  1. $F(x) \in I$ whenever $x \in I$
  2. $ F(x) - F(y) \le L x - y $ for all $x, y \in I$ (Lipschitz continuity).

Then, if $L \in [0, 1)$, $F$ has a unique fixed point $x^{\star} \in I$. Furthermore, fixed-point iteration $x^{k+1} = F(x^{k})$ converges to $x^{\star}$ regardless of the initial point $x^{(0)}$.

Proof. From condition 2, \begin{align} |x^{(k+1)} - x^{(k)}| &= |F(x^{(k)} - F(x^{(k-1)})||
&\le L|x^{(k)} - x^{(k-1)}|
&\vdots
&\le L^k|x^{(1)} - x^{(0)}| \end{align
} Using the triangular inequality, for $m > n$ we have

\begin{align} |x^{(n)} - x^{(m)}| &= |x^{(n)} - x^{(n+1)} + x^{(n+1)} - x^{(n+2)} + \dotsb + x^{(m-1)} - x^{(m)}|
&\le \sum_{k=n}^{m-1} |x^{(k)} - x^{(k+1)}|
&\le \sum_{k=n}^{m-1} L^k |x^{(1)} - x^{(0)})| \tag{
}
&\le \sum_{k=n}^{\infty} L^k |x^{(1)} - x^{(0)})|
= \frac{L^n}{1 - L} |x^{(1)} - x^{(0)}|. \end{align*} Thus the sequence ${x^{(k)}}$ is Cauchy and convergent. Since the interval $I$ is closed, $x^{\star} = \lim_k x^{(k)} \in I$.

Now by (Lipschitz) continuity of $F$, and the definition $x^{k+1} = F(x^{(k)})$, we see that $x^{\star}$ is a fixed point of $F$.

If there exists another fixed point $y^{\star} \neq x^{\star}$, then \(|x^{\star} - y^{\star}| = |F(x^{\star}) - F(y^{\star})| \le L|x^{\star} - y^{\star}|,\) which is a contradiction since $L \in (0, 1)$.

Remark. By sending $m \to \infty$ in inequality (*), we have the bound \(|x^{(n)} - x^{\star}| \le \frac{L^n}{1 - L}|x^{(1)} - x^{(0)}|\) which provides the rate of convergence to $x^{\star}$. This geometric rate of convergence is called linear convergence.

Back to example

Obvious choice of $F$

Newton’s method (Newton-Raphson)

Examples

F <- function(x) { pt(x, 5) - 0.95 }   # Student's t distribution with 5 df, 95%ile
Fprime <- function(x) { dt(x, 5) }  # derivative of F (density function)
#x <- 1.291
x <- 2.582
options(digits=20)
for (i in 1:10) {
    x <- x - (F(x) / Fprime(x))
    print(x)
}
[1] 1.7338465618414033997
[1] 1.968962641494075072
[1] 2.0136622631340750367
[1] 2.0150470922847292243
[1] 2.0150483733319273227
[1] 2.015048373333023779
[1] 2.015048373333023779
[1] 2.015048373333023779
[1] 2.015048373333023779
[1] 2.015048373333023779
qt(0.95, 5)

2.01504837333302

Multivariate problems

Goal: to solve \begin{align} f_1(x_1, \dotsc, x_n) &= 0
\vdots & ~
f_n(x_1, \dotsc, x_n) &=0. \end{align
}

Newton’s method

Variants