Gibbs sampling

깁스 샘플링의 개념에 관해 웹의 여러곳에서 intuition 및 적용 예시를 가져와서 합쳐보았다.

애초 이 방식이 도출된 흐름과 수식 derivation은 카이스트 문일철 교수님 강의를 토대로 연습장에 정리했던  Sampling-based model 정리 포스팅에서 자세한 내용을 볼 수 있다. 카하.

하지만 큰 그림과 직관은 이곳 정리가 꽤나 괜찮다 후훗.

 

explanation part 1/2

Let p(X1, . . . , Xn|e1, . . . , em) denote the joint distribution of a set of random variables (X1, . . . , Xn) conditioned on a set of evidence variables (e1, . . . , em).

: Gibbs sampling is an algorithm to generate a sequence of samples from such a joint probability distribution.

The purpose of such a sequence is to approximate the joint distribution (as with a histogram), or to compute an integral (such as an expected value).

Gibbs sampling is applicable when the joint distribution is not known explicitly, but the conditional distribution of each variable is known.

The Gibbs sampling algorithm is used to generate an instance from the distribution of each variable in turn, conditional on the current values of the other variables. (이전 iteration에서 얻었던 값)

It can be shown that the sequence of samples comprises a Markov chain, and the stationary distribution of that Markov chain is just the sought-after joint distribution. (converge된 stationary distribution이 바로 우리가 찾던 joint이다.)

Gibbs sampling is particularly well-adapted to sampling the posterior distribution of a Bayesian network, since Bayesian networks are typically specified as a collection of conditional distributions. (Full joint는 계산량이 너무 크고, 각 CPT들만 알고 있는 경우가 많기 때문.)

A Gibbs sampler runs a Markov chain on (X1, . . . , Xn). For convenience of notation, we denote the set (X1,…,Xi−1,Xi+1,…,Xn) as X(−i), and e = (e1,…,em). Then, the following method gives one possible way of creating a Gibbs sampler:

 

 

 

 

explanation part 2/2

 

1_sampling from multivariate distribution (GMM model)

이런 방식도 사용할 수 있지만.

But, if we can sample form

그러면 Gibbs sampling을 사용할 수 있다 .

 

2_Gibbs sampling

j: iteration number

지난번 iteration의 theta2의 값으로부터 condition되어서 theta1의 값을 샘플링한다.

theta2는 방금 얻은 theat1의 값으로부터 condition되어서 샘플링한다.

(It’s a markov chain )

결국, theta(j) vector는 converges to draw the sample from joint distribution

 

Central limit theorem에 의해서 결국 이렇게 converge된다.

 

3_ Bivariate normal example

이 2개의 condition들이 Gibbs sampling을 위해 필요한 것들이다.

 

4_ Bivariate normal with p = 0.9 (correlation)

1) start out w/ (-3, 3)

2) draw a slice at theat2 = 3,

and find the conditional distribution on theta 1(normal distribution)

and sample from it. -> (theta1 = -3 -> 2.3)

then do the same thing

conditioned on theta = 2.3,

sample theta2 (from normal distribution) -> 2.3

Repeat the procedure..

 

5_ k-component Gibbs sampler

 

_ Summary

(conditioned on all the components except for k)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s