So continuing my previous post about Bayesian Statistic, now let’s have a look how to sample/get posterior probability from Bayesian Inversion (to me Bayesian Inversion almost the same like Bayesian Inference? Anybody can give me explanation if these two terms are different?). I was thinking that this resume should be divided in two series.
- First resume should describe about Markov Chain, Monte Carlo, and then Monte Carlo Markov Chain.
- Second resume will talk about how to integrate Markov Chain Mote Carlo with Bayesian Inversion. But let’s how the next paragraph will be, I just flow with anything I understand with it.
Like the first post, please correct me and explain more to me, if there’s any mistakes in my writing. I am new to this field of mathematics. I decided, in this post I will only resume the first series.
Introduction to Monte Carlo Method
Thinking about Monte Carlo, is thinking about gambling place in Monaco, well this is how the pioneer in this mathematical methods get inspired. There are common thing that happened in Monte Carlo casino with Monte Carlo mathematical method, that are randomness and the repetitive nature of the process. It wouldn’t be that hard to find references and information about Monte Carlo. This method is very popular. Put Monte Carlo keyword in Google, you’ll get a lot of information about it. For first step, keep in our mind, two common words in Monte Carlo, randomness and repetitive.
Tarantola described in his book (Inverse Problem Theory, 2005), an example of classis use of Monte Carlo method, knowing the value of phi (yes, thats 3.14…. or 22/7), the other explanation of the same example, can be look at this site An Introduction to Monte Carlo Methods. This is to show how to find the value of phi, by defining the ratio of area of a circle inside a square. In that site, they writers summarize, by throwing dart 100000 times (it doesn’t have to be that much :D), yes not too much, as long as we throw dart randomly in so many times, therefore we may know how much the dart fall inside a circle or outside a circle. The ratio of the darts hitting inside the circle and outside the circle is the number of phi. Well I skipped the geometrical explanation, but to see full explanation about this dart experiment, at that site. By the way, Archimedes also did the same experiment in defining phi, but what he did is, how to put square inside a circle. He put so many square inside the circle, therefore he can find the value of phi. But the Roman army captured him first, before he found the value of phi. [CITATION NEEDED, i’m to lazy to googling].
A question will arise, how come you justify,validate or have believes in the result of your experiment? What we must understand is, if we throw darts so many times (think about large number), we will sample perfectly the shape of the circle, and also the definition of position outside circle well perfectly sampled. Therefore we have belief, about our experiment.
As summary, this is my statement of understanding about Monte Carlo Method, please correct me, if you have some specific problem, you can simulate your problem in trial and error framework. By offering so many possibilities to your simulation, at some point you will “perfectly” sampled your problem, thus you can get a model that fit with your situation.
It’s not as easy as it seems exactly. For the moment, let’s keep it that way, someday, I will resume further about this method in detail. Have a look at so many better explanation (hehehehe🙂 ) with Google.
Introduction to Markov Chain Method
My first meeting with Markov Chain was introduced by Dr. Charlie Wu of Petroselat, he taught me about this at Geostatistical class two semesters back. Did I just mentioned that I pick up something from my grad school? Ops🙂
He mentioned that, “If you understand Markov Chain, you can use it for anything?”, Geez is this some kind of magic. Well it is a magic. Magical Technique based on probability theory. So, if it’s probabilitistic method, it’s no longer magic anymore? Well, probability is one of the closest inference measurement to the way how our brain thinks and take decision.
ERRRRR… enough with my naive theory of philosophy.
Let’s quote a definition from Wikipedia about Markov Chain, “Markov Chain is a stochastic process which has Markov Property”. What is Markov Property?, in simple term, it defined as, current state will have information that could influence the future state. Using current state we can probabilistically defined future states. But future states are independent of past states, only depend on current states.
Another explanation from Wikipedia, that we must keep in mind is, “At each step the system may change its state from the current state to another state, or remain in the same state, according to a certain probability distribution. The changes of state are called transitions, and the probabilities associated with various state-changes are called transition probabilities.”
Thus the name were phrased Markov Chain, it’s just like chain, where connects each other. The connection defined by some sort of probability properties. An elegant example, is how we play snake and ladders or monopoly, where we move is only depend on the current state of the dice. We don’t really care about the previous dice state or else. On the contrary, card game like blackjack will depend on previous state, the chance you get a 21 after the dealer pass the next card will depend on current card you have, and the previous card that you have in hand.
My resume is, it’s almost look like Bayesian Theorem, that said the posterior probability depend on prior probability with some certain of likelihood function. In analogy to Markov Chain, future state will depend on current state with some transition matrix defined by probability of current state (is it?, CMIIW). This is why, if you know the Markov Chain of any series, for example weather, finance, etc., you can probabilistically forecast the future based on current state.
How are we going to use it in our case of Bayesian Inversion? or will Markov Chain help Monte Carlo sampling of posterior probability. This is what I trying to understand in my theses.
ARGHH!!!!, I’ll continue with second part later on. This material is too heavy for me to understand.