# Monte Carlo Method and Pi Day!

**Posted:**19 April, 2015

**Filed under:**Data Science Resources |

**Tags:**Monte Carlo, Number Pi, Ulam, von Neumann Leave a comment

Monte Carlo methods (or Monte Carlo experiments) are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. They are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other mathematical methods. Monte Carlo methods are mainly used in three distinct problem classes: optimization, numerical integration, and generation of draws from a probability distribution.

In physics-related problems, Monte Carlo methods are quite useful for simulating systems with many coupled degrees of freedom, such as fluids, disordered materials, strongly coupled solids, and cellular structures (see cellular Potts model). Other examples include modeling phenomena with significant uncertainty in inputs such as the calculation of risk in business and, in math, evaluation of multidimensional definite integrals with complicated boundary conditions. In application to space and oil exploration problems, Monte Carlo based predictions of failure, cost overruns and schedule overruns are routinely better than human intuition or alternative “soft” methods.

The modern version of the Monte Carlo method was invented in the late 1940s by Stanislaw Ulam, while he was working on nuclear weapons projects at the Los Alamos National Laboratory. Immediately after Ulam’s breakthrough, John von Neumann understood its importance and programmed the ENIAC computer to carry out Monte Carlo calculations.

Being secret, the work of von Neumann and Ulam required a code name. A colleague of von Neumann and Ulam, Nicholas Metropolis, suggested using the name Monte Carlo, which refers to the **Monte Carlo Casino in Monaco** where Ulam’s uncle would borrow money from relatives to gamble. Using lists of “truly random” random numbers was extremely slow, but von Neumann developed a way to calculate pseudorandom numbers, using the middle-square method. Though this method has been criticized as crude, von Neumann was aware of this: he justified it as being faster than any other method at his disposal, and also noted that when it went awry it did so obviously, unlike methods that could be subtly incorrect.

There is no single Monte Carlo method – any attempt to define one will inevitably leave out valid examples – but many simulations follow this pattern:

- Model a system as a (series of) probability density functions (PDFs);
- repeatedly sample from the PDFs;
- tally/compute the statistics of interest.

**Defining The Model** : In the case of Buffon’s Needle, the model is based on a proof that shows the probability of the needle intersecting a line. To model the system one needs probability density functions for random positions in the lined space and random angles for the needle. It is a very simple simulation. Another paper in this volume, Monte Carlo Simulation Of Emission Tomography And Other Radiation-Based Medical Imaging Techniques, describes a more complicated model used to simulate emission tomography.

Key points to consider in defining a model are:

- what are our desired outputs?
- what will these outputs be used for?
- how accurate/precise must the outputs be?
- how exactly can/must we model?
- how exactly can/must we define the inputs?
- how do we model the underlying processes?

The outputs are key. What questions are we trying to answer? For instance, in a weather simulation we might want to predict high/low temperatures, likelihood/intensity of precipitation, and wind speed and direction for five days. The accuracy/precision needed will vary with use: predicting precipitation for picnic planning requires specifying which days rain is likely; for a farmer it may be more important to know how much rain will fall over the five days.

The final step in defining the model is determining how to process the inputs to generate the outputs. This is done deterministically in some simulations, for instance a weather simulation given the same inputs might always produce the same forecast. However, a Monte Carlo simulation always involves an element of randomness, often at many points in the model. In the weather simulation, one might include some randomness to account for the measurement error in the inputs; and the temperature, wind, and precipitation forecast for a given area might be time series for which each point is a random function of both the measurements and earlier time point forecasts in nearby areas. By running a Monte Carlo forecast multiple times, one could determine the variability of the forecast for the measured inputs.

**Example, using Monte Carlo to estimate the value of Pi:**

A simple Monte Carlo estimate for the value of can be found by generating random points on a square and counting the proportion that lie inside an inscribed circle. The probability of a point landing in the circle is proportional to the relative areas of the circle and square. However this method converges slowly, with many sample points needed to get an accurate approximation.

In detail, if a circle of radius R is inscribed inside a square with side length 2R, then the area of the circle will be pi*R^2 and the area of the square will be (2R)^2. So the ratio of the area of the circle to the area of the square will be pi/4.

This means that, if you pick N points at random inside the square, approximately N*pi/4 of those points should fall inside the circle.

This program picks points at random inside the square. It then checks to see if the point is inside the circle (it knows it’s inside the circle if x^2 + y^2 < R^2, where x and y are the coordinates of the point and R is the radius of the circle). The program keeps track of how many points it’s picked so far (N) and how many of those points fell inside the circle (M).

Pi is then approximated as follows: **pi = 4*M / N**

Although the Monte Carlo Method is often useful for solving problems in physics and mathematics which cannot be solved by analytical means, it is a rather slow method of calculating pi. To calculate each significant digit there will have to be about 10 times as many trials as to calculate the preceding significant digit.

Here you are a graphic explanation in youtube.

In this site you’ll find examples of Monte Carlo application with R included the calculation of Pi:

http://www.r-bloggers.com/probability-and-monte-carlo-methods/

http://blog.revolutionanalytics.com/2015/03/a-pi-day-obsession.html

Or java:

http://www.eveandersson.com/pi/monte-carlo-circle

## Recent Comments