2. Monty Hall

VC2M10AA02: level 10A: Devise and use algorithms and simulations to solve mathematical problems
  • Developing simulations for counter-intuitive problems in probability such as the Monty Hall problem or derangements


2.1. Monty Hall simulation

The Monty Hall problem is a counter-intuitive statistics puzzle.
There are 3 doors, behind which are two goats and a car.
You pick a door (call it door A). You’re hoping for the car of course.
Monty Hall, the game show host, examines the other doors (B & C) and opens one with a goat. (If both doors have goats, he picks randomly.)
Do you stick with door A (original guess) or switch to the unopened door? Does it matter?
Surprisingly, the odds aren’t 50-50. If you switch doors you’ll win 2/3 of the time!
Long term chances are shown below:
../_images/monty_hall_200.png
Python code for the simulation:
 1import random
 2import matplotlib.pyplot as plt
 3from pathlib import Path
 4
 5currfile_dir = Path(__file__).parent
 6
 7
 8def monty_hall():
 9    """
10    Simulates one round of the Monty Hall problem and returns True if switching wins the car, False otherwise.
11
12    Returns:
13        bool: The outcome of switching.
14    """
15    # Create a list of three doors, one with a car and two with goats
16    doors = ["car", "goat", "goat"]
17    # Shuffle the list randomly
18    random.shuffle(doors)
19    # Choose a door at random as the initial choice
20    choice = random.randint(0, 2)
21    # Find the index of the door with the car
22    car = doors.index("car")
23    # Find the index of a door with a goat that is not the initial choice
24    goat = random.choice([i for i in range(3) if i != choice and i != car])
25    # Switch to the other door that is not the initial choice or the revealed goat
26    switch = 3 - choice - goat
27    # Return True if switching wins the car, False otherwise
28    return switch == car
29
30
31def monty_hall_simulation(n, filename):
32    """
33    Simulates n rounds of the Monty Hall problem and plots the outcomes of switching against the number of rounds.
34
35    Args:
36        n (int): The number of rounds to simulate.
37        filename (str): The filename to save the plot as.
38    """
39    # Initialize a list to store the outcomes of switching
40    switch_outcomes = []
41    # Initialize a variable to count the number of wins by switching
42    switch_wins = 0
43    # Loop over n rounds
44    for i in range(n):
45        # Simulate one round and get the outcome
46        outcome = monty_hall()
47        # Update the number of wins by switching
48        switch_wins += outcome
49        # Append the proportion of wins by switching so far to the list
50        switch_outcomes.append(switch_wins / (i + 1))
51    # Call the plot function with the list of outcomes and n
52    plot_outcomes(switch_outcomes, n, filename)
53
54
55def plot_outcomes(switch_outcomes, n, filename):
56    """
57    Plots the outcomes of switching against the number of rounds and saves the plot to a file.
58
59    Args:
60        switch_outcomes (list): The list of outcomes of switching.
61        n (int): The number of rounds.
62        filename (str): The filename to save the plot as.
63    """
64    # Plot the list of outcomes against the number of rounds
65    plt.plot(range(1, n + 1), switch_outcomes)
66    # Add labels and title
67    plt.xlabel("Number of rounds")
68    plt.ylabel("Proportion of wins by switching")
69    plt.title("Monty Hall Simulation")
70    plt.ylim(0, 1)
71    # Add a horizontal line, grey dashed at 0.67
72    plt.axhline(0.67, color="grey", linestyle="--")
73    # Save and show the plot
74    save_plot(plt, filename)
75    plt.show()
76
77
78def save_plot(plot, filename):
79    """
80    Saves the given plot to a file with the given filename within the curr directory.
81
82    Args:
83        plot (matplotlib.pyplot): The plot to save.
84        filename (str): The filename to save the plot as.
85    """
86    filepath = currfile_dir / filename
87    plot.savefig(filepath, dpi=600)
88
89
90monty_hall_simulation(200, "monty_hall_200.png")

The average of several simulations can be graphed:
../_images/monty_hall_av.png
Python code for the simulation:
  1import random
  2import matplotlib.pyplot as plt
  3from pathlib import Path
  4import statistics  # Import the statistics module to use the mean function
  5import webcolors  # Import the webcolors module
  6import colorsys  # Import the colorsys module
  7
  8
  9currfile_dir = Path(__file__).parent
 10
 11
 12def monty_hall():
 13    """
 14    Simulates one round of the Monty Hall problem and returns True if switching wins the car, False otherwise.
 15
 16    Returns:
 17        bool: The outcome of switching.
 18    """
 19    # Create a list of three doors, one with a car and two with goats
 20    doors = ["car", "goat", "goat"]
 21    # Shuffle the list randomly
 22    random.shuffle(doors)
 23    # Choose a door at random as the initial choice
 24    choice = random.randint(0, 2)
 25    # Find the index of the door with the car
 26    car = doors.index("car")
 27    # Find the index of a door with a goat that is not the initial choice
 28    goat = random.choice([i for i in range(3) if i != choice and i != car])
 29    # Switch to the other door that is not the initial choice or the revealed goat
 30    switch = 3 - choice - goat
 31    # Return True if switching wins the car, False otherwise
 32    return switch == car
 33
 34
 35def monty_hall_simulation(n):
 36    """
 37    Simulates n rounds of the Monty Hall problem and returns a list of outcomes of switching.
 38
 39    Args:
 40        n (int): The number of rounds to simulate.
 41
 42    Returns:
 43        list: The list of outcomes of switching.
 44    """
 45    # Initialize a list to store the outcomes of switching
 46    switch_outcomes = []
 47    # Initialize a variable to count the number of wins by switching
 48    switch_wins = 0
 49    # Loop over n rounds
 50    for i in range(n):
 51        # Simulate one round and get the outcome
 52        outcome = monty_hall()
 53        # Update the number of wins by switching
 54        switch_wins += outcome
 55        # Append the proportion of wins by switching so far to the list
 56        switch_outcomes.append(switch_wins / (i + 1))
 57    # Return the list of outcomes
 58    return switch_outcomes
 59
 60
 61def lighten_color_name(color_name, factor):
 62    """
 63    Takes a color name and a factor between 0 and 1 and returns a lighter color in RGB format.
 64
 65    Args:
 66        color_name (str): The name of the color, such as 'red', 'blue', etc.
 67        factor (float): The factor by which to increase the value component of the color. Should be between 0 and 1.
 68
 69    Returns:
 70        tuple: The lighter color in RGB format as a tuple of three numbers between 0 and 1.
 71    """
 72    # Convert the color name to a hex string using the webcolors module
 73    hex_color = webcolors.name_to_hex(color_name)
 74    # Convert the hex string to a tuple of RGB values using the webcolors module
 75    rgb_color = webcolors.hex_to_rgb_percent(hex_color)
 76    # Convert the RGB values to floats between 0 and 1
 77    rgb_color = tuple(float(x.strip("%")) / 100 for x in rgb_color)
 78    # Convert the RGB color to HSV format using the colorsys module
 79    h, s, v = colorsys.rgb_to_hsv(*rgb_color)
 80    # Increase the value component by the factor, but make sure it does not exceed 1
 81    v = min(v + factor, 1)
 82    # Convert the HSV color back to RGB format using the colorsys module
 83    r, g, b = colorsys.hsv_to_rgb(h, s, v)
 84    # Return the lighter color as a tuple
 85    return (r, g, b)
 86
 87
 88def new_colors():
 89    # Define a list of color names to use for each simulation
 90    color_names = [
 91        "red",
 92        "orange",
 93        "plum",
 94        "green",
 95        "blue",
 96        "indigo",
 97        "violet",
 98        "pink",
 99        "brown",
100        "skyblue",
101        "lightgreen",
102        "peachpuff",
103        "yellow",
104    ]
105    # Define a list of factors to use for each color name
106    factors = [0.1, 0.2, 0.1, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.4]
107    new_colors = []
108    for i in range(len(color_names)):
109        new_light_color = lighten_color_name(color_names[i], factors[i])
110        new_colors.append(new_light_color)
111    return new_colors
112
113
114def plot_outcomes(switch_outcomes, color, rounds, label, linewidth=1, linestyle="-"):
115    """
116    Plots the outcomes of switching against the first 200 rounds using the given color, label, linewidth and linestyle.
117
118    Args:
119        switch_outcomes (list): The list of outcomes of switching.
120        color (str): The color to use for the plot.
121        rounds (int): The number of trials in the sim
122        label (str): The label to use for the plot.
123        linewidth (int): The linewidth to use for the plot. Default is 1.
124        linestyle (str): The linestyle to use for the plot. Default is '-'.
125    """
126    # Plot the list of outcomes against the rounds using the given color, label, linewidth and linestyle
127    plt.plot(
128        range(1, rounds + 1),
129        switch_outcomes[:rounds],
130        color=color,
131        label=label,
132        linewidth=linewidth,
133        linestyle=linestyle,
134    )
135
136
137def save_plot(plot, filename):
138    """
139    Saves the given plot to a file with the given filename within the curr directory.
140
141    Args:
142        plot (matplotlib.pyplot): The plot to save.
143        filename (str): The filename to save the plot as.
144    """
145    filepath = currfile_dir / filename
146    plot.savefig(filepath, dpi=600)
147
148
149def main(sims, rounds):
150    """
151    Runs sims simulations of the Monty Hall problem with rounds rounds each and plots them on one figure.
152    """
153    # Define a list of colors to use for each simulation
154    colors = new_colors()
155    # Initialize an empty list to store the data from each simulation
156    data = []
157    # Loop over sims simulations with different number of rounds and store the data in the list
158    for i in range(sims):
159        data.append(
160            monty_hall_simulation(rounds)
161        )  # Use append method instead of indexing and pass rounds as argument
162    # Initialize an empty list to store the averages from each simulation
163    average = []
164    # Loop over sims simulations again and calculate and plot each data set with a different color and label
165    for i in range(sims):
166        # Plot each data set with a different color and label
167        color = colors[i]
168        # Use the lighten_color function with a factor of 0.2
169        plot_outcomes(data[i], color, rounds, f"Simulation {i + 1}")
170    # Plot the overall average with grey color and label and double linewidth
171    average = [statistics.mean(data[i][k] for i in range(sims)) for k in range(rounds)]
172    plot_outcomes(average, "black", rounds, "Overall average", 4, ":")
173    # Add labels and title
174    plt.xlabel("Number of rounds")
175    plt.ylabel("Proportion of wins by switching")
176    plt.title("Monty Hall Simulation")
177    # plt.xscale("log")
178    plt.ylim(0, 1)
179    # Add a horizontal line, grey dashed at 0.67
180    plt.axhline(0.67, color="grey", linestyle="--")
181    # Add a legend in the lower right corner
182    plt.legend(loc="lower right")
183    # Save and show the plot
184    save_plot(plt, "monty_hall_av.png")  # Add a closing parenthesis here
185    plt.show()
186
187
188# Call the main function if this file is run as a script
189if __name__ == "__main__":
190    main(8, 100)  # Pass the number of simulations and rounds as arguments

2.2. Monty Hall explanation

The Monty Hall problem is a brain teaser, in the form of a probability puzzle, loosely based on the American television game show Let’s Make a Deal and named after its original host, Monty Hall.
The problem is as follows:
Suppose you’re on a game show, and you’re given the choice of three doors:
Behind one door is a car; behind the others, goats.
You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat.
He then says to you, “Do you want to pick door No. 2?”
Is it to your advantage to switch your choice?
The surprising answer is that switching is better than staying.
If you switch, you have a 2/3 chance of winning the car, while if you stay, you have only a 1/3 chance.
This is because when you first choose a door, there is a 2/3 chance that the car is behind one of the other doors.
This probability does not change after the host reveals a goat behind one of the unchosen doors.
When the host provides information about the 2 unchosen doors (revealing that one of them does not have the car behind it), the 2/3 chance of the car being behind one of the unchosen doors rests on the unchosen and unrevealed door, as opposed to the 1/3 chance of the car being behind the door you chose initially.
A common misconception is that switching or staying does not matter because there are only two doors left and each has an equal chance of having the car.
However, this ignores the fact that switching and staying are not independent events.
Switching uses the previous information given by the host, while staying does not.
Switching is equivalent to choosing both of the other doors at the beginning, while staying is equivalent to choosing only one door at the beginning.
One way to see why switching is better is to list out all the possible outcomes and count how often you win by switching or staying.
Suppose you choose door 1 initially. Then there are three scenarios:
  • The car is behind door 1, and the host reveals either door 2 or 3 (both have goats). If you switch, you lose; if you stay, you win.

  • The car is behind door 2, and the host reveals door 3 (which has a goat). If you switch to door 2, you win; if you stay at door 1, you lose.

  • The car is behind door 3, and the host reveals door 2 (which has a goat). If you switch to door 3, you win; if you stay at door 1, you lose.

Out of three scenarios, switching wins twice and staying wins once.
Therefore, switching has a 2/3 probability of winning and staying has a 1/3 probability of winning.