2. Monty Hall

VC2M10AA02: level 10A: Devise and use algorithms and simulations to solve mathematical problems

Developing simulations for counter-intuitive problems in probability such as the Monty Hall problem or derangements

2.1. Monty Hall simulation

The Monty Hall problem is a counter-intuitive statistics puzzle.

There are 3 doors, behind which are two goats and a car.
You pick a door (call it door A). You’re hoping for the car of course.
Monty Hall, the game show host, examines the other doors (B & C) and opens one with a goat. (If both doors have goats, he picks randomly.)
Do you stick with door A (original guess) or switch to the unopened door? Does it matter?
Surprisingly, the odds aren’t 50-50. If you switch doors you’ll win 2/3 of the time!

Long term chances are shown below:

Python code for the simulation:

import random
import matplotlib.pyplot as plt
from pathlib import Path

currfile_dir = Path(__file__).parent


def monty_hall():
    """
    Simulates one round of the Monty Hall problem and returns True if switching wins the car, False otherwise.

    Returns:
        bool: The outcome of switching.
    """
    # Create a list of three doors, one with a car and two with goats
    doors = ["car", "goat", "goat"]
    # Shuffle the list randomly
    random.shuffle(doors)
    # Choose a door at random as the initial choice
    choice = random.randint(0, 2)
    # Find the index of the door with the car
    car = doors.index("car")
    # Find the index of a door with a goat that is not the initial choice
    goat = random.choice([i for i in range(3) if i != choice and i != car])
    # Switch to the other door that is not the initial choice or the revealed goat
    switch = 3 - choice - goat
    # Return True if switching wins the car, False otherwise
    return switch == car


def monty_hall_simulation(n, filename):
    """
    Simulates n rounds of the Monty Hall problem and plots the outcomes of switching against the number of rounds.

    Args:
        n (int): The number of rounds to simulate.
        filename (str): The filename to save the plot as.
    """
    # Initialize a list to store the outcomes of switching
    switch_outcomes = []
    # Initialize a variable to count the number of wins by switching
    switch_wins = 0
    # Loop over n rounds
    for i in range(n):
        # Simulate one round and get the outcome
        outcome = monty_hall()
        # Update the number of wins by switching
        switch_wins += outcome
        # Append the proportion of wins by switching so far to the list
        switch_outcomes.append(switch_wins / (i + 1))
    # Call the plot function with the list of outcomes and n
    plot_outcomes(switch_outcomes, n, filename)


def plot_outcomes(switch_outcomes, n, filename):
    """
    Plots the outcomes of switching against the number of rounds and saves the plot to a file.

    Args:
        switch_outcomes (list): The list of outcomes of switching.
        n (int): The number of rounds.
        filename (str): The filename to save the plot as.
    """
    # Plot the list of outcomes against the number of rounds
    plt.plot(range(1, n + 1), switch_outcomes)
    # Add labels and title
    plt.xlabel("Number of rounds")
    plt.ylabel("Proportion of wins by switching")
    plt.title("Monty Hall Simulation")
    plt.ylim(0, 1)
    # Add a horizontal line, grey dashed at 0.67
    plt.axhline(0.67, color="grey", linestyle="--")
    # Save and show the plot
    save_plot(plt, filename)
    plt.show()


def save_plot(plot, filename):
    """
    Saves the given plot to a file with the given filename within the curr directory.

    Args:
        plot (matplotlib.pyplot): The plot to save.
        filename (str): The filename to save the plot as.
    """
    filepath = currfile_dir / filename
    plot.savefig(filepath, dpi=600)


monty_hall_simulation(200, "monty_hall_200.png")

The average of several simulations can be graphed:

Python code for the simulation:

import random
import matplotlib.pyplot as plt
from pathlib import Path
import statistics  # Import the statistics module to use the mean function
import webcolors  # Import the webcolors module
import colorsys  # Import the colorsys module


currfile_dir = Path(__file__).parent


def monty_hall():
    """
    Simulates one round of the Monty Hall problem and returns True if switching wins the car, False otherwise.

    Returns:
        bool: The outcome of switching.
    """
    # Create a list of three doors, one with a car and two with goats
    doors = ["car", "goat", "goat"]
    # Shuffle the list randomly
    random.shuffle(doors)
    # Choose a door at random as the initial choice
    choice = random.randint(0, 2)
    # Find the index of the door with the car
    car = doors.index("car")
    # Find the index of a door with a goat that is not the initial choice
    goat = random.choice([i for i in range(3) if i != choice and i != car])
    # Switch to the other door that is not the initial choice or the revealed goat
    switch = 3 - choice - goat
    # Return True if switching wins the car, False otherwise
    return switch == car


def monty_hall_simulation(n):
    """
    Simulates n rounds of the Monty Hall problem and returns a list of outcomes of switching.

    Args:
        n (int): The number of rounds to simulate.

    Returns:
        list: The list of outcomes of switching.
    """
    # Initialize a list to store the outcomes of switching
    switch_outcomes = []
    # Initialize a variable to count the number of wins by switching
    switch_wins = 0
    # Loop over n rounds
    for i in range(n):
        # Simulate one round and get the outcome
        outcome = monty_hall()
        # Update the number of wins by switching
        switch_wins += outcome
        # Append the proportion of wins by switching so far to the list
        switch_outcomes.append(switch_wins / (i + 1))
    # Return the list of outcomes
    return switch_outcomes


def lighten_color_name(color_name, factor):
    """
    Takes a color name and a factor between 0 and 1 and returns a lighter color in RGB format.

    Args:
        color_name (str): The name of the color, such as 'red', 'blue', etc.
        factor (float): The factor by which to increase the value component of the color. Should be between 0 and 1.

    Returns:
        tuple: The lighter color in RGB format as a tuple of three numbers between 0 and 1.
    """
    # Convert the color name to a hex string using the webcolors module
    hex_color = webcolors.name_to_hex(color_name)
    # Convert the hex string to a tuple of RGB values using the webcolors module
    rgb_color = webcolors.hex_to_rgb_percent(hex_color)
    # Convert the RGB values to floats between 0 and 1
    rgb_color = tuple(float(x.strip("%")) / 100 for x in rgb_color)
    # Convert the RGB color to HSV format using the colorsys module
    h, s, v = colorsys.rgb_to_hsv(*rgb_color)
    # Increase the value component by the factor, but make sure it does not exceed 1
    v = min(v + factor, 1)
    # Convert the HSV color back to RGB format using the colorsys module
    r, g, b = colorsys.hsv_to_rgb(h, s, v)
    # Return the lighter color as a tuple
    return (r, g, b)


def new_colors():
    # Define a list of color names to use for each simulation
    color_names = [
        "red",
        "orange",
        "plum",
        "green",
        "blue",
        "indigo",
        "violet",
        "pink",
        "brown",
        "skyblue",
        "lightgreen",
        "peachpuff",
        "yellow",
    ]
    # Define a list of factors to use for each color name
    factors = [0.1, 0.2, 0.1, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.4]
    new_colors = []
    for i in range(len(color_names)):
        new_light_color = lighten_color_name(color_names[i], factors[i])
        new_colors.append(new_light_color)
    return new_colors


def plot_outcomes(switch_outcomes, color, rounds, label, linewidth=1, linestyle="-"):
    """
    Plots the outcomes of switching against the first 200 rounds using the given color, label, linewidth and linestyle.

    Args:
        switch_outcomes (list): The list of outcomes of switching.
        color (str): The color to use for the plot.
        rounds (int): The number of trials in the sim
        label (str): The label to use for the plot.
        linewidth (int): The linewidth to use for the plot. Default is 1.
        linestyle (str): The linestyle to use for the plot. Default is '-'.
    """
    # Plot the list of outcomes against the rounds using the given color, label, linewidth and linestyle
    plt.plot(
        range(1, rounds + 1),
        switch_outcomes[:rounds],
        color=color,
        label=label,
        linewidth=linewidth,
        linestyle=linestyle,
    )


def save_plot(plot, filename):
    """
    Saves the given plot to a file with the given filename within the curr directory.

    Args:
        plot (matplotlib.pyplot): The plot to save.
        filename (str): The filename to save the plot as.
    """
    filepath = currfile_dir / filename
    plot.savefig(filepath, dpi=600)


def main(sims, rounds):
    """
    Runs sims simulations of the Monty Hall problem with rounds rounds each and plots them on one figure.
    """
    # Define a list of colors to use for each simulation
    colors = new_colors()
    # Initialize an empty list to store the data from each simulation
    data = []
    # Loop over sims simulations with different number of rounds and store the data in the list
    for i in range(sims):
        data.append(
            monty_hall_simulation(rounds)
        )  # Use append method instead of indexing and pass rounds as argument
    # Initialize an empty list to store the averages from each simulation
    average = []
    # Loop over sims simulations again and calculate and plot each data set with a different color and label
    for i in range(sims):
        # Plot each data set with a different color and label
        color = colors[i]
        # Use the lighten_color function with a factor of 0.2
        plot_outcomes(data[i], color, rounds, f"Simulation {i + 1}")
    # Plot the overall average with grey color and label and double linewidth
    average = [statistics.mean(data[i][k] for i in range(sims)) for k in range(rounds)]
    plot_outcomes(average, "black", rounds, "Overall average", 4, ":")
    # Add labels and title
    plt.xlabel("Number of rounds")
    plt.ylabel("Proportion of wins by switching")
    plt.title("Monty Hall Simulation")
    # plt.xscale("log")
    plt.ylim(0, 1)
    # Add a horizontal line, grey dashed at 0.67
    plt.axhline(0.67, color="grey", linestyle="--")
    # Add a legend in the lower right corner
    plt.legend(loc="lower right")
    # Save and show the plot
    save_plot(plt, "monty_hall_av.png")  # Add a closing parenthesis here
    plt.show()


# Call the main function if this file is run as a script
if __name__ == "__main__":
    main(8, 100)  # Pass the number of simulations and rounds as arguments

2.2. Monty Hall explanation

The Monty Hall problem is a brain teaser, in the form of a probability puzzle, loosely based on the American television game show Let’s Make a Deal and named after its original host, Monty Hall.

The problem is as follows:
Suppose you’re on a game show, and you’re given the choice of three doors:
Behind one door is a car; behind the others, goats.
You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat.
He then says to you, “Do you want to pick door No. 2?”
Is it to your advantage to switch your choice?

The surprising answer is that switching is better than staying.
If you switch, you have a 2/3 chance of winning the car, while if you stay, you have only a 1/3 chance.
This is because when you first choose a door, there is a 2/3 chance that the car is behind one of the other doors.
This probability does not change after the host reveals a goat behind one of the unchosen doors.
When the host provides information about the 2 unchosen doors (revealing that one of them does not have the car behind it), the 2/3 chance of the car being behind one of the unchosen doors rests on the unchosen and unrevealed door, as opposed to the 1/3 chance of the car being behind the door you chose initially.

A common misconception is that switching or staying does not matter because there are only two doors left and each has an equal chance of having the car.
However, this ignores the fact that switching and staying are not independent events.
Switching uses the previous information given by the host, while staying does not.
Switching is equivalent to choosing both of the other doors at the beginning, while staying is equivalent to choosing only one door at the beginning.

One way to see why switching is better is to list out all the possible outcomes and count how often you win by switching or staying.

Suppose you choose door 1 initially. Then there are three scenarios:

The car is behind door 1, and the host reveals either door 2 or 3 (both have goats). If you switch, you lose; if you stay, you win.
The car is behind door 2, and the host reveals door 3 (which has a goat). If you switch to door 2, you win; if you stay at door 1, you lose.
The car is behind door 3, and the host reveals door 2 (which has a goat). If you switch to door 3, you win; if you stay at door 1, you lose.

Out of three scenarios, switching wins twice and staying wins once.

Therefore, switching has a 2/3 probability of winning and staying has a 1/3 probability of winning.