Probability is at the heart of data science. Simulation is also commonly used in algorithms such as the bootstrap. After completing this exercise, you will have a slightly stronger intuition for probability and for writing your own simulation algorithms.
Most of the problems in this set have an exact analytical solution, which is not the case for all probability problems, but they are great for practice since we can check against the exact correct answer.
To get the most out of the exercises, it pays off to read the instructions carefully and think about what the solution should be before starting to write
R code. Often this helps you weed out irrelevant information that can otherwise make your algorithm unnecessarily complicated.
Answers are available here.
In 100 coin tosses, what is the probability of having the same side come up 10 times in a row?
You might want to use some of the following functions to answer this question:
sample(), rbinom(), rle().
Six kids are standing in line. What is the probability that they are in alphabetical order by name? Assume no two children have the same exact name.
Remember the kids from the last question? There are three boys and three girls. How likely is it that all the girls come first?
In six coin tosses, what is the probability of having a different side come up with each throw, that is, that you never get two tails or two heads in a row?
A random five-card poker hand is dealt from a standard deck. What is the chance of a flush (all cards are the same suit)?
In a random thirteen-card hand from a standard deck, what is the probability that none of the cards is an ace and none is a heart (♥)?
- work with different binomial and logistic regression techniques,
- know how to compare regression models and choose the right fit,
- and much more.
At four parties each attended by 13, 23, 33, and 53 people respectively, how likely is it that at least two individuals share a birthday at each party? Assume there are no leap days, that all years are 365 days, and that births are uniformly distributed over the year.
A famous coin tossing game has the following rules: The player tosses a coin repeatedly until a tail appears or tosses it a maximum of 1000 times if no tail appears. The initial stake starts at 2 dollars and is doubled every time heads appears. The first time tails appears, the game ends and the player wins whatever is in the pot. Thus the player wins 2 dollars if tails appears on the first toss, 4 dollars if heads appears on the first toss and tails on the second, 8 dollars if heads appears on the first two tosses and tails on the third, and so on. Mathematically, the player wins 2k dollars, where k equals the number of tosses until the first tail. What is the probability of profit if it costs 15 dollars to participate?
Back to coin tossing. What is the probability the pattern heads-heads-tails appears before tails-heads-heads?
Suppose you’re on a game show, and you’re given the choice of three doors. Behind one door is a car; behind the others, goats. You pick a door, say #1, and the host, who knows what’s behind the doors, opens another door, say #3, which has a goat. He then says to you, “Do you want to pick door #2?” What is the probability of winning the car if you use the strategy of first picking a random door and then switching doors every time? Note that the host will always open a door you did not pick, and it always reveals a goat.
Related exercise sets:
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…
Source:: R News