By David Smith
Several recent blog posts have explored the Secret Santa problem and provided solutions in R. This post provides a roundup of various solutions and how they are implemented in R.
If you wanted to set up a “Secret Santa” gift exchange at the office, you could put everyone’s name into a hat and have each participant draw a name at random. The problem is that someone might draw their own name, but if that happens you can just reshuffle all the names back into the hat and start the process over. That’s essentially what the R code below, from a blog post by David Selby, does:
— Mara Averick (@dataandme) November 24, 2017
That’s not an entirely satisfying solution (at least to me), with all of the having to check for self-giving and restarting if so. Thomas Lumley calculates that the chance of requiring a do-over is about 63% (for more than 2 participants, anyway), and on average you’d need about 2.7 tries to get a “valid” gift list. (This is an example of the negative binomial distribution in action: we keep on drawing a set of names from the hat, until we get a set that has no-one giving to themselves. You can generate 100 examples of this playing out in R with
rnbinom(100,1, exp(-1))+1 — the
+1 is for the final successful draw from the hat.)
An easier way might be simply to seat all the participants in random order in a circle and assign them to give a gift to the person on their right. This is easy to do in R, and doing it in code has the benefit of keeping the recipients secret from each other. First, let’s select 10 names from the babynames dataset:
> library(babynames) > santas sample(babynames$name, 10, prob=babynames$prop)
prob=babynames$prop selects names according to their prevalence in US births 1880-2015.) Then, it’s a simple matter of reordering the names at random, and assigning a gift to the next in line:
> p sample(santas) > cbind(santa=p, recipient=c(tail(p,-1),p)) santa recipient [1,] "Sherman" "Shayna" [2,] "Shayna" "Elizabeth" [3,] "Elizabeth" "Mary" [4,] "Mary" "Kathleen" [5,] "Kathleen" "Russell" [6,] "Russell" "James" [7,] "James" "Arlene" [8,] "Arlene" "Ruth" [9,] "Ruth" "Darryl" [10,] "Darryl" "Sherman"
Now, this “sit in a circle” process isn’t quite the same as the “keep on drawing names from the hat” process. In the above example, the 10 names form a cycle, and it will never happen that Arlene gives to Ruth and Ruth gives to Arlene. Nonetheless, I think it’s the simplest and fairest way of generating a Secret Santa list.
Thinking of the gift-giving relationship as a graph, the process above generates a Hamiltonian path through each of the recipients, and never generates more than one clique. Tristan Mahr explores the graph-theory nature of the Secret Santa problem in a blog post. There, he uses the DiagrammeR package to solve variants of the Secret Santa problem by constructing graphs, which can created and visualized quite easily in R.
Finally, Sarah Lotspeich and Lucy D’Agostino McGowan take the whole process one step further and show how to generate emails each participant using the ponyexpress package, notifying them of their Secret Santa recipient.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…
Source:: R News