Per a suggestion, I’m going to try to find a neat data set (prbly one from @jsvine) to feature each week and toss up some sample code (99% of the time prbly in R) and offer up a vis challenge. Just reply in the comments with a link to a gist/repo/rpub/blog/etc (or post directly, though inserting code requires some markup that you can ping me abt) post containing the code & vis with a brief explanation. I’ll gather up everything into a new github organization I made for this context. You can also submit a PR right to this week’s repo.
Winners get a free digital copy of Data-Driven Security, and if you win more than once I’ll come up with other stuff to give away (either an Amazon gift card, a book or something Captain America related).
Submissions should include a story/angle/question you were trying to answer, any notes or “gotchas” that the code/comments doesn’t explain and a [beautiful] vis. You can use whatever language or tool (even Excel or ugh Tableau), but you’ll have to describe what you did step-by-step for the GUI tools or record a video, since the main point about this contest is to help folks learn about asking questions, munging data and making visualizations. Excel & Tableau lock that knowledge in and Tableau even locks that data in.
Droning on and on
Today’s data source comes from this week’s Data Is Plural newsletter and is all about drones. @jsvine linked to the main FAA site for drone sightings and there’s enough ways to slice the data that it should make for some interesting story angles.
I will remove one of those angles with a simple bar chart of unmanned aircraft (UAS) sightings by week, using an FAA site color for the bars. I wanted to see if there were any overt visual patterns in the time of year or if the registration requirement at the end of 2015 caused any changes (I didn’t crunch the numbers to see if there were any actual patterns that could be found statistically, but that’s something y’all can do). I’m not curious as to what caused the “spike” in August/September 2015 and the report text may have that data.
I’ve put this week’s example code & data into the 52 vis repo for this week.
library(ggplot2) library(ggalt) library(ggthemes) library(readxl) library(dplyr) library(hrbrmisc) library(grid) # get copies of the data locally URL1 <- "http://www.faa.gov/uas/media/UAS_Sightings_report_21Aug-31Jan.xlsx" URL2 <- "http://www.faa.gov/uas/media/UASEventsNov2014-Aug2015.xls" fil1 <- basename(URL1) fil2 <- basename(URL2) if (!file.exists(fil1)) download.file(URL1, fil1) if (!file.exists(fil2)) download.file(URL2, fil2) # read it in xl1 <- read_excel(fil1) xl2 <- read_excel(fil2) # munge it a bit so we can play with it by various calendrical options drones <- setNames(bind_rows(xl2[,1:3], xl1[,c(1,3,4)]), c("ts", "city", "state")) drones <- mutate(drones, year=format(ts, "%Y"), year_mon=format(ts, "%Y%m"), ymd=as.Date(ts), yw=format(ts, "%Y%V")) # let's see them by week by_week <- mutate(count(drones, yw), wk=as.Date(sprintf("%s1", yw), "%Y%U%u")-7) # this looks like bad data but I didn't investigate it too much by_week <- arrange(filter(by_week, wk>=as.Date("2014-11-10")), wk) # plot gg <- ggplot(by_week, aes(wk, n)) gg <- gg + geom_bar(stat="identity", fill="#937206") gg <- gg + annotate("text", by_week$wk, 49, label="# reports", hjust=0, vjust=1, family="Cabin-Italic", size=3) gg <- gg + scale_x_date(expand=c(0,0)) gg <- gg + scale_y_continuous(expand=c(0,0)) gg <- gg + labs(y=NULL, title="Weekly U.S. UAS (drone) sightings", subtitle="As reported to the Federal Aviation Administration", caption="Data from: http://www.faa.gov/uas/law_enforcement/uas_sighting_reports/") gg <- gg + theme_hrbrmstr(grid="Y", axis="X") gg <- gg + theme(axis.title.x=element_text(margin=margin(t=-6))) gg
I’ll still keep up a weekly vis from the Data Is Plural weekly collection even if this whole contest thing doesn’t take root with folks. You can never have too many examples for budding data folks to review.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…
Source:: R News