Survey of Kagglers finds Python, R to be preferred tools

By David Smith

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Competitive predictive modeling site Kaggle conducted a survey of participants in prediction competitions, and the 16,000 responses provide some insights about that user community. (Whether those trends generalize to the wider community of all data scientists is unclear, however.) One question of interest asked what tools Kagglers use at work. Python is the most commonly-used tool within this community, and R is second. (Respondents could select more than one tool.)

Interestingly, the rankings varied according to the job title of the respondent. R and Python received top-ranking for every job-title subgroup except one (database administrators, who preferred SQL), according to the following division:

  • R: Business Analyst, Data Analyst, Data Miner, Operations Researcher, Predictive Modeler, Statistician
  • Python: Computer Scientist, Data Scientist, Engineer, Machine Learning Engineer, Other, Programmer, Researcher, Scientist, Software Developer

You can find summaries of the other questions in the survey at the link below. An anonymized dataset of survey responses is also available, as is the “Kaggle Kernel” (a kind of notebook) of the R code behind the survey analysis.

Kaggle: The State of Data Science and Machine Learning, 2017

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…

Source:: R News

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.