By Rob Grant
Often I use R to handle large datasets, analyze the data and filter out the data I don’t need.
When all this is done, I usually use
write.csv() to print my data off and reopen it in Google Sheets.
My workflow would look something like this:
full_data However, there's an R package that provides a bridge between your Google account and your R environment: googlesheets. Using this package we can read data directly from Google, modify it and create new files within our Google Drive.
Step 1: Install googlesheets
Step 2: Authenticate your Google account
Before we can do anything we need to allow google sheets to access our account.
We can do this by running this:
gs_auth(new_user = TRUE)
Have a browser open (Google Chrome worked for me) and it should open a new tab asking you to connect via an account:
You can rerun this command any time you want to change accounts.
Sometimes if you don’t use the token for a while it will run out and you will have to refresh it, which it will initiate automatically if you run a command that requires you to connect to the Google API (i.e. any of the specialised googlesheets functions).
Step 3: See what’s in your Google Account
Calling the function gs_ls() will show you spreadsheets in your account.
gs_ls() # A tibble: 15 x 10 sheet_title author perm version updated sheet_key 1 for googlesheets rforjournali… rw new 2017-12-11 09:44:54 1Y0WCfTW… 2 Avon and Somerset Septe… rforjournali… rw new 2017-11-19 12:46:55 1TfC5Fs6… 3 Mid year 2015 UK popula… rforjournali… rw new 2017-10-25 21:19:52 1Vqg560s… 4 Cleveland 2016-09 rforjournali… rw new 2016-11-26 10:18:16 19xBr8nU… 5 Rankings of US presiden… rforjournali… rw new 2016-11-08 04:39:55 11PZxq7y… 6 Tennis #1s rforjournali… rw new 2016-11-06 20:04:42 1Riz8GRs… 7 Young persons railcard rob.grant rw new 2016-11-06 13:05:55 1XZsjJxu… 8 Copy of Young persons r… rforjournali… rw new 2016-11-05 18:14:38 1oUpRS-D… 9 defective rforjournali… rw new 2016-11-05 11:40:30 1jWZBILC… 10 Asylum rforjournali… rw new 2016-10-27 19:04:05 1CRMl2_1… 11 Buses rforjournali… rw new 2016-10-24 20:07:41 1qy9Z-sn… 12 Untitled spreadsheet rforjournali… rw new 2016-10-24 19:22:42 1_f_FI5n… 13 Population rforjournali… rw new 2016-10-24 18:29:17 1rrOQuV5… 14 Drugs rforjournali… rw new 2016-10-18 21:37:29 1UTsnGM6… 15 Food rforjournali… rw new 2016-10-15 13:24:30 1aWEAPR4…
Step 4: Read a spreadsheet
I am going to select the first spreadsheet ‘for googlesheets’ by its title. It’s a selection of 50 random numbers between 0 and 1 (you can recreate this function with
runif() in R).
You can also locate the sheet by the key (the letters, numbers and characters after the /d/ in the URL) for the same result
This gives us a list, which we can turn into a data frame using the
for_gs_sheet Classes ‘tbl_df', ‘tbl' and 'data.frame': 50 obs. of 2 variables: $ Number.x: num 0.4696 0.1587 0.0949 0.1823 0.0885 ... $ Number.y: num 0.67551 0.7041 0.00167 0.51302 0.20114 ... - attr(*, "spec")=List of 2 ..$ cols :List of 2 .. ..$ Number.x: list() .. .. ..- attr(*, "class")= chr "collector_double" "collector" .. ..$ Number.y: list() .. .. ..- attr(*, "class")= chr "collector_double" "collector" ..$ default: list() .. ..- attr(*, "class")= chr "collector_guess" "collector" ..- attr(*, "class")= chr "col_spec"
Step 5: Modify the spreadsheet
Next up, we modify our spreadsheet using the
This function has several arguments that we need to employ to edit our spreadsheet properly.
gs_edit_cells(for_gs, ws = "Sheet1", anchor = "A2", input = c(1,2), byrow = TRUE)
ws argument refers to the sheet name in the spreadsheet. The
anchor argument refers to the cell from which the modification will begin. In my example, I am editing two cells, where the first one will be the anchor cell A2. The
byrow argument indicates that the modification will apply horizontally (change to FALSE for vertical editing).
Note that this won’t change our data frame
for_gs_sheet that is based on this spreadsheet; just the spreadsheet itself.
Cell A2 now has a value of 1. A3 is 2.
Step 6: Create a Google Sheets file using R
We can create new spreadsheets using this package using
We’ll use the
mtcars dataset as a test:
gs_new(title = "mtcars", ws_title = "first_sheet", input = mtcars)
It worked, except it didn’t include the rownames, which contains the cars.
That doesn’t matter, we can add them using
gs_edit_cells(), changing the
byrow argument to FALSE this time.
#register the new mtcars sheet in R gs_new(title = "mtcars", ws_title = "first_sheet", input = mtcars) #insert the rownames vertically in column L gs_edit_cells(mtcars_sheet, ws = "first_sheet", anchor = "L2", input = rownames(mtcars), byrow = FALSE)
That was a quick overview of the most basic functions of the google sheets package.
This is a really useful package. A lot of my work involves reading data in Google Sheets either before or after using R.
Googlesheets means I won’t have to bother with
write.csv() as much in the future, saving me time.
So thanks to Jenny Bryan for creating it!
- DataFrames Vs RDDs in Spark – Part 1
- Parallel Operations
- R Markdown: How to number and reference tables
- RDBL – manipulate data in-database with R code only
- R Markdown: How to format tables and figures in .docx files
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…
Source:: R News