A Config File to Save Them: A Workflow to Use Microsoft Sharepoint and R

If your working for a company that uses Microsoft Sharepoint as the central file sharing platform and are as impatient as I am, you will be annoyed sooner rather than later. Saving and reading xlsx or csv files is so much more inefficient compared to other cloud storing services like Google Drive/Googlesheets. Unfortunately, there is no package like googlesheets4 for Microsoft products that works just out of the box.

Of course, there are similar packages to communicate with Microsoft Sharepoint or Microsoft OneDrive via an API. But some of them like Microsoft365R work only well with OneDrive and Sharepoint online, but not for instance an on premise version of a Sharepoint. Or one needs special rights which are very hard to be granted by IT departments if the main reason is just to make saving and reading files easier for a lazy R programmer. In short: It is unnecessarily tedious.

But there is rescue! I developed a simple workflow without any APIs or special access rights. After all, Sharepoint and OneDrive are not just cloud storages, they are also integrated into the file system of every computer and are synchronized from there. I can take advantage of that.

The Requirements

  • I want to share Excel (or any other files) from R with other people over a Microsoft Sharepoint
  • The scripts are stored in Git repos and the Excel spreadsheets in a Sharepoint
  • Paths must be accessible regardless of OS and user

The Workflow

So I need a workflow that targets paths individually per user and at the same time allows generic path references in scripts.

To do so, I use the config package. It helps to store and read paths, parameters or even R code in a yaml config file that each person interacting with the script has to create on his or her computer.

There are two places that are suitable for this purpose:

  1. in the root directory of each user (~/); this is useful for paths or other configuration information that is used in multiple repos
  2. in the root folder of a repo/working directory of an .RProj

The docs say:

By default configuration data is read from a file named config.yml within the current working directory (or parent directories if no config file is found in the initially specified directory).

To make the example a tiny bit more complex, I call it my-test-config.yml and add two folders in yaml syntax: the Sharepoint root folder and a subfolder.

default:
  sharepoint: "~/My Sharepoint Root Folder/"
  sharepoint_special_project: "Folder for Special Project/"

After that I load the config file by calling the get() function. The result is a list called config, which elements are accessible with $:

# install.packages("config")
library(config)
#> 
#> Attaching package: 'config'
#> The following objects are masked from 'package:base':
#> 
#>     get, merge
config <- config::get(file = "my-test-config.yml")

# get the sharepoint root folder
config$sharepoint
#> [1] "~/My Sharepoint Folder/"
# get the sharepoint subfolder folder
config$sharepoint_special_project
#> [1] "Folder for Special Project/"

# build a full path that combines the root and the subfolder
paste0(config$sharepoint, config$sharepoint_special_project)
#> [1] "~/My Sharepoint Folder/Folder for Special Project/"

A simple call of paste0 combines the root folder and subfolder and it’s straightforward to include also a file name and extension. If you work on a Sharepoint, you most likely deal with xlsx-files:

file <- paste0(config$sharepoint, config$sharepoint_special_project, "my-file.xlsx")

Assuming I want to save the PlantGrowth data that comes as a default dataset (just like mtcars), I can use the  file variable to save an xlsx-file with the writexl package. To check if it works, I re-import the saved file with the readxl package.

# save als xlsx with the writexlsx package
writexl::write_xlsx(PlantGrowth, file = file)

# and import as dataframe with readxl package
PlantGrowth <- readxl::read_excel(file = file)

This allows different persons to save a file in a predefined Sharepoint folder or import it from there. The working directory used in R is thus completely independent of the location of the Sharepoint. Never again absolute paths like C:/User/... in a R script that needs to be edited if someone else works with it …

Sharing Files in a Team

An Excel file that just sits sadly in a folder without being looked at is no good. Circulating it in a team is the next natural step. The most convenient way for me has proven to be to open the Excel file once, and use the Share-Button.

Assuming that the file in question is buried somewhere deep in a project’s subfolder on a Sharepoint, I need to find my way in my machine’s file directory until finally I would be able double-click on the desired file. Of course, there’s a workaround!

Luckily, there’s still the direct path to the file as a file variable in the environment. This is handy, because the fs package let’s me open files directly with a function call. And then I can click on that share button in Excel.

fs::file_show(file)

Even more convenient: Use a Snippet

Snippets offer auto-complete function that insert any lines of code defined in RStudio options. Any snippet can be added under: Tools -> Global Options -> Code -> Edit Snippets.

There’s one caveat: In snippets one can’t store variables with $ directly. This means I need to include spaces as a mini hack. So each time I use the snippet, I have to remove these spaces. But it’s still much less effort than without the snippet:

snippet config_special_project
  config <- config::get(file = "conf/config.yml")
  file <- paste0(config $sharepoint, config $sharepoint_special_project, "")

So next time I start typing configconfig_special_project will appear and with a tap on tab I add the two lines.

Summary

These few lines of code save me so much time:

library(config)
library(readxl)
library(writexl)
library(fs)
# load config
config <- config::get(file = "my-test-config.yml")
# determine file path incl. name
file <- paste0(config$sharepoint, config$sharepoint_special_project, "my-file.xlsx")
# save als xlsx with the writexlsx package
writexl::write_excel(PlantGrowth, file = file)
# open saved file in Excel standalone Application
fs::file_show(file)
# and import as dataframe
PlantGrowth <- readxl::read_excel(file = file)

This is the first blog post I wrote as an Rmd file in RStudio and published it via the WordPress API using the goodpress package.