I am a big fan of Excel. It has been one of the biggest productivity boosting technologies invented in recent times. But it has its limitations.
Learning to code, like learning any language, is both challenging and rewarding. It takes time, effort and perseverance.
Learning R is no different. The initial learning phase can be daunting and there are many moments when, deadlines looming, you abandon ship and return to the safety of Excel. But once you build up confidence, you’ll find the benefits are huge. Tasks can be done much faster. New types of analysis are possible. Repetitive tasks are automatable and workflows can be shared with colleagues.
Here are 6 (of many) reasons to take the leap from Excel to R. (PS. it’s also free!!)
1. Cleaning data
Transforming messy raw data into usable clean processed data is generally the first data analysis task of a new project. There are many subtasks - merging different datasets together, checking for missing or erroneous data, reshaping tables, reformatting and transforming variables.
Most of these are just about possible in Excel. Merging datasets generally involves getting very friendly with the VLOOKUP() function. Searching for missing values involves a lot of filtering and sorting. Reshaping data is difficult, usually involving a lot of playing about with pivot tables.
In R these tasks are all straightforward. Packages from tidyverse contain clean well written functions. Each of these tasks can be neatly stacked into a logical workflow.
2. Automating tasks
If you received the same messy raw data periodically, you might want to automate the data cleaning process. Coding in general allows many repetitive tasks to be automated. At best this can be achieved with VBA in the Excel environment, but it’s not ideal.
At DS Analytics we have a whole host of R scripts we recycle, that clean, model and visualise data. Once written they can be executed with a single run command.
3. Sharing workflows
Most analysts at one time or other will have been on the receiving end of a bad project handover. A myriad of miscellaneous .xls and .csv files. The former filled with hideous long formulae and conditional formatting. Generally it is not possible to trace back the analysis stages.