The draw of the spreadsheet is strong, using Excel is an almost default choice for anyone when performing data tasks. With it’s aesthetically pleasing ways to easily enter data, tabular format to rearrange rows and add values in various columns, Excel is the first program anyone dealing with data learns about and eventually shortcuts and programmatic logic find their way into the Excel warrior’s repertoire.
However, spreadsheets can be quite limiting and the majority of analysis is time consuming and requires a lot of data organisation in order to execute correctly. Because the logic in a spreadsheet is all done through mouse clicks, there is no way to effectively track what changes have been made either in one session or in the production of a chart. Microsoft Excel could be the single most popular piece of software in the business community. Released over thirty years ago, Excel is still used every day in countries across the globe to store, manipulate, and analyse data.
As inviting as Excel may seem to a beginner and learning how to program is a time investment task however, it is an investment worth making because of a couple of reasons that could not only save you time but save on effort.
What is R?
R is an open-source programming language, which can be freely downloaded and used. A freely available application called RStudio provides a number of features and helps you write code. To give you an analogy, if Excel is a calculator that helps perform simple functions, R is a calculator that you build and control to analyse data and perform functions the way you want it to.
Here are a couple of reasons you should consider learning R and go from data analyser to Data wizard:
Deal with large datasets
Excel is limited in that there are only so many rows and columns per spreadsheet. So when you run out of rows/columns, you’re forced to move to a new tab or a new file. While it’s debatable that needing that many rows or columns of data is unlikely in most circumstances, there are cases where datasets grow over time and eventually the excel spreadsheet will not be able to contain all of that data, especially when you’re dealing with large amounts of data over time.
Time Saving Automation
R can automate and calculate much quicker than Excel. Whilst Excel has a GUI (a Graphical User Interface) which allows you to click buttons rather than writing code making it more approachable, it can be a large hindrance when you’re trying to automate a process or run the same analysis multiple times. Using a programming language like R allows you to save the effort of having to manually execute analysis tasks. For example, if you needed to run the same analysis on a new set of sales data each week, doing this in Excel would require opening a different file manually each week and re-entering formulas and other elements needed for the analysis. But you could do that same analysis automatically in a language like R, writing a simple script that imports the new data and runs the same analysis each week, outputting the results in whatever format you’d like.
Reproducible & Community Code
R source codes can be utilised repeatedly across various datasets unlike Excel formulas. There are statistical source codes available that can be applied to any dataset. R also shows the data and analysis separately allowing you to view your data more clearly and correct any errors or see progression in your data. As R usage grows in popularity, a large community of users consistently contribute packages and libraries with new functions applicable across data. This creates a community of R users who extend their knowledge to other users who require a similar solution to their data.
Excel can easily produce several basic graphs after you select the exact data you want to analyse but with R it’s significantly simpler to produce graphs without all the pre-graph data assembly and you have more types of graphs to generate making visual representations of complex data more elegant.
R is Free & cross compatible
R can be downloaded by anyone anywhere on any platform, it is open source and free to use. You can examine R code for any function or computation you perform. You can even modify and improve key functions by simply changing the code and be certain that it will run across any OS from Windows to Mac to Linux which isn’t applicable to Excel.
Excel is still a powerful tool for smaller datasets, basic data entry and simple functions. However, when working with data having a broad understanding of multiple types of programs that can be used to organise and analyse your own specific type of data is a good skill to have. Don’t be afraid to branch out of your Excel comfort zone and go from data analyst to data wizard with programming and R.