Moving to R as the Primary Statistical Software Increases Equity in Teaching Quantitative Methods

By Eric R. Schuler, Ph.D.

Senior Quantitative/Computational Research Methodologist,

Center for Teaching, Research, and Learning, American University

Repeat after me, “R is for Revolution.”

This was the name of Culpepper and Aguinis’ 2011 article. R was revolutionary at that time as it was one of the few open-source statistical programs that was comparable to commercial software like SPSS, Stata, SAS, and Minitab. The revolution has been here for a while. If anything, the versatility and functionality of R has grown by leaps and bounds. Many departments across large R1 universities have ditched SPSS and have become “R departments”. This is due to:

  1. There is no cost to download and use R.
  2. When learning/using R, there is a very inclusive and helpful community of users to help you.
  3. R is more flexible, which makes it more useful in a variety of contexts.
  4. If you learn R, learning another program like SAS, Stata, SPSS, can be easier.

These reasons have been around since the 2011 article. There are now even more resources and a huge community to help!

I am one of a growing number of individuals that feels that if we are training students to use commercial statistical software then we are doing them a disservice. If we teach students to use SPSS or Stata, unless they purchase a copy for themselves after they leave AU there is no way for them to continue to hone their skills; this is quite a barrier as not everyone can afford a non-student license. Additionally, why should anyone have to pay for software when there are free options? If we teach them with an open-source tool, we are setting them down a path to be able to continue to learn on their own.

Let’s be very honest with ourselves, if you don’t practice the coding skills you have learned, you lose them. Do I remember how to use Minitab from so many years ago? I wouldn’t expect someone who hasn’t used a software program in a couple of years to retain their proficiency.

Another downside of commercial software is that they often require additional add-ons. As an example, let’s say I wanted to teach a class on latent variable modeling (structural equation modeling and item response theory). SPSS has 0 built in options for that. Instead, I would need to purchase a separate add-on for structural equation modeling and use an entirely different software for item response theory. R can do both, for free.

Supporting the R learning curve

I work as the Senior Quantitative/Computational Research Methodologist at CTRL. I have taught quantitative methods courses, both undergraduate and graduate level, at several universities. Too often I hear comments about the learning curve of learning R. Yes, there is a learning curve but there are innumerable free resources out there to learn it, and some R users have created packages within R itself to learn the program (e.g., swirl, learnr, rcmdr). There is also a great community of R users who help others to learn R and provide great walkthroughs that go through step by step of the process.

Updates to open-source software, like R, are much more rapid as the users of the programs are the ones who are writing the packages and have a vested interest in it. Whereas, with some commercial software it can be a nightmare to just get a response on troubleshooting an issue from the company. I have been able to post questions on an R user’s group for a package that I use for my research and heard back from the authors of the package with recommendations and example code to help me within a matter of hours. This was amazing considering they were in Europe and I posted at the end of the day.

Progressing the Disciplinary Standard

Another comment that I hear a lot is “Software [SPSS, SAS, Stata, insert commercial software here] is the most used program in the discipline.” Times are changing and many times that information is quite dated. The methods and analyses taught 10 years ago have evolved and commercial software is struggling to keep up. In fact, many job posts list proficiency with R and/or Python on the job advertisement (when searching LinkedIn for data analyst positions). If students learn R, it is easier to pick up another coding language, like Julia.

SPSS usage in research has been on the decline for years (Lindeløv, 2019). It is not coming back. More and more faculty are making the switch to open source like R or Python for the very reasons that I mentioned earlier. Some faculty don’t want to spend the time to learn a new software language or don’t know where to start.

It can be daunting to learn a new language but there are options and ways to go about this, so it is not as burdensome.

I have built out an entire learning module series on getting started with the program. I am happy to share the learning modules, walk-throughs, and code to get you up and running. The key things: have patience and be willing to Google search errors. Yes, it will take you longer at first, but it will get easier the more you work at it, wasn’t that also true when you first were learning SPSS/Stata/SAS?

Repeat after me, R is for revolution!

Citations

Culpepper, S. A., & Aguinis, H. (2011). R is for revolution: A cutting-edge, free, open-source statistical package. Organizational research methods14(4), 735-740.

Lindeløv, J.K. (2019, March 13). SPSS is dying. It’s time to change. Neuroscience, stats, and coding: A blog. https://lindeloev.net/spss-is-dying/

Additional Resources

Author Profile

Eric R. Schuler is an experimental psychologist by training and conducts research in Bayesian psychometric methods. He works in the Center for Teaching, Research & Learning as the Senior Quantitative/Computational Research Methodologist.