Top Data Analytic Tools For PhDs


Graduate and doctoral students, as well as academics, should demonstrate competency in using data analysis software to conduct quantitative or qualitative studies. The use of data analytics software and platforms by students and academics to perform predictive analytics functions also has become more common. This blog post looks at some of the best data analytics software for students and academics. 

This post was written by Rachel Simmons, BA (freelance writer) and Dr. Stephanie A. Bosco-Ruggiero (PhD in Social Work) on behalf of Dr. Dave Maslach for the R3ciprocity project (check out the YouTube Channel or the writing feedback software). R3ciprocity helps students, faculty, and researchers by providing an authentic look into PhD and academic life and how to be a successful researcher. For over four years the project has been offering advice, community, and encouragement to students and researchers around the world.

When deciding which analytics software to use for analyzing quantitative or qualitative research data, you may become overwhelmed by all of the options available. Depending on the type of research and data you are collecting you will have many options. For even more tips and recommendations about software, apps, and other helpful programs for academics and students check out this recent blog at blog.r3ciprocity.com. 

What is data analysis software?

Data analysis software and tools enable researchers to enter data from surveys or other data gathering tools, clean the data, analyze the data, create charts and graphics, and more. Data analysis software can be geared toward analyzing quantitative data or qualitative data. Software can be downloaded to your computer or, with some programs, used in the cloud. 

Some data analytics tools are more like apps or smaller software systems. One example is GPower which can help you figure out what sample size you need to attain a certain level of power for different statistical tests. Do some research and you will find some handy apps to use in conjunction with software programs, but probably not in lieu of them. 

Do students use data analysis software? 

Most social science, business, life sciences, and other academic programs require students to take a research course sequence that includes an introduction to research and analysis, and more advanced use of data analytics platforms. In fact, a good graduate or doctoral program will require you to take at least two research courses and require you to learn a data analysis software program or two.

The reason this is so important is because knowing research and how to use one or more of these platforms will make you much more marketable. This is especially true in fields where some students and professionals shy away from research and research platforms. We all have a colleague or classmate, or we are this person, who dreads taking the research courses, performing statistical analyses, or even using common software. Let go of your fear, learn a platform, and break out of the pack. 

It is important for all students to overcome their fears of data analytics and data analysis platforms. If you are a doctoral student, you should be confident in your ability to use a variety of tools and platforms to analyze your data, especially quantitative data. Quantitative analysis tools help us look at data that is measured in numbers and can be analyzed using statistics. Qualitative platforms help us analyze data that comes from interviews, focus groups, or other content composed of words – such as the pages of books. 

What are predictive analytics?

In the world of business and beyond the need for and use of predictive analytics tools has grown. Predictive analytics can be used in business forecasting such as examining mass stores of data and to predict trends in markets. It can also be used in business to facilitate organizational learning and help businesses plan for their future. Predictive analytics are also used in product development, marketing, finance, and equities trading. Predictive analytics employ algorithms, machine modeling, and statistics to predict future occurrences and trends. 

In this vlog Dave talks about business intelligence and analytics, its origins, and what it is: 

What Is Business Intelligence And Analytics? What Are Some Advantages & Disadvantages Of BI?

In the social sciences, predictive analytics can be used to predict the likelihood of certain events occurring in society or to families or individuals. For example, social work researchers or child welfare quality assurance professionals may use data analytics to predict which children and families are most at risk for welfare system involvement in the future. Data analytics may also be used in government to predict migration or in cyber security. 

Predictive analytics involves mapping out and examining data to make decisions at different branches of a decision process. It is important to the process of using data analytics tools to make a hypothesis, figure out what the components of your model will be, and what exactly you are looking for. Check out this vlog where Dave talks more about data analytics, its limitations, and its use in business forecasting.

Predictive Analytics: Limitations & Importance Of Business Forecasting – Small Business Saturdays

Which software should I use for predictive analytics? 

There are several software products on the market that can be used by students and academics to perform predictive analytics operations. For example, SAS is a predictive analytics software that includes features such as “optimization and simulation, text analytics, forecasting and econometrics, high-performance data mining, a model manager, and a scoring accelerator.” One of the most interesting capabilities of this software is the data mining tool which enables you to create highly accurate predictive models based on large quantities of data.

Additionally, the text analytics feature allows students and researchers to incorporate qualitative data into models and graphics. This is one of the best platforms, arguably, for analyzing data for mixed methods research projects. Utilizing SAS helps students, educators and researchers ask important questions and make predictions about impactful or even world changing events and phenomena. 

Is it difficult to learn and use data analysis software? 

The difficulty can vary, but the short answer is no it is not too difficult for any students who has been admitted to a graduate or doctoral program to learn a data analysis program. You may be intimidated and convince yourself you can’t do the “math’ or will not understand why you get the results that you do, but you will learn through the help of a good professor and/or lab instructor. Some of the programs are easier than others to learn because they use a point and click menu rather than depend mainly on code; but everyone learns differently and will prefer one program over another based on a number of factors. 

One thing to keep in mind is that these programs are continually updated, and once you’ve been trained that training may become outdated in a few years. Check out our blog post about how to become a more productive student researcher. Trust us, use the technology you have access to and your life as a researcher will be easier and more efficient. 

What are some of the most commonly used quantitative data analysis software programs? 

There are many quantitative data analysis tools – too many to list here- but here are a few that are commonly used in courses, in universities, and by researchers. 

IBM SPSS Statistics

SPSS employees point and click functionality, as well as syntax that allows the user to enter code, to conduct statistical analyses. Its versatile and advanced packages have many statistical analyses and operations to choose from. It is expensive however. Often universities or departments purchase a license to allow their students to use the program for free on site at the university. 

If students want to use the software at home they can purchase a short-term license for a few months to a few years (see studentdiscounts.com). There are Base versions of the software and more advanced versions, and IBM continually updates the software so there is a new version almost every year. Here is a list of the statistical tests and operations available at different levels (and costs) of the program: 

  • Base edition contains analytics capabilities such as advanced data preparation, description statistics, linear regression, visual graphing and reporting
    • Key features:
      • Basic hypothesis testing
      • Bootstrapping
      • Cluster analysis
      • Data access and management 
      • Nonparametric tests
      • One-way ANOVA
      • Output management
      • Programmability extension
      • ROC analysis
      • Support for R/Python
  • Add On 1: Custom tables & advance statistics — capabilities: predicting categorical outcomes, applying non-linear/logistic regression, performing multivariate modeling, and summarizing findings through custom tables
    • Key Features:
      • 2-stage least squares regression
      • Bayesian statistics
      • Generalized linear mixed models (GLMM)
      • Generalized linear modeling (GLM)
      • Loglinear analysis
      • Multivariate analysis
      • Nested tables
      • Probit response analysis
      • Quantile regression
      • Repeated measures analysis
      • Survival analysis
      • Weighted least squares regression
  • Add on 2: Complex sampling & testing — capabilities: work with complex sample designs, uncover missing data, apply categorical regression procedures, understand consumer preferences, work more accurately with small samples\
    • Key Features:
      • Categorical principal components analysis 
      • Conjoint analysis 
      • Decision trees
      • Exact tests
      • Missing values
      • Multidimensional scaling and unfolding
      • Multiple correspondence analysis
      • Neural networks
      • Regressions with optimal scaling including lasso and elastic net
      • Time-series analysis
  • Add on 3: Forecasting & decision trees — capabilities: predict trends using time-series data, uncover relationships using classification, decision trees and neural networks
    • Key Features:
      • ARIMA modeling
      • C&RT
      • CHAID & Exhaustive CHAID
      • Direct marketing analysis
      • Multilayer perception 
      • Neural networks
      • QUEST
      • Radial basis function
      • Seasonal decomposition
      • Spectral analysis
      • Temporal causal modeling

R Analytics

R has been gaining popularity due to the fact that it is a free and open source system that can be used on Mac, Windows, & Linux. It has many of the same capabilities as SPSS but no point and click menu. The user has to learn some code to use R and learn about what are called packages that enable the user to perform different sets of statistical tests. 

R can be downloaded to your computer or you can use a cloud based version (see link above). In Stephanie’s opinion the downloadable version is a bit easier to use. Here are just a few of the operations that can be performed using R: 

  • Data transformations
    • Transposing data
    • Pivoting/melting tables
    • Creating new variables
    • Recoding variables
    • Computing
  • Statistical analysis
    • Bivariate linear regression
    • Multivariate linear regression
    • T tests (dependent/independent)
    • Statistical significance 
  • Data visualization
    • Pie charts
    • Box plots
    • Tables 

XLSTAT

Excel itself has become more helpful and user friendly in terms of performing statistical analyses and creating graphics, but check out XLSTAT for more advanced operations. XLSTAT is an excel add-on that can be used on Macs or PCs. It employs a point and click menu and can be integrated into Excel. XLSTAT can do as much if not more than R and SPSS including bivariate and multivariate statistics. it also offers machine modeling, predictive analytics, and risk analysis. There are different packages geared toward different fields as well, such as finance or life sciences. It is easy to export tables and graphics into Word and other Microsoft Office products using XLSTAT. Like SPSS, XLSTAT comes in different sized packages with the least expensive costing $295 and the largest at $1495 year for all specialized packages and functionalities.

Stata

Stata is another popular, mainly quantitative, data analytics program. Very popular among those in business and finance, Stata offers some unique features such as Python (programming language) and JAVA integration, and community contributed features. It provides a number of Bayesian analyses that may be more difficult to perform with other analytics software. Stata graphics and tables are very high quality. The most basic annual plan for students costs under $100. The basic packages have some limitations on numbers of variables and cases, but should be sufficient for most students’ needs.

What about qualitative data analysis software?

Qualitative analytics programs enable a user to analyze qualitative data which includes data from focus groups, interviews, texts, and other content that includes words that you want to analyze for patterns, trends, to develop a theory, track changes in a phenomena, and the list goes on. Here are a few of the best known and commonly used platforms: 

ATLAS.ti

  • Cost:
    •  Lease a personalized single-user license for $250 or buy for $750 (teaching and research at an institution/university)
    • Student license price — $99 for two years or $51 for six months
  •  Capabilities:
    • Supports collaborative work
    • Import data or surveys from Evernote or Twitter 
    • Offers different media types (audio, photos)
    • Object Managers, the Project Explorer, and the Co-occurrence Explorer let you browse and navigate through your project data
    • Visualize your findings and interpretations in a digital mind map as you go
    • The interactive margin area is a unique work space in ATLAS.ti to digitally transfer data 
    • Data visualization tools

Nvivo 

  • Cost: Student price is $99 (Windows) or $85 (Mac) for twelve months 
  • Capabilities: 
    • Centralize difference formats of data by importing from various sources including social media
    • Collaboration
    • Transcription 
    • Reference management and note taking
    • Coding organization
    • Data visualization (frequency charts, word clouds & comparison diagrams etc.)

The bottom line

There are many other analytics software programs and tools that you can explore as well such as Tableau, Qlik (for data mining), and Provalis (qualitative). Read up on a bunch of software programs and apps, read the reviews, sign up for a free trial, and see which are best for you. Of course there are tons of books out there to help you become a pro at any of these programs, but nothing can substitute for a great lab instructor. So don’t shy away from taking that statistics class, even if you don’t have to, because these programs do it all for you – math wizardry not needed! 

Recent Posts