Recent Publications (Vol. 109)

Vol. 109 (2024)

bizicount: Bivariate Zero-Inflated Count Copula Regression Using R

Scikit-fda: a python package for functional data analysis, opentsne: a modular python library for t-sne dimensionality reduction and embedding, magi: a package for inference of dynamic systems from noisy and sparse data via manifold-constrained gaussian processes, fungp: an r package for gaussian process regression with scalar and functional inputs, extremes.jl: extreme value analysis in julia, cpop: detecting changes in piecewise-linear signals, generalized plackett-luce likelihoods, fhmm: hidden markov models for financial time series in r, emulation and history matching using the hmer package.

Isometric illustration of four people at work stations

The IBM® SPSS® software platform offers advanced statistical analysis, a vast library of machine learning algorithms, text analysis, open-source extensibility, integration with big data and seamless deployment into applications.

Its ease of use, flexibility and scalability make SPSS accessible to users of all skill levels. What’s more, it’s suitable for projects of all sizes and levels of complexity, and can help you find new opportunities, improve efficiency and minimize risk.

Within the SPSS software family of products,  IBM SPSS Statistics  supports a top-down, hypothesis testing approach to your data, while  IBM SPSS Modeler  exposes patterns and models hidden in data through a bottom-up, hypothesis generation approach.

The AI studio that brings together traditional machine learning along with the new generative AI capabilities powered by foundation models.

SPSS Statistics for Students

Prepare and analyze data with an easy-to-use interface without having to write code.

Choose from purchase options including subscription and traditional licenses.

Empower coders, noncoders and analysts with visual data science tools.

IBM SPSS Modeler helps you tap into data assets and modern applications, with algorithms and models that are ready for immediate use.

IBM SPSS Modeler is available on IBM Cloud Pak for Data. Take advantage of IBM SPSS Modeler on the public cloud.

Manage analytical assets, automate processes and share results more efficiently and securely.

Get descriptive and predictive analytics, data preparation and real-time scoring.

Use structural equation modeling (SEM) to test hypotheses and gain new insights from data.

Create a platform that can make predictive analytics easier for big data.

Find support resources for SPSS Statistics.

Get technical tips and insights from other SPSS users.

Gain new perspective through expert guidance.

Find support resources for IBM SPSS Modeler.

Learn how to use linear regression analysis to predict the value of a variable based on the value of another variable.

Learn how logistic regression estimates the probability of an event occurring, based on a dataset of independent variables.

Learn about new statistical procedures, data visualization tools and other improvements in SPSS Statistics 29.

Discover how you can uncover data insights that solve business and research problems.

Logo

SPSS, SAS, R, Stata, JMP? Choosing a Statistical Software Package or Two

by Karen Grace-Martin   50 Comments

In addition to the five listed in this title, there are quite a few other options, so how do you choose which statistical software to use?

The default is to use whatever software they used in your statistics class–at least you know the basics.

And this might turn out pretty well, but chances are it will fail you at some point. Many times the stat package used in a class is chosen for its shallow learning curve, not its ability to handle advanced analyses that are encountered in research.

I think I’ve used at least a dozen different statistics packages since my first stats class. And here are my observations:

1. The first one you learn is the hardest to learn. There are many similarities in the logic and wording they use even if the interface is different. So once you’re learned one, it will be easier to learn the next one.

2. You will have to learn another one. Just accept it.  If you have the self discipline to do it, I suggest learning two software packages at the beginning. This will come in handy for a number of reasons

– My favorite stat package for a while was BMDP. Until the company was bought up by SPSS. I’m not sure if they stopped producing or updating it, but my university cancelled their site license.

– Many schools offer only a site license for only one package, and it may not be the one you’re used to. When I was at Cornell, they offered site licenses for 5 packages. But when a new stats professor decided to use JMP instead of Minitab, guess what happened to the Minitab site license? Unless you’re sure you’ll never leave your current university, you may have to start over.

– In case you decide to outwit the powers-that-be in IT who control the site licenses and buy your own (or use R, which is free), no software package does every type of analysis. There is huge overlap , to be sure, and the major ones are much more comprehensive than they were even 5 years ago. Even so, the gaps are in the most complicated analyses–some mixed models, gee, complex sampling, etc. And when you’re trying to learn a new, highly complicated statistical method is not the time to learn a new, highly complicated stats package.

For these reasons, I recommend that everyone who plans to do research for the foreseeable future learn two packages.

I know, it’s hard enough to find the time to start over and learn one. Much less the self discipline. But if you can, it will save you grief later on. There are many great books, online tutorials, and workshops for learning all the major stats packages.

But I also recommend you choose one as your primary package and learn it really, really well. The defaults and assumptions and wording are not the same across packages. Knowing how yours handles dummy coding or missing data is imperative to doing correct statistics.

Which one? Mainly it depends on the field you’re in. Social scientists should generally learn SPSS as their main package, mainly because that is what their colleagues are using. You can then choose something else as a backup–either SAS, R, or Stata, based on availability and which makes most sense to you logically.

research on statistical software

Reader Interactions

' src=

October 5, 2021 at 7:37 pm

Rguroo is a newly developed software for teaching statistics that is becoming increasingly popular. This software is web-based and is designed by instructors who have had many years of teaching experience. To get more information, please visit https://rguroo.com

' src=

October 19, 2021 at 3:50 pm

Interesting. Thanks for sharing this.

' src=

November 3, 2019 at 11:58 pm

Hello,am an actuarial science student. Kindly help me choose the best software to learn.

' src=

July 8, 2019 at 3:41 am

Great stuff, thanks.

' src=

March 27, 2019 at 10:44 am

I want to pursue MSC in ecology.which is tbe best statistical package?

' src=

March 6, 2018 at 10:30 am

Sir/Madam I am graduate in statistics B.SC[MSCS] and post graduate in M.SC[MATHEMATICS] .Im interested in SAS .I want your suggestion.

' src=

January 24, 2018 at 6:20 pm

My experience SPSS is better than the others by far in terms of flexibility, user friendliness, user interface. It is popular in academia as compared to SAS and R. R is Ok but you have to know lots of things before you feel comfortable with it, and there are too many packages which are confusing some times. SAS is Ok but I hate its web usage and old fashion UI. In addition SPSS has just added Bayesian Statistics and it is a huge plus. Stop using stingy SAS! Check it out if it make sense what I have just added here! SPSS, SAS, R, Stata, Minitab, OriginPro, NSCC (!) and Pass (good for sample size estimates) and forget the others!

' src=

April 8, 2017 at 5:22 am

This is Prasad, I am purshing Msc (statistics with computer applications) we can give the suggestion which software is better pls me suggests

' src=

June 30, 2017 at 12:22 am

I have also done MSC statistics but now I have a problem. Problem is that i did not know about the Statistics software such as SPSS,SAS,STAT,etc, Can you give me any suggestion

' src=

January 23, 2017 at 9:27 am

Hai folks Basically Iam a Statistics background student completed my graduation in Statistics And now Iam pursuing my post graduation in statistics too Iam going learn SAS is it okay for me ?? or any thing else you people are going to suggest please suggest me

' src=

September 28, 2016 at 11:33 am

Hi friends! I’m in an agricultural institution,and most of my colleagues are using SAS & SPSS in their researches. For a change, I’m planning to use another statistical tool for my research, the JMP. Could there be any difference? Thanks

' src=

July 25, 2016 at 10:14 am

I have done MA economics and looking for supporting statical package for carrer betterment. Please suggest some statical package which can support to my economics degree..

' src=

August 4, 2016 at 6:47 am

Dear Ganesh, Given that you’re from Economics background, it is suggested to go ahead with STATA or SPSS. Stata has witnessed a rapid rise in last few years ,especially all international org in India like UN office,World Bank regional office,Institute of economic growth, NCAER they all have swaped from spss to stata.

' src=

August 5, 2016 at 8:21 am

I would recommend using R not only because it is free but because it has become an industry standard. Below is a link to a very good course that will teach you how to use R. Good luck! -Lisa

https://www.coursera.org/learn/r-programming

August 16, 2016 at 2:57 pm

Hey Ganesh and Lisa, thanks for posting. We have a great collection of resources for R, too, including 2 free webinars. Check them out here .

' src=

September 21, 2016 at 7:40 am

I personally did my first degree in Quantitative Economics of Makerere University, and I have always felt at home using both STATA & SPSS tools. But more preferably STATA. If you are in the field of Economic Research, Agricultural Reseach, Public health(Epidemiology), i would advise you use STATA instead of SPSS.

In contrast, I think SPSS has better procedures when it comes to using graphs.

In general, I prefer STATA to SPSS

June 30, 2017 at 12:25 am

' src=

January 1, 2018 at 2:15 pm

How you did your MSC in statistics without knowledge of SPSS and STATA i can’t imagine that

' src=

September 29, 2017 at 5:06 pm

I know exactly where Makerere university is, lived in Kampala for a year. Can’t remember my econ degree- think it was minitab, but then in EPI we learned both SPSS and SAS at the university

' src=

January 3, 2021 at 10:51 am

Use STATA .It is more rich than SPSS for policy research which is in your area as an economist.

' src=

May 25, 2015 at 9:03 am

Could anyone suggest me any site that has some good projects ( I am looking for beginners to intermediate level) that uses Stata as a tool?

August 5, 2016 at 8:18 am

Here is a link to a book that I purchased that I thought was very helpful. I do not use Stata, however, I learned a lot by reading this book. (The workflow concepts carry over to anything.) Hope this helps!

http://www.indiana.edu/~jslsoc/web_workflow/wf_home.htm

https://www.amazon.com/Workflow-Data-Analysis-Using-Stata-ebook/dp/B01GQJSGGI/ref=sr_1_1?ie=UTF8&qid=1470399079&sr=8-1&keywords=analysis+of+workflow+stata#nav-subnav

August 16, 2016 at 3:18 pm

Hey Paul and Lisa,

You can also check out our list of Stata resources , which includes 2 free webinars.

' src=

December 22, 2014 at 10:19 am

I definitely prefer NCSS, though not mentioned in the article. Now the newest version pre-released even in cloud. https://www.apponfly.com/en/application/ncss10

' src=

April 20, 2017 at 8:22 am

Hi Brandon. I am using AppOnFly as well, but I have chosen SPSS to do my analysis. They also provide a free trial of SPSS which is available for 30days: http://www.apponfly.com/en/ibm-spss-statistics-standard?EZE

' src=

August 27, 2014 at 1:53 am

For a comparison of SPSS, SAS, R, Stata and Matlab for each type of statistical analysis, see

http://www.stanfordphd.com/Statistical_Software.html

' src=

May 31, 2014 at 9:57 pm

I like to use Java since it has good graphics. Therefore, my choice is SCaVis ( http://jwork.org/scavis ). It integrates Java and Python with superb graphics.

' src=

December 7, 2013 at 12:08 pm

I am used to spss and stata for my data analysis, however today I tried adding “analyse-it” to my excel package. It really worked for me. Can I really go ahead with it?

December 9, 2013 at 10:48 am

I don’t know much about the excel plug ins (or whatever the correct software term is). As a general rule, I avoid excel for data analysis, but this add-on may be just fine.

' src=

December 1, 2018 at 1:44 pm

Thanks for the mention Kule, and yes Karen, Analyse-it is completely legitimate. We’ve developed it for over 20-years now, validate and test it thoroughly (see our NIST StRD results at https://analyse-it.com/support/NIST-StRD ), and most importantly we do not use any of the unreliable Excel statistical functions.

' src=

September 7, 2016 at 11:46 am

You could also try the XLSTAT software statistical add-on. It holds more than 200 statistical features including multivariate data analysis, modeling, machine learning, stat tests and field-oriented features https://www.xlstat.com/en/

' src=

August 2, 2013 at 1:58 pm

Hi Karen, nice suggestions backed with arguments!

On a different note, I wish to hear your opinion on free software… Have you, for example, had an experience with EasyReg? It seems to have much of the econometrics methods covered — by far more than I would ever imagine to use –, it’s easy to operate and is supported with PDF-files about relevant theory. What do you think? (I have currently no access to commercial software, unfortunately.)

August 7, 2013 at 3:28 pm

Thanks! I haven’t used that software before, but I can tell you there are many good stat software packages out there. If you like using it and you’re confident that it’s accurate, go with it.

' src=

May 26, 2013 at 9:47 am

I often use r ! and sometimes work with SPSS and Excel,but at all, i prefer to use R because i love programing and R is a wonderfull language.also R isn’t limited! my goal idea is to create packages that cover shortage of other softwares,and linking softwares toghether.Indeed,i like to ferret in softwares. so,my first software is R but i hasn’t think about primiary software yet…! so,i research about statistical softwares and decide to use STATA inside R!

June 6, 2013 at 5:18 pm

Hi Morteza, I agree: R is awesome if you love programming. But do check out Stata too. 🙂

' src=

September 19, 2014 at 6:29 am

Hi karen do you really think that R is more efficient then Stata. I think that you are right because in programming most of my fellows using R rather then Stata. So Agreed with you…….. 😀

' src=

November 26, 2018 at 10:59 am

Hi, is there any book that will help me to learn STATA and R. If there is pease how do I get it. I’m in Nigeria.

' src=

May 23, 2013 at 8:34 pm

I use R and Stata regularly. Dollar for dollar, I personally think that Stata is the most comprehensive stats package you can buy. Excellent documentation and a great user community. R is excellent as well, but suffers from absolutely terrible online documentation, which (for me) requires third party sources (read: books).

If somebody is buying you a license, then you don’t care what it costs. If someone like me has to buy a license, then to me, Stata is a no-brainer, given all the stats you can do with it.

My college eliminated both SAS and SPSS for that reason and use R for most classes. Rumor has it SAS is offering a new “college” licensing fee, but I’m not privy to that information.

Small sidebar: SAS started on the mainframe and it annoys me that it still “looks” that way. JMP is probably better (and again, expensive) but doesn’t have anywhere the capabilities if base SAS, the last time I looked.

Just my opinions.

May 24, 2013 at 1:54 pm

I actually agree with you about Stata. If I were to start over, that’s what I would use, especially, as you’ve said, if you’re buying your own license.

And Stata has the *best* manuals, IMHO.

' src=

February 5, 2013 at 5:43 pm

Depends on which social scientists you are talking about. I doubt you will find many economists, for example, who do most (if any) of their analyses in SPSS. If you absolutely must have a gui JMP is clearly the superior platform, since its scripting language can interface with R, and you can do whatever you please. Try searching for quantile regression in the SPSS documentation, it says the math is too hard, and SPSS cannot compute.

February 6, 2013 at 2:25 pm

Agreed, most economists I’ve talked to use either Stata or Eviews.

SPSS also interfaces with R.

Sure, there are examples of specific analyses that can’t be done in any software. That’s one reason why it’s good to be able to use at least two.

' src=

November 15, 2012 at 12:30 pm

I am SPSS and R lover…in my university they use JMP software…how should I convince them that SPSS is better than JMP…or First of all can I convince them???

November 16, 2012 at 11:53 am

Well, I’m sure they’ll cite budget issues. But there are some statistical options in SPSS that are not available in JMP. I don’t know of any where the reverse is true, although that may just be my lack of knowledge of JMP. For example, to the best of my knowledge, JMP doesn’t have a Linear Mixed Model procedure.

' src=

December 3, 2012 at 1:00 pm

When you add random effects to a linear model in JMP the default is REML. In fact the manual goes so far as to say REML for repeated measures data is the modern default, and JMP provides EMS solutions for univariate RM ANOVA only for historical reasons. JMP doesn’t do multilevel models (more than 1 level of random effects), and I don’t believe it does generalized linear mixed effects models (count or binary outcomes). I usually use Stata and R, but I keep an eye on JMP because it is a fun program sometimes. I have used it for repeated measures data by mixed model when a colleague wanted help doing it himself, where the posthoc tests where flexible and accessible, compared to his version of Stata or in R.

December 3, 2012 at 5:08 pm

Thanks, Dave. That’s great to know. The last time I used JMP (which was a few years ago), REML wasn’t an option.

Yes, I agree. JMP is very straightforward and for 95% of analyses that most researchers use, entirely sufficient.

' src=

October 11, 2011 at 11:58 pm

Good advice, all around. But… if you choose SPSS as your primary package, SAS has little to offer you, and vice versa. The overlap is just too great to make either a good complement to the other.

A factor to consider in choosing between the Big Two is your preferred user interface. If you don’t want to program (much) and you adore point-and-shoot interfaces, go with SPSS. If you don’t mind programming explicitly, and despise point-and-shoot interfaces SAS will make you happier.

Another factor in choosing among the Big Two is your use of structural equation models (SEMs). If you don’t use them it’s a non-issue. If you use them extensively, you should choose between EQS-like syntax (in SAS PROC CALIS) and SPSS’s AMOS. SEMs are confusing enough without worrying about converting from your preferred expression of the models into the expression your software wants.

Much better choices as a complement to one of the Big Two are Stata and some dialect of S (R, S, S-plus). Stata users say it has some very slick programming facilities. (I’m not among them, so I can’t say from experience.) The S dialects are killers for simulation studies. I benchmarked R against SAS/IML (in version 9.1) and found R was an order of magnitude faster. R is built entirely around an object-oriented programming interface. Language extensions are a snap. In my opinion bootstrap estimation is easier in R than in other languages. High resolution graphics are native to R, and (despite a lot of improvement from versions 6 to 7 to 9.1 and 9.2) not native to SAS.

' src=

March 10, 2015 at 1:47 pm

I think SAS becomes an asset over SPSS when the focus is on data preparation: Merging multiple tables, accessing SQL databases, using API functions, creating canned reports, etc..

' src=

January 29, 2010 at 10:08 am

hi friends, I am new to R.I would like to know R-PLUS.Does any know where can I get the free training for R-PLUS.

Regards, Peng.

' src=

May 23, 2013 at 12:59 pm

I believe the above comment is spam. I am not aware of the existence of R-plus; googling revealed a word for word comment on another site: http://www.talkstats.com/showthread.php/10761-free-training-for-R-PLUS .

Apologies to the commenter if this is a genuine enquiry.

May 23, 2013 at 2:56 pm

Thanks, Dave.

I suspect it was real, only because there was no link back to another site (you wouldn’t believe the strange links I get). I figured it was a language difficulty, and they meant S-Plus, on which R was based.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Privacy Overview

Quantitative Analysis Guide: Which Statistical Software to Use?

  • Finding Data
  • Which Statistical Software to Use?
  • Merging Data Sets
  • Reshaping Data Sets
  • Choose Statistical Test for 1 Dependent Variable
  • Choose Statistical Test for 2 or More Dependent Variables

NYU Data Services, NYU Libraries & Information Technology

  • Data Services Home Page

Statistical Software Comparison

  • What statistical test to use?
  • Data Visualization Resources
  • Data Analysis Examples External (UCLA) examples of regression and power analysis
  • Supported software
  • Request a consultation
  • Making your code reproducible

Software Access

  • The first version of SPSS was developed by  Norman H. Nie, Dale H. Bent and C.  Hadlai  Hull in and released in 1968 as the Statistical Package for Social Sciences.
  • In July 2009, IBM acquired SPSS.
  • Social sciences
  • Health sciences

Data Format and Compatibility

  • .sav file to save data
  • Optional syntax files (.sps)
  • Easily export .sav file from Qualtrics
  • Import Excel files (.xls, .xlsx), Text files (.csv, .txt, .dat), SAS (.sas7bdat), Stata (.dta)
  • Export Excel files (.xls, .xlsx), Text files (.csv, .dat), SAS (.sas7bdat), Stata (.dta)
  • SPSS Chart Types
  • Chart Builder: Drag and drop graphics
  • Easy and intuitive user interface; menus and dialog boxes
  • Similar feel to Excel
  • SEMs through SPSS Amos
  • Easily exclude data and handle missing data

Limitations

  • Absence of robust methods (e.g...Least Absolute Deviation Regression, Quantile Regression, ...)
  • Unable to perform complex many to many merge

Sample Data

  • Developed by SAS 
  • Created in the 1980s by John Sall to take advantage of the graphical user interface introduced by Macintosh
  • Orginally stood for 'John's Macintosh Program'
  • Five products: JMP, JMP Pro, JMP Clinical, JMP Genomics, JMP Graph Builder App
  • Engineering: Six Sigma, Quality Control, Scientific Research, Design of Experiments
  • Healthcare/Pharmaceutical
  • .jmp file to save data
  • Optional syntax files (.jsl)
  • Import Excel files (.xls, .xlsx), Text files (.csv, .txt, .dat), SAS (.sas7bdat), Stata (.dta), SPSS (.sav)
  • Export Excel files (.xls, .xlsx), Text files (.csv, .dat), SAS (.sas7bdat)
  • Gallery of JMP Graphs
  • Drag and Drop Graph Editor will try to guess what chart is correct for your data
  • Dynamic interface can be used to zoom and change view
  • Ability to lasso outliers on a graph and regraph without the outliers
  • Interactive Graphics
  • Scripting Language (JSL)
  • SAS, R and MATLAB can be executed using JSL
  • Interface for using R from within and add-in for Excel
  • Great interface for easily managing output
  • Graphs and data tables are dynamically linked
  • Great set of online resources!
  • Absence of some robust methods (regression: 2SLS, LAD, Quantile)

  • Stata was first released in January 1985 as a regression and data management package with 44 commands, written by Bill Gould and Sean Becketti. 
  • The name Stata is a syllabic abbreviation of the words  statistics and data.
  • The graphical user interface (menus and dialog boxes) was released in 2003.
  • Political Science
  • Public Health
  • Data Science
  • Who uses Stata?

Data Format and Compatibility

  • .dta file to save dataset
  • .do syntax file, where commands can be written and saved
  • Import Excel files (.xls, .xlsx), Text files (.txt, .csv, .dat), SAS (.XPT), Other (.XML), and various ODBC data sources
  • Export  Excel files  (.xls, . xlsx ), Text files (.txt, .csv, .dat), SAS (.XPT),  Other (.XML),  and various ODBC data sources
  • Newer versions of  Stata  can read datasets, commands, graphs, etc., from older versions, and in doing so, reproduce results 
  • Older versions of Stata cannot read newer versions of Stata datasets,  but newer versions can save in the format of older versions
  • Stata Graph Gallery
  • UCLA - Stata Graph Gallery
  • Syntax mainly used, but menus are an option as well
  • Some user written programs are available to install
  • Offers matrix programming in Mata
  • Works well with panel, survey, and time-series data
  • Data management
  • Can only hold one dataset in memory at a time
  • The specific Stata package ( Stata/IC, Stata/SE, and Stata/MP ) limits the size of usable datasets.  One may have to sacrifice the number of variables for the number of observations, or vice versa, depending on the package.
  • Overall, graphs have limited flexibility.   Stata schemes , however, provide some flexibility in changing the style of the graphs.
  • Sample Syntax

* First enter the data manually; input str10 sex test1 test2    "Male" 86 83    "Male" 93 79    "Male" 85 81    "Male" 83 80    "Male" 91 76    "Female" 94 79    "Fem ale" 91 94    "Fem ale" 83 84    "Fem ale" 96 81    "Fem ale" 95 75 end

*   Next run a paired t-test; ttest test1 == test2

* Create a scatterplot; twoway ( scatter test2 test1 if sex == "Male" ) ( scatter test2 test1 if sex == "Fem ale" ), legend (lab(1 "Male" ) lab(2 "Fem ale" ))

  • The development of SAS (Statistical Analysis System) began in 1966 by Anthony Bar of North Carolina State University and later joined by James Goodnight. 
  • The National Institute of Health funded this project with a goal of analyzing agricultural data to improve crop yields.
  • The first release of SAS was in 1972. In 2012, SAS held 36.2% of the market making it the largest market-share holder in 'advanced analytics.'
  • Financial Services
  • Manufacturing
  • Health and Life Sciences
  • Available for Windows only
  • Import Excel files (.xls, .xlsx), Text files (.txt, .dat, .csv), SPSS (.sav), Stata (.dta), JMP (.jmp), Other (.xml)
  • Export  Excel files (.xls, . xlsx ), Text files (.txt, .dat, .csv),  SPSS  (.sav),  Stata  (.dta), JMP (.jmp),  Other (.xml)
  • SAS Graphics Samples Output Gallery
  • Can be cumbersome at times to create perfect graphics with syntax
  • ODS Graphics Designer provides a more interactive interface
  • BASE SAS contains the data management facility, programming language, data analysis and reporting tools
  • SAS Libraries collect the SAS datasets you create
  • Multitude of additional  components are available to complement Base SAS which include SAS/GRAPH, SAS/PH (Clinical Trial Analysis), SAS/ETS (Econometrics and Time Series), SAS/Insight (Data Mining) etc...
  • SAS Certification exams
  • Handles extremely large datasets
  • Predominantly used for data management and statistical procedures
  • SAS has two main types of code; DATA steps and  PROC  steps
  • With one procedure, test results, post estimation and plots can be produced
  • Size of datasets analyzed is only limited by the machine

Limitations 

  • Graphics can be cumbersome to manipulate
  • Since SAS is a proprietary software, there may be an extensive lag time for the implementation of new methods
  • Documentation and books tend to be very technical and not necessarily new user friendly

* First enter the data manually; data example;    input  sex $ test1 test2;   datalines ;     M 86 83     M 93 79     M 85 81     M 83 80     M 91 76     F 94 79     F 91 94     F 83 84     F 96 81     F 95 75    ; run ;

*   Next run a paired t-test; proc ttest data = example;   paired test1*test2; run ;

* Create a scatterplot; proc sgplot data = example;   scatter y = test1 x = test2 / group = sex; run ;

  • R first appeared in 1993 and was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand. 
  • R is an implementation of the S programming language which was developed at Bell Labs.
  • It is named partly after its first authors and partly as a play on the name of S.
  • R is currently developed by the R Development Core Team. 
  • RStudio, an integrated development environment (IDE) was first released in 2011.
  • Companies Using R
  • Finance and Economics
  • Bioinformatics
  • Import Excel files (.xls, .xlsx), Text files (.txt, .dat, .csv), SPSS (.sav), Stata (.dta), SAS(.sas7bdat), Other (.xml, .json)
  • Export Excel files (.xlsx), Text files (.txt, .csv), SPSS (.sav), Stata (.dta), Other (.json)
  • ggplot2 package, grammar of graphics
  • Graphs available through ggplot2
  • The R Graph Gallery
  • Network analysis (igraph)
  • Flexible esthetics and options
  • Interactive graphics with Shiny
  • Many available packages to create field specific graphics
  • R is a free and open source
  • Over 6000 user contributed packages available through  CRAN
  • Large online community
  • Network Analysis, Text Analysis, Data Mining, Web Scraping 
  • Interacts with other software such as, Python, Bioconductor, WinBUGS, JAGS etc...
  • Scope of functions, flexible, versatile etc..

Limitations​

  • Large online help community but no 'formal' tech support
  • Have to have a good understanding of different data types before real ease of use begins
  • Many user written packages may be hard to sift through

# Manually enter the data into a dataframe dataset <- data.frame(sex = c("Male", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Female"),                        test1 = c( 86 , 93 , 85 , 83 , 91 , 94 , 91 , 83 , 96 , 95 ),                        test2 = c( 83 , 79 , 81 , 80 , 76 , 79 , 94 , 84 , 81 , 75 ))

# Now we will run a paired t-test t.test(dataset$test1, dataset$test2, paired = TRUE )

# Last let's simply plot these two test variables plot(dataset$test1, dataset$test2, col = c("red","blue")[dataset$sex]) legend("topright", fill = c("blue", "red"), c("Male", "Female"))

# Making the same graph using ggplot2 install.packages('ggplot2') library(ggplot2) mygraph <- ggplot(data = dataset, aes(x = test1, y = test2, color = sex)) mygraph + geom_point(size = 5) + ggtitle('Test1 versus Test2 Scores')

  • Cleave Moler of the University of New Mexico began development in the late 1970s.
  • With the help of Jack Little, they cofounded MathWorks and released MATLAB (matrix laboratory) in 1984. 
  • Education (linear algebra and numerical analysis)
  • Popular among scientists involved in image processing
  • Engineering
  • .m Syntax file
  • Import Excel files (.xls, .xlsx), Text files (.txt, .dat, .csv), Other (.xml, .json)
  • Export Excel files (.xls, .xlsx), Text files (.txt, .dat, .csv), Other (.xml, .json)
  • MATLAB Plot Gallery
  • Customizable but not point-and-click visualization
  • Optimized for data analysis, matrix manipulation in particular
  • Basic unit is a matrix
  • Vectorized operations are quick
  • Diverse set of available toolboxes (apps) [Statistics, Optimization, Image Processing, Signal Processing, Parallel Computing etc..]
  • Large online community (MATLAB Exchange)
  • Image processing
  • Vast number of pre-defined functions and implemented algorithms
  • Lacks implementation of some advanced statistical methods
  • Integrates easily with some languages such as C, but not others, such as Python
  • Limited GIS capabilities

sex = { 'Male' , 'Male' , 'Male' , 'Male' , 'Male' , 'Female' , 'Female' , 'Female' , 'Female' , 'Female' }; t1 = [86,93,85,83,91,94,91,83,96,95]; t2 = [83,79,81,80,76,79,94,84,81,75];

% paired t-test [h,p,ci,stats] = ttest(t1,t2)

% independent samples t-test sex = categorical(sex); [h,p,ci,stats] = ttest2(t1(sex== 'Male' ),t1(sex== 'Female' ))

plot(t1,t2, 'o' ) g = sex== 'Male' ; plot(t1(g),t2(g), 'bx' ); hold on; plot(t1(~g),t2(~g), 'ro' )

Software Features and Capabilities

*The primary interface is bolded in the case of multiple interface types available.

Learning Curve

Cartoon representation of learning difficulty of various quantitative software

Further Reading

  • The Popularity of Data Analysis Software
  • Statistical Software Capability Table
  • The SAS versus R Debate in Industry and Academia
  • Why R has a Steep Learning Curve
  • Comparison of Data Analysis Packages
  • Comparison of Statistical Packages
  • MATLAB commands in Python and R
  • MATLAB and R Side by Side
  • Stata and R Side by Side

Creative Commons License logo.

  • << Previous: Statistical Guidance
  • Next: Merging Data Sets >>
  • Last Updated: Jun 3, 2024 3:55 PM
  • URL: https://guides.nyu.edu/quant

A Fresh Way to Do Statistics

Download JASP  

research on statistical software

JASP is an open-source project supported by the University of Amsterdam.

JASP has an intuitive interface that was designed with the user in mind.

JASP offers standard analysis procedures in both their classical and Bayesian form.

research on statistical software

Main Features

Your choice.

  • Frequentist analyses
  • Bayesian analyses

User-friendly Interface

  • Dynamic update of all results
  • Spreadsheet layout and an intuitive drag-and-drop interface
  • Progressive disclosure for increased understanding
  • Annotated output for communicating your results

Developed for publishing analyses

  • Integrated with The Open Science Framework (OSF)
  • Support for APA format (copy graphs and tables directly into Word)

View complete feature list

Mission Statement

Our main goal is to help statistical practitioners reach maximally informative conclusions with a minimum of fuss. This is why we have developed JASP, a free cross-platform software program with a state-of-the-art graphical user interface.

research on statistical software

Your First Steps Using JASP

Getting started.

The introductory video on the left should give you a good idea of how JASP works. You can consult our Getting Started Guide for more information.

How to Use JASP

Take a look at our How to Use JASP page for in-depth explanations of the different features in JASP.

Past Sponsors

research on statistical software

University of Bern

Department of Psychology

www.psy.unibe.ch

APS Fund for Teaching and Public Understanding of Psychological Science

www.psychologicalscience.org

European Research Council

www.erc.europa.eu

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Center for Open Science

Scientific Advisory Board

  • Prof. James O. Berger, Duke University
  • Prof. Jon Forster, University of Southampton
  • Prof. Merlise A. Clyde, Duke University
  • Prof. Ioannis Ntzoufras, Athens University of Economics and Business
  • Prof. Jeffrey N. Rouder, University of California, Irvine
  • Prof. Zoltan Dienes, University of Sussex
  • Prof. Andy Field, University of Sussex
  • Prof. Han L. J. van der Maas, University of Amsterdam
  • Prof. Erin Buchanan, Missouri State University
  • Prof. Casper Albers, University of Groningen
  • Dr. Henrik Singmann, University College London
  • Dr. Felix Schönbrodt, LMU Munich

For more details on the scientific advisory board, click here .

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Anaesth
  • v.60(9); 2016 Sep

Basic statistical tools in research and data analysis

Zulfiqar ali.

Department of Anaesthesiology, Division of Neuroanaesthesiology, Sheri Kashmir Institute of Medical Sciences, Soura, Srinagar, Jammu and Kashmir, India

S Bala Bhaskar

1 Department of Anaesthesiology and Critical Care, Vijayanagar Institute of Medical Sciences, Bellary, Karnataka, India

Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.

INTRODUCTION

Statistics is a branch of science that deals with the collection, organisation, analysis of data and drawing of inferences from the samples to the whole population.[ 1 ] This requires a proper design of the study, an appropriate selection of the study sample and choice of a suitable statistical test. An adequate knowledge of statistics is necessary for proper designing of an epidemiological study or a clinical trial. Improper statistical methods may result in erroneous conclusions which may lead to unethical practice.[ 2 ]

Variable is a characteristic that varies from one individual member of population to another individual.[ 3 ] Variables such as height and weight are measured by some type of scale, convey quantitative information and are called as quantitative variables. Sex and eye colour give qualitative information and are called as qualitative variables[ 3 ] [ Figure 1 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g001.jpg

Classification of variables

Quantitative variables

Quantitative or numerical data are subdivided into discrete and continuous measurements. Discrete numerical data are recorded as a whole number such as 0, 1, 2, 3,… (integer), whereas continuous data can assume any value. Observations that can be counted constitute the discrete data and observations that can be measured constitute the continuous data. Examples of discrete data are number of episodes of respiratory arrests or the number of re-intubations in an intensive care unit. Similarly, examples of continuous data are the serial serum glucose levels, partial pressure of oxygen in arterial blood and the oesophageal temperature.

A hierarchical scale of increasing precision can be used for observing and recording the data which is based on categorical, ordinal, interval and ratio scales [ Figure 1 ].

Categorical or nominal variables are unordered. The data are merely classified into categories and cannot be arranged in any particular order. If only two categories exist (as in gender male and female), it is called as a dichotomous (or binary) data. The various causes of re-intubation in an intensive care unit due to upper airway obstruction, impaired clearance of secretions, hypoxemia, hypercapnia, pulmonary oedema and neurological impairment are examples of categorical variables.

Ordinal variables have a clear ordering between the variables. However, the ordered data may not have equal intervals. Examples are the American Society of Anesthesiologists status or Richmond agitation-sedation scale.

Interval variables are similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. A good example of an interval scale is the Fahrenheit degree scale used to measure temperature. With the Fahrenheit scale, the difference between 70° and 75° is equal to the difference between 80° and 85°: The units of measurement are equal throughout the full range of the scale.

Ratio scales are similar to interval scales, in that equal differences between scale values have equal quantitative meaning. However, ratio scales also have a true zero point, which gives them an additional property. For example, the system of centimetres is an example of a ratio scale. There is a true zero point and the value of 0 cm means a complete absence of length. The thyromental distance of 6 cm in an adult may be twice that of a child in whom it may be 3 cm.

STATISTICS: DESCRIPTIVE AND INFERENTIAL STATISTICS

Descriptive statistics[ 4 ] try to describe the relationship between variables in a sample or population. Descriptive statistics provide a summary of data in the form of mean, median and mode. Inferential statistics[ 4 ] use a random sample of data taken from a population to describe and make inferences about the whole population. It is valuable when it is not possible to examine each member of an entire population. The examples if descriptive and inferential statistics are illustrated in Table 1 .

Example of descriptive and inferential statistics

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g002.jpg

Descriptive statistics

The extent to which the observations cluster around a central location is described by the central tendency and the spread towards the extremes is described by the degree of dispersion.

Measures of central tendency

The measures of central tendency are mean, median and mode.[ 6 ] Mean (or the arithmetic average) is the sum of all the scores divided by the number of scores. Mean may be influenced profoundly by the extreme variables. For example, the average stay of organophosphorus poisoning patients in ICU may be influenced by a single patient who stays in ICU for around 5 months because of septicaemia. The extreme values are called outliers. The formula for the mean is

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g003.jpg

where x = each observation and n = number of observations. Median[ 6 ] is defined as the middle of a distribution in a ranked data (with half of the variables in the sample above and half below the median value) while mode is the most frequently occurring variable in a distribution. Range defines the spread, or variability, of a sample.[ 7 ] It is described by the minimum and maximum values of the variables. If we rank the data and after ranking, group the observations into percentiles, we can get better information of the pattern of spread of the variables. In percentiles, we rank the observations into 100 equal parts. We can then describe 25%, 50%, 75% or any other percentile amount. The median is the 50 th percentile. The interquartile range will be the observations in the middle 50% of the observations about the median (25 th -75 th percentile). Variance[ 7 ] is a measure of how spread out is the distribution. It gives an indication of how close an individual observation clusters about the mean value. The variance of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g004.jpg

where σ 2 is the population variance, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The variance of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g005.jpg

where s 2 is the sample variance, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. The formula for the variance of a population has the value ‘ n ’ as the denominator. The expression ‘ n −1’ is known as the degrees of freedom and is one less than the number of parameters. Each observation is free to vary, except the last one which must be a defined value. The variance is measured in squared units. To make the interpretation of the data simple and to retain the basic unit of observation, the square root of variance is used. The square root of the variance is the standard deviation (SD).[ 8 ] The SD of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g006.jpg

where σ is the population SD, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The SD of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g007.jpg

where s is the sample SD, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. An example for calculation of variation and SD is illustrated in Table 2 .

Example of mean, variance, standard deviation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g008.jpg

Normal distribution or Gaussian distribution

Most of the biological variables usually cluster around a central value, with symmetrical positive and negative deviations about this point.[ 1 ] The standard normal distribution curve is a symmetrical bell-shaped. In a normal distribution curve, about 68% of the scores are within 1 SD of the mean. Around 95% of the scores are within 2 SDs of the mean and 99% within 3 SDs of the mean [ Figure 2 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g009.jpg

Normal distribution curve

Skewed distribution

It is a distribution with an asymmetry of the variables about its mean. In a negatively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the right of Figure 1 . In a positively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the left of the figure leading to a longer right tail.

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g010.jpg

Curves showing negatively skewed and positively skewed distribution

Inferential statistics

In inferential statistics, data are analysed from a sample to make inferences in the larger collection of the population. The purpose is to answer or test the hypotheses. A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. Hypothesis tests are thus procedures for making rational decisions about the reality of observed effects.

Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1 (where 0 indicates impossibility and 1 indicates certainty).

In inferential statistics, the term ‘null hypothesis’ ( H 0 ‘ H-naught ,’ ‘ H-null ’) denotes that there is no relationship (difference) between the population variables in question.[ 9 ]

Alternative hypothesis ( H 1 and H a ) denotes that a statement between the variables is expected to be true.[ 9 ]

The P value (or the calculated probability) is the probability of the event occurring by chance if the null hypothesis is true. The P value is a numerical between 0 and 1 and is interpreted by researchers in deciding whether to reject or retain the null hypothesis [ Table 3 ].

P values with interpretation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g011.jpg

If P value is less than the arbitrarily chosen value (known as α or the significance level), the null hypothesis (H0) is rejected [ Table 4 ]. However, if null hypotheses (H0) is incorrectly rejected, this is known as a Type I error.[ 11 ] Further details regarding alpha error, beta error and sample size calculation and factors influencing them are dealt with in another section of this issue by Das S et al .[ 12 ]

Illustration for null hypothesis

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g012.jpg

PARAMETRIC AND NON-PARAMETRIC TESTS

Numerical data (quantitative variables) that are normally distributed are analysed with parametric tests.[ 13 ]

Two most basic prerequisites for parametric statistical analysis are:

  • The assumption of normality which specifies that the means of the sample group are normally distributed
  • The assumption of equal variance which specifies that the variances of the samples and of their corresponding population are equal.

However, if the distribution of the sample is skewed towards one side or the distribution is unknown due to the small sample size, non-parametric[ 14 ] statistical techniques are used. Non-parametric tests are used to analyse ordinal and categorical data.

Parametric tests

The parametric tests assume that the data are on a quantitative (numerical) scale, with a normal distribution of the underlying population. The samples have the same variance (homogeneity of variances). The samples are randomly drawn from the population, and the observations within a group are independent of each other. The commonly used parametric tests are the Student's t -test, analysis of variance (ANOVA) and repeated measures ANOVA.

Student's t -test

Student's t -test is used to test the null hypothesis that there is no difference between the means of the two groups. It is used in three circumstances:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g013.jpg

where X = sample mean, u = population mean and SE = standard error of mean

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g014.jpg

where X 1 − X 2 is the difference between the means of the two groups and SE denotes the standard error of the difference.

  • To test if the population means estimated by two dependent samples differ significantly (the paired t -test). A usual setting for paired t -test is when measurements are made on the same subjects before and after a treatment.

The formula for paired t -test is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g015.jpg

where d is the mean difference and SE denotes the standard error of this difference.

The group variances can be compared using the F -test. The F -test is the ratio of variances (var l/var 2). If F differs significantly from 1.0, then it is concluded that the group variances differ significantly.

Analysis of variance

The Student's t -test cannot be used for comparison of three or more groups. The purpose of ANOVA is to test if there is any significant difference between the means of two or more groups.

In ANOVA, we study two variances – (a) between-group variability and (b) within-group variability. The within-group variability (error variance) is the variation that cannot be accounted for in the study design. It is based on random differences present in our samples.

However, the between-group (or effect variance) is the result of our treatment. These two estimates of variances are compared using the F-test.

A simplified formula for the F statistic is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g016.jpg

where MS b is the mean squares between the groups and MS w is the mean squares within groups.

Repeated measures analysis of variance

As with ANOVA, repeated measures ANOVA analyses the equality of means of three or more groups. However, a repeated measure ANOVA is used when all variables of a sample are measured under different conditions or at different points in time.

As the variables are measured from a sample at different points of time, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not appropriate because it fails to model the correlation between the repeated measures: The data violate the ANOVA assumption of independence. Hence, in the measurement of repeated dependent variables, repeated measures ANOVA should be used.

Non-parametric tests

When the assumptions of normality are not met, and the sample means are not normally, distributed parametric tests can lead to erroneous results. Non-parametric tests (distribution-free test) are used in such situation as they do not require the normality assumption.[ 15 ] Non-parametric tests may fail to detect a significant difference when compared with a parametric test. That is, they usually have less power.

As is done for the parametric tests, the test statistic is compared with known values for the sampling distribution of that statistic and the null hypothesis is accepted or rejected. The types of non-parametric analysis techniques and the corresponding parametric analysis techniques are delineated in Table 5 .

Analogue of parametric and non-parametric tests

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g017.jpg

Median test for one sample: The sign test and Wilcoxon's signed rank test

The sign test and Wilcoxon's signed rank test are used for median tests of one sample. These tests examine whether one instance of sample data is greater or smaller than the median reference value.

This test examines the hypothesis about the median θ0 of a population. It tests the null hypothesis H0 = θ0. When the observed value (Xi) is greater than the reference value (θ0), it is marked as+. If the observed value is smaller than the reference value, it is marked as − sign. If the observed value is equal to the reference value (θ0), it is eliminated from the sample.

If the null hypothesis is true, there will be an equal number of + signs and − signs.

The sign test ignores the actual values of the data and only uses + or − signs. Therefore, it is useful when it is difficult to measure the values.

Wilcoxon's signed rank test

There is a major limitation of sign test as we lose the quantitative information of the given data and merely use the + or – signs. Wilcoxon's signed rank test not only examines the observed values in comparison with θ0 but also takes into consideration the relative sizes, adding more statistical power to the test. As in the sign test, if there is an observed value that is equal to the reference value θ0, this observed value is eliminated from the sample.

Wilcoxon's rank sum test ranks all data points in order, calculates the rank sum of each sample and compares the difference in the rank sums.

Mann-Whitney test

It is used to test the null hypothesis that two samples have the same median or, alternatively, whether observations in one sample tend to be larger than observations in the other.

Mann–Whitney test compares all data (xi) belonging to the X group and all data (yi) belonging to the Y group and calculates the probability of xi being greater than yi: P (xi > yi). The null hypothesis states that P (xi > yi) = P (xi < yi) =1/2 while the alternative hypothesis states that P (xi > yi) ≠1/2.

Kolmogorov-Smirnov test

The two-sample Kolmogorov-Smirnov (KS) test was designed as a generic method to test whether two random samples are drawn from the same distribution. The null hypothesis of the KS test is that both distributions are identical. The statistic of the KS test is a distance between the two empirical distributions, computed as the maximum absolute difference between their cumulative curves.

Kruskal-Wallis test

The Kruskal–Wallis test is a non-parametric test to analyse the variance.[ 14 ] It analyses if there is any difference in the median values of three or more independent samples. The data values are ranked in an increasing order, and the rank sums calculated followed by calculation of the test statistic.

Jonckheere test

In contrast to Kruskal–Wallis test, in Jonckheere test, there is an a priori ordering that gives it a more statistical power than the Kruskal–Wallis test.[ 14 ]

Friedman test

The Friedman test is a non-parametric test for testing the difference between several related samples. The Friedman test is an alternative for repeated measures ANOVAs which is used when the same parameter has been measured under different conditions on the same subjects.[ 13 ]

Tests to analyse the categorical data

Chi-square test, Fischer's exact test and McNemar's test are used to analyse the categorical or nominal variables. The Chi-square test compares the frequencies and tests whether the observed data differ significantly from that of the expected data if there were no differences between groups (i.e., the null hypothesis). It is calculated by the sum of the squared difference between observed ( O ) and the expected ( E ) data (or the deviation, d ) divided by the expected data by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g018.jpg

A Yates correction factor is used when the sample size is small. Fischer's exact test is used to determine if there are non-random associations between two categorical variables. It does not assume random sampling, and instead of referring a calculated statistic to a sampling distribution, it calculates an exact probability. McNemar's test is used for paired nominal data. It is applied to 2 × 2 table with paired-dependent samples. It is used to determine whether the row and column frequencies are equal (that is, whether there is ‘marginal homogeneity’). The null hypothesis is that the paired proportions are equal. The Mantel-Haenszel Chi-square test is a multivariate test as it analyses multiple grouping variables. It stratifies according to the nominated confounding variables and identifies any that affects the primary outcome variable. If the outcome variable is dichotomous, then logistic regression is used.

SOFTWARES AVAILABLE FOR STATISTICS, SAMPLE SIZE CALCULATION AND POWER ANALYSIS

Numerous statistical software systems are available currently. The commonly used software systems are Statistical Package for the Social Sciences (SPSS – manufactured by IBM corporation), Statistical Analysis System ((SAS – developed by SAS Institute North Carolina, United States of America), R (designed by Ross Ihaka and Robert Gentleman from R core team), Minitab (developed by Minitab Inc), Stata (developed by StataCorp) and the MS Excel (developed by Microsoft).

There are a number of web resources which are related to statistical power analyses. A few are:

  • StatPages.net – provides links to a number of online power calculators
  • G-Power – provides a downloadable power analysis program that runs under DOS
  • Power analysis for ANOVA designs an interactive site that calculates power or sample size needed to attain a given power for one effect in a factorial ANOVA design
  • SPSS makes a program called SamplePower. It gives an output of a complete report on the computer screen which can be cut and paste into another document.

It is important that a researcher knows the concepts of the basic statistical methods used for conduct of a research study. This will help to conduct an appropriately well-designed study leading to valid and reliable results. Inappropriate use of statistical techniques may lead to faulty conclusions, inducing errors and undermining the significance of the article. Bad statistics may lead to bad research, and bad research may lead to unethical practice. Hence, an adequate knowledge of statistics and the appropriate use of statistical tests are important. An appropriate knowledge about the basic statistical methods will go a long way in improving the research designs and producing quality medical research which can be utilised for formulating the evidence-based guidelines.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

What is Statistical Analysis? Types, Methods, Software, Examples

Appinio Research · 29.02.2024 · 31min read

What Is Statistical Analysis Types Methods Software Examples

Ever wondered how we make sense of vast amounts of data to make informed decisions? Statistical analysis is the answer. In our data-driven world, statistical analysis serves as a powerful tool to uncover patterns, trends, and relationships hidden within data. From predicting sales trends to assessing the effectiveness of new treatments, statistical analysis empowers us to derive meaningful insights and drive evidence-based decision-making across various fields and industries. In this guide, we'll explore the fundamentals of statistical analysis, popular methods, software tools, practical examples, and best practices to help you harness the power of statistics effectively. Whether you're a novice or an experienced analyst, this guide will equip you with the knowledge and skills to navigate the world of statistical analysis with confidence.

What is Statistical Analysis?

Statistical analysis is a methodical process of collecting, analyzing, interpreting, and presenting data to uncover patterns, trends, and relationships. It involves applying statistical techniques and methodologies to make sense of complex data sets and draw meaningful conclusions.

Importance of Statistical Analysis

Statistical analysis plays a crucial role in various fields and industries due to its numerous benefits and applications:

  • Informed Decision Making : Statistical analysis provides valuable insights that inform decision-making processes in business, healthcare, government, and academia. By analyzing data, organizations can identify trends, assess risks, and optimize strategies for better outcomes.
  • Evidence-Based Research : Statistical analysis is fundamental to scientific research, enabling researchers to test hypotheses, draw conclusions, and validate theories using empirical evidence. It helps researchers quantify relationships, assess the significance of findings, and advance knowledge in their respective fields.
  • Quality Improvement : In manufacturing and quality management, statistical analysis helps identify defects, improve processes, and enhance product quality. Techniques such as Six Sigma and Statistical Process Control (SPC) are used to monitor performance, reduce variation, and achieve quality objectives.
  • Risk Assessment : In finance, insurance, and investment, statistical analysis is used for risk assessment and portfolio management. By analyzing historical data and market trends, analysts can quantify risks, forecast outcomes, and make informed decisions to mitigate financial risks.
  • Predictive Modeling : Statistical analysis enables predictive modeling and forecasting in various domains, including sales forecasting, demand planning, and weather prediction. By analyzing historical data patterns, predictive models can anticipate future trends and outcomes with reasonable accuracy.
  • Healthcare Decision Support : In healthcare, statistical analysis is integral to clinical research, epidemiology, and healthcare management. It helps healthcare professionals assess treatment effectiveness, analyze patient outcomes, and optimize resource allocation for improved patient care.

Statistical Analysis Applications

Statistical analysis finds applications across diverse domains and disciplines, including:

  • Business and Economics : Market research , financial analysis, econometrics, and business intelligence.
  • Healthcare and Medicine : Clinical trials, epidemiological studies, healthcare outcomes research, and disease surveillance.
  • Social Sciences : Survey research, demographic analysis, psychology experiments, and public opinion polls.
  • Engineering : Reliability analysis, quality control, process optimization, and product design.
  • Environmental Science : Environmental monitoring, climate modeling, and ecological research.
  • Education : Educational research, assessment, program evaluation, and learning analytics.
  • Government and Public Policy : Policy analysis, program evaluation, census data analysis, and public administration.
  • Technology and Data Science : Machine learning, artificial intelligence, data mining, and predictive analytics.

These applications demonstrate the versatility and significance of statistical analysis in addressing complex problems and informing decision-making across various sectors and disciplines.

Fundamentals of Statistics

Understanding the fundamentals of statistics is crucial for conducting meaningful analyses. Let's delve into some essential concepts that form the foundation of statistical analysis.

Basic Concepts

Statistics is the science of collecting, organizing, analyzing, and interpreting data to make informed decisions or conclusions. To embark on your statistical journey, familiarize yourself with these fundamental concepts:

  • Population vs. Sample : A population comprises all the individuals or objects of interest in a study, while a sample is a subset of the population selected for analysis. Understanding the distinction between these two entities is vital, as statistical analyses often rely on samples to draw conclusions about populations.
  • Independent Variables : Variables that are manipulated or controlled in an experiment.
  • Dependent Variables : Variables that are observed or measured in response to changes in independent variables.
  • Parameters vs. Statistics : Parameters are numerical measures that describe a population, whereas statistics are numerical measures that describe a sample. For instance, the population mean is denoted by μ (mu), while the sample mean is denoted by x̄ (x-bar).

Descriptive Statistics

Descriptive statistics involve methods for summarizing and describing the features of a dataset. These statistics provide insights into the central tendency, variability, and distribution of the data. Standard measures of descriptive statistics include:

  • Mean : The arithmetic average of a set of values, calculated by summing all values and dividing by the number of observations.
  • Median : The middle value in a sorted list of observations.
  • Mode : The value that appears most frequently in a dataset.
  • Range : The difference between the maximum and minimum values in a dataset.
  • Variance : The average of the squared differences from the mean.
  • Standard Deviation : The square root of the variance, providing a measure of the average distance of data points from the mean.
  • Graphical Techniques : Graphical representations, including histograms, box plots, and scatter plots, offer visual insights into the distribution and relationships within a dataset. These visualizations aid in identifying patterns, outliers, and trends.

Inferential Statistics

Inferential statistics enable researchers to draw conclusions or make predictions about populations based on sample data. These methods allow for generalizations beyond the observed data. Fundamental techniques in inferential statistics include:

  • Null Hypothesis (H0) : The hypothesis that there is no significant difference or relationship.
  • Alternative Hypothesis (H1) : The hypothesis that there is a significant difference or relationship.
  • Confidence Intervals : Confidence intervals provide a range of plausible values for a population parameter. They offer insights into the precision of sample estimates and the uncertainty associated with those estimates.
  • Regression Analysis : Regression analysis examines the relationship between one or more independent variables and a dependent variable. It allows for the prediction of the dependent variable based on the values of the independent variables.
  • Sampling Methods : Sampling methods, such as simple random sampling, stratified sampling, and cluster sampling , are employed to ensure that sample data are representative of the population of interest. These methods help mitigate biases and improve the generalizability of results.

Probability Distributions

Probability distributions describe the likelihood of different outcomes in a statistical experiment. Understanding these distributions is essential for modeling and analyzing random phenomena. Some common probability distributions include:

  • Normal Distribution : The normal distribution, also known as the Gaussian distribution, is characterized by a symmetric, bell-shaped curve. Many natural phenomena follow this distribution, making it widely applicable in statistical analysis.
  • Binomial Distribution : The binomial distribution describes the number of successes in a fixed number of independent Bernoulli trials. It is commonly used to model binary outcomes, such as success or failure, heads or tails.
  • Poisson Distribution : The Poisson distribution models the number of events occurring in a fixed interval of time or space. It is often used to analyze rare or discrete events, such as the number of customer arrivals in a queue within a given time period.

Types of Statistical Analysis

Statistical analysis encompasses a diverse range of methods and approaches, each suited to different types of data and research questions. Understanding the various types of statistical analysis is essential for selecting the most appropriate technique for your analysis. Let's explore some common distinctions in statistical analysis methods.

Parametric vs. Non-parametric Analysis

Parametric and non-parametric analyses represent two broad categories of statistical methods, each with its own assumptions and applications.

  • Parametric Analysis : Parametric methods assume that the data follow a specific probability distribution, often the normal distribution. These methods rely on estimating parameters (e.g., means, variances) from the data. Parametric tests typically provide more statistical power but require stricter assumptions. Examples of parametric tests include t-tests, ANOVA, and linear regression.
  • Non-parametric Analysis : Non-parametric methods make fewer assumptions about the underlying distribution of the data. Instead of estimating parameters, non-parametric tests rely on ranks or other distribution-free techniques. Non-parametric tests are often used when data do not meet the assumptions of parametric tests or when dealing with ordinal or non-normal data. Examples of non-parametric tests include the Wilcoxon rank-sum test, Kruskal-Wallis test, and Spearman correlation.

Descriptive vs. Inferential Analysis

Descriptive and inferential analyses serve distinct purposes in statistical analysis, focusing on summarizing data and making inferences about populations, respectively.

  • Descriptive Analysis : Descriptive statistics aim to describe and summarize the features of a dataset. These statistics provide insights into the central tendency, variability, and distribution of the data. Descriptive analysis techniques include measures of central tendency (e.g., mean, median, mode), measures of dispersion (e.g., variance, standard deviation), and graphical representations (e.g., histograms, box plots).
  • Inferential Analysis : Inferential statistics involve making inferences or predictions about populations based on sample data. These methods allow researchers to generalize findings from the sample to the larger population. Inferential analysis techniques include hypothesis testing, confidence intervals, regression analysis, and sampling methods. These methods help researchers draw conclusions about population parameters, such as means, proportions, or correlations, based on sample data.

Exploratory vs. Confirmatory Analysis

Exploratory and confirmatory analyses represent two different approaches to data analysis, each serving distinct purposes in the research process.

  • Exploratory Analysis : Exploratory data analysis (EDA) focuses on exploring data to discover patterns, relationships, and trends. EDA techniques involve visualizing data, identifying outliers, and generating hypotheses for further investigation. Exploratory analysis is particularly useful in the early stages of research when the goal is to gain insights and generate hypotheses rather than confirm specific hypotheses.
  • Confirmatory Analysis : Confirmatory data analysis involves testing predefined hypotheses or theories based on prior knowledge or assumptions. Confirmatory analysis follows a structured approach, where hypotheses are tested using appropriate statistical methods. Confirmatory analysis is common in hypothesis-driven research, where the goal is to validate or refute specific hypotheses using empirical evidence. Techniques such as hypothesis testing, regression analysis, and experimental design are often employed in confirmatory analysis.

Methods of Statistical Analysis

Statistical analysis employs various methods to extract insights from data and make informed decisions. Let's explore some of the key methods used in statistical analysis and their applications.

Hypothesis Testing

Hypothesis testing is a fundamental concept in statistics, allowing researchers to make decisions about population parameters based on sample data. The process involves formulating null and alternative hypotheses, selecting an appropriate test statistic, determining the significance level, and interpreting the results. Standard hypothesis tests include:

  • t-tests : Used to compare means between two groups.
  • ANOVA (Analysis of Variance) : Extends the t-test to compare means across multiple groups.
  • Chi-square test : Assessing the association between categorical variables.

Regression Analysis

Regression analysis explores the relationship between one or more independent variables and a dependent variable. It is widely used in predictive modeling and understanding the impact of variables on outcomes. Key types of regression analysis include:

  • Simple Linear Regression : Examines the linear relationship between one independent variable and a dependent variable.
  • Multiple Linear Regression : Extends simple linear regression to analyze the relationship between multiple independent variables and a dependent variable.
  • Logistic Regression : Used for predicting binary outcomes or modeling probabilities.

Analysis of Variance (ANOVA)

ANOVA is a statistical technique used to compare means across two or more groups. It partitions the total variability in the data into components attributable to different sources, such as between-group differences and within-group variability. ANOVA is commonly used in experimental design and hypothesis testing scenarios.

Time Series Analysis

Time series analysis deals with analyzing data collected or recorded at successive time intervals. It helps identify patterns, trends, and seasonality in the data. Time series analysis techniques include:

  • Trend Analysis : Identifying long-term trends or patterns in the data.
  • Seasonal Decomposition : Separating the data into seasonal, trend, and residual components.
  • Forecasting : Predicting future values based on historical data.

Survival Analysis

Survival analysis is used to analyze time-to-event data, such as time until death, failure, or occurrence of an event of interest. It is widely used in medical research, engineering, and social sciences to analyze survival probabilities and hazard rates over time.

Factor Analysis

Factor analysis is a statistical method used to identify underlying factors or latent variables that explain patterns of correlations among observed variables. It is commonly used in psychology, sociology, and market research to uncover underlying dimensions or constructs.

Cluster Analysis

Cluster analysis is a multivariate technique that groups similar objects or observations into clusters or segments based on their characteristics. It is widely used in market segmentation, image processing, and biological classification.

Principal Component Analysis (PCA)

PCA is a dimensionality reduction technique used to transform high-dimensional data into a lower-dimensional space while preserving most of the variability in the data. It identifies orthogonal axes (principal components) that capture the maximum variance in the data. PCA is useful for data visualization, feature selection, and data compression.

How to Choose the Right Statistical Analysis Method?

Selecting the appropriate statistical method is crucial for obtaining accurate and meaningful results from your data analysis.

Understanding Data Types and Distribution

Before choosing a statistical method, it's essential to understand the types of data you're working with and their distribution. Different statistical methods are suitable for different types of data:

  • Continuous vs. Categorical Data : Determine whether your data are continuous (e.g., height, weight) or categorical (e.g., gender, race). Parametric methods such as t-tests and regression are typically used for continuous data , while non-parametric methods like chi-square tests are suitable for categorical data.
  • Normality : Assess whether your data follows a normal distribution. Parametric methods often assume normality, so if your data are not normally distributed, non-parametric methods may be more appropriate.

Assessing Assumptions

Many statistical methods rely on certain assumptions about the data. Before applying a method, it's essential to assess whether these assumptions are met:

  • Independence : Ensure that observations are independent of each other. Violations of independence assumptions can lead to biased results.
  • Homogeneity of Variance : Verify that variances are approximately equal across groups, especially in ANOVA and regression analyses. Levene's test or Bartlett's test can be used to assess homogeneity of variance.
  • Linearity : Check for linear relationships between variables, particularly in regression analysis. Residual plots can help diagnose violations of linearity assumptions.

Considering Research Objectives

Your research objectives should guide the selection of the appropriate statistical method.

  • What are you trying to achieve with your analysis? : Determine whether you're interested in comparing groups, predicting outcomes, exploring relationships, or identifying patterns.
  • What type of data are you analyzing? : Choose methods that are suitable for your data type and research questions.
  • Are you testing specific hypotheses or exploring data for insights? : Confirmatory analyses involve testing predefined hypotheses, while exploratory analyses focus on discovering patterns or relationships in the data.

Consulting Statistical Experts

If you're unsure about the most appropriate statistical method for your analysis, don't hesitate to seek advice from statistical experts or consultants:

  • Collaborate with Statisticians : Statisticians can provide valuable insights into the strengths and limitations of different statistical methods and help you select the most appropriate approach.
  • Utilize Resources : Take advantage of online resources, forums, and statistical software documentation to learn about different methods and their applications.
  • Peer Review : Consider seeking feedback from colleagues or peers familiar with statistical analysis to validate your approach and ensure rigor in your analysis.

By carefully considering these factors and consulting with experts when needed, you can confidently choose the suitable statistical method to address your research questions and obtain reliable results.

Statistical Analysis Software

Choosing the right software for statistical analysis is crucial for efficiently processing and interpreting your data. In addition to statistical analysis software, it's essential to consider tools for data collection, which lay the foundation for meaningful analysis.

What is Statistical Analysis Software?

Statistical software provides a range of tools and functionalities for data analysis, visualization, and interpretation. These software packages offer user-friendly interfaces and robust analytical capabilities, making them indispensable tools for researchers, analysts, and data scientists.

  • Graphical User Interface (GUI) : Many statistical software packages offer intuitive GUIs that allow users to perform analyses using point-and-click interfaces. This makes statistical analysis accessible to users with varying levels of programming expertise.
  • Scripting and Programming : Advanced users can leverage scripting and programming capabilities within statistical software to automate analyses, customize functions, and extend the software's functionality.
  • Visualization : Statistical software often includes built-in visualization tools for creating charts, graphs, and plots to visualize data distributions, relationships, and trends.
  • Data Management : These software packages provide features for importing, cleaning, and manipulating datasets, ensuring data integrity and consistency throughout the analysis process.

Popular Statistical Analysis Software

Several statistical software packages are widely used in various industries and research domains. Some of the most popular options include:

  • R : R is a free, open-source programming language and software environment for statistical computing and graphics. It offers a vast ecosystem of packages for data manipulation, visualization, and analysis, making it a popular choice among statisticians and data scientists.
  • Python : Python is a versatile programming language with robust libraries like NumPy, SciPy, and pandas for data analysis and scientific computing. Python's simplicity and flexibility make it an attractive option for statistical analysis, particularly for users with programming experience.
  • SPSS : SPSS (Statistical Package for the Social Sciences) is a comprehensive statistical software package widely used in social science research, marketing, and healthcare. It offers a user-friendly interface and a wide range of statistical procedures for data analysis and reporting.
  • SAS : SAS (Statistical Analysis System) is a powerful statistical software suite used for data management, advanced analytics, and predictive modeling. SAS is commonly employed in industries such as healthcare, finance, and government for data-driven decision-making.
  • Stata : Stata is a statistical software package that provides tools for data analysis, manipulation, and visualization. It is popular in academic research, economics, and social sciences for its robust statistical capabilities and ease of use.
  • MATLAB : MATLAB is a high-level programming language and environment for numerical computing and visualization. It offers built-in functions and toolboxes for statistical analysis, machine learning, and signal processing.

Data Collection Software

In addition to statistical analysis software, data collection software plays a crucial role in the research process. These tools facilitate data collection, management, and organization from various sources, ensuring data quality and reliability.

When it comes to data collection, precision and efficiency are paramount. Appinio offers a seamless solution for gathering real-time consumer insights, empowering you to make informed decisions swiftly. With our intuitive platform, you can define your target audience with precision, launch surveys effortlessly, and access valuable data in minutes.   Experience the power of Appinio and elevate your data collection process today. Ready to see it in action? Book a demo now!

Book a Demo

How to Choose the Right Statistical Analysis Software?

When selecting software for statistical analysis and data collection, consider the following factors:

  • Compatibility : Ensure the software is compatible with your operating system, hardware, and data formats.
  • Usability : Choose software that aligns with your level of expertise and provides features that meet your analysis and data collection requirements.
  • Integration : Consider whether the software integrates with other tools and platforms in your workflow, such as data visualization software or data storage systems.
  • Cost and Licensing : Evaluate the cost of licensing or subscription fees, as well as any additional costs for training, support, or maintenance.

By carefully evaluating these factors and considering your specific analysis and data collection needs, you can select the right software tools to support your research objectives and drive meaningful insights from your data.

Statistical Analysis Examples

Understanding statistical analysis methods is best achieved through practical examples. Let's explore three examples that demonstrate the application of statistical techniques in real-world scenarios.

Example 1: Linear Regression

Scenario : A marketing analyst wants to understand the relationship between advertising spending and sales revenue for a product.

Data : The analyst collects data on monthly advertising expenditures (in dollars) and corresponding sales revenue (in dollars) over the past year.

Analysis : Using simple linear regression, the analyst fits a regression model to the data, where advertising spending is the independent variable (X) and sales revenue is the dependent variable (Y). The regression analysis estimates the linear relationship between advertising spending and sales revenue, allowing the analyst to predict sales based on advertising expenditures.

Result : The regression analysis reveals a statistically significant positive relationship between advertising spending and sales revenue. For every additional dollar spent on advertising, sales revenue increases by an estimated amount (slope coefficient). The analyst can use this information to optimize advertising budgets and forecast sales performance.

Example 2: Hypothesis Testing

Scenario : A pharmaceutical company develops a new drug intended to lower blood pressure. The company wants to determine whether the new drug is more effective than the existing standard treatment.

Data : The company conducts a randomized controlled trial (RCT) involving two groups of participants: one group receives the new drug, and the other receives the standard treatment. Blood pressure measurements are taken before and after the treatment period.

Analysis : The company uses hypothesis testing, specifically a two-sample t-test, to compare the mean reduction in blood pressure between the two groups. The null hypothesis (H0) states that there is no difference in the mean reduction in blood pressure between the two treatments, while the alternative hypothesis (H1) suggests that the new drug is more effective.

Result : The t-test results indicate a statistically significant difference in the mean reduction in blood pressure between the two groups. The company concludes that the new drug is more effective than the standard treatment in lowering blood pressure, based on the evidence from the RCT.

Example 3: ANOVA

Scenario : A researcher wants to compare the effectiveness of three different teaching methods on student performance in a mathematics course.

Data : The researcher conducts an experiment where students are randomly assigned to one of three groups: traditional lecture-based instruction, active learning, or flipped classroom. At the end of the semester, students' scores on a standardized math test are recorded.

Analysis : The researcher performs an analysis of variance (ANOVA) to compare the mean test scores across the three teaching methods. ANOVA assesses whether there are statistically significant differences in mean scores between the groups.

Result : The ANOVA results reveal a significant difference in mean test scores between the three teaching methods. Post-hoc tests, such as Tukey's HSD (Honestly Significant Difference), can be conducted to identify which specific teaching methods differ significantly from each other in terms of student performance.

These examples illustrate how statistical analysis techniques can be applied to address various research questions and make data-driven decisions in different fields. By understanding and applying these methods effectively, researchers and analysts can derive valuable insights from their data to inform decision-making and drive positive outcomes.

Statistical Analysis Best Practices

Statistical analysis is a powerful tool for extracting insights from data, but it's essential to follow best practices to ensure the validity, reliability, and interpretability of your results.

  • Clearly Define Research Questions : Before conducting any analysis, clearly define your research questions or objectives . This ensures that your analysis is focused and aligned with the goals of your study.
  • Choose Appropriate Methods : Select statistical methods suitable for your data type, research design , and objectives. Consider factors such as data distribution, sample size, and assumptions of the chosen method.
  • Preprocess Data : Clean and preprocess your data to remove errors, outliers, and missing values. Data preprocessing steps may include data cleaning, normalization, and transformation to ensure data quality and consistency.
  • Check Assumptions : Verify that the assumptions of the chosen statistical methods are met. Assumptions may include normality, homogeneity of variance, independence, and linearity. Conduct diagnostic tests or exploratory data analysis to assess assumptions.
  • Transparent Reporting : Document your analysis procedures, including data preprocessing steps, statistical methods used, and any assumptions made. Transparent reporting enhances reproducibility and allows others to evaluate the validity of your findings.
  • Consider Sample Size : Ensure that your sample size is sufficient to detect meaningful effects or relationships. Power analysis can help determine the minimum sample size required to achieve adequate statistical power.
  • Interpret Results Cautiously : Interpret statistical results with caution and consider the broader context of your research. Be mindful of effect sizes, confidence intervals, and practical significance when interpreting findings.
  • Validate Findings : Validate your findings through robustness checks, sensitivity analyses, or replication studies. Cross-validation and bootstrapping techniques can help assess the stability and generalizability of your results.
  • Avoid P-Hacking and Data Dredging : Guard against p-hacking and data dredging by pre-registering hypotheses, conducting planned analyses, and avoiding selective reporting of results. Maintain transparency and integrity in your analysis process.

By following these best practices, you can conduct rigorous and reliable statistical analyses that yield meaningful insights and contribute to evidence-based decision-making in your field.

Conclusion for Statistical Analysis

Statistical analysis is a vital tool for making sense of data and guiding decision-making across diverse fields. By understanding the fundamentals of statistical analysis, including concepts like hypothesis testing, regression analysis, and data visualization, you gain the ability to extract valuable insights from complex datasets. Moreover, selecting the appropriate statistical methods, choosing the right software, and following best practices ensure the validity and reliability of your analyses. In today's data-driven world, the ability to conduct rigorous statistical analysis is a valuable skill that empowers individuals and organizations to make informed decisions and drive positive outcomes. Whether you're a researcher, analyst, or decision-maker, mastering statistical analysis opens doors to new opportunities for understanding the world around us and unlocking the potential of data to solve real-world problems.

How to Collect Data for Statistical Analysis in Minutes?

Introducing Appinio , your gateway to effortless data collection for statistical analysis. As a real-time market research platform, Appinio specializes in delivering instant consumer insights, empowering businesses to make swift, data-driven decisions.

With Appinio, conducting your own market research is not only feasible but also exhilarating. Here's why:

  • Obtain insights in minutes, not days:  From posing questions to uncovering insights, Appinio accelerates the entire research process, ensuring rapid access to valuable data.
  • User-friendly interface:  No advanced degrees required! Our platform is designed to be intuitive and accessible to anyone, allowing you to dive into market research with confidence.
  • Targeted surveys, global reach:  Define your target audience with precision using our extensive array of demographic and psychographic characteristics, and reach respondents in over 90 countries effortlessly.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

Pareto Analysis Definition Pareto Chart Examples

30.05.2024 | 29min read

Pareto Analysis: Definition, Pareto Chart, Examples

What is Systematic Sampling Definition Types Examples

28.05.2024 | 32min read

What is Systematic Sampling? Definition, Types, Examples

Time Series Analysis Definition Types Techniques Examples

16.05.2024 | 30min read

Time Series Analysis: Definition, Types, Techniques, Examples

Statistical Software Popularity in 40,582 Research Papers

I analyzed a random sample of 76,147 full-text research papers, uploaded to PubMed Central between the years 2016 and 2021, in order to check the popularity of statistical software among medical researchers. (I used the BioC API to download the articles — see the References section below).

Out of these 76,147 research papers, only 40,582 (53.3%) mentioned the use of at least 1 statistical software.

Here’s a summary of the key findings

1- SPSS was the most used statistical software overall , mentioned in 40.48% of research papers, followed by R (20.52%) and Prism (17.38%).

2- The 6-year trend showed that SPSS had the largest decline (-1.43%) followed by SAS (-0.48%). However, R and Prism had the largest upward trends (+1.29% and +0.82% respectively).

3- The data also suggest that SPSS is more popular among beginners . This is in contrast with R and Prism, which were mentioned more commonly in papers published in high impact journals.

Top statistical packages over the years

The graph below shows that SPSS is still ranking number 1 for the past 6 years, R is in second position and Prism is third. Perhaps the most noticeable trend is the decline of SAS in 2021 which is being replaced by Stata and Microsoft Excel.

Rankings of top statistical packages over the years

Most popular statistical packages overall

The table below can be read in the following way: For instance, SPSS was reported to be used in 16,616 out of 40,582 research papers (= 40.48%), and showed a decreasing trend of 1.43% over the past 6 years.

⚠ How was the trend calculated? The 6-year trend is the linear regression coefficient (reported in percent) obtained by regressing “the percent of articles that mention a particular software package each year” onto the “years” variable. This trend was calculated only for statistical packages with more than 100 mentions over the past 6 years, because otherwise, this number will be reflecting the noise more than the trend.

Do beginners and professional scientists use the same statistical packages?

In order to answer this question, I compared the type of statistical software used in articles published in low versus high impact journals.

I collected the journal impact factor for 32,144 of the articles and divided the dataset into 2 parts:

  • Research papers published in low impact journals (impact factor ≤ 3): This subset consisted of 16,337 articles.
  • Research papers published in high impact journals (impact factor > 3): This subset consisted of 15,807 articles.

I chose the threshold of 3 for no particular reason other than it seamed a reasonable limit, and also separates the dataset into 2 approximately equal subsets in terms of number of articles.

The results were as follows:

Looking at this bar chart, we can conclude that SPSS, SAS and Stata tend to be more popular among beginners and the inverse is true for R and Prism.

Statistical packages frequently used together

Out of a total of 40,582 articles in our dataset, 7,454 (18.4%) reported the use of more than 1 statistical software.

Here’s a table that shows the top 10 pairs of statistical packages frequently used together:

I don’t think there is much to interpret here, as the top 3 most used statistical packages are also the ones frequently used together.

So which statistical software should you choose?

A statistical package can influence the way you approach a statistics problem.

It is not only the violin that shapes the violinist, we are all shaped by the tools we train ourselves to use. Edsger Dijkstra

When selecting a statistical tool, it is important to consider the following 4 points:

  • Popularity: The popularity of any kind of software affects how much it will get updated, and therefore is a good predictor of how much relevant it will be in the future.
  • Cost: When it comes to statistical software, “free” does not always mean lower quality, instead some of the free and open source options have the same, sometimes even higher, quality than their paid alternatives.
  • Interface: Although many statistical packages are marketed as simple, easy-to-use, and can perform a fully automated statistical analysis, if you want to work professionally with real-world data, you will have to learn how to code and you will need a tool that allows you to do so. That being said, you don’t need to be an expert programmer for most tasks, just knowing the basics will set you apart from your peers.
  • Flexibility: Stay away from tools that have an esoteric syntax and always prefer those that have much in common with others as you may at some point find yourself out of your comfort zone, working with different kinds of tools. For instance, although I prefer using R for everything related to statistics, I chose Python for analyzing data for this article, and JavaScript for creating visualizations. The key takeaway is to get yourself to become flexible enough to choose the best tool for the job instead of being forced to modify your projects just for the sake of working with your preferred tool.
  • Comeau DC, Wei CH, Islamaj Doğan R, and Lu Z. PMC text mining subset in BioC: about 3 million full text articles and growing,  Bioinformatics , btz070, 2019.

Further reading

  • How to Write & Publish a Research Paper: Step-by-Step Guide
  • Checking the Popularity of 125 Statistical Tests and Models
  • How Long Should a Research Paper Be? Data from 61,519 Examples
  • Programming Languages Popularity in 12,086 Research Papers
  • How Many References to Cite? Based on 96,685 Research Papers
  • How Old Should References Be? Based on 3,823,919 Examples
  • Today's news
  • Reviews and deals
  • Climate change
  • 2024 election
  • Fall allergies
  • Health news
  • Mental health
  • Sexual health
  • Family health
  • So mini ways
  • Unapologetically
  • Buying guides

Entertainment

  • How to Watch
  • My Portfolio
  • Latest News
  • Stock Market
  • Biden Economy
  • Stocks: Most Actives
  • Stocks: Gainers
  • Stocks: Losers
  • Trending Tickers
  • World Indices
  • US Treasury Bonds
  • Top Mutual Funds
  • Highest Open Interest
  • Highest Implied Volatility
  • Stock Comparison
  • Advanced Charts
  • Currency Converter
  • Basic Materials
  • Communication Services
  • Consumer Cyclical
  • Consumer Defensive
  • Financial Services
  • Industrials
  • Real Estate
  • Mutual Funds
  • Credit Cards
  • Balance Transfer Cards
  • Cash-back Cards
  • Rewards Cards
  • Travel Cards
  • Student Loans
  • Personal Loans
  • Car Insurance
  • Morning Brief
  • Market Domination
  • Market Domination Overtime
  • Asking for a Trend
  • Opening Bid
  • Stocks in Translation
  • Lead This Way
  • Good Buy or Goodbye?
  • Fantasy football
  • Pro Pick 'Em
  • College Pick 'Em
  • Fantasy baseball
  • Fantasy hockey
  • Fantasy basketball
  • Download the app
  • Daily fantasy
  • Scores and schedules
  • GameChannel
  • World Baseball Classic
  • Premier League
  • CONCACAF League
  • Champions League
  • Motorsports
  • Horse racing
  • Newsletters

New on Yahoo

  • Privacy Dashboard

Yahoo Finance

Software supply chain attacks have increased financial and reputational impacts on companies globally, new blackberry research reveals.

BlackBerry study reveals more than 75 percent of software supply chains were exposed to cyberattacks in the last twelve months.

WATERLOO, ON , June 6, 2024 /CNW/ -- BlackBerry Limited (NYSE: BB; TSX: BB) today released the results of a global survey of 1,000 senior IT decision makers and cybersecurity leaders conducted in April 2024 by Coleman Parkes on the security of the global software supply chain. The BlackBerry study sought to identify the procedures companies currently use to manage and lower the risk of security breaches from their software supply chain, drawing comparisons to previous research conducted in October 2022.

Recovery After an Attack and Impact on the Business

After an attack, a little more than half of companies (51 percent) were able to recover from a breach within a week, a slight drop (53 percent) from two years ago – while nearly 40 percent took a month, a slight increase (37 percent) from before. Slightly less than three quarters of attacks (74 percent) came through members of the software supply chain that companies were either not aware of or not monitoring before the breach. This was despite insisting on data encryption (52 percent), security awareness training for staff (48 percent), and multi-factor authentication (44 percent).

"How a company monitors and manages cybersecurity in their software supply chain has to rely on more than just trust," explains Christine Gadsby , Vice President, Product Security, BlackBerry. "IT leaders must tackle the lack of visibility as a priority."

And that risk comes with a real price -- in financial loss (64 percent), data loss (59 percent), reputational damage (58 percent), and operational impact (55 percent).

Confidence Buoyed by Monitoring

More than two thirds of respondents (68 percent) were "very confident" that suppliers can identify and prevent a vulnerability. A slightly smaller percentage (63 percent) were "very confident" supply chain partners have adequate cybersecurity regulatory and compliance practices. That confidence stems from regular monitoring.

When asked how often they inventory their supply chain partners for cybersecurity compliance, 41 percent asked for proof every quarter. These compliance requests include showing a software bill of materials (SBOM) or a Vulnerability Exploitability eXchange (VEX) artifact. The biggest barriers to regular software inventories are lack of technical understanding (51 percent), lack of visibility (46 percent) and lack of effective tools (41 percent).

Telling the Consumer

With over 75 percent of software supply chains attacked in the last 12 months, what about the consumer/end user? Seventy-eight percent of companies are tracking the impact, but only 65 percent are informing their customers. When asked why not, the top two responses were concerned about the negative impact on corporate reputation (51 percent) and lack of staff resources (45 percent).

"There is a risk that companies will be afraid of reporting attacks for fear of public shaming and damage to their corporate reputation," Gadsby notes. "Our research comes at a time of increased regulatory and legislative interest in addressing software supply chain security vulnerabilities."

Other Notable Statistics

Vulnerable components having the biggest impact for organization

Operating system – 27 percent

Web browser – 21 percent

Expected time taken to be notified in the event of a supplier suffering a cyber breach

Within four hours – 34 percent

Within 24 hours – 46 percent

Within 1-3 days – 18 percent

Comparability of suppliers' cybersecurity policies

They are of comparable strength – 66 percent

They are stronger – 30 percent

Notes to editor:  Research conducted in April 2024 by  Coleman Parkes  on behalf of  BlackBerry , with 1,000 IT decision-makers and Cybersecurity professionals across North America ( USA and Canada ), the United Kingdom , France , Germany , Malaysia , and Japan .

About BlackBerry

BlackBerry (NYSE: BB; TSX: BB) provides intelligent security software and services to enterprises and governments worldwide.  The company's software powers over 235M vehicles. Based in Waterloo, Ontario , the company leverages AI and machine learning to deliver innovative solutions in the areas of cybersecurity, safety, and data privacy solutions and is a leader in the areas of endpoint management, endpoint security, encryption, and embedded systems.  BlackBerry's vision is clear - to secure a connected future you can trust.

For more information, visit BlackBerry.com and follow @BlackBerry.

Trademarks, including but not limited to BLACKBERRY and EMBLEM Design, are the trademarks or registered trademarks of BlackBerry Limited, and the exclusive rights to such trademarks are expressly reserved.   All other trademarks are the property of their respective owners.  BlackBerry is not responsible for any third-party products or services.

Media Contacts: BlackBerry Media Relations +1 (519) 597-7273 [email protected]

View original content to download multimedia: https://www.prnewswire.com/news-releases/software-supply-chain-attacks-have-increased-financial-and-reputational-impacts-on-companies-globally-new-blackberry-research-reveals-302165423.html

SOURCE BlackBerry Limited

View original content to download multimedia: http://www.newswire.ca/en/releases/archive/June2024/06/c1922.html

Let your curiosity lead the way:

Apply Today

  • Arts & Sciences
  • Graduate Studies in A&S

TRIADS software engineering team custom builds research tools for WashU faculty

Washington University faculty engage in research so innovative that it often demands specialized tools that don’t yet exist.

As part of its mission to foster groundbreaking, data-driven research, the Transdisciplinary Institute in Applied Data Sciences (TRIADS) has formed its own software engineering team to help connect faculty with custom solutions to meet their research needs.

WashU Professor of Political Science Dino P. Christenson approached the team in search of a sophisticated tool to help draw connections between people’s web browsing behavior and political opinions, while providing ironclad privacy protection for users. The resulting Online Privacy-Protected Synthesizer is a simple plugin that users can install in their web browser, which will then deliver a treasure trove of browsing data to Christenson and his research partners at Boston University.

“Our team had the great pleasure of working with software engineer Jessie Walker , who helped us turn the basics of an app into a polished plugin for multiple browsers,” Christenson said. “Thanks to TRIADS and Jessie’s quick and excellent work—and additional funding from the Weidenbaum Center —we’re ready for preliminary data collection and a major grant proposal in the near future.”

Walker and the TRIADS software team’s work requires a great deal of nimble thinking, building collaborations with faculty from multiple fields of study and developing solutions to supercharge their research.

“We view our role not just as technical support providers but as strategic partners who help faculty achieve their research goals,” Walker said. “By offering personalized guidance and solutions, we aim to facilitate breakthroughs in their research and contribute to their success.”

TRIADS software engineers don’t need to possess the expertise level of their faculty collaborators to help drive innovative research. Instead, the role demands the ability to ask the right questions and listen to the resulting answers critically. 

Nan Lin , a professor in WashU’s new Department of Statistics and Data Science , connected with TRIADS software engineer Greg Porter to help implement a new algorithm for quantile regression analysis. While Porter isn’t an expert in quantile regression, he knows how to code. And when you’re crunching numbers at the level of Lin and his team, you need a tool built to handle the strain.

“Greg’s expertise and guidance were instrumental in achieving our goals, and we are truly grateful for his contributions,” Lin said. “Throughout the project, his insights and recommendations significantly enhanced our understanding, particularly in identifying areas for parallelization and optimizing code performance.”

The TRIADS software engineering team also offers consultations and technical guidance for WashU faculty. To learn more and schedule a meeting, visit the TRIADS website.

in the news:

TRIADS announces call for graduate student fellows

TRIADS announces call for graduate student fellows

STL DataFest unites the region's data scientists

STL DataFest unites the region's data scientists

TRIADS Training Series demystifies data science tools for WashU students, staff, and faculty

TRIADS Training Series demystifies data science tools for WashU students, staff, and faculty

TRIADS announces details of its 2024 Seed Grant Program

TRIADS announces details of its 2024 Seed Grant Program

Remote Work Statistics And Trends In 2024

Katherine Haan

Updated: Jun 12, 2023, 5:29am

Remote Work Statistics And Trends In 2024

Table of Contents

Key remote work statistics, remote work by industry and occupation, remote work by demographics, remote work preferences (surveys, sentiment, etc.), benefits and challenges of remote work, remote work trends.

The paradigm of traditional workspaces has undergone a seismic shift thanks to the Covid pandemic. As a result, remote work has emerged as a dominant trend, requiring human resources departments to pivot faster than ever before. In this comprehensive analysis, we present the most recent remote work statistics that are shaping the professional world and working environments across the nation.

As we navigate through the ever-evolving world of post-pandemic work in 2023, several key remote work statistics stand out. They not only offer insight into the current state of remote work but also provide a glimpse into its future.

As of 2023, 12.7% of full-time employees work from home, while 28.2% work a hybrid model

Currently, 12.7% of full-time employees work from home, illustrating the rapid normalization of remote work environments. Simultaneously, a significant 28.2% of employees have adapted to a hybrid work model. This model combines both home and in-office working, offering flexibility and maintaining a level of physical presence at the workplace [1] .

Despite the steady rise in remote work, the majority of the workforce (59.1%) still work in-office [1] . This percentage underscores the fact that while remote work is on an upswing, traditional in-office work is far from obsolete.

By 2025, 32.6 million Americans will work remote by 2025

Looking ahead, the future of remote work seems promising. According to Upwork, by 2025, an estimated 32.6 million Americans will be working remotely, which equates to about 22% of the workforce [2] . This projection suggests a continuous, yet gradual, shift towards remote work arrangements.

98% of workers want to work remote at least some of the time

Interestingly, workers’ preference for remote work aligns with this trend. A staggering 98% of workers expressed the desire to work remotely, at least part of the time [3] . This overwhelming figure reflects the workforce’s growing affinity towards the flexibility, autonomy and work-life balance that remote work offers.

93% of employers plan to continue conducting job interviews remotely

From the employers’ perspective, the acceptance of remote work is evident as well. A remarkable 93% of employers plan to continue conducting job interviews remotely [4] . This indicates a willingness to adapt to virtual methods and signals the recognition of remote work as a sustainable option.

16% of companies operate fully remote

About 16% of companies are already fully remote, operating without a physical office [5] . These companies are pioneers in the remote work paradigm, highlighting the feasibility of such models and paving the way for others to follow.

It’s evident that some industries and job roles are more geared towards remote work than others. Understanding these trends helps us predict the direction remote work will take in the future.

The computer and IT sector leads as the top industry for remote work in 2023 [6] . This aligns with the fact that tasks in this sector are often digital in nature, requiring only a reliable internet connection.

Other industries aren’t far behind. Marketing, accounting and finance, and project management have embraced remote work, using digital tools and platforms to ensure work continuity. The medical and health industry has also seen a shift towards remote work, primarily driven by the rise of telehealth services and the digitization of health records.

Even sectors such as HR and recruiting and customer service, traditionally reliant on physical offices, are experiencing the benefits of remote work. Virtual collaboration tools have enabled these industries to operate effectively, irrespective of location.

Shifting the lens to the most sought-after remote job roles, accountant tops the list in 2022. This showcases how traditional office functions, such as accounting, can successfully adapt to a remote format.

Other prominent remote job postings include executive assistant, customer service representative and senior financial analyst. These roles, although diverse, can all be performed effectively with the right technology, without the need for a physical office.

Recruiters, project managers, technical writers, product marketing managers, customer success managers and graphic designers also feature prominently on the list of remote roles. The wide variety of these roles signifies the expanding scope of remote work across different fields.

These industry and occupation-specific statistics highlight the widespread acceptance of remote work. With the evolution of digital tools and changing work norms, remote work is no longer a niche concept but a growing trend spanning various fields.

The top industry for remote workers in 2024 is computer and IT

  • Computer and IT
  • Accounting and Finance
  • Project Management
  • Medical and Health
  • HR and Recruiting
  • Customer Service

An accountant was the most common remote job posting in 2022

  • Executive Assistant
  • Customer Service Representative
  • Senior Financial Analyst
  • Project Manager
  • Technical Writer
  • Product Marketing Manager
  • Customer Success Manager
  • Graphic Designer

A closer look at the demographics of remote work in 2023 offers fascinating insights into who is embracing this work model and how it’s affecting their livelihoods.

The highest percentage of remote workers are aged 24 to 35

The age group most likely to work remotely are those aged 24 to 35 [7] . Within this demographic, 39% work remotely full time and 25% do so part time. This suggests that the younger workforce values the flexibility and autonomy offered by remote work, which could have implications for businesses looking to attract and retain this talent group.

Education also plays a significant role in remote work accessibility. Those with higher levels of education have a better chance at remote work. This could be a consequence of the qualities of roles that necessitate postgraduate qualifications, which usually involve cognitive labor that can be done anywhere.

Workers with more education are more likely to have remote work options

A higher percentage of men work remote than women.

In terms of gender, there is a higher percentage of men who work from home than women. Specifically, 38% of men work remotely full time, and 23% part time. Comparatively, 30% of women work remotely full time, and 22% part time. These figures suggest a gender gap in remote work, highlighting the need for more inclusive remote work policies to ensure equal opportunities.

Remote workers on average earn $19,000 more than in-office workers

Remote work also seems to have a positive impact on earnings. Remote workers, in comparison, make an average of $19,000 more than those in the office [1] . Remote workers make an average of $74,000, while in-office workers typically have an average salary of $55,000.

Those who opt for a hybrid work model report the highest average salary at $80,000. This may be attributed to the flexibility and balance that hybrid work offers, enabling workers to maximize their productivity and potentially take on more responsibilities.

These demographic insights serve as a snapshot of the current remote work landscape. Understanding these patterns can help employers design remote work policies that cater to their workforce's needs and preferences, while also bridging any gaps in accessibility and pay.

As remote work becomes more prevalent, it’s important to understand workers’ sentiments towards this evolving model. Surveys and studies offer revealing insights into workers’ preferences and how remote work impacts their lives.

57% of workers would look for a new job if their current company didn’t allow remote work

One of the most compelling statistics indicates that 57% of workers would consider leaving their current job if their employer stopped allowing remote work [6] . This figure underscores the value that workers place on the flexibility and autonomy associated with remote work.

35% of remote employees feel more productive when working fully remote

Productivity is another significant factor that influences workers’ remote work preferences. Thirty-five percent of remote employees feel more productive when working fully remotely [8] . This could be due to reduced commute times, fewer in-person distractions or the ability to design a work environment that suits their needs.

65% report wanting to work remote all of the time

Sixty-five percent of workers desire to work remotely all the time, highlighting the popularity of this work model [6] . At the same time, 32% prefer a hybrid schedule, which combines the best of both worlds—flexibility from remote work and collaboration opportunities from in-office work.

71% of remote workers said remote work helps balance their work and personal life

When it comes to work-life balance, a crucial aspect of employee well-being, remote work seems to be making a positive impact. Seventy-one percent of remote workers stated that remote work helps balance their work and personal life [9] . However, it’s important to acknowledge that 12% reported that it hurts their work-life balance, indicating that remote work may not suit everyone.

Understanding these preferences is vital for organizations as they design their remote work policies. The goal should be to harness the benefits of remote work—such as increased productivity and improved work-life balance—while addressing potential drawbacks to ensure a positive remote work experience for all employees.

Embracing remote work comes with its own set of benefits and challenges, impacting both employees and employers in various ways. Understanding these aspects can help in creating effective strategies for managing remote work.

Remote workers say that flexible hours are the top benefits of working remotely

69% of remote workers report increased burnout from digital communication tools.

However, the transition to remote work is not without its challenges. Sixty-nine percent of remote workers experience increased burnout from digital communication tools [10] . The constant stream of digital communication can lead to mental fatigue, underscoring the need for proper work boundaries and digital wellness strategies.

53% of remote workers say it’s harder to feel connected to their coworkers

Another challenge associated with remote work is the lack of face-to-face interaction. Surveys of remote workers report finding it harder to feel connected to their coworkers [9] . Yet, 37% feel that remote work neither hurts nor helps with connection to coworkers. This highlights the need for effective communication and team-building strategies in a remote setting.

Research shows that employers can save $11,000 per employee when switching to remote work

While the challenges are noteworthy, remote work also offers significant financial benefits for employers. Research shows that employers can save $11,000 per employee when switching to remote work [11] . These savings come from reduced costs associated with office space, utilities and other resources.

In essence, while remote work offers tangible benefits including flexible hours and cost savings, it also presents challenges such as digital burnout and reduced social connection. Employers and employees need to work together to maximize the benefits while effectively addressing the challenges to create a healthy and productive remote work environment.

The shift towards remote work has brought several notable trends to the forefront, shaping how companies and employees approach this model of work.

60% of companies use monitoring software to track remote employees

The use of monitoring software is one trend that’s gained traction. As many as 60% of companies now rely on such tools to track remote employees [12] . While these tools can aid productivity and accountability, they also pose privacy considerations, highlighting the need for transparency and consent in their use.

73% of executives believe remote workers pose a greater security risk

Cybersecurity has also become a major concern for businesses. A significant 73% of executives perceive remote workers as a greater security risk [13] . This concern stresses the need for robust security protocols and employee education about safe digital practices in a remote work setting.

32% of hybrid workers report they would take a pay cut to work remotely full time

Another trend that showcases the preference for remote work is the willingness of employees to accept financial trade-offs. A surprising 32% of hybrid workers state they would consider a pay cut to work remotely full time [14] . This reflects the high value workers place on the flexibility and autonomy remote work provides and could potentially impact how companies structure compensation in the future.

Each of these trends provides valuable insights into the evolving dynamics of remote work. As we continue to adapt to this new work landscape, understanding these trends will be crucial in shaping effective remote work policies and practices.

Visit our hub to view more statistic pages .

  • WFHResearch
  • ApolloTechnical
  • PewResearch
  • Forbes Advisor
  • Best HR Software
  • Best HCM Software
  • Best HRIS Systems
  • Best Employee Management Software
  • Best Onboarding Software
  • Best Talent Management Software
  • Best HR Outsourcing Services
  • Best Workforce Management Software
  • Best Time And Attendance Software
  • Best Employee Scheduling Software
  • Best Employee Time Tracking Apps
  • Best Free Time Tracking Apps
  • Best Employee Training Software
  • Best Employee Monitoring Software
  • Best Enterprise Learning Management Systems
  • Best Time Clock Software
  • Best ERP Systems
  • Zenefits Review
  • Oracle HCM Review
  • UKG Pro Review
  • IntelliHR Review
  • ADP Workforce Now Review
  • ADP TotalSource Review
  • SuccessFactors Review
  • Connecteam Review
  • What is Human Resources?
  • Employee Benefits Guide
  • What is Workforce Management?
  • What is a PEO?
  • What is Human Capital Management?
  • HR Compliance Guide
  • Strategic Human Resource Management
  • Onboarding Checklist
  • Benefits Administration Guide
  • What Is Employee Training?
  • Employee Development Plan
  • 30-60-90 Day Plan Guide
  • How To Calculate Overtime
  • What Is Outplacement?
  • New Hire Orientation Checklist
  • HR Analytics Guide

Next Up In HR

  • Best PEO Services
  • Best Performance Management Software
  • Best HR Apps
  • Essential HR Metrics

How To Start A Print On Demand Business In 2024

How To Start A Print On Demand Business In 2024

Katherine Haan

HR For Small Businesses: The Ultimate Guide

Anna Baluch

How One Company Is Using AI To Transform Manufacturing

Rae Hartley Beck

Not-For-Profit Vs. Nonprofit: What’s The Difference?

Natalie Cusson

How To Develop an SEO Strategy in 2024

Jennifer Simonson

How To Make Money On Social Media in 2024

Katherine Haan is a small business owner with nearly two decades of experience helping other business owners increase their incomes.

  • Systematic Review
  • Open access
  • Published: 24 May 2024

Turnover intention and its associated factors among nurses in Ethiopia: a systematic review and meta-analysis

  • Eshetu Elfios 1 ,
  • Israel Asale 1 ,
  • Merid Merkine 1 ,
  • Temesgen Geta 1 ,
  • Kidist Ashager 1 ,
  • Getachew Nigussie 1 ,
  • Ayele Agena 1 ,
  • Bizuayehu Atinafu 1 ,
  • Eskindir Israel 2 &
  • Teketel Tesfaye 3  

BMC Health Services Research volume  24 , Article number:  662 ( 2024 ) Cite this article

375 Accesses

Metrics details

Nurses turnover intention, representing the extent to which nurses express a desire to leave their current positions, is a critical global public health challenge. This issue significantly affects the healthcare workforce, contributing to disruptions in healthcare delivery and organizational stability. In Ethiopia, a country facing its own unique set of healthcare challenges, understanding and mitigating nursing turnover are of paramount importance. Hence, the objectives of this systematic review and meta-analysis were to determine the pooled proportion ofturnover intention among nurses and to identify factors associated to it in Ethiopia.

A comprehensive search carried out for studies with full document and written in English language through an electronic web-based search strategy from databases including PubMed, CINAHL, Cochrane Library, Embase, Google Scholar and Ethiopian University Repository online. Checklist from the Joanna Briggs Institute (JBI) was used to assess the studies’ quality. STATA version 17 software was used for statistical analyses. Meta-analysis was done using a random-effects method. Heterogeneity between the primary studies was assessed by Cochran Q and I-square tests. Subgroup and sensitivity analyses were carried out to clarify the source of heterogeneity.

This systematic review and meta-analysis incorporated 8 articles, involving 3033 nurses in the analysis. The pooled proportion of turnover intention among nurses in Ethiopia was 53.35% (95% CI (41.64, 65.05%)), with significant heterogeneity between studies (I 2  = 97.9, P  = 0.001). Significant association of turnover intention among nurses was found with autonomous decision-making (OR: 0.28, CI: 0.14, 0.70) and promotion/development (OR: 0.67, C.I: 0.46, 0.89).

Conclusion and recommendation

Our meta-analysis on turnover intention among Ethiopian nurses highlights a significant challenge, with a pooled proportion of 53.35%. Regional variations, such as the highest turnover in Addis Ababa and the lowest in Sidama, underscore the need for tailored interventions. The findings reveal a strong link between turnover intention and factors like autonomous decision-making and promotion/development. Recommendations for stakeholders and concerned bodies involve formulating targeted retention strategies, addressing regional variations, collaborating for nurse welfare advocacy, prioritizing career advancement, reviewing policies for nurse retention improvement.

Peer Review reports

Turnover intention pertaining to employment, often referred to as the intention to leave, is characterized by an employee’s contemplation of voluntarily transitioning to a different job or company [ 1 ]. Nurse turnover intention, representing the extent to which nurses express a desire to leave their current positions, is a critical global public health challenge. This issue significantly affects the healthcare workforce, contributing to disruptions in healthcare delivery and organizational stability [ 2 ].

The global shortage of healthcare professionals, including nurses, is an ongoing challenge that significantly impacts the capacity of healthcare systems to provide quality services [ 3 ]. Nurses, as frontline healthcare providers, play a central role in patient care, making their retention crucial for maintaining the functionality and effectiveness of healthcare delivery. However, the phenomenon of turnover intention, reflecting a nurse’s contemplation of leaving their profession, poses a serious threat to workforce stability [ 4 ].

Studies conducted globally shows that high turnover rates among nurses in several regions, with notable figures reported in Alexandria (68%), China (63.88%), and Jordan (60.9%) [ 5 , 6 , 7 ]. In contrast, Israel has a remarkably low turnover rate of9% [ 8 ], while Brazil reports 21.1% [ 9 ], and Saudi hospitals26% [ 10 ]. These diverse turnover rates highlight the global nature of the nurse turnover phenomenon, indicating varying degrees of workforce mobility in different regions.

The magnitude and severity of turnover intention among nurses worldwide underscore the urgency of addressing this issue. High turnover rates not only disrupt healthcare services but also result in a loss of valuable skills and expertise within the nursing workforce. This, in turn, compromises the continuity and quality of patient care, with potential implications for patient outcomes and overall health service delivery [ 11 ]. Extensive research conducted worldwide has identified a range of factors contributing to turnover intention among nurses [ 11 , 12 , 13 , 14 , 15 , 16 , 17 ]. These factors encompass both individual and organizational aspects, such as high workload, inadequate support, limited career advancement opportunities, job satisfaction, conflict, payment or reward, burnout sense of belongingness to their work environment. The complex interplay of these factors makes addressing turnover intention a multifaceted challenge that requires targeted interventions.

In Ethiopia, a country facing its own unique set of healthcare challenges, understanding and mitigating nursing turnover are of paramount importance. The healthcare system in Ethiopia grapples with issues like resource constraints, infrastructural limitations, and disparities in healthcare access [ 18 ]. Consequently, the factors influencing nursing turnover in Ethiopia may differ from those in other regions. Previous studies conducted in the Ethiopian context have started to unravel some of these factors, emphasizing the need for a more comprehensive examination [ 18 , 19 ].

Although many cross-sectional studies have been conducted on turnover intention among nurses in Ethiopia, the results exhibit variations. The reported turnover intention rates range from a minimum of 30.6% to a maximum of 80.6%. In light of these disparities, this systematic review and meta-analysis was undertaken to ascertain the aggregated prevalence of turnover intention among nurses in Ethiopia. By systematically analyzing findings from various studies, we aimed to provide a nuanced understanding of the factors influencing turnover intention specific to the Ethiopian healthcare context. Therefore, this systematic review and meta-analysis aimed to answer the following research questions.

What is the pooled prevalence of turnover intention among nurses in Ethiopia?

What are the factors associated with turnover intention among nurses in Ethiopia?

The primary objective of this review was to assess the pooled proportion of turnover intention among nurses in Ethiopia. The secondary objective was identifying the factors associated to turnover intention among nurses in Ethiopia.

Study design and search strategy

A comprehensive systematic review and meta-analysis was conducted, examining observational studies on turnover intention among nurses in Ethiopia. The procedure for this systematic review and meta-analysis was developed in accordance with the Preferred Reporting Items for Systematic review and Meta-analysis Protocols (PRISMA-P) statement [ 20 ]. PRISMA-2015 statement was used to report the findings [ 21 , 22 ]. This systematic review and meta-analysis were registered on PROSPERO with the registration number of CRD42024499119.

We conducted systematic and an extensive search across multiple databases, including PubMed, CINAHL, Cochrane Library, Embase, Google Scholar and Ethiopian University Repository online to identify studies reporting turnover intention among nurses in Ethiopia. We reviewed the database available at http://www.library.ucsf.edu and the Cochrane Library to ensure that the intended task had not been previously undertaken, preventing any duplication. Furthermore, we screened the reference lists to retrieve relevant articles. The process involved utilizing EndNote (version X8) software for downloading, organizing, reviewing, and citing articles. Additionally, a manual search for cross-references was performed to discover any relevant studies not captured through the initial database search. The search employed a comprehensive set of the following search terms:“prevalence”, “turnover intention”, “intention to leave”, “attrition”, “employee attrition”, “nursing staff turnover”, “Ethiopian nurses”, “nurses”, and “Ethiopia”. These terms were combined using Boolean operators (AND, OR) to conduct a thorough and systematic search across the specified databases.

Eligibility criteria

Inclusion criteria.

The established inclusion criteria for this meta-analysis and systematic review are as follows to guide the selection of articles for inclusion in this review.

Population: Nurses working in Ethiopia.

Study period: studies conducted or published until 23November 2023.

Study design: All observational study designs, such as cross-sectional, longitudinal, and cohort studies, were considered.

Setting: Only studies conducted in Ethiopia were included.

Outcome; turnover intention.

Study: All studies, whether published or unpublished, in the form of journal articles, master’s theses, and dissertations, were included up to the final date of data analysis.

Language: This study exclusively considered studies in the English language.

Exclusion criteria

Excluded were studies lacking full text or Studies with a Newcastle–Ottawa Quality Assessment Scale (NOS) score of 6 or less. Studies failing to provide information on turnover intention among nurses or studies for which necessary details could not be obtained were excluded. Three authors (E.E., T.G., K.A) independently assessed the eligibility of retrieved studies, other two authors (E.I & M.M) input sought for consensus on potential in- or exclusion.

Quality assessment and data extraction

Two authors (E.E, A.A, G.N) independently conducted a critical appraisal of the included studies. Joanna Briggs Institute (JBI) checklists of prevalence study was used to assess the quality of the studies. Studies with a Newcastle–Ottawa Quality Assessment Scale (NOS) score of seven or more were considered acceptable [ 23 ]. The tool has nine parameters, which have yes, no, unclear, and not applicable options [ 24 ]. Two reviewers (I.A, B.A) were involved when necessary, during the critical appraisal process. Accordingly, all studies were included in our review. ( Table  1 ) Questions to evaluate the methodological quality of studies on turnover intention among nurses and its associated factors in Ethiopia are the followings:

Q1 = was the sample frame appropriate to address the target population?

Q2. Were study participants sampled appropriately.

Q3. Was the sample size adequate?

Q4. Were the study subjects and the setting described in detail?

Q5. Was the data analysis conducted with sufficient coverage of the identified sample?

Q6. Were the valid methods used for the identification of the condition?

Q7. Was the condition measured in a standard, reliable way for all participants?

Q8. Was there appropriate statistical analysis?

Q9. Was the response rate adequate, and if not, was the low response rate.

managed appropriately?

Data was extracted and recorded in a Microsoft Excel as guided by the Joanna Briggs Institute (JBI) data extraction form for observational studies. Three authors (E.E, M.G, T.T) independently conducted data extraction. Recorded data included the first author’s last name, publication year, study setting or country, region, study design, study period, sample size, response rate, population, type of management, proportion of turnover intention, and associated factors. Discrepancies in data extraction were resolved through discussion between extractors.

Data processing and analysis

Data analysis procedures involved importing the extracted data into STATA 14 statistical software for conducting a pooled proportion of turnover intention among nurses. To evaluate potential publication bias and small study effects, both funnel plots and Egger’s test were employed [ 25 , 26 ]. We used statistical tests such as the I statistic to quantify heterogeneity and explore potential sources of variability. Additionally, subgroup analyses were conducted to investigate the impact of specific study characteristics on the overall results. I 2 values of 0%, 25%, 50%, and 75% were interpreted as indicating no, low, medium, and high heterogeneity, respectively [ 27 ].

To assess publication bias, we employed several methods, including funnel plots and Egger’s test. These techniques allowed us to visually inspect asymmetry in the distribution of study results and statistically evaluate the presence of publication bias. Furthermore, we conducted sensitivity analyses to assess the robustness of our findings to potential publication bias and other sources of bias.

Utilizing a random-effects method, a meta-analysis was performed to assess turnover intention among nurses, employing this method to account for observed variability [ 28 ]. Subgroup analyses were conducted to compare the pooled magnitude of turnover intention among nurses and associated factors across different regions. The results of the pooled prevalence were visually presented in a forest plot format with a 95% confidence interval.

Study selection

After conducting the initial comprehensive search concerning turnover intention among nurses through Medline, Cochran Library, Web of Science, Embase, Ajol, Google Scholar, and other sources, a total of 1343 articles were retrieved. Of which 575 were removed due to duplication. Five hundred ninety-three articles were removed from the remaining 768 articles by title and abstract. Following theses, 44 articles which cannot be retrieved were removed. Finally, from the remaining 131 articles, 8 articles with a total 3033 nurses were included in the systematic review and meta-analysis (Fig.  1 ).

figure 1

PRISMA flow diagram of the selection process of studies on turnover intention among nurses in Ethiopia, 2024

Study characteristics

All included 8 studies had a cross-sectional design and of which, 2 were from Tigray region, 2 were from Addis Ababa(Capital), 1 from south region, 1 from Amhara region, 1 from Sidama region, and 1 was multiregional and Nationwide. The prevalence of turnover intention among nurses ‘ranges from 30.6 to 80.6%. Table  2 .

Pooled prevalence of turnover intention among nurses in Ethiopia

Our comprehensive meta-analysis revealed a notable turnover intention rate of 53.35% (95% CI: 41.64, 65.05%) among Ethiopian nurses, accompanied by substantial heterogeneity between studies (I 2  = 97.9, P  = 0.000) as depicted in Fig.  2 . Given the observed variability, we employed a random-effects model to analyze the data, ensuring a robust adjustment for the significant heterogeneity across the included studies.

figure 2

Forest plot showing the pooled proportion of turnover intention among nurses in Ethiopia, 2024

Subgroup analysis of turnover intention among nurses in Ethiopia

To address the observed heterogeneity, we conducted a subgroup analysis based on regions. The results of the subgroup analysis highlighted considerable variations, with the highest level of turnover intention identified in Addis Ababa at 69.10% (95% CI: 46.47, 91.74%) and substantial heterogeneity (I 2  = 98.1%). Conversely, the Sidama region exhibited the lowest level of turnover intention among nurses at 30.6% (95% CI: 25.18, 36.02%), accompanied by considerable heterogeneity (I 2  = 100.0%) ( Fig.  3 ).

figure 3

Subgroup analysis of systematic review and meta-analysis by region of turnover intention among nurses in Ethiopia, 2024

Publication bias of turnover intention among nurses in Ethiopia

The Egger’s test result ( p  = 0.64) is not statistically significant, indicating no evidence of publication bias in the meta-analysis (Table  3 ). Additionally, the symmetrical distribution of included studies in the funnel plot (Fig.  4 ) confirms the absence of publication bias across studies.

figure 4

Funnel plot of systematic review and meta-analysis on turnover intention among nurses in Ethiopia, 2024

Sensitivity analysis

The leave-out-one sensitivity analysis served as a meticulous evaluation of the influence of individual studies on the comprehensive pooled prevalence of turnover intention within the context of Ethiopian nurses. In this systematic process, each study was methodically excluded from the analysis one at a time. The outcomes of this meticulous examination indicated that the exclusion of any particular study did not lead to a noteworthy or statistically significant alteration in the overall pooled estimate of turnover intention among nurses in Ethiopia. The findings are visually represented in Fig.  5 , illustrating the stability and robustness of the overall pooled estimate even with the removal of specific studies from the analysis.

figure 5

Sensitivity analysis of pooled prevalence for each study being removed at a time for systematic review and meta-analysis of turnover intention among nurses in Ethiopia

Factors associated with turnover intention among nurses in Ethiopia

In our meta-analysis, we comprehensively reviewed and conducted a meta-analysis on the determinants of turnover intention among nurses in Ethiopia by examining eight relevant studies [ 6 , 29 , 30 , 31 , 32 , 33 , 34 , 35 ]. We identified a significant association between turnover intention with autonomous decision-making (OR: 0.28, CI: 0.14, 0.70) (Fig.  6 ) and promotion/development (OR: 0.67, CI: 0.46, 0.89) (Fig.  7 ). In both instances, the odds ratios suggest a negative association, signifying that increased levels of autonomous decision-making and promotion/development were linked to reduced odds of turnover intention.

figure 6

Forest plot of the association between autonomous decision making with turnover intention among nurses in Ethiopia2024

figure 7

Forest plot of the association between promotion/developpment with turnover intention among nurses in Ethiopia, 2024

In our comprehensive meta-analysis exploring turnover intention among nurses in Ethiopia, our findings revealed a pooled proportion of turnover intention at 53.35%. This significant proportion warrants a comparative analysis with turnover rates reported in other global regions. Distinct variations emerge when compared with turnover rates in Alexandria (68%), China (63.88%), and Jordan (60.9%) [ 5 , 6 , 7 ]. This comparison highlights that the multifaceted nature of turnover intention, influenced by diverse contextual, cultural, and organizational factors. Conversely, Ethiopia’s turnover rate among nurses contrasts with substantially lower figures reported in Israel (9%) [ 8 ], Brazil (21.1%) [ 9 ], and Saudi hospitals (26%) [ 10 ]. Challenges such as work overload, economic constraints, limited promotional opportunities, lack of recognition, and low job rewards are more prevalent among nurses in Ethiopia, contributing to higher turnover intention compared to their counterparts [ 7 , 29 , 36 ].

The highest turnover intention was observed in Addis Ababa, while Sidama region displayed the lowest turnover intention among nurses, These differences highlight the complexity of turnover intention among Ethiopian nurses, showing the importance of specific interventions in each region to address unique factors and improve nurses’ retention.

Our systematic review and meta-analysis in the Ethiopian nursing context revealed a significant inverse association between turnover intention and autonomous decision-making. The odd of turnover intention is approximately reduced by 72% in employees with autonomous decision-making compared to those without autonomous decision-making. This finding was supported by other similar studies conducted in South Africa, Tanzania, Kenya, and Turkey [ 37 , 38 , 39 , 40 ].

The significant association of turnover intention with promotion/development in our study underscores the crucial role of career advancement opportunities in alleviating turnover intention among nurses. Specifically, our analysis revealed that individuals with promotion/development had approximately 33% lower odds of turnover intention compared to those without such opportunities. These results emphasize the pivotal influence of organizational support in shaping the professional environment for nurses, providing substantive insights for the formulation of evidence-based strategies targeted at enhancing workforce retention. This finding is in line with former researches conducted in Taiwan, Philippines and Italy [ 41 , 42 , 43 ].

Our meta-analysis on turnover intention among Ethiopian nurses reveals a considerable challenge, with a pooled proportion of 53.35%. Regional variations highlight the necessity for region-specific strategies, with Addis Ababa displaying the highest turnover intention and Sidama region the lowest. A significant inverse association was found between turnover intention with autonomous decision-making and promotion/development. These insights support the formulation of evidence-based strategies and policies to enhance nurse retention, contributing to the overall stability of the Ethiopian healthcare system.

Recommendations

Federal ministry of health (fmoh).

The FMoH should consider the regional variations in turnover intention and formulate targeted retention strategies. Investment in professional development opportunities and initiatives to enhance autonomy can be integral components of these strategies.

Ethiopian nurses association (ENA)

ENA plays a pivotal role in advocating for the welfare of nurses. The association is encouraged to collaborate with healthcare institutions to promote autonomy, create mentorship programs, and advocate for improved working conditions to mitigate turnover intention.

Healthcare institutions

Hospitals and healthcare facilities should prioritize the provision of career advancement opportunities and recognize the value of professional autonomy in retaining nursing staff. Tailored interventions based on regional variations should be considered.

Policy makers

Policymakers should review existing healthcare policies to identify areas for improvement in nurse retention. Policy changes that address challenges such as work overload, limited promotional opportunities, and economic constraints can positively impact turnover rates.

Future research initiatives

Further research exploring the specific factors contributing to turnover intention in different regions of Ethiopia is recommended. Understanding the nuanced challenges faced by nurses in various settings will inform the development of more targeted interventions.

Strength and limitations

Our systematic review and meta-analysis on nurse turnover intention in Ethiopia present several strengths. The comprehensive inclusion of diverse studies provides a holistic view of the issue, enhancing the generalizability of our findings. The use of a random-effects model accounts for potential heterogeneity, ensuring a more robust and reliable synthesis of data.

However, limitations should be acknowledged. The heterogeneity observed across studies, despite the use of a random-effects model, may impact the precision of the pooled estimate. These considerations should be taken into account when interpreting and applying the results of our analysis.

Data availability

Data set used on this analysis will available from corresponding author upon reasonable request.

Abbreviations

Ethiopian Nurses Association

Federal Ministry of Health

Joanna Briggs Institute

Preferred Reporting Items for Systematic review and Meta-analysis Protocols

Kanchana L, Jayathilaka R. Factors impacting employee turnover intentions among professionals in Sri Lankan startups. PLoS ONE. 2023;18(2):e0281729.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Boateng AB, et al. Factors influencing turnover intention among nurses and midwives in Ghana. Nurs Res Pract. 2022;2022:4299702.

PubMed   PubMed Central   Google Scholar  

Organization WH. WHO Guideline on Health Workforce Development Attraction, Recruitment and Retention in Rural and Remote Areas, 2021, pp. 1-104.

Hayes LJ, et al. Nurse turnover: a literature review. Int J Nurs Stud. 2006;43(2):237–63.

Article   PubMed   Google Scholar  

Yang H, et al. Validation of work pressure and associated factors influencing hospital nurse turnover: a cross-sectional investigation in Shaanxi Province, China. BMC Health Serv Res. 2017;17:1–11.

Article   Google Scholar  

Ayalew E et al. Nurses’ intention to leave their job in sub-Saharan Africa: A systematic review and meta-analysis. Heliyon, 2021. 7(6).

Al Momani M. Factors influencing public hospital nurses’ intentions to leave their current employment in Jordan. Int J Community Med Public Health. 2017;4(6):1847–53.

DeKeyser Ganz F, Toren O. Israeli nurse practice environment characteristics, retention, and job satisfaction. Isr J Health Policy Res. 2014;3(1):1–8.

de Oliveira DR, et al. Intention to leave profession, psychosocial environment and self-rated health among registered nurses from large hospitals in Brazil: a cross-sectional study. BMC Health Serv Res. 2017;17(1):21.

Article   PubMed   PubMed Central   Google Scholar  

Dall’Ora C, et al. Association of 12 h shifts and nurses’ job satisfaction, burnout and intention to leave: findings from a cross-sectional study of 12 European countries. BMJ Open. 2015;5(9):e008331.

Lu H, Zhao Y, While A. Job satisfaction among hospital nurses: a literature review. Int J Nurs Stud. 2019;94:21–31.

Ramoo V, Abdullah KL, Piaw CY. The relationship between job satisfaction and intention to leave current employment among registered nurses in a teaching hospital. J Clin Nurs. 2013;22(21–22):3141–52.

Al Sabei SD, et al. Nursing work environment, turnover intention, Job Burnout, and Quality of Care: the moderating role of job satisfaction. J Nurs Scholarsh. 2020;52(1):95–104.

Wang H, Chen H, Chen J. Correlation study on payment satisfaction, psychological reward satisfaction and turnover intention of nurses. Chin Hosp Manag. 2018;38(03):64–6.

Google Scholar  

Loes CN, Tobin MB. Interpersonal conflict and organizational commitment among licensed practical nurses. Health Care Manag (Frederick). 2018;37(2):175–82.

Wei H, et al. The state of the science of nurse work environments in the United States: a systematic review. Int J Nurs Sci. 2018;5(3):287–300.

Nantsupawat A, et al. Effects of nurse work environment on job dissatisfaction, burnout, intention to leave. Int Nurs Rev. 2017;64(1):91–8.

Article   CAS   PubMed   Google Scholar  

Ayalew F, et al. Factors affecting turnover intention among nurses in Ethiopia. World Health Popul. 2015;16(2):62–74.

Debie A, Khatri RB, Assefa Y. Contributions and challenges of healthcare financing towards universal health coverage in Ethiopia: a narrative evidence synthesis. BMC Health Serv Res. 2022;22(1):866.

Moher D, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Reviews. 2015;4(1):1–9.

Moher D, et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med. 2009;151(4):264–9.

Moher D et al. Group, P.-P.(2015) Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement.

Institute JB. Checklist for Prevalence Studies. Checkl prevalance Stud [Internet]. 2016;7.

Sakonidou S, et al. Interventions to improve quantitative measures of parent satisfaction in neonatal care: a systematic review. BMJ Paediatr Open. 2020;4(1):e000613.

Egger M, Smith GD. Meta-analysis: potentials and promise. BMJ. 1997;315(7119):1371.

Tura G, Fantahun M, Worku A. The effect of health facility delivery on neonatal mortality: systematic review and meta-analysis. BMC Pregnancy Childbirth. 2013;13:18.

Lin L. Comparison of four heterogeneity measures for meta-analysis. J Eval Clin Pract. 2020;26(1):376–84.

McFarland LV. Meta-analysis of probiotics for the prevention of antibiotic associated diarrhea and the treatment of Clostridium difficile disease. Am J Gastroenterol. 2006;101(4):812–22.

Asegid A, Belachew T, Yimam E. Factors influencing job satisfaction and anticipated turnover among nurses in Sidama zone public health facilities, South Ethiopia Nursing research and practice, 2014. 2014.

Wubetie A, Taye B, Girma B. Magnitude of turnover intention and associated factors among nurses working in emergency departments of governmental hospitals in Addis Ababa, Ethiopia: a cross-sectional institutional based study. BMC Nurs. 2020;19:97.

Getie GA, Betre ET, Hareri HA. Assessment of factors affecting turnover intention among nurses working at governmental health care institutions in east Gojjam, Amhara region, Ethiopia, 2013. Am J Nurs Sci. 2015;4(3):107–12.

Gebregziabher D, et al. The relationship between job satisfaction and turnover intention among nurses in Axum comprehensive and specialized hospital Tigray, Ethiopia. BMC Nurs. 2020;19(1):79.

Negarandeh R et al. Magnitude of nurses’ intention to leave their jobs and its associated factors of nurses working in tigray regional state, north ethiopia: cross sectional study 2020.

Nigussie Bolado G, et al. The magnitude of turnover intention and Associated factors among nurses working at Governmental Hospitals in Southern Ethiopia: a mixed-method study. Nursing: Research and Reviews; 2023. pp. 13–29.

Woldekiros AN, Getye E, Abdo ZA. Magnitude of job satisfaction and intention to leave their present job among nurses in selected federal hospitals in Addis Ababa, Ethiopia. PLoS ONE. 2022;17(6):e0269540.

Rhoades L, Eisenberger R. Perceived organizational support: a review of the literature. J Appl Psychol. 2002;87(4):698.

Lewis M. Causal factors that influence turnover intent in a manufacturing organisation. University of Pretoria (South Africa); 2008.

Kuria S, Alice O, Wanderi PM. Assessment of causes of labour turnover in three and five star-rated hotels in Kenya International journal of business and social science, 2012. 3(15).

Blaauw D, et al. Comparing the job satisfaction and intention to leave of different categories of health workers in Tanzania, Malawi, and South Africa. Global Health Action. 2013;6(1):19287.

Masum AKM, et al. Job satisfaction and intention to quit: an empirical analysis of nurses in Turkey. PeerJ. 2016;4:e1896.

Song L. A study of factors influencing turnover intention of King Power Group at Downtown Area in Bangkok, Thailand. Volume 2. International Review of Research in Emerging Markets & the Global Economy; 2016. 3.

Karanikola MN, et al. Moral distress, autonomy and nurse-physician collaboration among intensive care unit nurses in Italy. J Nurs Manag. 2014;22(4):472–84.

Labrague LJ, McEnroe-Petitte DM, Tsaras K. Predictors and outcomes of nurse professional autonomy: a cross-sectional study. Int J Nurs Pract. 2019;25(1):e12711.

Download references

No funding was received.

Author information

Authors and affiliations.

School of Nursing, College of Health Science and Medicine, Wolaita Sodo University, Wolaita Sodo, Ethiopia

Eshetu Elfios, Israel Asale, Merid Merkine, Temesgen Geta, Kidist Ashager, Getachew Nigussie, Ayele Agena & Bizuayehu Atinafu

Department of Midwifery, College of Health Science and Medicine, Wolaita Sodo University, Wolaita Sodo, Ethiopia

Eskindir Israel

Department of Midwifery, College of Health Science and Medicine, Wachamo University, Hossana, Ethiopia

Teketel Tesfaye

You can also search for this author in PubMed   Google Scholar

Contributions

E.E. conceptualized the study, designed the research, performed statistical analysis, and led the manuscript writing. I.A, T.G, M.M contributed to the study design and provided critical revisions. K.A., G.N, B.A., E.I., and T.T. participated in data extraction and quality assessment. M.M. and T.G. K.A. and G.N. contributed to the literature review. I.A, A.A. and B.A. assisted in data interpretation. E.I. and T.T. provided critical revisions to the manuscript. All authors read and approved the final version.

Corresponding author

Correspondence to Eshetu Elfios .

Ethics declarations

Ethical approval.

Ethical approval and informed consent are not required, as this study is a systematic review and meta-analysis that only involved the use of previously published data.

Ethical guidelines

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Elfios, E., Asale, I., Merkine, M. et al. Turnover intention and its associated factors among nurses in Ethiopia: a systematic review and meta-analysis. BMC Health Serv Res 24 , 662 (2024). https://doi.org/10.1186/s12913-024-11122-9

Download citation

Received : 20 January 2024

Accepted : 20 May 2024

Published : 24 May 2024

DOI : https://doi.org/10.1186/s12913-024-11122-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Turnover intention
  • Systematic review
  • Meta-analysis

BMC Health Services Research

ISSN: 1472-6963

research on statistical software

IMAGES

  1. Top 48 Statistical Software in 2022

    research on statistical software

  2. Statistical Analysis Softwares

    research on statistical software

  3. Significance of statistical software in data analysis: SPSS & STATA

    research on statistical software

  4. Top 48 Free Statistical Software in 2022

    research on statistical software

  5. Standard statistical tools in research and data analysis

    research on statistical software

  6. Statistical Analysis Software

    research on statistical software

VIDEO

  1. Basics in statistical analysis and software used in research

  2. Developing Statistical Math Software

  3. SPSS Part 5

  4. Day 3: Statistical Data Analysis Using R Programming for Staff and Students of Makerere University

  5. Week 7

  6. Statistical Thinking for Industrial Problem Solving

COMMENTS

  1. The Best Statistical Software Tools Of 2024

    Statistical software consists of systems with these techniques that support research and business analytics. Best Statistical Software. After diligent research, our analysts curated a list of the best statistics software on the market. ... But, open-source statistical software fills a gap. For instance, JASP offers students a no-cost ...

  2. Best Statistical Analysis Software: User Reviews from June 2024

    IBM SPSS Statistics. (875) 4.2 out of 5. Optimized for quick response. 1st Easiest To Use in Statistical Analysis software. Save to My Lists. Deals Special offer! 10% off: $1069.20/year. Claim Offer. Overview.

  3. Journal of Statistical Software

    Established in 1996, the Journal of Statistical Software publishes articles on statistical software along with the source code of the software itself and replication code for all empirical results. Furthermore, shorter code snippets are published as well as book reviews and software reviews. All contents are freely available online under open ...

  4. IBM SPSS Statistics

    The IBM® SPSS® Statistics software puts the power of advanced statistical analysis at your fingertips. Whether you are a beginner, an experienced analyst, a statistician or a business professional it offers a comprehensive suite of advanced capabilities, flexibility and usability that are not available in traditional statistical software.

  5. Trends in the Usage of Statistical Software and Their Associated Study

    The percentages of the statistical software used in these articles are shown in Figure Figure2. 2. SPSS was the most commonly used statistical software for data analysis with 3,368 (52.1%) articles, followed by SAS 833 (12.9%), and Stata 815 (12.6%). WinBugs was the least used statistical software with only 40 (0.6%) of the total articles.

  6. SPSS Software

    The IBM® SPSS® software platform offers advanced statistical analysis, a vast library of machine learning algorithms, text analysis, open-source extensibility, integration with big data and seamless deployment into applications. Its ease of use, flexibility and scalability make SPSS accessible to users of all skill levels.

  7. List of Top Statistical Analysis Software 2024

    Pricing Information. Statistical analysis software pricing models can vary widely depending on the vendor, features, and software edition. Basic solutions start between $20 and $140, while more advanced plans can rise into the hundreds or even thousands of dollars. Freelancers typically charge between $400 and and $1000 or more per project.

  8. Top 48 Statistical Software

    Statistical software uses different data analysis techniques such as regression analysis, sampling, multivariate analysis, cluster analysis, and Bayesian analysis. Multi-platform Support: The best statistical software products can run on popular operating systems such as Windows, Linux, UNIX, and Macos. They also support multiple programming ...

  9. Leading Statistical Analysis Software, SAS/STAT

    SAS/STAT statistical software includes exact techniques for small data sets, high-performance statistical modeling tools for large data tasks and modern methods for analyzing data with missing values. And because the software is updated regularly, you'll benefit from using the newest methods in the rapidly expanding field of statistics. ...

  10. Best Statistical Analysis Software 2024

    In the meantime, here are some things you can do: Refresh the page. Try again in a few minutes. / WHO WE ARE. Find the top Statistical Analysis software of 2024 on Capterra. Based on millions of verified user reviews - compare and filter for whats important to you to find the best tools for your needs.

  11. 11 Best Data Analysis Software for Research [2023]

    1. Microsoft Excel. Microsoft Excel is a widely available spreadsheet software often used for basic data analysis and visualization. It is user-friendly and suitable for researchers working with small datasets. Excel is readily accessible and frequently used for preliminary data exploration and simple calculations.

  12. SPSS, SAS, R, Stata, JMP? Choosing a Statistical Software Package or

    Problem is that i did not know about the Statistics software such as SPSS,SAS,STAT,etc, Can you give me any suggestion. Reply. Rajkumar says. January 23, 2017 at 9:27 am. ... so,i research about statistical softwares and decide to use STATA inside R! Reply. Karen says. June 6, 2013 at 5:18 pm.

  13. Quantitative Analysis Guide: Which Statistical Software to Use?

    The development of SAS (Statistical Analysis System) began in 1966 by Anthony Bar of North Carolina State University and later joined by James Goodnight. The National Institute of Health funded this project with a goal of analyzing agricultural data to improve crop yields.

  14. List of statistical software

    Regression Analysis of Time Series (RATS) - comprehensive econometric analysis package. Rguroo Statistical Software - An online statistical software designed for teaching and analyzing data. S-PLUS - general statistics package. SAS (software) - comprehensive statistical package.

  15. JASP

    JASP is an open-source statistics program that is free, friendly, and flexible. ... This is why we have developed JASP, a free cross-platform software program with a state-of-the-art graphical user interface. Read More. Your First Steps Using JASP ... European Research Council. www.erc.europa.eu. Nederlandse Organisatie voor Wetenschappelijk ...

  16. Basic statistical tools in research and data analysis

    Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if ...

  17. What is Statistical Analysis? Types, Software, Examples

    Popular Statistical Analysis Software. Several statistical software packages are widely used in various industries and research domains. Some of the most popular options include: R: R is a free, open-source programming language and software environment for statistical computing and graphics. It offers a vast ecosystem of packages for data ...

  18. Best 17 Free Statistical Analysis Software Picks in 2024

    Best free Statistical Analysis Software across 17 Statistical Analysis Software products. See reviews of IBM SPSS Statistics, Posit, JMP and compare free or paid products easily. Get the G2 on the right Statistical Analysis Software for you. ... Q Research Software by Displayr features and usability ratings that predict user satisfaction. 0.0.

  19. Statistical Software Popularity in 40,582 Research Papers

    Out of these 76,147 research papers, only 40,582 (53.3%) mentioned the use of at least 1 statistical software. Here's a summary of the key findings. 1- SPSS was the most used statistical software overall, mentioned in 40.48% of research papers, followed by R (20.52%) and Prism (17.38%).

  20. 7 Data Analysis Software Applications You Need to Know

    1. Excel. Microsoft Excel is one of the most common software used for data analysis. In addition to offering spreadsheet functions capable of managing and organizing large data sets, Excel also includes graphing tools and computing capabilities like automated summation or "AutoSum.". Excel also includes Analysis ToolPak, which features data ...

  21. The Beginner's Guide to Statistical Analysis

    Table of contents. Step 1: Write your hypotheses and plan your research design. Step 2: Collect data from a sample. Step 3: Summarize your data with descriptive statistics. Step 4: Test hypotheses or make estimates with inferential statistics.

  22. Choosing Statistical Software: A Researcher's Guide

    Choosing the right statistical software is crucial for the success of your research. It's like selecting the perfect tool for a job; the right choice can make your work efficient and insightful ...

  23. What Is Statistical Analysis? Definition, Types, and Jobs

    Statistical software is beneficial for both descriptive and inferential statistics. You can use it to generate charts and graphs or perform computations to draw conclusions and inferences from the data. While the type of statistical software you will use will depend on your employer, common software used include: ... Operational research ...

  24. IBM® SPSS® Statistics

    IBM SPSS Statistics is the world's leading statistical software used to solve business and research problems by employing ad hoc analysis, hypothesis testing, and predictive analytics. Organizations utilize IBM SPSS Statistics to understand data, analyze trends, forecast, and plan to validate assumptions and drive accurate conclusions.

  25. Software Supply Chain Attacks Have Increased Financial and Reputational

    Software Supply Chain Attacks Have Increased Financial and Reputational Impacts on Companies Globally, New BlackBerry Research Reveals CNW Group Thu, Jun 6, 2024, 6:00 AM 3 min read

  26. New AI-powered statistics method has potential to improve tissue and

    New AI-powered statistics method has potential to improve tissue and disease research; ... Research team hopeful that the method, called IRIS, can provide more detailed information for precision health treatment plans and health outcomes. June 6, 2024. Researchers at the University of Michigan and Brown University have developed a new ...

  27. TRIADS software engineering team custom builds research tools for WashU

    Washington University faculty engage in research so innovative that it often demands specialized tools that don't yet exist. As part of its mission to foster groundbreaking, data-driven research, the Transdisciplinary Institute in Applied Data Sciences (TRIADS) has formed its own software engineering team to help connect faculty with custom solutions to meet their research needs. WashU ...

  28. Remote Work Statistics & Trends In (2024)

    The experts at Forbes Advisor analyze and present the most recent remote work statistics of 2023 that are shaping the professional world. ... Research shows that employers can save $11,000 per ...

  29. FDA and CluePoints Sign New 3 Year Cooperative Research and Development

    As a result of the original CRADA, the CluePoints software was deployed in the FDA high performance computing environment, new statistical tests were developed to detect anomalous sites, the site ...

  30. Turnover intention and its associated factors among nurses in Ethiopia

    Data analysis procedures involved importing the extracted data into STATA 14 statistical software for conducting a pooled proportion of turnover intention among nurses. To evaluate potential publication bias and small study effects, both funnel plots and Egger's test were employed [25, 26]. We used statistical tests such as the I statistic to ...