analysis in research example

Effective Experiment Design and Data Analysis in Transportation Research (2012)

Chapter: chapter 3 - examples of effective experiment design and data analysis in transportation research.

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

10 Examples of Effective Experiment Design and Data Analysis in Transportation Research About this Chapter This chapter provides a wide variety of examples of research questions. The examples demon- strate varying levels of detail with regard to experiment designs and the statistical analyses required. The number and types of examples were selected after consulting with many practitioners. The attempt was made to provide a couple of detailed examples in each of several areas of transporta- tion practice. For each type of problem or analysis, some comments also appear about research topics in other areas that might be addressed using the same approach. Questions that were briefly introduced in Chapter 2 are addressed in considerably more depth in the context of these examples. All the examples are organized and presented using the outline below. Where applicable, ref- erences to the two-volume primer produced under NCHRP Project 20-45 have been provided to encourage the reader to obtain more detail about calculation techniques and more technical discussion of issues. Basic Outline for Examples The numbered outline below is the model for the structure of all of the examples that follow. 1. Research Question/Problem Statement: A simple statement of the research question is given. For example, in the maintenance category, does crack sealant A perform better than crack sealant B? 2. Identification and Description of Variables: The dependent and independent variables are identified and described. The latter includes an indication of whether, for example, the variables are discrete or continuous. 3. Data Collection: A hypothetical scenario is presented to describe how, where, and when data should be collected. As appropriate, reference is made to conventions or requirements for some types of data (e.g., if delay times at an intersection are being calculated before and after some treatment, the data collected need to be consistent with the requirements in the Highway Capacity Manual). Typical problems are addressed, such as sample size, the need for control groups, and so forth. 4. Specification of Analysis Technique and Data Analysis: The links between successfully framing the research question, fully describing the variables that need to be considered, and the specification of the appropriate analysis technique are highlighted in each example. Refer- ences to NCHRP Project 20-45 are provided for additional detail. The appropriate types of statistical test(s) are described for the specific example. 5. Interpreting the Results: In each example, results that can be expected from the analysis are discussed in terms of what they mean from a statistical perspective (e.g., the t-test result from C h a p t e r 3

examples of effective experiment Design and Data analysis in transportation research 11 a comparison of means indicates whether the mean values of two distributions can be con- sidered to be equal with a specified degree of confidence) as well as an operational perspective (e.g., judging whether the difference is large enough to make an operational difference). In each example, the typical results and their limitations are discussed. 6. Conclusion and Discussion: This section recaps how the early steps in the process lead directly to the later ones. Comments are made regarding how changes in the early steps can affect not only the results of the analysis but also the appropriateness of the approach. 7. Applications in Other Areas of Transportation Research: Each example includes a short list of typical applications in other areas of transportation research for which the approach or analysis technique would be appropriate. Techniques Covered in the Examples The determination of what kinds of statistical techniques to include in the examples was made after consulting with a variety of professionals and examining responses to a survey of research- oriented practitioners. The examples are not exhaustive insofar as not every type of statistical analysis is covered. However, the attempt has been made to cover a representative sample of tech- niques that the practitioner is most likely to encounter in undertaking or supervising research- oriented projects. The following techniques are introduced in one or more examples: â¢ Descriptive statistics â¢ Fitting distributions/goodness of fit (used in one example) â¢ Simple one- and two-sample comparison of means â¢ Simple comparisons of multiple means using analysis of variance (ANOVA) â¢ Factorial designs (also ANOVA) â¢ Simple comparisons of means before and after some treatment â¢ Complex before-and-after comparisons involving control groups â¢ Trend analysis â¢ Regression â¢ Logit analysis (used in one example) â¢ Survey design and analysis â¢ Simulation â¢ Non-parametric methods (used in one example) Although the attempt has been made to make the examples as readable as possible, some tech- nical terms may be unfamiliar to some readers. Detailed definitions for most applicable statistical terms are available in the glossary in NCHRP Project 20-45, Volume 2, Appendix A. Most defini- tions used here are consistent with those contained in NCHRP Project 20-45, which contains useful information for everyone from the beginning researcher to the most accomplished statistician. Some variations appear in the notations used in the examples. For example, in statistical analy- sis an alternate hypothesis may be represented by Ha or by H1, and readers will find both notations used in this report. The examples were developed by several authors with differing backgrounds, and latitude was deliberately given to the authors to use the notations with which they are most familiar. The variations have been included purposefully to acquaint readers with the fact that the same concepts (e.g., something as simple as a mean value) may be noted in various ways by different authors or analysts. Finally, the more widely used techniques, such as analysis of variance (ANOVA), are applied in more than one example. Readers interested in ANOVA are encouraged to read all the ANOVA examples as each example presents different aspects of or perspectives on the approach, and computational techniques presented in one example may not be repeated in later examples (although a citation typically is provided).

12 effective experiment Design and Data analysis in transportation research Areas Covered in the Examples Transportation research is very broad, encompassing many fields. Based on consultation with many research-oriented professionals and a survey of practitioners, key areas of research were identified. Although these areas have lots of overlap, explicit examples in the following areas are included: â¢ Construction â¢ Environment â¢ Lab testing and instrumentation â¢ Maintenance â¢ Materials â¢ Pavements â¢ Public transportation â¢ Structures/bridges â¢ Traffic operations â¢ Traffic safety â¢ Transportation planning â¢ Work zones The 21 examples provided on the following pages begin with the most straightforward ana- lytical approaches (i.e., descriptive statistics) and progress to more sophisticated approaches. Table 1 lists the examples along with the area of research and method of analysis for each example. Example 1: Structures/Bridges; Descriptive Statistics Area: Structures/bridges Method of Analysis: Descriptive statistics (exploring and presenting data to describe existing conditions and develop a basis for further analysis) 1. Research Question/Problem Statement: An engineer for a state agency wants to determine the functional and structural condition of a select number of highway bridges located across the state. Data are obtained for 100 bridges scheduled for routine inspection. The data will be used to develop bridge rehabilitation and/or replacement programs. The objective of this analysis is to provide an overview of the bridge conditions, and to present various methods to display the data in a concise and meaningful manner. Question/Issue Use collected data to describe existing conditions and prepare for future analysis. In this case, bridge inspection data from the state are to be studied and summarized. 2. Identification and Description of Variables: Bridge inspection generally entails collection of numerous variables that include location information, traffic data, structural elementsâ type and condition, and functional characteristics. In this example, the variables are: bridge condition ratings of the deck, superstructure, and substructure; and overall condition of the bridge. Based on the severity of deterioration and the extent of spread through a bridge component, a condition rating is assigned on a discrete scale from 0 (failed) to 9 (excellent). These ratings (in addition to several other factors) are used in categorization of a bridge in one of three overall conditions: not deficient; structurally deficient; or functionally obsolete.

examples of effective experiment Design and Data analysis in transportation research 13 Example Area Method of Analysis 1 Structures/bridges Descriptive statistics (exploring and presenting data to describe existing conditions) 2 Public transport Descriptive statistics (organizing and presenting data to describe a system or component) 3 Environment Descriptive statistics (organizing and presenting data to explain current conditions) 4 Traffic operations Goodness of fit (chi-square test; determining if observed/collected data fit a certain distribution) 5 Construction Simple comparisons to specified values (t-test to compare the mean value of a small sample to a standard or other requirement) 6 Maintenance Simple two-sample comparison (t-test for paired comparisons; comparing the mean values of two sets of matched data) 7 Materials Simple two-sample comparisons (t-test for paired comparisons and the F-test for comparing variances) 8 Laboratory testing and/or instrumentation Simple ANOVA (comparing the mean values of more than two samples using the F-test) 9 Materials Simple ANOVA (comparing more than two mean values and the F-test for equality of means) 10 Pavements Simple ANOVA (comparing the mean values of more than two samples using the F-test) 11 Pavements Factorial design (an ANOVA approach exploring the effects of varying more than one independent variable) 12 Work zones Simple before-and-after comparisons (exploring the effect of some treatment before it is applied versus after it is applied) 13 Traffic safety Complex before-and-after comparisons using control groups (examining the effect of some treatment or application with consideration of other factors) 14 Work zones Trend analysis (examining, describing, and modeling how something changes over time) 15 Structures/bridges Trend analysis (examining a trend over time) 16 Transportation planning Multiple regression analysis (developing and testing proposed linear models with more than one independent variable) 17 Traffic operations Regression analysis (developing a model to predict the values that a dependent variable can take as a function of one or more independent variables) 18 Transportation planning Logit and related analysis (developing predictive models when the dependent variable is dichotomous) 19 Public transit Survey design and analysis (organizing survey data for statistical analysis) 20 Traffic operations Simulation (using field data to simulate or model operations or outcomes) 21 Traffic safety Non-parametric methods (methods to be used when data do not follow assumed or conventional distributions) Table 1. Examples provided in this report.

14 effective experiment Design and Data analysis in transportation research 3. Data Collection: Data are collected at 100 scheduled locations by bridge inspectors. It is important to note that the bridge condition rating scale is based on subjective categories, and there may be inherent variability among inspectors in their assignment of ratings to bridge components. A sample of data is compiled to document the bridge condition rating of the three primary structural components and the overall condition by location and ownership (Table 2). Notice that the overall condition of a bridge is not necessarily based only on the condition rating of its components (e.g., they cannot just be added). 4. Specification of Analysis Technique and Data Analysis: The two primary variables of inter- est are bridge condition rating and overall condition. The overall condition of the bridge is a categorical variable with three possible values: not deficient; structurally deficient; and functionally obsolete. The frequencies of these values in the given data set are calculated and displayed in the pie chart below. A pie chart provides a visualization of the relative proportions of bridges falling into each category that is often easier to communicate to the reader than a table showing the same information (Figure 1). Another way to look at the overall bridge condition variable is by cross-tabulation of the three condition categories with the two location categories (urban and rural), as shown in Table 3. A cross-tabulation provides the joint distribution of two (or more) variables such that each cell represents the frequency of occurrence of a specific combination of pos- sible values. For example, as seen in Table 3, there are 10 structurally deficient bridges in rural areas, which represent 11.4% of all rural area bridges inspected. The numbers in the parentheses are column percentages and add up to 100%. Table 3 also shows that 88 of the bridges inspected were located in rural areas, whereas 12 were located in urban areas. The mean values of the bridge condition rating variable for deck, superstructure, and sub- structure are shown in Table 4. These have been calculated by taking the sum of all the values and then dividing by the total number of cases (100 in this example). Generally, a condition rating Bridge No. Owner Location Bridge Condition Rating Overall Condition Deck Superstructure Substructure 1 State Rural 8 8 8 ND* 7 Local agency Rural 6 6 6 FO* 39 State Urban 6 6 2 SD* 69 State park Rural 7 5 5 SD 92 City Urban 5 6 6 ND *ND = not deficient; FO: functionally obsolete; SD: structurally deficient. Table 2. Sample bridge inspection data. Structurally Deficient (SD), 13% Functionally Obsolete (FO), 10% Neither SD/FO, 77% Figure 1. Highway bridge conditions.

examples of effective experiment Design and Data analysis in transportation research 15 of 4 or below indicates deficiency in a structural component. For the purpose of comparison, the mean bridge condition rating of the 13 structurally deficient bridges also is provided. Notice that while the rating scale for the bridge conditions is discrete with values ranging from 0 (failure) to 9 (excellent), the average bridge condition variable is continuous. Therefore, an average score of 6.47 would indicate overall condition of all bridges to be between 6 (satisfactory) and 7 (good). The combined bridge condition rating of deck, superstructure, and substructure is not defined; therefore calculating the mean of the three componentsâ average rating would make no sense. Also, the average bridge condition rating of functionally obsolete bridges is not calculated because other functional characteristics also accounted for this designation. The distributions of the bridge condition ratings for deck, superstructure, and substructure are shown in Figure 2. Based on the cut-off point of 4, approximately 7% of all bridge decks, 2% of all superstructures, and 5% of all substructures are deficient. 5. Interpreting the Results: The results indicate that a majority of bridges (77%) are not struc- turally or functionally deficient. The inspections were carried out on bridges primarily located in rural areas (88 out of 100). The bridge condition variable may also be cross-tabulated with the ownership variable to determine distribution by jurisdiction. The average condition ratings for the three bridge components for all bridges lies between 6 (satisfactory, some minor problems) and 7 (good, no problems noted). 6. Conclusion and Discussion: This example illustrates how to summarize and present quan- titative and qualitative data on bridge conditions. It is important to understand the mea- surement scale of variables in order to interpret the results correctly. Bridge inspection data collected over time may also be analyzed to determine trends in the condition of bridges in a given area. Trend analysis is addressed in Example 15 (structures). 7. Applications in Other Areas of Transportation Research: Descriptive statistics could be used to present data in other areas of transportation research, such as: â¢ Transportation Planningâto assess the distribution of travel times between origin- destination pairs in an urban area. Overall averages could also be calculated. â¢ Traffic Operationsâto analyze the average delay per vehicle at a railroad crossing. Rating Category Mean Value Overall average bridge condition rating (deck) 6.20 Overall average bridge condition rating (superstructure) 6.47 Overall average bridge condition rating (substructure) 6.08 Average bridge condition rating of structurally deficient bridges (deck) 4.92 Average bridge condition rating of structurally deficient bridges (superstructure) 5.30 Average bridge condition rating of structurally deficient bridges (substructure) 4.54 Table 4. Bridge condition ratings. Rural Urban Total Structurally deficient 10 (11.4%) 3 (25.0%) 13 Functionally obsolete 6 (6.8%) 4 (33.3%) 10 Not deficient 72 (81.8%) 5 (41.7%) 77 Total 88 (100%) 12 (100%) 100 Table 3. Cross-tabulation of bridge condition by location.

16 effective experiment Design and Data analysis in transportation research â¢ Traffic Operations/Safetyâto examine the frequency of turning violations at driveways with various turning restrictions. â¢ Work Zones, Environmentâto assess the average energy consumption during various stages of construction. Example 2: Public Transport; Descriptive Statistics Area: Public transport Method of Analysis: Descriptive statistics (organizing and presenting data to describe a system or component) 1. Research Question/Problem Statement: The manager of a transit agency would like to present information to the board of commissioners on changes in revenue that resulted from a change in the fare. The transit system provides three basic types of service: local bus routes, express bus routes, and demand-responsive bus service. There are 15 local bus routes, 10 express routes, and 1 demand-responsive system. 0 5 10 15 20 25 30 35 40 45 9 8 7 6 5 4 3 2 1 0 Condition Ratings Pe rc en ta ge o f S tru ctu re s Deck Superstructure Substructure Figure 2. Bridge condition ratings. Question/Issue Use data to describe some change over time. In this instance, data from 2008 and 2009 are used to describe the change in revenue on each route/part of a transit system when the fare structure was changed from variable (per mile) to fixed fares. 2. Identification and Description of Variables: Revenue data are available for each route on the local and express bus system and the demand-responsive system as a whole for the years 2008 and 2009. 3. Data Collection: Revenue data were collected on each route for both 2008 and 2009. The annual revenue for the demand-responsive system was also collected. These data are shown in Table 5. 4. Specification of Analysis Technique and Data Analysis: The objective of this analysis is to present the impact of changing the fare system in a series of graphs. The presentation is intended to show the impact on each component of the transit system as well as the impact on overall system revenue. The impact of the fare change on the overall revenue is best shown with a bar graph (Figure 3). The variation in the impact across system components can be illustrated in a similar graph (Figure 4). A pie chart also can be used to illustrate the relative impact on each system component (Figure 5).

examples of effective experiment Design and Data analysis in transportation research 17 Bus Route 2008 Revenue 2009 Revenue Local Route 1 $350,500 $365,700 Local Route 2 $263,000 $271,500 Local Route 3 $450,800 $460,700 Local Route 4 $294,300 $306,400 Local Route 5 $173,900 $184,600 Local Route 6 $367,800 $375,100 Local Route 7 $415,800 $430,300 Local Route 8 $145,600 $149,100 Local Route 9 $248,200 $260,800 Local Route 10 $310,400 $318,300 Local Route 11 $444,300 $459,200 Local Route 12 $208,400 $205,600 Local Route 13 $407,600 $412,400 Local Route 14 $161,500 $169,300 Local Route 15 $325,100 $340,200 Express Route 1 $85,400 $83,600 Express Route 2 $110,300 $109,200 Express Route 3 $65,800 $66,200 Express Route 4 $125,300 $127,600 Express Route 5 $90,800 $90,400 Express Route 6 $125,800 $123,400 Express Route 7 $87,200 $86,900 Express Route 8 $68.300 $67,200 Express Route 9 $110,100 $112,300 Express Route 10 $73,200 $72,100 Demand-Responsive System $510,100 $521,300 Table 5. Revenue by route or type of service and year. 6.02 6.17 0 1 2 3 4 5 6 7 8 2008 2009 Total System Revenue Re ve nu e (M illi on $ ) Figure 3. Impact of fare change on overall revenue.

18 effective experiment Design and Data analysis in transportation research Express Buses, 15.7% Express Buses, 15.2% Local Buses, 76.3% Local Buses, 75.8% Demand Responsive, 8.5% Demand Responsive, 8.5% 2008 2009 Figure 5. Pie charts illustrating percent of revenue from each component of a transit system. If it is important to display the variability in the impact within the various bus routes in the local bus or express bus operations, this also can be illustrated (Figure 6). This type of diagram shows the maximum value, minimum value, and mean value of the percent increase in revenue across the 15 local bus routes and the 10 express bus routes. 5. Interpreting the results: These results indicate that changing from a variable fare based on trip length (2008) to a fixed fare (2009) on both the local bus routes and the express bus routes had little effect on revenue. On the local bus routes, there was an average increase in revenue of 3.1%. On the express bus routes, there was an average decrease in revenue of 0.4%. These changes altered the percentage of the total system revenue attributed to the local bus routes and the express bus routes. The local bus routes generated 76.3% of the revenue in 2009, compared to 75.8% in 2008. The percentage of revenue generated by the express bus routes dropped from 15.7% to 15.2%, and the demand-responsive system generated 8.5% in both 2008 and 2009. 6. Conclusion and Discussion: The total revenue increased from $6.02 million to $6.17 mil lion. The cost of operating a variable fare system is greater than that of operating a fixed fare systemâ hence, net income probably increased even more (more revenue, lower cost for fare collection), and the decision to modify the fare system seems reasonable. Notice that the entire discussion Figure 4. Variation in impact of fare change across system components. 0.94 0.51 0.94 0.52 4.57 4.71 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Local Buses Express Buses Demand Responsive Re ve nu e (M illi on $ ) 2008 2009

examples of effective experiment Design and Data analysis in transportation research 19 also is based on the assumption that no other factors changed between 2008 and 2009 that might have affected total revenues. One of the implicit assumptions is that the number of riders remained relatively constant from 1 year to the next. If the ridership had changed, the statistics reported would have to be changed. Using the measure revenue/rider, for example, would help control (or normalize) for the variation in ridership. 7. Applications in Other Areas in Transportation Research: Descriptive statistics are widely used and can convey a great deal of information to a reader. They also can be used to present data in many areas of transportation research, including: â¢ Transportation Planningâto display public response frequency or percentage to various alternative designs. â¢ Traffic Operationsâto display the frequency or percentage of crashes by route type or by the type of traffic control devices present at an intersection. â¢ Airport Engineeringâto display the arrival pattern of passengers or flights by hour or other time period. â¢ Public Transitâto display the average load factor on buses by time of day. Example 3: Environment; Descriptive Statistics Area: Environment Method of Analysis: Descriptive statistics (organizing and presenting data to explain current conditions) 1. Research Question/Problem Statement: The planning and programming director in Envi- ronmental City wants to determine the current ozone concentration in the city. These data will be compared to data collected after the projects included in the Transportation Improvement Program (TIP) have been completed to determine the effects of these projects on the environ- ment. Because the terrain, the presence of hills or tall buildings, the prevailing wind direction, and the sample station location relative to high volume roads or industrial sites all affect the ozone level, multiple samples are required to determine the ozone concentration level in a city. For this example, air samples are obtained each weekday in the month of July (21 days) at 14 air-sampling stations in the city: 7 in the central city and 7 in the outlying areas of the city. The objective of the analysis is to determine the ozone concentration in the central city, the outlying areas of the city, and the city as a whole. Figure 6. Graph showing variation in revenue increase by type of bus route. -0.4 -1.3 -2.1 3.1 6.2 2.0 -3 -2 -1 0 1 2 3 4 5 6 7 Local Bus Routes Express Bus Routes Percent Increase in Revenue

20 effective experiment Design and Data analysis in transportation research 2. Identification and Description of Variables: The variable to be analyzed is the 8-hour average ozone concentration in parts per million (ppm) at each of the 14 air-sampling stations. The 8-hour average concentration is the basis for the EPA standard, and July is selected because ozone levels are temperature sensitive and increase with a rise in the temperature. 3. Data Collection: Ozone concentrations in ppm are recorded for each hour of the day at each of the 14 air-sampling stations. The highest average concentration for any 8-hour period during the day is recorded and tabulated. This results in 294 concentration observations (14 stations for 21 days). Table 6 and Table 7 show the data for the seven central city locations and the seven outlying area locations. 4. Specification of Analysis Technique and Data Analysis: Much of the data used in analyzing transportation issues has year-to-year, month-to-month, day-to-day, and even hour-to-hour variations. For this reason, making only one observation, or even a few observations, may not accurately describe the phenomenon being observed. Thus, standard practice is to obtain several observations and report the mean value of all observations. In this example, the phenomenon being observed is the daily ozone concentration at a series of air-sampling locations. The statistic to be estimated is the mean value of this variable over Question/Issue Use collected data to describe existing conditions and prepare for future analysis. In this example, air pollution levels in the central city, the outlying areas, and the overall city are to be described. Day Station 1 2 3 4 5 6 7 â 1 0.079 0.084 0.081 0.083 0.088 0.086 0.089 0.590 2 0.082 0.087 0.088 0.086 0.086 0.087 0.081 0.597 3 0.080 0.081 0.077 0.072 0.084 0.083 0.081 0.558 4 0.083 0.086 0.082 0.079 0.086 0.087 0.089 0.592 5 0.082 0.087 0.080 0.075 0.090 0.089 0.085 0.588 6 0.075 0.084 0.079 0.076 0.080 0.083 0.081 0.558 7 0.078 0.079 0.080 0.074 0.078 0.080 0.075 0.544 8 0.081 0.077 0.082 0.081 0.076 0.079 0.074 0.540 9 0.088 0.084 0.083 0.085 0.083 0.083 0.088 0.594 10 0.085 0.087 0.086 0.089 0.088 0.087 0.090 0.612 11 0.079 0.082 0.082 0.089 0.091 0.089 0.090 0.602 12 0.078 0.080 0.081 0.086 0.088 0.089 0.089 0.591 13 0.081 0.079 0.077 0.083 0.084 0.085 0.087 0.576 14 0.083 0.080 0.079 0.081 0.080 0.082 0.083 0.568 15 0.084 0.083 0.080 0.085 0.082 0.086 0.085 0.585 16 0.086 0.087 0.085 0.087 0.089 0.090 0.089 0.613 17 0.082 0.085 0.083 0.090 0.087 0.088 0.089 0.604 18 0.080 0.081 0.080 0.087 0.085 0.086 0.088 0.587 19 0.080 0.083 0.077 0.083 0.085 0.084 0.087 0.579 20 0.081 0.084 0.079 0.082 0.081 0.083 0.088 0.578 21 0.082 0.084 0.080 0.081 0.082 0.083 0.085 0.577 â 1.709 1.744 1.701 1.734 1.773 1.789 1.793 12.243 Table 6. Central city 8-hour ozone concentration samples (ppm).

examples of effective experiment Design and Data analysis in transportation research 21 the test period selected. The mean value of any data set (x _ ) equals the sum of all observations in the set divided by the total number of observations in the set (n): x x n i i n = = â 1 The variables of interest stated in the research question are the average ozone concentration for the central city, the outlying areas, and the total city. Thus, there are three data sets: the first table, the second table, and the sum of the two tables. The first data set has a sample size of 147; the second data set also has a sample size of 147, and the third data set contains 294 observations. Using the formula just shown, the mean value of the ozone concentration in the central city is calculated as follows: x xi i = = = = â 147 12 243 147 0 083 1 147 . . ppm The mean value of the ozone concentration in the outlying areas of the city is: x xi i = = = = â 147 10 553 147 0 072 1 147 . . ppm The mean value of the ozone concentration for the entire city is: x xi i = = = = â 294 22 796 294 0 078 1 294 . . ppm Day Station 8 9 10 11 12 13 14 â 1 0.072 0.074 0.073 0.071 0.079 0.070 0.074 0.513 2 0.074 0.075 0.077 0.075 0.081 0.075 0.077 0.534 3 0.070 0.072 0.074 0.074 0.083 0.078 0.080 0.531 4 0.067 0.070 0.071 0.077 0.080 0.077 0.081 0.523 5 0.064 0.067 0.068 0.072 0.079 0.078 0.079 0.507 6 0.069 0.068 0.066 0.070 0.075 0.079 0.082 0.509 7 0.071 0.069 0.070 0.071 0.074 0.071 0.077 0.503 8 0.073 0.072 0.074 0.072 0.076 0.073 0.078 0.518 9 0.072 0.075 0.077 0.074 0.078 0.074 0.080 0.530 10 0.074 0.077 0.079 0.077 0.080 0.076 0.079 0.542 11 0.070 0.072 0.075 0.074 0.079 0.074 0.078 0.522 12 0.068 0.067 0.068 0.070 0.074 0.070 0.075 0.492 13 0.065 0.063 0.067 0.068 0.072 0.067 0.071 0.473 14 0.063 0.062 0.067 0.069 0.073 0.068 0.073 0.475 15 0.064 0.064 0.066 0.067 0.070 0.066 0.070 0.467 16 0.061 0.059 0.062 0.062 0.067 0.064 0.069 0.434 17 0.065 0.061 0.060 0.064 0.069 0.066 0.073 0.458 18 0.067 0.063 0.065 0.068 0.073 0.069 0.076 0.499 19 0.069 0.067 0.068 0.072 0.077 0.071 0.078 0.502 20 0.071 0.069 0.070 0.074 0.080 0.074 0.077 0.515 21 0.070 0.065 0.072 0.076 0.079 0.073 0.079 0.514 â 1.439 1.431 1.409 1.497 1.598 1.513 1.606 10.553 Table 7. Outlying area 8-hour ozone concentration samples (ppm).

22 effective experiment Design and Data analysis in transportation research Using the same equation, the mean value for each air-sampling location can be found by summing the value of the ozone concentration in the column representing that location and dividing by the 21 observations at that location. For example, considering Sample Station 1, the mean value of the ozone concentration is 1.709/21 = 0.081 ppm. Similarly, the mean value of the ozone concentrations for any specific day can be found by summing the ozone concentration values in the row representing that day and dividing by the number of stations. For example, for Day 1, the mean value of the ozone concentration in the central city is 0.590/7=0.084. In the outlying areas of the city, it is 0.513/7=0.073, and for the entire city it is 1.103/14=0.079. The highest and lowest values of the ozone concentration can be obtained by searching the two tables. The highest ozone concentration (0.091 ppm) is logged as having occurred at Station 5 on Day 11. The lowest ozone concentration (0.059 ppm) occurred at Station 9 on Day 16. The variation by sample location can be illustrated in the form of a frequency diagram. A graph can be used to show the variation in the average ozone concentration for the seven sample stations in the central city (Figure 7). Notice that all of these calculations (and more) can be done very easily if all the data are put in a spreadsheet and various statistical functions used. Graphs and other displays also can be made within the spreadsheet. 5. Interpreting the Results: In this example, the data are not tested to determine whether they fit a known distribution or whether one average value is significantly higher or lower than another. It can only be reported that, as recorded in July, the mean ozone concentration in the central city was greater than the concentration in the outlying areas of the city. (For testing to see whether the data fit a known distribution or comparing mean values, see Example 4 on fitting distribu- tions and goodness of fit. For comparing mean values, see examples 5 through 7.) It is known that ozone concentration varies by day and by location of the air-sampling equipment. If there is some threshold value of importance, such as the ozone concentration level considered acceptable by the EPA, these data could be used to determine the number of days that this level was exceeded, or the number of stations that recorded an ozone concentration above this threshold. This is done by comparing each day or each station with the threshold 0.081 0.083 0.081 0.083 0.084 0.085 0.085 0.070 0.072 0.074 0.076 0.078 0.080 0.082 0.084 0.086 1 2 3 4 5 6 7 Station A ve ra ge o zo ne c on ce nt ra tio n Figure 7. Average ozone concentration for seven central city sampling stations (ppm).

examples of effective experiment Design and Data analysis in transportation research 23 value. It must be noted that, as presented, this example is not a statistical comparison per se (i.e., there has been no significance testing or formal statistical comparison). 6. Conclusion and Discussion: This example illustrates how to determine and present quanti- tative information about a data set containing values of a varying parameter. If a similar set of data were captured each month, the variation in ozone concentration could be analyzed to describe the variation over the year. Similarly, if data were captured at these same locations in July of every year, the trend in ozone concentration over time could be determined. 7. Applications in Other Areas in Transportation: These descriptive statistics techniques can be used to present data in other areas of transportation research, such as: â¢ Traffic Operations/Safety and Transportation Planning â to analyze the average speed of vehicles on streets with a speed limit of 45 miles per hour (mph) in residential, commercial, and industrial areas by sampling a number of streets in each of these area types. â to examine the average emergency vehicle response time to various areas of the city or county, by analyzing dispatch and arrival times for emergency calls to each area of interest. â¢ Pavement Engineeringâto analyze the average number of potholes per mile on pavement as a function of the age of pavement, by sampling a number of streets where the pavement age falls in discrete categories (0 to 5 years, 5 to 10 years, 10 to 15 years, and greater than 15 years). â¢ Traffic Safetyâto evaluate the average number of crashes per month at intersections with two-way STOP control versus four-way STOP control by sampling a number of intersections in each category over time. Example 4: Traffic Operations; Goodness of Fit Area: Traffic operations Method of Analysis: Goodness of fit (chi-square test; determining if observed distributions of data fit hypothesized standard distributions) 1. Research Question/Problem Statement: A research team is developing a model to estimate travel times of various types of personal travel (modes) on a path shared by bicyclists, in-line skaters, and others. One version of the model relies on the assertion that the distribution of speeds for each mode conforms to the normal distribution. (For a helpful definition of this and other statistical terms, see the glossary in NCHRP Project 20-45, Volume 2, Appendix A.) Based on a literature review, the researchers are sure that bicycle speeds are normally distributed. However, the shapes of the speed distributions for other users are unknown. Thus, the objective is to determine if skater speeds are normally distributed in this instance. Question/Issue Do collected data fit a specific type of probability distribution? In this example, do the speeds of in-line skaters on a shared-use path follow a normal distribution (are they normally distributed)? 2. Identification and Description of Variables: The only variable collected is the speed of in-line skaters passing through short sections of the shared-use path. 3. Data Collection: The team collects speeds using a video camera placed where most path users would not notice it. The speed of each free-flowing skater (i.e., each skater who is not closely following another path user) is calculated from the times that the skater passes two benchmarks on the path visible in the camera frame. Several days of data collection allow a large sample of 219 skaters to be measured. (An implicit assumption is made that there is no

24 effective experiment Design and Data analysis in transportation research variation in the data by day.) The data have a familiar bell shape; that is, when graphed, they look like they are normally distributed (Figure 8). Each bar in the figure shows the number of observations per 1.00-mph-wide speed bin. There are 10 observations between 6.00 mph and 6.99 mph. 4. Specification of Analysis Technique and Data Analysis: This analysis involves several pre- liminary steps followed by two major steps. In the preliminaries, the team calculates the mean and standard deviation from the data sample as 10.17 mph and 2.79 mph, respectively, using standard formulas described in NCHRP Project 20-45, Volume 2, Chapter 6, Section C under the heading âFrequency Distributions, Variance, Standard Deviation, Histograms, and Boxplots.â Then the team forms bins of observations of sufficient size to conduct the analysis. For this analysis, the team forms bins containing at least four observations each, which means forming a bin for speeds of 5 mph and lower and a bin for speeds of 17 mph or higher. There is some argument regarding the minimum allowable cell size. Some analysts argue that the minimum is five; others argue that the cell size can be smaller. Smaller numbers of observations in a bin may distort the results. When in doubt, the analysis can be done with different assumptions regarding the cell size. The left two columns in Table 8 show the data ready for analysis. The first major step of the analysis is to generate the theoretical normal distribution to compare to the field data. To do this, the team calculates a value of Z, the standard normal variable for each bin i, using the following equation: Z xi = â Âµ Ï where x is the speed in miles per hour (mph) corresponding to the bin, Âµ is the mean speed, and s is the standard deviation of all of the observations in the speed sample in mph. For example (and with reference to the data in Table 8), for a speed of 5 mph the value of Z will be (5 - 10.17)/2.79 = -1.85 and for a speed of 6 mph, the value of Z will be (6 - 10.17)/2.79 = -1.50. The team then consults a table of standard normal values (i.e., NCHRP Project 20-45, Volume 2, Appendix C, Table C-1) to convert these Z values into A values representing the area under the standard normal distribution curve. The A value for a Z of -1.85 is 0.468, while the A value for a Z of -1.50 is 0.432. The difference between these two A values, representing the area under the standard normal probability curve corresponding to the speed of 6 mph, is 0.036 (calculated 0.468 - 0.432 = 0.036). The team multiplies 0.036 by the total sample size (219), to estimate that there should be 7.78 skaters with a speed of 6 mph if the speeds follow the standard normal distribution. The team follows Figure 8. Distribution of observed in-line skater speeds. 0 5 10 15 20 25 30 35 40 1 3 5 7 9 11 13 15 17 232119 Speed, mph Nu m be r o f o bs er va tio ns

examples of effective experiment Design and Data analysis in transportation research 25 a similar procedure for all speeds. Notice that the areas under the curve can also be calculated in a simple Excel spreadsheet using the âNORMDISTâ function for a given x value and the average speed of 10.17 and standard deviation of 2.79. The values shown in Table 8 have been estimated using the Excel function. The second major step of the analysis is to use the chi-square test (as described in NCHRP Project 20-45, Volume 2, Chapter 6, Section F) to determine if the theoretical normal distribution is significantly different from the actual data distribution. The team computes a chi-square value for each bin i using the formula: Ïi i i i O E E 2 2 = â( ) where Oi is the number of actual observations in bin i and Ei is the expected number of obser- vations in bin i estimated by using the theoretical distribution. For the bin of 6 mph speeds, O = 10 (from the table), E = 7.78 (calculated), and the ci2 contribution for that cell is 0.637. The sum of the ci2 values for all bins is 19.519. The degrees of freedom (df) used for this application of the chi-square test are the number of bins minus 1 minus the number of variables in the distribution of interest. Given that the normal distribution has two variables (see May, Traffic Flow Fundamentals, 1990, p. 40), in this example the degrees of freedom equal 9 (calculated 12 - 1 - 2 = 9). From a standard table of chi-square values (NCHRP Project 20-45, Volume 2, Appendix C, Table C-2), the team finds that the critical value at the 95% confidence level for this case (with df = 9) is 16.9. The calculated value of the statistic is ~19.5, more than the tabular value. The results of all of these observations and calculations are shown in Table 8. 5. Interpreting the Results: The calculated chi-square value of ~19.5 is greater than the criti- cal chi-square value of 16.9. The team concludes, therefore, that the normal distribution is significantly different from the distribution of the speed sample at the 95% level (i.e., that the in-line skater speed data do not appear to be normally distributed). Larger variations between the observed and expected distributions lead to higher values of the statistic and would be interpreted as it being less likely that the data are distributed according to the Speed (mph) Number of Observations Number Predicted by Normal Distribution Chi-Square Value Under 5.99 6 6.98 0.137 6.00 to 6.99 10 7.78 0.637 7.00 to 7.99 18 13.21 1.734 8.00 to 8.99 24 19.78 0.902 9.00 to 9.99 37 26.07 4.585 10.00 to 10.99 38 30.26 1.980 11.00 to 11.99 24 30.93 1.554 12.00 to 12.99 21 27.85 1.685 13.00 to 13.99 15 22.08 2.271 14.00 to 14.99 13 15.42 0.379 15.00 to 15.99 4 9.48 3.169 16.00 to 16.99 4 5.13 0.251 17.00 and over 5 4.03 0.234 Total 219 219 19.519 Table 8. Observations, theoretical predictions, and chi-square values for each bin.

26 effective experiment Design and Data analysis in transportation research hypothesized distribution. Conversely, smaller variations between observed and expected distributions result in lower values of the statistic, which would suggest that it is more likely that the data are normally distributed because the observed values would fit better with the expected values. 6. Conclusion and Discussion: In this case, the results suggest that the normal distribution is not a good fit to free-flow speeds of in-line skaters on shared-use paths. Interestingly, if the 23 mph observation is considered to be an outlier and discarded, the results of the analysis yield a different conclusion (that the data are normally distributed). Some researchers use a simple rule that an outlier exists if the observation is more than three standard deviations from the mean value. (In this example, the 23 mph observation is, indeed, more than three standard deviations from the mean.) If there is concern with discarding the observation as an outlier, it would be easy enough in this example to repeat the data collection exercise. Looking at the data plotted above, it is reasonably apparent that the well-known normal distribution should be a good fit (at least without the value of 23). However, the results from the statistical test could not confirm the suspicion. In other cases, the type of distribution may not be so obvious, the distributions in question may be obscure, or some distribution parameters may need to be calibrated for a good fit. In these cases, the statistical test is much more valuable. The chi-square test also can be used simply to compare two observed distributions to see if they are the same, independent of any underlying probability distribution. For example, if it is desired to know if the distribution of traffic volume by vehicle type (e.g., automobiles, light trucks, and so on) is the same at two different freeway locations, the two distributions can be compared to see if they are similar. The consequences of an error in the procedure outlined here can be severe. This is because the distributions chosen as a result of the procedure often become the heart of predictive models used by many other engineers and planners. A poorly-chosen distribution will often provide erroneous predictions for many years to come. 7. Applications in Other Areas of Transportation Research: Fitting distributions to data samples is important in several areas of transportation research, such as: â¢ Traffic Operationsâto analyze shapes of vehicle headway distributions, which are of great interest, especially as a precursor to calibrating and using simulation models. â¢ Traffic Safetyâto analyze collision frequency data. Analysts often assume that the Poisson distribution is a good fit for collision frequency data and must use the method described here to validate the claim. â¢ Pavement Engineeringâto form models of pavement wear or otherwise compare results obtained using different designs, as it is often required to check the distributions of the parameters used (e.g., roughness). Example 5: Construction; Simple Comparisons to Specified Values Area: Construction Method of Analysis: Simple comparisons to specified valuesâusing Studentâs t-test to compare the mean value of a small sample to a standard or other requirement (i.e., to a population with a known mean and unknown standard deviation or variance) 1. Research Question/Problem Statement: A contractor wants to determine if a specified soil compaction can be achieved on a segment of the road under construction by using an on-site roller or if a new roller must be brought in.

examples of effective experiment Design and Data analysis in transportation research 27 The cost of obtaining samples for many construction materials and practices is quite high. As a result, decisions often must be made based on a small number of samples. The appropri- ate statistical technique for comparing the mean value of a small sample with a standard or requirement is Studentâs t-test. Formally, the working, or null, hypothesis (Ho) and the alternative hypothesis (Ha) can be stated as follows: Ho: The soil compaction achieved using the on-site roller (CA) is less than a specified value (CS); that is, (CA < CS). Ha: The soil compaction achieved using the on-site roller (CA) is greater than or equal to the specified value (CS); that is, (CA â¥ CS). Question/Issue Determine whether a sample mean exceeds a specified value. Alternatively, deter- mine the probability of obtaining a sample mean (x _ ) from a sample of size n, if the universe being sampled has a true mean less than or equal to a population mean with an unknown variance. In this example, is an observed mean of soil compaction samples equal to or greater than a specified value? 2. Identification and Description of Variables: The variable to be used is the soil density results of nuclear densometer tests. These values will be used to determine whether the use of the on-site roller is adequate to meet the contract-specified soil density obtained in the laboratory (Proctor density) of 95%. 3. Data Collection: A 125-foot section of road is constructed and compacted with the on-site roller, and four samples of the soil density are obtained (25 feet, 50 feet, 75 feet, and 100 feet from the beginning of the test section). 4. Specification of Analysis Technique and Data Analysis: For small samples (n < 30) where the population mean is known but the population standard deviation is unknown, it is not appropriate to describe the distribution of the sample mean with a normal distribution. The appropriate distribution is called Studentâs distribution (t-distribution or t-statistic). The equation for Studentâs t-statistic is: t x x S n = â â² where x _ is the sample mean, x _ â² is the population mean (or specified standard), S is the sample standard deviation, and n is the sample size. The four nuclear densometer readings were 98%, 97%, 93% and 99%. Then, showing some simple sample calculations, X X S X i i i n = = + + + = = = = = â 4 98 97 93 99 4 387 4 96 75 1 4 1 . % Î£ i X n S â( ) â = = 2 1 20 74 3 2 63 . . %

28 effective experiment Design and Data analysis in transportation research and using the equation for t above, t = â = = 96 75 95 00 2 63 2 1 75 1 32 1 33 . . . . . . The calculated value of the t-statistic (1.33) is most typically compared to the tabularized values of the t-statistic (e.g., NCHRP Project 20-45, Volume 2, Appendix C, Table C-4) for a given significance level (typically called t critical or tcrit). For a sample size of n = 4 having 3 (n - 1) degrees of freedom (df), the values for tcrit are: 1.638 for a = 0.10 and 2.353 for a = 0.05 (two common values of a for testing, the latter being most common). Important: The specification of the significance level (a level) for testing should be done before actual testing and interpretation of results are done. In many instances, the appropriate level is defined by the agency doing the testing, a specified testing standard, or simply common practice. Generally speaking, selection of a smaller value for a (e.g., a = 0.05 versus a = 0.10) sets a more stringent standard. In this example, because the calculated value of t (1.33) is less than the critical value (2.353, given a = 0.05), the null hypothesis is accepted. That is, the engineer cannot be confident that the mean value from the densometer tests (96.75%) is greater than the required specifica- tion (95%). If a lower confidence level is chosen (e.g., a = 0.15), the value for tcrit would change to 1.250, which means the null hypothesis would be rejected. A lower confidence level can have serious implications. For example, there is an approximately 15% chance that the standard will not be met. That level of risk may or may not be acceptable to the contractor or the agency. Notice that in many standards the required significance level is stated (typically a = 0.05). It should be emphasized that the confidence level should be chosen before calculations and testing are done. It is not generally permissible to change the confidence level after calculations have been performed. Doing this would be akin to arguing that standards can be relaxed if a test gives an answer that the analyst doesnât like. The results of small sample tests often are sensitive to the number of samples that can be obtained at a reasonable cost. (The mean value may change considerably as more data are added.) In this example, if it were possible to obtain nine independent samples (as opposed to four) and the mean value and sample standard deviation were the same as with the four samples, the calculation of the t-statistic would be: t = â = 96 75 95 00 2 63 3 1 99 . . . . Comparing the value of t (with a larger sample size) to the appropriate tcrit (for n - 1 = 8 df and a = 0.05) of 1.860 changes the outcome. That is, the calculated value of the t-statistic is now larger than the tabularized value of tcrit, and the null hypothesis is rejected. Thus, it is accepted that the mean of the densometer readings meets or exceeds the standard. It should be noted, however, that the inclusion of additional tests may yield a different mean value and standard deviation, in which case the results could be different. 5. Interpreting the Results: By themselves, the results of the statistical analysis are insufficient to answer the question as to whether a new roller should be brought to the project site. These results only provide information the contractor can use to make this decision. The ultimate decision should be based on these probabilities and knowledge of the cost of each option. What is the cost of bringing in a new roller now? What is the cost of starting the project and then determining the current roller is not adequate and then bringing in a new roller? Will this decision result in a delay in project completionâand does the contract include an incentive for early completion and/or a penalty for missing the completion date? If it is possible to conduct additional independent densometer tests, what is the cost of conducting them?

examples of effective experiment Design and Data analysis in transportation research 29 If there is a severe penalty for missing the deadline (or a significant reward for finishing early), the contractor may be willing to incur the cost of bringing in a new roller rather than accepting a 15% probability of being delayed. 6. Conclusion and Discussion: In some cases the decision about which alternative is preferable can be expressed in the form of a probability (or level of confidence) required to make a deci- sion. The decision criterion is then expressed in a hypothesis and the probability of rejecting that hypothesis. In this example, if the hypothesis to be tested is âUsing the on-site roller will provide an average soil density of 95% or higherâ and the level of confidence is set at 95%, given a sample of four tests the decision will be to bring in a new roller. However, if nine independent tests could be conducted, the results in this example would lead to a decision to use the on-site roller. 7. Applications in Other Areas in Transportation Research: Simple comparisons to specified values can be used in a variety of areas of transportation research. Some examples include: â¢ Traffic Operationsâto compare the average annual number of crashes at intersections with roundabouts with the average annual number of crashes at signalized intersections. â¢ Pavement Engineeringâto test the comprehensive strength of concrete slabs. â¢ Maintenanceâto test the results of a proposed new deicer compound. Example 6: Maintenance; Simple Two-Sample Comparisons Area: Maintenance Method of Analysis: Simple two-sample comparisons (t-test for paired comparisons; com- paring the mean values of two sets of matched data) 1. Research Question/Problem Statement: As a part of a quality control and quality assurance (QC/QA) program for highway maintenance and construction, an agency engineer wants to compare and identify discrepancies in the contractorâs testing procedures or equipment in making measurements on materials being used. Specifically, compacted air voids in asphalt mixtures are being measured. In this instance, the agencyâs test results need to be compared, one-to-one, with the contractorâs test results. Samples are drawn or made and then literally split and testedâone by the contractor, one by the agency. Then the pairs of measurements are analyzed. A paired t-test will be used to make the comparison. (For another type of two-sample comparison, see Example 7.) Question/Issue Use collected data to test if two sets of results are similar. Specifically, do two test- ing procedures to determine air voids produce the same results? Stated in formal terms, the null and alternative hypotheses are: Ho: There is no mean difference in air voids between agency and contractor test results: H Xo d: = 0 Ha: There is a mean difference in air voids between agency and contractor test results: H Xa d: â 0 (For definitions and more discussion about the formulation of formal hypotheses for test- ing, see NCHRP Project 20-45, Volume 2, Appendix A and Volume 1, Chapter 2, âHypothesis.â) 2. Identification and Description of Variables: The testing procedure for laboratory-compacted air voids in the asphalt mixture needs to be verified. The split-sample test results for laboratory-

30 effective experiment Design and Data analysis in transportation research compacted air voids are shown in Table 9. Twenty samples are prepared using the same asphalt mixture. Half of the samples are prepared in the agencyâs laboratory and the other half in the contractorâs laboratory. Given this arrangement, there are basically two variables of concern: who did the testing and the air void determination. 3. Data Collection: A sufficient quantity of asphalt mix to make 10 lots is produced in an asphalt plant located on a highway project. Each of the 10 lots is collected, split into two samples, and labeled. A sample from each lot, 4 inches in diameter and 2 inches in height, is prepared in the contractorâs laboratory to determine the air voids in the compacted samples. A matched set of samples is prepared in the agencyâs laboratory and a similar volumetric procedure is used to determine the agencyâs lab-compacted air voids. The lab-compacted air void contents in the asphalt mixture for both the contractor and agency are shown in Table 9. 4. Specification of Analysis Technique and Data Analysis: A paired (two-sided) t-test will be used to determine whether a difference exists between the contractor and agency results. As noted above, in a paired t-test the null hypothesis is that the mean of the differences between each pair of two tests is 0 (there is no difference between the means). The null hypothesis can be expressed as follows: H Xo d: = 0 The alternate hypothesis, that the two means are not equal, can be expressed as follows: H Xa d: â 0 The t-statistic for the paired measurements (i.e., the difference between the split-sample test results) is calculated using the following equation: t X s n d d = â 0 Using the actual data, the value of the t-statistic is calculated as follows: t = â = 0 88 0 0 7 10 4 . . Sample Air Voids (%) DifferenceContractor Agency 1 4.37 4.15 0.21 2 3.76 5.39 -1.63 3 4.10 4.47 -0.37 4 4.39 4.52 -0.13 5 4.06 5.36 -1.29 6 4.14 5.01 -0.87 7 3.92 5.23 -1.30 8 3.38 4.97 -1.60 9 4.12 4.37 -0.25 10 3.68 5.29 -1.61 X 3.99 4.88 dX = -0.88 S 0.31 0.46 ds = 0.70 Table 9. Laboratory-compacted air voids in split samples.

examples of effective experiment Design and Data analysis in transportation research 31 For n - 1 (10 - 1 = 9) degrees of freedom and a = 0.05, the tcrit value can be looked up using a t-table (e.g., NCHRP Project 20-45, Volume 2, Appendix C, Table C-4): t0 025 9 2 262. , .= For a more detailed description of the t-statistic, see the glossary in NCHRP Project 20-45, Volume 2, Appendix A. 5. Interpreting the Results: Given that t = 4 > t0.025, 9 = 2.685, the engineer would reject the null hypothesis and conclude that the results of the paired tests are different. This means that the contractor and agency test results from paired measurements indicate that the test method, technicians, and/or test equipment are not providing similar results. Notice that the engineer cannot conclude anything about the material or production variation or what has caused the differences to occur. 6. Conclusion and Discussion: The results of the test indicate that a statistically significant difference exists between the test results from the two groups. When making such comparisons, it is important that random sampling be used when obtaining the samples. Also, because sources of variability influence the population parameters, the two sets of test results must have been sampled over the same time period, and the same sampling and testing procedures must have been used. It is best if one sample is drawn and then literally split in two, then another sample drawn, and so on. The identification of a difference is just that: notice that a difference exists. The reason for the difference must still be determined. A common misinterpretation is that the result of the t-test provides the probability of the null hypothesis being true. Another way to look at the t-test result in this example is to conclude that some alternative hypothesis provides a better description of the data. The result does not, however, indicate that the alternative hypothesis is true. To ensure practical significance, it is necessary to assess the magnitude of the difference being tested. This can be done by computing confidence intervals, which are used to quantify the range of effect size and are often more useful than simple hypothesis testing. Failure to reject a hypothesis also provides important information. Possible explanations include: occurrence of a type-II error (erroneous acceptance of the null hypothesis); small sample size; difference too small to detect; expected difference did not occur in data; there is no difference/effect. Proper experiment design and data collection can minimize the impact of some of these issues. (For a more comprehensive discussion of this topic, see NCHRP Project 20-45, Volume 2, Chapter 1.) 7. Applications in Other Areas of Transportation Research: The application of the t-test to compare two mean values in other areas of transportation research may include: â¢ Traffic Operationsâto evaluate average delay in bus arrivals at various bus stops. â¢ Traffic Operations/Safetyâto determine the effect of two enforcement methods on reduction in a particular traffic violation. â¢ Pavement Engineeringâto investigate average performance of two pavement sections. â¢ Environmentâto compare average vehicular emissions at two locations in a city. Example 7: Materials; Simple Two-Sample Comparisons Area: Materials Method of Analysis: Simple two-sample comparisons (using the t-test to compare the mean values of two samples and the F-test for comparing variances) 1. Research Question/Problem Statement: As a part of dispute resolution during quality control and quality assurance, a highway agency engineer wants to validate a contractorâs test results concerning asphalt content. In this example, the engineer wants to compare the results

32 effective experiment Design and Data analysis in transportation research of two sets of tests: one from the contractor and one from the agency. Formally, the (null) hypothesis to be tested, Ho, is that the contractorâs tests and the agencyâs tests are from the same population. In other words, the null hypothesis is that the means of the two data sets will be equal, as will the standard deviations. Notice that in the latter instance the variances are actually being compared. Test results were also compared in Example 6. In that example, the comparison was based on split samples. The same test specimens were tested by two different analysts using different equipment to see if the same results could be obtained by both. The major difference between Example 6 and Example 7 is that, in this example, the two samples are randomly selected from the same pavement section. Question/Issue Use collected data to test if two measured mean values are the same. In this instance, are two mean values of asphalt content the same? Stated in formal terms, the null and alternative hypotheses can be expressed as follows: Ho: There is no difference in asphalt content between agency and contractor test results: H m mo c a: â =( )0 Ha: There is a difference in asphalt content between agency and contractor test results: H m ma c a: â â ( )0 2. Identification and Description of Variables: The contractor runs 12 asphalt content tests and the agency engineer runs 6 asphalt content tests over the same period of time, using the same random sampling and testing procedures. The question is whether it is likely that the tests have come from the same population based on their variability. 3. Data Collection: If the agencyâs objective is simply to identify discrepancies in the testing procedures or equipment, then verification testing should be done on split samples (as in Example 6). Using split samples, the difference in the measured variable can more easily be attributed to testing procedures. A paired t-test should be used. (For more information, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A, âAnalysis of Variance Methodology.â) A split sample occurs when a physical sample (of whatever is being tested) is drawn and then literally split into two testable samples. On the other hand, if the agencyâs objective is to identify discrepancies in the overall material, process, sampling, and testing processes, then validation testing should be done on independent samples. Notice the use of these terms. It is important to distinguish between testing to verify only the testing process (verification) versus testing to compare the overall production, sampling, and testing processes (validation). If independent samples are used, the agency test results still can be compared with contractor test results (using a simple t-test for comparing two means). If the test results are consistent, then the agency and contractor tests can be combined for contract compliance determination. 4. Specification of Analysis Technique and Data Analysis: When comparing the two data sets, it is important to compare both the means and the variances because the assumption when using the t-test requires equal variances for each of the two groups. A different test is used in each instance. The F-test provides a method for comparing the variances (the standard devia- tion squared) of two sets of data. Differences in means are assessed by the t-test. Generally, construction processes and material properties are assumed to follow a normal distribution.

examples of effective experiment Design and Data analysis in transportation research 33 In this example, a normal distribution is assumed. (The assumption of normality also can be tested, as in Example 4.) The ratios of variances follow an F-distribution, while the means of relatively small samples follow a t-distribution. Using these distributions, hypothesis tests can be conducted using the same concepts that have been discussed in prior examples. (For more information about the F-test and the t-distribution, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A, âCompute the F-ratio Test Statistic.â For more information about the t-distribution, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A.) For samples from the same normal population, the statistic F (the ratio of the two-sample variances) has a sampling distribution called the F-distribution. For validation and verification testing, the F-test is based on the ratio of the sample variance of the contractorâs test results (sc 2) and the sample variance of the agencyâs test results (sa 2). Similarly, the t-test can be used to test whether the sample mean of the contractorâs tests, X _ c, and the agencyâs tests, X _ a, came from populations with the same mean. Consider the asphalt content test results from the contractor samples and agency samples (Table 10). In this instance, the F-test is used to determine whether the variance observed for the contractorâs tests differs from the variance observed for the agencyâs tests. Using the F-test Step 1. Compute the variance (s2), for each set of tests: sc 2 = 0.064 and sa 2 = 0.092. As an example, sc 2 can be calculated as: s x X n c i c i2 2 2 2 1 6 4 6 1 11 6 2 6 1 11 = â( ) â = â( ) + â( )â . . . . + + â( ) + â( ) =. . . . . . . 6 6 1 11 5 7 6 1 11 0 0645 2 2 Step 2. Compute F s s calc a c = = = 2 2 0 092 0 064 1 43 . . . . Contractor Samples Agency Samples 1 6.4 1 5.4 2 6.2 2 5.8 3 6.0 3 6.2 4 6.6 4 5.4 5 6.1 5 5.6 6 6.0 6 5.8 7 6.3 8 6.1 9 5.9 10 5.8 11 6.0 12 5.7 Descriptive Statistics = 6.1cX Descriptive Statistics = 5.7aX = 0.0642cs = 0.0922as = 0.25cs = 0.30as = 12cn = 6an Table 10. Asphalt content test results from independent samples.

34 effective experiment Design and Data analysis in transportation research Step 3. Determine Fcrit from the F-distribution table, making sure to use the correct degrees of freedom (df) for the numerator (the number of observations minus 1, or na - 1 = 6 - 1 = 5) and the denominator (nc - 1 = 12 - 1 = 11). For a = 0.01, Fcrit = 5.32. The critical F-value can be found from tables (see NCHRP Project 20-45, Volume 2, Appendix C, Table C-5). Read the F-value for 1 - a = 0.99, numerator and denominator degrees of freedom 5 and 11, respectively. Interpolation can be used if exact degrees of freedom are not available in the table. Alternatively, a statistical function in Microsoft Excelâ¢ can be used to determine the F-value. Step 4. Compare the two values to determine if Fcalc < Fcrit. If Fcalc < Fcrit is true, then the variances are equal; if not, they are unequal. In this example, Fcalc (1.43) is, in fact, less than Fcrit (5.32) and, thus, there is no evidence of unequal variances. Given this result, the t-test for the case of equal variances is used to determine whether to declare that the mean of the contractorâs tests differs from the mean of the agencyâs tests. Using the t-test Step 1. Compute the sample means (X _ ) for each set of tests: X _ c = 6.1 and X _ a = 5.7. Step 2. Compute the pooled variance sp 2 from the individual sample variances: s s n s n n n p c c a a c a 2 2 21 1 2 0 064 12 1 = â( )+ â( ) + â = â( )+. 0 092 6 1 12 6 2 0 0731 . . â( ) + â = Step 3. Compute the t-statistic using the following equation for equal variance: t X X s n s n c a p c p a = â + = â + = 2 2 6 1 5 7 0 0731 12 0 0731 6 . . . . 2 9. t0 005 16 2 921. , .= (For more information, see NCHRP Project 20-45, Volume 2, Appendix C, Table C-4 for A v= â =1 2 16 Î± and .) 5. Interpreting the Results: Given that F < Fcrit (i.e., 1.43 < 5.32), there is no reason to believe that the two sets of data have different variances. That is, they could have come from the same population. Therefore, the t-test can be used to compare the means using equal variance. Because t < tcrit (i.e., 2.9 < 2.921), the engineer does not reject the null hypothesis and, thus, assumes that the sample means are equal. The final conclusion is that it is likely that the contractor and agency test results represent the same process. In other words, with a 99% confidence level, it can be said that the agencyâs test results are not different from the contrac- torâs and therefore validate the contractor tests. 6. Conclusion and Discussion: The simple t-test can be used to validate the contractorâs test results by conducting independent sampling from the same pavement at the same time. Before conducting a formal t-test to compare the sample means, the assumption of equal variances needs to be evaluated. This can be accomplished by comparing sample variances using the F-test. The interpretation of results will be misleading if the equal variance assumption is not validated. If the variances of two populations being compared for their means are different, the mean comparison will reflect the difference between two separate populations. Finally, based on the comparison of means, one can conclude that the construction materials have consistent properties as validated by two independent sources (contractor and agency). This sort of comparison is developed further in Example 8, which illustrates tests for the equality of more than two mean values.

examples of effective experiment Design and Data analysis in transportation research 35 7. Applications in Other Areas of Transportation Research: The simple t-test can be used to compare means of two independent samples. Applications for this method in other areas of transportation research may include: â¢ Traffic Operations â to compare average speeds at two locations along a route. â to evaluate average delay times at two intersections in an urban area. â¢ Pavement Engineeringâto investigate the difference in average performance of two pavement sections. â¢ Maintenanceâto determine the effects of two maintenance treatments on average life extension of two pavement sections. Example 8: Laboratory Testing/Instrumentation; Simple Analysis of Variance (ANOVA) Area: Laboratory testing and/or instrumentation Method of Analysis: Simple analysis of variance (ANOVA) comparing the mean values of more than two samples and using the F-test 1. Research Question/Problem Statement: An engineer wants to test and compare the com- pressive strength of five different concrete mix designs that vary in coarse aggregate type, gradation, and water/cement ratio. An experiment is conducted in a laboratory where five different concrete mixes are produced based on given specifications, and tested for com- pressive strength using the ASTM International standard procedures. In this example, the comparison involves inference on parameters from more than two populations. The purpose of the analysis, in other words, is to test whether all mix designs are similar to each other in mean compressive strength or whether some differences actually exist. ANOVA is the statistical procedure used to test the basic hypothesis illustrated in this example. Question/Issue Compare the means of more than two samples. In this instance, compare the compres- sive strengths of five concrete mix designs with different combinations of aggregates, gradation, and water/cement ratio. More formally, test the following hypotheses: Ho: There is no difference in mean compressive strength for the various (five) concrete mix types. Ha: At least one of the concrete mix types has a different compressive strength. 2. Identification and Description of Variables: In this experiment, the factor of interest (independent variable) is the concrete mix design, which has five levels based on differ- ent coarse aggregate types, gradation, and water/cement ratios (denoted by t and labeled A through E in Table 11). Compressive strength is a continuous response (dependent) variable, measured in pounds per square inch (psi) for each specimen. Because only one factor is of interest in this experiment, the statistical method illustrated is often called a one-way ANOVA or simple ANOVA. 3. Data Collection: For each of the five mix designs, three replicates each of cylinders 4 inches in diameter and 8 inches in height are made and cured for 28 days. After 28 days, all 15 specimens are tested for compressive strength using the standard ASTM International test. The compres- sive strength data and summary statistics are provided for each mix design in Table 11. In this example, resource constraints have limited the number of replicates for each mix design to

36 effective experiment Design and Data analysis in transportation research three. (For a discussion on sample size determination based on statistical power requirements, see NCHRP Project 20-45, Volume 2, Chapter 1, âSample Size Determination.â) 4. Specification of Analysis Technique and Data Analysis: To perform a one-way ANOVA, pre- liminary calculations are carried out to compute the overall mean (y _ P), the sample means (y _ i.), and the sample variances (si 2) given the total sample size (nT = 15) as shown in Table 11. The basic strategy for ANOVA is to compare the variance between levels or groupsâspecifically, the variation between sample meansâto the variance within levels. This comparison is used to determine if the levels explain a significant portion of the variance. (Details for perform- ing a one-way ANOVA are given in NCHRP Project 20-45, Volume 2, Chapter 4, Section A, âAnalysis of Variance Methodology.â) ANOVA is based on partitioning of the total sum of squares (TSS, a measure of overall variability) into within-level and between-levels components. The TSS is defined as the sum of the squares of the differences of each observation (yij) from the overall mean (y _ P). The TSS, between-levels sum of squares (SSB), and within-level sum of squares (SSE) are computed as follows. TSS y y SSB y y ij i j i = â( ) = = â( ) â .. , . .. . 2 2 4839620 90 = = â( ) = â 4331513 60 508107 30 2 . . , . , i j ij i i j SSE y yâ The next step is to compute the between-levels mean square (MSB) and within-levels mean square (MSE) based on respective degrees of freedom (df). The total degrees of freedom (dfT), between-levels degrees of freedom (dfB), and within-levels degrees of freedom (dfE) for one- way ANOVA are computed as follows: df n df t df n t T T B E T = â = â = = â = â = = â = â = 1 15 1 14 1 5 1 4 15 5 10 where nT = the total sample size and t = the total number of levels or groups. The next step of the ANOVA procedure is to compute the F-statistic. The F-statistic is the ratio of two variances: the variance due to interaction between the levels, and the variance due to differences within the levels. Under the null hypothesis, the between-levels mean square (MSB) and within-levels mean square (MSE) provide two independent estimates of the variance. If the means for different levels of mix design are truly different from each other, the MSB will tend Replicate Mix Design A B C D E 1 y11 = 5416 y21 = 5292 y31 = 4097 y41 = 5056 y51 = 4165 2 y12 = 5125 y22 = 4779 y32 = 3695 y42 = 5216 y52 = 3849 3 y13 = 4847 y23 = 4824 y33 = 4109 y43 = 5235 y53 = 4089 Mean yâ 1. = 5129 yâ 2. = 4965 yâ 3. = 3967 yâ 4. = 5169 yâ 5. = 4034 Standard deviation s1 = 284.52 s2 = 284.08 s3 = 235.64 s4 = 98.32 s5 = 164.94 Overall mean yâ.. = 4653 Table 11. Concrete compressive strength (psi) after 28 days.

examples of effective experiment Design and Data analysis in transportation research 37 to be larger than the MSE, such that it will be more likely to reject the null hypothesis. For this example, the calculations for MSB, MSE, and F are as follows: MSB SSB df MSE SSE df F M B E = = = = = 1082878 40 50810 70 . . SB MSE = 21 31. If there are no effects due to level, the F-statistic will tend to be smaller. If there are effects due to level, the F-statistic will tend to be larger, as is the case in this example. ANOVA computations usually are summarized in the form of a table. Table 12 summarizes the computations for this example. The final step is to determine Fcrit from the F-distribution table (e.g., NCHRP Project 20-45, Volume 2, Appendix C, Table C-5) with t - 1 (5 - 1 = 4) degrees of freedom for the numerator and nT - t (15 - 5 = 10) degrees of freedom for the denominator. For a significance level of a = 0.01, Fcrit is found (in Table C-5) to be 5.99. Given that F > Fcrit (21.31 > 5.99), the null hypothesis that all mix designs have equal compressive strength is rejected, supporting the conclusion that at least two mix designs are different from each other in their mean effect. Table 12 also shows the p-value calculated using a computer program. The p-value is the probability that a sample would result in the given statistic value if the null hypothesis were true. The p-value of 0.0000698408 is well below the chosen significance level of 0.01. 5. Interpreting the Results: The ANOVA results in rejection of the null hypothesis at a = 0.01. That is, the mean values are judged to be statistically different. However, the ANOVA result does not indicate where the difference lies. For example, does the compressive strength of mix design A differ from that of mix design C or D? To carry out such multiple mean comparisons, the analyst must control the experiment-wise error rate (EER) by employing more conservative methods such as Tukeyâs test, Bonferroniâs test, or Scheffeâs test, as appropriate. (Details for ANOVA are given in NCHRP Project 20-45, Volume 2, Chapter 4, Section A, âAnalysis of Variance Methodology.â) The coefficient of determination (R2) provides a rough indication of how well the statistical model fits the data. For this example, R2 is calculated as follows: R SSB TSS 2 4331513 60 4839620 90 0 90= = = . . . For this example, R2 indicates that the one-way ANOVA classification model accounts for 90% of the total variation in the data. In the controlled laboratory experiment demonstrated in this example, R2 = 0.90 indicates a fairly acceptable fit of the statistical model to the data. 6. Conclusion and Discussion: This example illustrates a simple one-way ANOVA where infer- ence regarding parameters (mean values) from more than two populations or treatments was Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Probability > F (Significance) Between 4331513.60 4 1082878.40 21.31 0.0000698408 Within 508107.30 10 50810.70 Total 4839620.90 14 Table 12. ANOVA results.

38 effective experiment Design and Data analysis in transportation research desired. The focus of computations was the construction of the ANOVA table. Before pro- ceeding with ANOVA, however, an analyst must verify that the assumptions of common vari- ance and data normality are satisfied within each group/level. The results do not establish the cause of difference in compressive strength between mix designs in any way. The experimental setup and analytical procedure shown in this example may be used to test other properties of mix designs such as flexure strength. If another factor (for example, water/cement ratio with levels low or high) is added to the analysis, the classification will become a two-way ANOVA. (In this report, two-way ANOVA is demonstrated in Example 11.) Notice that the equations shown in Example 8 may only be used for one-way ANOVA for balanced designs, meaning that in this experiment there are equal numbers of replicates for each level within a factor. (For a discussion of computations on unbalanced designs and multifactor designs, see NCHRP Project 20-45.) 7. Applications in Other Areas of Transportation Research: Examples of applications of one-way ANOVA in other areas of transportation research include: â¢ Traffic Operationsâto determine the effect of various traffic calming devices on average speeds in residential areas. â¢ Traffic Operations/Safetyâto study the effect of weather conditions on accidents in a given time period. â¢ Work Zonesâto compare the effect of different placements of work zone signs on reduction in highway speeds at some downstream point. â¢ Materialsâto investigate the effect of recycled aggregates on compressive and flexural strength of concrete. Example 9: Materials; Simple Analysis of Variance (ANOVA) Area: Materials Method of Analysis: Simple analysis of variance (ANOVA) comparing more than two mean values and using the F-test for equality of means 1. Research Question/Problem Statement: To illustrate how increasingly detailed analysis may be appropriate, Example 9 is an extension of the two-sample comparison presented in Exam- ple 7. As a part of dispute resolution during quality control and quality assurance, letâs say the highway agency engineer from Example 7 decides to reconfirm the contractorâs test results for asphalt content. The agency hires an independent consultant to verify both the contractor- and agency-measured asphalt contents. It now becomes necessary to compare more than two mean values. A simple one-way analysis of variance (ANOVA) can be used to analyze the asphalt contents measured by three different parties. Question/Issue Extend a comparison of two mean values to compare three (or more) mean values. Specifically, use data collected by several (>2) different parties to see if the results (mean values) are the same. Formally, test the following null (Ho) and alternative (Ha) hypotheses, which can be stated as follows: Ho: There is no difference in asphalt content among three different parties: H m m mo contractor agency: = =( )consultant Ha: At least one of the parties has a different measured asphalt content.

examples of effective experiment Design and Data analysis in transportation research 39 2. Identification and Description of Variables: The independent consultant runs 12 additional asphalt content tests by taking independent samples from the same pavement section as the agency and contractor. The question is whether it is likely that the tests came from the same population, based on their variability. 3. Data Collection: The descriptive statistics (mean, standard deviation, and sample size) for the asphalt content data collected by the three parties are shown in Table 13. Notice that 12 measurements each have been taken by the contractor and the independent consultant, while the agency has only taken six measurements. The data for the contractor and the agency are the same as presented in Example 7. For brevity, the consultantâs raw observations are not repeated here. The mean value and standard deviation for the consultantâs data are calculated using the same formulas and equations that were used in Example 7. 4. Specification of Analysis Technique and Data Analysis: The agency engineer can use one-way ANOVA to resolve this question. (Details for one-way ANOVA are available in NCHRP Project 20-45, Volume 2, Chapter 4, Section A, âAnalysis of Variance Methodology.â) The objective of the ANOVA is to determine whether the variance observed in the depen- dent variable (in this case, asphalt content) is due to the differences among the samples (different from one party to another) or due to the differences within the samples. ANOVA is basically an extension of two-sample comparisons to cases when three or more samples are being compared. More formally, the technician is testing to see whether the between- sample variability is large relative to the within-sample variability, as stated in the formal hypothesis. This type of comparison also may be referred to as between-groups versus within-groups variance. Rejection of the null hypothesis (that the mean values are the same) gives the engineer some information concerning differences among the population means; however, it does not indicate which means actually differ from each other. Rejection of the null hypothesis tells the engineer that differences exist, but it does not specify that X _ 1 differs from X _ 2 or from X _ 3. To control the experiment-wise error rate (EER) for multiple mean comparisons, a con- servative testâTukeyâs procedure for unplanned comparisonsâcan be used for unplanned comparisons. (Information about Tukeyâs procedure can be found in almost any good statistics textbook, such as those by Freund and Wilson [2003] and Kutner et al. [2005].) The F-statistic calculated for determining the effect of who (agency, contractor, or consultant) measured Party Type Asphalt Content Percent Contractor 1 1 1 X s n = 6.1 = 0.254 = 12 Agency 2 2 2 X s n = 5.7 = 0.303 = 6 Consultant 3 3 3 X s n = 5.12 = 0.186 = 12 Table 13. Asphalt content data summary.

40 effective experiment Design and Data analysis in transportation research the asphalt content is given in Table 14. (See Example 8 for a more detailed discussion of the calculations necessary to create Table 14.) Although the ANOVA results reveal whether there are overall differences, it is always good practice to visually examine the data. For example, Figure 9 shows the mean and associated 95% confidence intervals (CI) of the mean asphalt content measured by each of the three parties involved in the testing. 5. Interpreting the Results: A simple one-way ANOVA is conducted to determine whether there is a difference in mean asphalt content as measured by the three different parties. The analysis shows that the F-statistic is significant (p-value < 0.05), meaning that at least two of the means are significantly different from each other. The engineer can use Tukeyâs procedure for com- parisons of multiple means, or he or she can observe the plotted 95% confidence intervals to figure out which means are actually (and significantly) different from each other (see Figure 9). Because the confidence intervals overlap, the results show that the asphalt content measured by the contractor and the agency are somewhat different. (These same conclusions were obtained in Example 7.) However, the mean asphalt content obtained by the consultant is significantly different from (and lower than) that obtained by both of the other parties. This is evident because the confidence interval for the consultant doesnât overlap with the confidence interval of either of the other two parties. Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Significance Between groups 5.6 2 2.8 49.1 0.000 Within groups 1.5 27 0.06 Total 7.2 29 Table 14. ANOVA results. Figure 9. Mean and confidence intervals for asphalt content data.

examples of effective experiment Design and Data analysis in transportation research 41 6. Conclusion and Discussion: This example uses a simple one-way ANOVA to compare the mean values of three sets of results using data drawn from the same test section. The error bar plots for data from the three different parties visually illustrate the statistical differences in the multiple means. However, the F-test for multiple means should be used to formally test the hypothesis of the equality of means. The interpretation of results will be misleading if the variances of populations being compared for their mean difference are not equal. Based on the comparison of the three means, it can be concluded that the construction material in this example may not have consistent properties, as indicated by the results from the independent consultant. 7. Applications in Other Areas of Transportation Research: Simple one-way ANOVA is often used when more than two means must be compared. Examples of applications in other areas of transportation research include: â¢ Traffic Safety/Operationsâto evaluate the effect of intersection type on the average number of accidents per month. Three or more types of intersections (e.g., signalized, non-signalized, and rotary) could be selected for study in an urban area having similar traffic volumes and vehicle mix. â¢ Pavement Engineering â to investigate the effect of hot-mix asphalt (HMA) layer thickness on fatigue cracking after 20 years of service life. Three HMA layer thicknesses (5 inches, 6 inches, and 7 inches) are to be involved in this study, and other factors (i.e., traffic, climate, and subbase/base thicknesses and subgrade types) need to be similar. â to determine the effect of climatic conditions on rutting performance of flexible pavements. Three or more climatic conditions (e.g., wet-freeze, wet-no-freeze, dry-freeze, and dry-no-freeze) need to be considered while other factors (i.e., traffic, HMA, and subbase/ base thicknesses and subgrade types) need to be similar. Example 10: Pavements; Simple Analysis of Variance (ANOVA) Area: Pavements Method of Analysis: Simple analysis of variance (ANOVA) comparing the mean values of more than two samples and using the F-test 1. Research Question/Problem Statement: The aggregate coefficient of thermal expansion (CTE) in Portland cement concrete (PCC) is a critical factor affecting thermal behavior of PCC slabs in concrete pavements. In addition, the interaction between slab curling (caused by the thermal gradient) and axle loads is assumed to be a critical factor for concrete pavement performance in terms of cracking. To verify the effect of aggregate CTE on slab cracking, a pavement engineer wants to conduct a simple observational study by collecting field pave- ment performance data on three different types of pavement. For this example, three types of aggregate (limestone, dolomite, and gravel) are being used in concrete pavement construction and yield the following CTEs: â¢ 4 in./in. per Â°F â¢ 5 in./in. per Â°F â¢ 6.5 in./in. per Â°F It is necessary to compare more than two mean values. A simple one-way ANOVA is used to analyze the observed slab cracking performance by the three different concrete mixes with different aggregate types based on geology (limestone, dolomite, and gravel). All other factors that might cause variation in cracking are assumed to be held constant.

42 effective experiment Design and Data analysis in transportation research 2. Identification and Description of Variables: The engineer identifies 1-mile sections of uni- form pavement within the state highway network with similar attributes (aggregate type, slab thickness, joint spacing, traffic, and climate). Field performance, in terms of the observed percentage of slab cracked (â% slab cracked,â i.e., how cracked is each slab) for each pavement section after about 20 years of service, is considered in the analysis. The available pavement data are grouped (stratified) based on the aggregate type (CTE value). The % slab cracked after 20 years is the dependent variable, while CTE of aggregates is the independent variable. The question is whether pavement sections having different types of aggregate (CTE values) exhibit similar performance based on their variability. 3. Data Collection: From the data stratified by CTE, the engineer randomly selects nine pave- ment sections within each CTE category (i.e., 4, 5, and 6.5 in./in. per Â°F). The sample size is based on the statistical power (1-b) requirements. (For a discussion on sample size determina- tion based on statistical power requirements, see NCHRP Project 20-45, Volume 2, Chapter 1, âSample Size Determination.â) The descriptive statistics for the data, organized by three CTE categories, are shown in Table 15. The engineer considers pavement performance data for 9 pavement sections in each CTE category. 4. Specification of Analysis Technique and Data Analysis: Because the engineer is concerned with the comparison of more than two mean values, the easiest way to make the statistical comparison is to perform a one-way ANOVA (see NCHRP Project 20-45, Volume 2, Chapter 4). The comparison will help to determine whether the between-section variability is large relative to the within-section variability. More formally, the following hypotheses are tested: HO: All mean values are equal (i.e., m1 = m2 = m3). HA: At least one of the means is different from the rest. Although rejection of the null hypothesis gives the engineer some information concerning difference among the population means, it doesnât tell the engineer anything about how the means differ from each other. For example, does m1 differ from m2 or m3? To control the experiment-wise error rate (EER) for multiple mean comparisons, a conservative testâ Tukeyâs procedure for unplanned comparisonsâcan be used. (Information about Tukeyâs procedure can be found in almost any good statistics textbook, such as those by Freund and Wilson [2003] and Kutner et al. [2005].)The F-statistic calculated for determining the effect of CTE on % slab cracked after 20 years is shown in Table 16. Question/Issue Compare the means of more than two samples. Specifically, is the cracking perfor- mance of concrete pavements designed using more than two different types of aggregates the same? Stated a bit differently, is the performance of three different types of concrete pavement statistically different (are the mean performance measures different)? CTE (in./in. per oF) % Slab Cracked After 20 Years 4 1 1 137, 4.8, 9X s n= = = 5 2 2 253.7, 6.1, 9X s n= = = 6.5 3 3 372.5, 6.3, 9X s n= = = Table 15. Pavement performance data.

examples of effective experiment Design and Data analysis in transportation research 43 The data in Table 16 have been produced by considering the original data and following the procedures presented in earlier examples. The emphasis in this example is on understanding what the table of results provides the researcher. Also in this example, the test for homogeneity of variances (Levene test) shows no significant difference among the standard deviations of % slab cracked for different CTE values. Figure 10 presents the mean and associated 95% confi- dence intervals of the average % slab cracked (also called the mean and error bars) measured for the three CTE categories considered. 5. Interpreting the Results: A simple one-way ANOVA is conducted to determine if there is a difference among the mean values for % slab cracked for different CTE values. The analysis shows that the F-statistic is significant (p-value < 0.05), meaning that at least two of the means are statistically significantly different from each other. To gain more insight, the engineer can use Tukeyâs procedure to specifically compare the mean values, or the engineer may simply observe the plotted 95% confidence intervals to ascertain which means are significantly different from each other (see Figure 10). The plotted results show that the mean % slab cracked varies significantly for different CTE valuesâthere is no overlap between the different mean/error bars. Figure 10 also shows that the mean % slab cracked is significantly higher for pavement sections having a higher CTE value. (For more information about Tukeyâs procedure, see NCHRP Project 20-45, Volume 2, Chapter 4.) 6. Conclusion and Discussion: In this example, simple one-way ANOVA is used to assess the effect of CTE on cracking performance of rigid pavements. The F-test for multiple means is used to formally test the (null) hypothesis of mean equality. The confidence interval plots for data from pavements having three different CTE values visually illustrate the statistical differ- ences in the three means. The interpretation of results will be misleading if the variances of Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Significance Between groups 5652.7 2 0.0002826.3 84.1 Within groups 806.9 24 33.6 Total 6459.6 26 Table 16. ANOVA results. Figure 10. Error bars for % slab cracked with different CTE.

44 effective experiment Design and Data analysis in transportation research populations being compared for their mean difference are not equal or if a proper multiple mean comparisons procedure is not adopted. Based on the comparison of the three means in this example, the engineer can conclude that the pavement slabs having aggregates with a higher CTE value will exhibit more cracking than those with lower CTE values, given that all other variables (e.g., climate effects) remain constant. 7. Applications in Other Areas of Transportation Research: Simple one-way ANOVA is widely used and can be employed whenever multiple means within a factor are to be compared with one another. Potential applications in other areas of transportation research include: â¢ Traffic Operationsâto evaluate the effect of commuting time on level of service (LOS) of an urban highway. Mean travel times for three periods (e.g., morning, afternoon, and evening) could be selected for specified highway sections to collect the traffic volume and headway data in all lanes. â¢ Traffic Safetyâto determine the effect of shoulder width on accident rates on rural highways. More than two shoulder widths (e.g., 0 feet, 6 feet, 9 feet, and 12 feet) should be selected in this study. â¢ Pavement Engineeringâto investigate the impact of air void content on flexible pavement fatigue performance. Pavement sections having three or more air void contents (e.g., 3%, 5%, and 7%) in the surface HMA layer could be selected to compare their average fatigue cracking performance after the same period of service (e.g., 15 years). â¢ Materialsâto study the effect of aggregate gradation on the rutting performance of flexible pavements. Three types of aggregate gradations (fine, intermediate, and coarse) could be adopted in the laboratory to make different HMA mix samples. Performance testing could be conducted in the laboratory to measure rut depths for a given number of load cycles. Example 11: Pavements; Factorial Design (ANOVA Approach) Area: Pavements Method of Analysis: Factorial design (an ANOVA approach used to explore the effects of varying more than one independent variable) 1. Research Question/Problem Statement: Extending the information from Example 10 (a simple ANOVA example for pavements), the pavement engineer has verified that the coefficient of thermal expansion (CTE) in Portland cement concrete (PCC) is a critical factor affecting thermal behavior of PCC slabs in concrete pavements and significantly affects concrete pave- ment performance in terms of cracking. The engineer now wants to investigate the effects of another factor, joint spacing (JS), in addition to CTE. To study the combined effects of PCC CTE and JS on slab cracking, the engineer needs to conduct a factorial design study by collect- ing field pavement performance data. As before, three CTEs will be considered: â¢ 4 in./in. per Â°F, â¢ 5 in./in. per Â°F, and â¢ 6.5 in./in. per Â°F. Now, three different joint spacings (12 ft, 16 ft, and 20 ft) also will be considered. For this example, it is necessary to compare multiple means within each factor (main effects) and the interaction between the two factors (interactive effects). The statistical technique involved is called a multifactorial two-way ANOVA. 2. Identification and Description of Variables: The engineer identifies uniform 1-mile pavement sections within the state highway network with similar attributes (e.g., slab thickness, traffic, and climate). The field performance, in terms of observed percentage of each slab cracked (% slab cracked) after about 20 years of service for each pavement section, is considered the

examples of effective experiment Design and Data analysis in transportation research 45 dependent (or response) variable in the analysis. The available pavement data are stratified based on CTE and JS. CTE and JS are considered the independent variables. The question is whether pavement sections having different CTE and JS exhibit similar performance based on their variability. Question/Issue Use collected data to determine the effects of varying more than one independent variable on some measured outcome. In this example, compare the cracking perfor- mance of concrete pavements considering two independent variables: (1) coefficients of thermal expansion (CTE) as measured using more than two types of aggregate and (2) differing joint spacing (JS). More formally, the hypotheses can be stated as follows: Ho : ai = 0, No difference in % slabs cracked for different CTE values. Ho : gj = 0, No difference in % slabs cracked for different JS values. Ho : (ag)ij = 0, for all i and j, No difference in % slabs cracked for different CTE and JS combinations. 3. Data Collection: The descriptive statistics for % slab cracked data by three CTE and three JS categories are shown in Table 17. From the data stratified by CTE and JS, the engineer has randomly selected three pavement sections within each of nine combinations of CTE values. (In other words, for each of the nine pavement sections from Example 10, the engineer has selected three JS.) 4. Specification of Analysis Technique and Data Analysis: The engineer can use two-way ANOVA test statistics to determine whether the between-section variability is large relative to the within-section variability for each factor to test the following null hypotheses: â¢ Ho : ai = 0 â¢ Ho : gj = 0 â¢ Ho : (ag)ij = 0 As mentioned before, although rejection of the null hypothesis does give the engineer some information concerning differences among the population means (i.e., there are differences among them), it does not clarify which means differ from each other. For example, does Âµ1 differ from Âµ2 or Âµ3? To control the experiment-wise error rate (EER) for the comparison of multiple means, a conservative testâTukeyâs procedure for an unplanned comparisonâcan be used. (Information about two-way ANOVA is available in NCHRP Project 20-45, Volume 2, CTE (in/in per oF) Marginal Âµ & Ï 4 5 6.5 Joint spacing (ft) 12 1,1 = 32.4 s1,1 = 0.1 1,2 = 46.8 s1,2 = 1.8 1,3 = 65.3 s 1,3 = 3.2 1,. = 48.2 s1,. = 14.4 16 2,1 = 36.0 s2,1 = 2.4 2,2 = 54 s2,2 = 2.9 2,3 = 73 s2,3 = 1.1 2,. = 54.3 s2,. = 16.1 20 3,1 = 42.7 s3,1 = 2.4 3,2 = 60.3 s3,2 = 0.5 3,3 = 79.1 s3,3 = 2.0 3,. = 60.7 s3,. = 15.9 Marginal Âµ & Ï .,1 = 37.0 xâ xâ xâ xâ xâ xâ xâ xâ xâ xâ xâ xâ xâ xâ xâ xâ s.,1 = 4.8 .,2 = 53.7 s.,2 = 6.1 .,3 = 72.5 s.,3 = 6.3 .,. = 54.4 s.,. = 15.8 Note: n = 3 in each cell; values are cell means and standard deviations. Table 17. Summary of cracking data.

46 effective experiment Design and Data analysis in transportation research Chapter 4. Information about Tukeyâs procedure can be found in almost any good statistics textbook, such as those by Freund and Wilson [2003] and Kutner et al. [2005].) The results of the two-way ANOVA are shown in Table 18. From the first line it can be seen that both of the main effects, CTE and JS, are significant in explaining cracking behavior (i.e., both p-values < 0.05). However, the interaction (CTE Ã JS) is not significant (i.e., the p-value is 0.999, much greater than 0.05). Also, the test for homogeneity of variances (Levene statistic) shows that there is no significant difference among the standard deviations of % slab cracked for different CTE and JS values. Figure 11 illustrates the main and interactive effects of CTE and JS on % slabs cracked. 5. Interpreting the Results: A two-way (multifactorial) ANOVA is conducted to determine if difference exists among the mean values for â% slab crackedâ for different CTE and JS values. The analysis shows that the main effects of both CTE and JS are significant, while the inter- action effect is insignificant (p-value > 0.05). These results show that when CTE and JS are considered jointly, they significantly impact the slab cracking separately. Given these results, the conclusions from the results will be based on the main effects alone without considering interaction effects. In fact, if the interaction effect had been significant, the conclusions would be based on them. To gain more insight, the engineer can use Tukeyâs procedure to compare specific multiple means within each factor, or the engineer can simply observe the plotted means in Figure 11 to ascertain which means are significantly different from each other. The plotted results show that the mean % slab cracked varies significantly for different CTE and JS values; that is, the CTE seems to be more influential than JS. All lines are almost parallel to Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F Significance CTE 5677.74 2 2838.87 657.16 0.000 JS 703.26 2 351.63 81.40 0.000 CTE Ã JS 0.12 4 0.03 0.007 0.999 Residual/error 77.76 18 4.32 Total 6458.88 26 Table 18. ANOVA results. M ea n % s la bs c ra ck ed 6.55.04.0 75 70 65 60 55 50 45 40 35 201612 CTE JS Main Effects Plot (data means) for Cracking Joint Spacing (ft) M ea n % s la bs c ra ck ed 201612 80 70 60 50 40 30 CTE 6.5 4.0 5.0 Interaction Plot (data means) for Cracking Figure 11. Main and interaction effects of CTE and JS on slab cracking.

examples of effective experiment Design and Data analysis in transportation research 47 each other when plotted for both factors together, showing no interactive effects between the levels of two factors. 6. Conclusion and Discussion: The two-way ANOVA can be used to verify the combined effects of CTE and JS on cracking performance of rigid pavements. The marginal mean plot for cracking having three different CTE and JS levels visually illustrates the differences in the multiple means. The plot of cell means for cracking within the levels of each factor can indicate the presence of interactive effect between two factors (in this example, CTE and JS). However, the F-test for multiple means should be used to formally test the hypothesis of mean equality. Finally, based on the comparison of three means within each factor (CTE and JS), the engineer can conclude that the pavement slabs having aggregates with higher CTE and JS values will exhibit more cracking than those with lower CTE and JS values. In this example, the effect of CTE on concrete pavement cracking seems to be more critical than that of JS. 7. Applications in Other Areas of Transportation Research: Multifactorial designs can be used when more than one factor is considered in a study. Possible applications of these methods can extend to all transportation-related areas, including: â¢ Pavement Engineering â to determine the effects of base type and base thickness on pavement performance of flexible pavements. Two or more levels can be considered within each factor; for exam- ple, two base types (aggregate and asphalt-treated bases) and three base thicknesses (8 inches, 12 inches, and 18 inches). â to investigate the impact of pavement surface conditions and vehicle type on fuel con- sumption. The researcher can select pavement sections with three levels of ride quality (smooth, rough, and very rough) and three types of vehicles (cars, vans, and trucks). The fuel consumptions can be measured for each vehicle type on all surface conditions to determine their impact. â¢ Materials â to study the effects of aggregate gradation and surface on tensile strength of hot-mix asphalt (HMA). The engineer can evaluate two levels of gradation (fine and coarse) and two types of aggregate surfaces (smooth and rough). The samples can be prepared for all the combinations of aggregate gradations and surfaces for determination of tensile strength in the laboratory. â to compare the impact of curing and cement types on the compressive strength of concrete mixture. The engineer can design concrete mixes in laboratory utilizing two cement types (Type I & Type III). The concrete samples can be cured in three different ways for 24 hours and 7 days (normal curing, water bath, and room temperature). Example 12: Work Zones; Simple Before-and-After Comparisons Area: Work zones Method of Analysis: Simple before-and-after comparisons (exploring the effect of some treat- ment before it is applied versus after it is applied) 1. Research Question/Problem Statement: The crash rate in work zones has been found to be higher than the crash rate on the same roads when a work zone is not present. For this reason, the speed limit in construction zones often is set lower than the prevailing non-work-zone speed limit. The state DOT decides to implement photo-radar speed enforcement in a work zone to determine if this speed-enforcement technique reduces the average speed of free- flowing vehicles in the traffic stream. They measure the speeds of a sample of free-flowing vehicles prior to installing the photo-radar speed-enforcement equipment in a work zone and

48 effective experiment Design and Data analysis in transportation research then measure the speeds of free-flowing vehicles at the same location after implementing the photo-radar system. Question/Issue Use collected data to determine whether a difference exists between results before and after some treatment is applied. For this example, does a photo-radar speed- enforcement system reduce the speed of free-flowing vehicles in a work zone, and, if so, is the reduction statistically significant? 2. Identification and Description of Variables: The variable to be analyzed is the mean speed of vehicles before and after the implementation of a photo-radar speed-enforcement system in a work zone. 3. Data Collection: The speeds of individual free-flowing vehicles are recorded for 30 minutes on a Tuesday between 10:00 a.m. and 10:30 a.m. before installing the photo-radar system. After the system is installed, the speeds of individual free-flowing vehicles are recorded for 30 minutes on a Tuesday between 10:00 a.m. and 10:30 a.m. The before sample contains 120 observations and the after sample contains 100 observations. 4. Specification of Analysis Technique and Data Analysis: A test of the significance of the difference between two means requires a statement of the hypothesis to be tested (Ho) and a statement of the alternate hypothesis (H1). In this example, these hypotheses can be stated as follows: Ho: There is no difference in the mean speed of free-flowing vehicles before and after the photo-radar speed-enforcement system is displayed. H1: There is a difference in the mean speed of free-flowing vehicles before and after the photo-radar speed-enforcement system is displayed. Because these two samples are independent, a simple t-test is appropriate to test the stated hypotheses. This test requires the following procedure: Step 1. Compute the mean speed (x _ ) for the before sample (x _ b) and the after sample (x _ a) using the following equation: x x n n ni i i n i b a= = = = â 1 120 100; and Results: x _ b = 53.1 mph and x _ a = 50.5 mph. Step 2. Compute the variance (S2) for each sample using the following equation: S x x n i i i n 2 2 1 1 = â( ) â â â where na = 100; x _ a= 50.5 mph; nb = 120; and x _ b = 53.1 mph Results: S x x n b b b b 2 2 1 12 06= â( ) â =â . and S x x n a a a a 2 2 1 12 97= â( ) â =â . . Step 3. Compute the pooled variance of the two samples using the following equation: S x x x x n n p a a b b b a 2 2 2 2 = â( ) + â( ) + â ââ Results: S2p = 12.472 and Sp = 3.532.

examples of effective experiment Design and Data analysis in transportation research 49 Step 4. Compute the t-statistic using the following equation: t x x S n n n n b a p a b a b = â + Result: t = â ( )( ) + = 53 1 50 5 3 532 100 120 100 120 5 43 . . . . . 5. Interpreting the Results: The results of the sample t-test are obtained by comparing the value of the calculated t-statistic (5.43 in this example) with the value of the t-statistic for the level of confidence desired. For a level of confidence of 95%, the t-statistic must be greater than 1.96 to reject the null hypotheses (Ho) that the use of a photo-radar speed-enforcement sys- tem does not change the speed of free-flowing vehicles. (For more information, see NCHRP Project 20-45, Volume 2, Appendix C, Table C-4.) 6. Conclusion and Discussion: The sample problem illustrates the use of a statistical test to determine whether the difference in the value of the variable of interest between the before conditions and the after conditions is statistically significant. The before condition is without photo-radar speed enforcement; and the after condition is with photo-radar speed enforcement. In this sample problem, the computed t-statistic (5.43) is greater than the critical t-statistic (1.96), so the null hypothesis is rejected. This means the change in the speed of free-flowing vehicles when the photo-radar speed-enforcement system is used is statistically significant. The assumption is made that all other factors that would affect the speed of free-flowing vehicles (e.g., traffic mix, weather, or construction activity) are the same in the before-and-after conditions. This test is robust if the normality assumption does not hold completely; however, it should be checked using box plots. For significant departures from normality and variance equality assumptions, non-parametric tests must be conducted. (For more information, see NCHRP Project 20-45, Volume 2, Chapter 6, Section C and also Example 21). The reliability of the results in this example could be improved by using a control group. As the example has been constructed, there is an assumption that the only thing that changed at this site was the use of photo-radar speed enforcement; that is, it is assumed that all observed differences are attributable to the use of the photo-radar. If other factorsâeven something as simple as a general decrease in vehicle speeds in the areaâmight have impacted speed changes, the effect of the photo-radar speed enforcement would have to be adjusted for those other factors. Measurements taken at a control site (ideally identical to the experiment site) during the same time periods could be used to detect background changes and then to adjust the photo-radar effects. Such a situation is explored in Example 13. 7. Applications in Other Areas in Transportation: The before-and-after comparison can be used whenever two independent samples of data are (or can be assumed to be) normally distributed with equal variance. Applications of before-and-after comparison in other areas of transportation research may include: â¢ Traffic Operations â to compare the average delay to vehicles approaching a signalized intersection when a fixed time signal is changed to an actuated signal or a traffic-adaptive signal. â to compare the average number of vehicles entering and leaving a driveway when access is changed from full access to right-in, right-out only. â¢ Traffic Safety â to compare the average number of crashes on a section of road before and after the road is resurfaced. â to compare the average number of speeding citations issued per day when a stationary operation is changed to a mobile operation. â¢ Maintenanceâto compare the average number of citizen complaints per day when a change is made in the snow plowing policy.

50 effective experiment Design and Data analysis in transportation research Example 13: Traffic Safety; Complex Before-and-After Comparisons and Controls Area: Traffic safety Method of Analysis: Complex before-and-after comparisons using control groups (examining the effect of some treatment or application with consideration of other factors that may also have an effect) 1. Research Question/Problem Statement: A state safety engineer wants to estimate the effec- tiveness of fluorescent orange warning signs as compared to standard orange signs in work zones on freeways and other multilane highways. Drivers can see fluorescent signs from a longer distance than standard signs, especially in low-visibility conditions, and the extra cost of the fluorescent material is not too high. Work-zone safety is a perennial concern, especially on freeways and multilane highways where speeds and traffic volumes are high. Question/Issue How can background effects be separated from the effects of a treatment or application? Compared to standard orange signs, do fluorescent orange warning signs increase safety in work zones on freeways and multilane highways? 2. Identification and Description of Variables: The engineer quickly concludes that there is a need to collect and analyze safety surrogate measures (e.g., traffic conflicts and late lane changes) rather than collision data. It would take a long time and require experimentation at many work zones before a large sample of collision data could be ready for analysis on this question. Surrogate measures relate to collisions, but they are much more numerous and it is easier to collect a large sample of them in a short time. For a study of traffic safety, surrogate measures might include near-collisions (traffic conflicts), vehicle speeds, or locations of lane changes. In this example, the engineer chooses to use the location of the lane-change maneuver made by drivers in a lane to be closed entering a work zone. This particular surrogate safety measure is a measure of effectiveness (MOE). The hypothesis is that the farther downstream at which a driver makes a lane change out of a lane to be closedâwhen the highway is still below capacityâthe safer the work zone. 3. Data Collection: The engineer establishes site selection criteria and begins examining all active work zones on freeways and multilane highways in the state for possible inclusion in the study. The site selection criteria include items such as an active work zone, a cooperative contractor, no interchanges within the approach area, and the desired lane geometry. Seven work zones meet the criteria and are included in the study. The engineer decides to use a before-and-after (sometimes designated B/A or b/a) experiment design with randomly selected control sites. The latter are sites in the same population as the treatment sites; that is, they meet the same selection criteria but are untreated (i.e., standard warning signs are employed, not the fluorescent orange signs). This is a strong experiment design because it minimizes three common types of bias in experiments: history, maturation, and regression to the mean. History bias exists when changes (e.g., new laws or large weather events) happen at about the same time as the treatment in an experiment, so that the engineer or analyst cannot separate the effect of the treatment from the effects of the other events. Maturation bias exists when gradual changes occur throughout an extended experiment period and cannot be separated from the effects of the treatment. Examples of maturation bias might involve changes like the aging of driver populations or new vehicles with more air bags. History and maturation biases are referred to as specification errors and are described in more detail in NCHRP Project 20-45, Volume 2,

examples of effective experiment Design and Data analysis in transportation research 51 Chapter 1, in the section âQuasi-Experiments.â Regression-to-the-mean bias exists when sites with the highest MOE levels in the before time period are treated. If the MOE level falls in the after period, the analyst can never be sure how much of the fall was due to the treatment and how much was due to natural fluctuations in the values of the MOE back toward its usual mean value. A before-and-after study with randomly selected control sites minimizes these biases because their effects are expected to apply just as much to the treatment sites as to the control sites. In this example, the engineer randomly selects four of the seven work zones to receive fluorescent orange signs. The other three randomly selected work zones received standard orange signs and are the control sites. After the signs have been in place for a few weeks (a common tactic in before-and-after studies to allow regular drivers to get used to the change), the engineer collects data at all seven sites. The location of each vehicleâs lane-change maneuver out of the lane to be closed is measured from video tape recorded for several hours at each site. Table 19 shows the lane-change data at the midpoint between the first warning sign and beginning of the taper. Notice that the same number of vehicles is observed in the before-and- after periods for each type of site. 4. Specification of Analysis Technique and Data Analysis: Depending on their format, data from a before-and-after experiment with control sites may be analyzed several ways. The data in the table lend themselves to analysis with a chi-square test to see whether the distributions between the before-and-after conditions are the same at both the treatment and control sites. (For more information about chi-square testing, see NCHRP Project 20-45, Volume 2, Chapter 6, Section E, âChi-Square Test for Independence.â) To perform the chi-square test on the data for Example 13, the engineer first computes the expected value in each cell. For the cell corresponding to the before time period for control sites, this value is computed as the row total (3361) times the column total (2738) divided by the grand total (6714): 3361 2738 6714 1371 = vehicles The engineer next computes the chi-square value for each cell using the following equation: Ïi i i i O E E 2 2 = â( ) where Oi is the number of actual observations in cell i and Ei is the expected number of observations in cell i. For example, the chi-square value in the cell corresponding to the before time period for control sites is (1262 - 1371)2 / 1371 = 8.6. The engineer then sums the chi-square values from all four cells to get 29.1. That sum is then compared to the critical chi-square value for the significance level of 0.025 with 1 degree of freedom (degrees of freedom = number of rows - 1 * number of columns - 1), which is shown on a standard chi-square distribution table to be 5.02 (see NCHRP Project 20-45, Volume 2, Appendix C, Table C-2.) A significance level of 0.025 is not uncommon in such experiments (although 0.05 is a general default value), but it is a standard that is difficult but not impossible to meet. Time Period Number of Vehicles Observed in Lane to be Closed at Midpoint Control Treatment Total Before 1262 2099 3361 After 1476 1877 3353 Total 2738 3976 6714 Table 19. Lane-change data for before-and-after comparison using controls.

52 effective experiment Design and Data analysis in transportation research 5. Interpreting the Results: Because the calculated chi-square value is greater than the critical chi-square value, the engineer concludes that there is a statistically significant difference in the number of vehicles in the lane to be closed at the midpoint between the before-and-after time periods for the treatment sites relative to what would be expected based on the control sites. In other words, there is a difference that is due to the treatment. 6. Conclusion and Discussion: The experiment results show that fluorescent orange signs in work zone approaches like those tested would likely have a safety benefit. Although the engi- neer cannot reasonably estimate the number of collisions that would be avoided by using this treatment, the before-and-after study with control using a safety surrogate measure makes it clear that some collisions will be avoided. The strength of the experiment design with randomly selected control sites means that agencies can have confidence in the results. The consequences of an error in an analysis like this that results in the wrong conclusion can be devastating. If the error leads an agency to use a safety measure more than it should, precious safety funds will be wasted that could be put to better use. If the error leads an agency to use the safety measure less often than it should, money will be spent on measures that do not prevent as many collisions. With safety funds in such short supply, solid analyses that lead to effective decisions on countermeasure deployment are of great importance. A before-and-after experiment with control is difficult to arrange in practice. Such an experiment is practically impossible using collision data, because that would mean leaving some higher collision sites untreated during the experiment. Such experiments are more plausible using surrogate measures like the one described in this example. 7. Applications in Other Areas of Transportation Research: Before-and-after experiments with randomly selected control sites are difficult to arrange in transportation safety and other areas of transportation research. The instinct to apply treatments to the worst sites, rather than randomlyâas this method requiresâis difficult to overcome. Despite the difficulties, such experiments are sometimes performed in: â¢ Traffic Operationsâto test traffic control strategies at a number of different intersections. â¢ Pavement Engineeringâto compare new pavement designs and maintenance processes to current designs and practice. â¢ Materialsâto compare new materials, mixes, or processes to standard mixtures or processes. Example 14: Work Zones; Trend Analysis Area: Work zones Method of Analysis: Trend analysis (examining, describing, and modeling how something changes over time) 1. Research Question/Problem Statement: Measurements conducted over time often reveal patterns of change called trends. A model may be used to predict some future measurement, or the relative success of a different treatment or policy may be assessed. For example, work/ construction zone safety has been a concern for highway officials, engineers, and planners for many years. Is there a pattern of change? Question/Issue Can a linear model represent change over time? In this particular example, is there a trend over time for motor vehicle crashes in work zones? The problem is to predict values of crash frequency at specific points in time. Although the question is simple, the statistical modeling becomes sophisticated very quickly.

examples of effective experiment Design and Data analysis in transportation research 53 2. Identification and Description of Variables: Highway safety, rather the lack of it, is revealed by the total number of fatalities due to motor vehicle crashes. The percentage of those deaths occurring in work zones reveals a pattern over time (Figure 12). The data points for the graph are calculated using the following equation: WZP a b YEAR u= + + where WZP = work zone percentage of total fatalities, YEAR = calendar year, and u = an error term, as used here. 3. Data Collection: The base data are obtained from the Fatality Analysis Reporting System maintained by the National Highway Traffic Safety Administration (NHTSA), as reported at www.workzonesafety.org. The data are state specific as well as for the country as a whole, and cover a period of 26 years from 1982 through 2007. The numbers of fatalities from motor vehicle crashes in and not in construction/maintenance zones (work zones) are used to compute the percentage of fatalities in work zones for each of the 26 years. 4. Specification of Analysis Techniques and Data Analysis: Ordinary least squares (OLS) regression is used to develop the general model specified above. The discussion in this example focuses on the resulting model and the related statistics. (See also examples 15, 16, and 17 for details on calculations. For more information about OLS regression, see NCHRP Project 20-45, Volume 2, Chapter 4, Section B, âLinear Regression.â) Looking at the data in Figure 12 another way, WZP = -91.523 (-8.34) (0.000) + 0.047(YEAR) (8.51) (0.000) R = 0.867 t-values p-values R2 = 0.751 The trend is significant: the line (trend) shows an increase of 0.047% each year. Generally, this trend shows that work-zone fatalities are increasing as a percentage of total fatalities. 5. Interpreting the Results: This experiment is a good fit and generally shows that work-zone fatalities were an increasing problem over the period 1982 through 2007. This is a trend that highway officials, engineers, and planners would like to change. The analyst is therefore interested in anticipating the trajectory of the trend. Here the trend suggests that things are getting worse. Figure 12. Percentage of all motor vehicle fatalities occurring in work zones.

54 effective experiment Design and Data analysis in transportation research How far might authorities let things goâ5%? 10%? 25%? Caution must be exercised when interpreting a trend beyond the limits of the available data. Technically the slope, or b-coefficient, is the trend of the relationship. The a-term from the regression, also called the intercept, is the value of WZP when the independent variable equals zero. The intercept for the trend in this example would technically indicate that the percentage of motor vehicle fatalities in work zones in the year zero would be -91.5%. This is absurd on many levels. There could be no motor vehicles in year zero, and what is a negative percentage of the total? The absurdity of the intercept in this example reveals that trends are limited concepts, limited to a relevant time frame. Figure 12 also suggests that the trend, while valid for the 26 years in aggregate, doesnât work very well for the last 5 years, during which the percentages are consistently falling, not rising. Something seems to have changed around 2002; perhaps the highway officials, engineers, and planners took action to change the trend, in which case, the trend reversal would be considered a policy success. Finally, some underlying assumptions must be considered. For example, there is an implicit assumption that the types of roads with construction zones are similar from year to year. If this assumption is not correct (e.g., if a greater number of high speed roads, where fatalities may be more likely, are worked on in some years than in others), then interpreting the trend may not make much sense. 6. Conclusion and Discussion: The computation of this dependent variable (the percent of motor-vehicle fatalities occurring in work zones, or MZP) is influenced by changes in the number of work-zone fatalities and the number of non-work-zone fatalities. To some extent, both of these are random variables. Accordingly, it is difficult to distinguish a trend or trend reversal from a short series of possibly random movements in the same direction. Statistically, more observations permit greater confidence in non-randomness. It is also possible that a data series might be recorded that contains regular, non-random movements that are unrelated to a trend. Consider the dependent variable above (MZP), but measured using monthly data instead of annual data. Further, imagine looking at such data for a state in the upper Midwest instead of for the nation as a whole. In this new situation, the WZP might fall off or halt altogether each winter (when construction and maintenance work are minimized), only to rise again in the spring (reflecting renewed work-zone activity). This change is not a trend per se, nor is it random. Rather, it is cyclical. 7. Applications in Other Areas of Transportation Research: Applications of trend analysis models in other areas of transportation research include: â¢ Transportation Safetyâto identify trends in traffic crashes (e.g., motor vehicle/deer) over time on some part of the roadway system (e.g., freeways). â¢ Public Transportationâto determine the trend in rail passenger trips over time (e.g., in response to increasing gas prices). â¢ Pavement Engineeringâto monitor the number of miles of pavement that is below some service-life threshold over time. â¢ Environmentâto monitor the hours of truck idling time in rest areas over time. Example 15: Structures/Bridges; Trend Analysis Area: Structures/bridges Method of Analysis: Trend analysis (examining a trend over time) 1. Research Question/Problem Statement: A state agency wants to monitor trends in the condition of bridge superstructures in order to perform long-term needs assessment for bridge rehabilitation or replacement. Bridge condition rating data will be analyzed for bridge

examples of effective experiment Design and Data analysis in transportation research 55 2. Identification and Description of Variables: Bridge inspection generally entails collection of numerous variables including location information, traffic data, structural elements (type and condition), and functional characteristics. Based on the severity of deterioration and the extent of spread through a bridge component, a condition rating is assigned on a dis- crete scale from 0 (failed) to 9 (excellent). Generally a condition rating of 4 or below indicates deficiency in a structural component. The state agency inspects approximately 300 bridges every year (denominator). The number of superstructures that receive a rating of 4 or below each year (number of events, numerator) also is recorded. The agency is concerned with the change in overall rate (calculated per 100) of structurally deficient bridge superstructures. This rate, which is simply the ratio of the numerator to the denominator, is the indicator (dependent variable) to be examined for trend over a time period of 15 years. Notice that the unit of analysis is the time period and not the individual bridge superstructures. 3. Data Collection: Data are collected for bridges scheduled for inspection each year. It is important to note that the bridge condition rating scale is based on subjective categories, and therefore there may be inherent variability among inspectors in their assignments of rates to bridge superstructures. Also, it is assumed that during the time period for which the trend analysis is conducted, no major changes are introduced in the bridge inspection methods. Sample data provided in Table 20 show the rate (per 100), number of bridges per year that received a score of four or below, and total number of bridges inspected per year. 4. Specification of Analysis Technique and Data Analysis: The data set consists of 15 observa- tions, one for each year. Figure 13 shows a scatter plot of the rate (dependent variable) versus time in years. The scatter plot does not indicate the presence of any outliers. The scatter plot shows a seemingly increasing linear trend in the rate of deficient superstructures over time. No need for data transformation or smoothing is apparent from the examination of the scatter plot in Figure 13. To determine whether the apparent linear trend is statistically significant in this data, ordinary least squares (OLS) regression can be employed. Question/Issue Use collected data to determine if the values that some variables have taken show an increasing trend or a decreasing trend over time. In this example, determine if levels of structural deficiency in bridge superstructures have been increasing or decreasing over time, and determine how rapidly the increase or decrease has occurred. No. Year Rate (per 100) Number of Events (Numerator) Number of Bridges Inspected (Denominator) 1 1990 8.33 25 300 2 1991 8.70 26 299 5 1994 10.54 31 294 11 2000 13.55 42 310 15 2004 14.61 45 308 Table 20. Sample bridge inspection data. superstructures that have been inspected over a period of 15 years. The objective of this study is to examine the overall pattern of change in the indicator variable over time.

56 effective experiment Design and Data analysis in transportation research The linear regression model takes the following form: y x ei o i i= + +Î² Î²1 where i = 1, 2, . . . , n (n = 15 in this example), y = dependent variable (rate of structurally deficient bridge superstructures), x = independent variable (time), bo = y-intercept (only provides reference point), b1 = slope (change in unit y for a change in unit x), and ei = residual error. The first step is to estimate the bo and b1 in the regression function. The residual errors (e) are assumed to be independently and identically distributed (i.e., they are mutually independent and have the same probability distribution). b1 and bo can be computed using the following equations: Ë . Ë Î² Î² 1 1 2 1 0 454= â( ) â( ) â( ) = = = = â â x x y y x x i i i n i i n o y xâ =Î²1 8 396. where y _ is the overall mean of the dependent variable and x _ is the overall mean of the independent variable. The prediction equation for rate of structurally deficient bridge superstructures over time can be written using the following equation: Ë Ë Ë . .y x xo= + = +Î² Î²1 8 396 0 454 That is, as time increases by a year, the rate of structurally deficient bridge superstructures increases by 0.454 per 100 bridges. The plot of the regression line is shown in Figure 14. Figure 14 indicates some small variability about the regression line. To conduct hypothesis testing for the regression relationship (Ho: b1 = 0), assessment of this variability and the assumption of normality would be required. (For a discussion on assumptions for residual errors, see NCHRP Project 20-45, Volume 2, Chapter 4.) Like analysis of variance (ANOVA, described in examples 8, 9, and 10), statistical inference is initiated by partitioning the total sum of squares (TSS) into the error sum of squares (SSE) Figure 13. Scatter plot of time versus rate. 7.00 9.00 11.00 13.00 15.00 Time in years Ra te p er 1 00 1 3 5 7 9 11 13 15

examples of effective experiment Design and Data analysis in transportation research 57 and the model sum of squares (SSR). That is, TSS = SSE + SSR. The TSS is defined as the sum of the squares of the difference of each observation from the overall mean. In other words, deviation of observation from overall mean (TSS) = deviation of observation from prediction (SSE) + deviation of prediction from overall mean (SSR). For our example, TSS y y SSR x x i i n i = â( ) = = â( ) = = â 2 1 1 2 2 60 892 57 7 . Ë .Î² 90 3 102 1i n SSE TSS SSR = â = â = . Regression analysis computations are usually summarized in a table (see Table 21). The mean squared errors (MSR, MSE) are computed by dividing the sums of squares by corresponding model and error degrees of freedom. For the null hypothesis (Ho: b1 = 0) to be true, the expected value of MSR is equal to the expected value of MSE such that F = MSR/MSE should be a random draw from an F-distribution with 1, n - 2 degrees of freedom. From the regression shown in Table 21, F is computed to be 242.143, and the probability of getting a value larger than the F computed is extremely small. Therefore, the null hypothesis is rejected; that is, the slope is significantly different from zero, and the linearly increasing trend is found to be statistically significant. Notice that a slope of zero implies that knowing a value of the independent variable provides no insight on the value of the dependent variable. 5. Interpreting the Results: The linear regression model does not imply any cause-and-effect relationship between the independent and dependent variables. The y-intercept only provides a reference point, and the relationship need not be linear outside the data range. The 95% confidence interval for b1 is computed as [0.391, 0.517]; that is, the analyst is 95% confident that the true mean increase in the rate of structurally deficient bridge superstructures is between Plot of regression line y = 8.396 + 0.454x R2 = 0.949 7.00 9.00 11.00 13.00 15.00 1 3 5 7 9 11 13 15 Time in years Ra te p er 1 00 Figure 14. Plot of regression line. Source Sum of Squares (SS) Degrees of Freedom (df) Mean Square F Significance Regression 57.790 1 57.790 (MSR) 242.143 8.769e-10 Error 3.102 13 0.239 (MSE) Total 60.892 14 Table 21. Analysis of regression table.

58 effective experiment Design and Data analysis in transportation research 0.391% and 0.517% per year. (For a discussion on computing confidence intervals, see NCHRP Project 20-45, Volume 2, Chapter 4.) The coefficient of determination (R2) provides an indication of the model fit. For this example, R2 is calculated using the following equation: R SSE TSS 2 0 949= = . The R2 indicates that the regression model accounts for 94.9% of the total variation in the (hypothetical) data. It should be noted that such a high value of R2 is almost impossible to attain from analysis of real observational data collected over a long time. Also, distributional assumptions must be checked before proceeding with linear regression, as serious violations may indicate the need for data transformation, use of non-linear regression or non-parametric methods, and so on. 6. Conclusion and Discussion: In this example, simple linear regression has been used to deter- mine the trend in the rate of structurally deficient bridge superstructures in a geographic area. In addition to assessing the overall patterns of change, trend analysis may be performed to: â¢ study the levels of indicators of change (or dependent variables) in different time periods to evaluate the impact of technical advances or policy changes; â¢ compare different geographic areas or different populations with perhaps varying degrees of exposure in absolute and relative terms; and â¢ make projections to monitor progress toward an objective. However, given the dynamic nature of trend data, many of these applications require more sophisticated techniques than simple linear regression. An important aspect of examining trends over time is the accuracy of numerator and denominator data. For example, bridge structures may be examined more than once during the analysis time period, and retrofit measures may be taken at some deficient bridges. Also, the age of structures is not accounted for in this analysis. For the purpose of this example, it is assumed that these (and other similar) effects are negligible and do not confound the data. In real-life application, however, if the analysis time period is very long, it becomes extremely important to account for changes in factors that may have affected the dependent variable(s) and their measurement. An example of the latter could be changes in the volume of heavy trucks using the bridge, changes in maintenance policies, or changes in plowing and salting regimes. 7. Applications in Other Areas of Transportation Research: Trend analysis is carried out in many areas of transportation research, such as: â¢ Transportation Planning/Traffic Operationsâto determine the need for capital improve- ments by examining traffic growth over time. â¢ Traffic Safetyâto study the trends in overall, fatal, and/or injury crash rates over time in a geographic area. â¢ Pavement Engineeringâto assess the long-term performance of pavements under varying loads. â¢ Environmentâto monitor the emission levels from commercial traffic over time with growth of industrial areas. Example 16: Transportation Planning; Multiple Regression Analysis Area: Transportation planning Method of Analysis: Multiple regression analysis (testing proposed linear models with more than one independent variable when all variables are continuous)

examples of effective experiment Design and Data analysis in transportation research 59 1. Research Question/Problem Statement: Transportation planners and engineers often work on variations of the classic four-step transportation planning process for estimat- ing travel demand. The first step, trip generation, generally involves developing a model that can be used to predict the number of trips originating or ending in a zone, which is a geographical subdivision of a corridor, city, or region (also referred to as a traffic analysis zone or TAZ). The objective is to develop a statistical relationship (a model) that can be used to explain the variation in a dependent variable based on the variation of one or more independent variables. In this example, ordinary least squares (OLS) regres- sion is used to develop a model between trips generated (the dependent variable) and demographic, socio-economic, and employment variables (independent variables) at the household level. Question/Issue Can a linear relationship (model) be developed between a dependent variable and one or more independent variables? In this application, the dependent variable is the number of trips produced by households. Independent variables include persons, workers, and vehicles in a household, household income, and average age of persons in the household. The basic question is whether the relationship between the dependent (Y) and independent (X) variables can be represented by a linear model using two coefficients (a and b), expressed as follows: Y X= +a b i where a = the intercept and b = the slope of the line. If the relationship being examined involves more than one independent variable, the equa- tion will simply have more terms. In addition, in a more formal presentation, the equation will also include an error term, e, added at the end. 2. Identification and Description of Variables: Data for four-step modeling of travel demand or for calibration of any specific model (e.g., trip generation or trip origins) come from a variety of sources, ranging from the U.S. Census to mail or telephone surveys. The data that are collected will depend, in part, on the specific purpose of the modeling effort. Data appropriate for a trip-generation model typically are collected from some sort of household survey. For the dependent variable in a trip-generation model, data must be collected on trip-making characteristics. These characteristics could include something as simple as the total trips made by a household in a day or involve more complicated break- downs by trip purpose (e.g., work-related trips versus shopping trips) and time of day (e.g., trips made during peak and non-peak hours). The basic issue that must be addressed is to determine the purpose of the proposed model: What is to be estimated or predicted? Weekdays and work trips normally are associated with peak congestion and are often the focus of these models. For the independent variable(s), the analyst must first give some thought to what would be the likely causes for household trips to vary. For example, it makes sense intuitively that household size might be pertinent (i.e., it seems reasonable that more persons in the household would lead to a higher number of household trips). Household members could be divided into workers and non-workers, two variables instead of one. Likewise, other socio-economic characteristics, such as income-related variables, might also make sense as candidate variables for the model. Data are collected on a range of candidate variables, and

60 effective experiment Design and Data analysis in transportation research the analysis process is used to sort through these variables to determine which combination leads to the best model. To be used in ordinary regression modeling, variables need to be continuous; that is, measured ratio or interval scale variables. Nominal data may be incorporated through the use of indicator (dummy) variables. (For more information on continuous variables, see NCHRP Project 20-45, Volume 2, Chapter 1; for more information on dummy variables, see NCHRP Project 20-45, Volume 2, Chapter 4). 3. Data Collection: As noted, data for modeling travel demand often come from surveys designed especially for the modeling effort. Data also may be available from centralized sources such as a state DOT or local metropolitan planning organization (MPO). 4. Specification of Analysis Techniques and Data Analysis: In this example, data for 178 house- holds in a small city in the Midwest have been provided by the state DOT. The data are obtained from surveys of about 15,000 households all across the state. This example uses only a tiny portion of the data set (see Table 22). Based on the data, a fairly obvious relationship is initially hypothesized: more persons in a household (PERS) should produce more person- trips (TRIPS). In its simplest form, the regression model has one dependent variable and one independent variable. The underlying assumption is that variation in the independent variable causes the variation in the dependent variable. For example, the dependent variable might be TRIPSi (the count of total trips made on a typical weekday), and the independent variable might be PERS (the total number of persons, or occupants, in the household). Expressing the relation- ship between TRIPS and PERS for the ith household in a sample of households results in the following hypothesized model: TRIPS PERSi i i= + +a b i Îµ where a and b are coefficients to be determined by ordinary least squares (OLS) regression analysis and ei is the error term. The difference between the value of TRIPS for any household predicted using the devel- oped equation and the actual observed value of TRIPS for that same household is called the residual. The resulting model is an equation for the best fit straight line (for the given data) where a is the intercept and b is the slope of the line. (For more information about fitted regression and measures of fit see NCHRP Project 20-45, Volume 2, Chapter 4). In Table 22, R is the multiple R, the correlation coefficient in the case of the simplest linear regression involving one variable (also called univariate regression). The R2 (coefficient of determination) may be interpreted as the proportion of the variance of the dependent variable explained by the fitted regression model. The adjusted R2 corrects for the number of independent variables in the equation. A âperfectâ R2 of 1.0 could be obtained if one included enough independent variables (e.g., one for each observation), but doing so would hardly be useful. Coefficients t-values (statistics) p-values Measures of Fit a = 3.347 4.626 0.000 R = 0.510 b = 2.001 7.515 0.000 R2 = 0.260 Adjusted R2 = 0.255 Table 22. Regression model statistics.

examples of effective experiment Design and Data analysis in transportation research 61 Restating the now-calibrated model, TRIPS PERS= +4 626 7 515. . i The statistical significance of each coefficient estimate is evaluated with the p-values of calculated t-statistics, provided the errors are normally distributed. The p-values (also known as probability values) generally indicate whether the coefficients are significantly different from zero (which they need to be in order for the model to be useful). More formally stated, a p-value is the probability of a Type I error. In this example, the t- and p-values shown in Table 22 indicate that both a and b are sig- nificantly different from zero at a level of significance greater than the 99.9% confidence level. P-values are generally offered as two-tail (two-sided hypothesis testing) test values in results from most computer packages; one-tail (one-sided) values may sometimes be obtained by dividing the printed p-values by two. (For more information about one-sided versus two- sided hypothesis testing, see NCHRP Project 20-45, Volume 2, Chapter 4.) The R2 may be tested with an F-statistic; in this example, the F was calculated as 56.469 (degrees of freedom = 2, 176) (See NCHRP Project 20-45, Volume 2, Chapter 4). This means that the model explains a significant amount of the variation in the dependent variable. A plot of the estimated model (line) and the actual data are shown in Figure 15. A strict interpretation of this model suggests that a household with zero occupants (PERS = 0) will produce 3.347 trips per day. Clearly, this is not feasible because there canât be a household of zero persons, which illustrates the kind of problem encountered when a model is extrapolated beyond the range of the data used for the calibration. In other words, a formal test of the intercept (the a) is not always meaningful or appropriate. Extension of the Model to Multivariate Regression: When the list of potential inde- pendent variables is considered, the researcher or analyst might determine that more than one cause for variation in the dependent variable may exist. In the current example, the question of whether there is more than one cause for variation in the number of trips can be considered. 0 1 2 3 4 5 6 7 8 9 10 PERS 0 10 20 30 40 TR IP S Figure 15. Plot of the line for the estimated model.

62 effective experiment Design and Data analysis in transportation research The model just discussed for evaluating the effect of one independent variable is called a uni- variate model. Should the final model for this example be multivariate? Before determining the final model, the analyst may want to consider whether a variable or variables exist that further clarify what has already been modeled (e.g., more persons cause more trips). The variable PERS is a crude measure, made up of workers and non-workers. Most households have one or two workers. It can be shown that a measure of the non-workers in the household is more effective in explaining trips than is total persons; so a new variable, persons minus workers (DEP), is calculated. Next, variables may exist that address entirely different causal relationships. It might be hypothesized that as the number of registered motor vehicles available in the household (VEH) increases, the number of trips will increase. It may also be argued that as household income (INC, measured in thousands of dollars) increases, the number of trips will increase. Finally, it may be argued that as the average age of household occupants (AVEAGE) increases, the number of trips will decrease because retired people generally make fewer trips. Each of these statements is based upon a logical argument (hypothesis). Given these arguments, the hypothesized multivariate model takes the following form: TRIPS DEP VEH INC AVEAGE= + + + + +a b c d ei i i i Îµ The results from fitting the multivariate model are given in Table 23. Results of the analysis of variance (ANOVA) for the overall model are shown in Table 24. 5. Interpreting the Results: It is common for regression packages to provide some values in scientific notation as shown for the p-values in Table 23. The coefficient d, showing the relationship of TRIPS with INC, is read 1.907 E-05, which in turn is read as 1.907 îº 10-5 or 0.000001907. All coefficients are of the expected sign and significantly different from 0 (at the 0.05 level) except for d. However, testing the intercept makes little sense. (The intercept value would be the number of trips for a household with 0 vehicles, 0 income, 0 average age, and 0 depen- dents, a most unlikely household.) The overall model is significant as shown by the F-ratio and its p-value, meaning that the model explains a significant amount of the variation in Coefficients t-values (statistics) p-values Measures of Fit a = 8.564 6.274 3.57E-09* R = 0.589 b = 0.899 2.832 0.005 R2 = 0.347 c = 1.067 3.360 0.001 adjusted R2 = 0.330 d = 1.907E-05* 1.927 0.056 e = -0.098 -4.808 3.68E-06 *See note about scientific notation in Section 5, Interpreting the Results. Table 23. Results from fitting the multivariate model. ANOVA Sum of Squares (SS) Degrees of Freedom (df) F-ratio p-value Regression 1487.5 4 19.952 3.4E-13 Residual 2795.7 150 Table 24. ANOVA results for the overall model.

examples of effective experiment Design and Data analysis in transportation research 63 the dependent variable. This model should reliably explain 33% of the variance of house- hold trip generation. Caution should be exercised when interpreting the significance of the R2 and the overall model because it is not uncommon to have a significant F-statistic when some of the coefficients in the equation are not significant. The analyst may want to consider recalibrating the model without the income variable because the coefficient d was insignificant. 6. Conclusion and Discussion: Regression, particularly OLS regression, relies on several assumptions about the data, the nature of the relationships, and the results. Data are assumed to be interval or ratio scale. Independent variables generally are assumed to be measured without error, so all error is attributed to the model fit. Furthermore, indepen- dent variables should be independent of one another. This is a serious concern because the presence in the model of related independent variables, called multicollinearity, compro- mises the t-tests and confuses the interpretation of coefficients. Tests of this problem are available in most statistical software packages that include regression. Look for Variance- Inflation Factor (VIF) and/or Tolerance tests; most packages will have one or the other, and some will have both. In the example above where PERS is divided into DEP and workers, knowing any two variables allows the calculation of the third. Including all three variables in the model would be a case of extreme multicollinearity and, logically, would make no sense. In this instance, because one variable is a linear combination of the other two, the calculations required (within the analysis program) to calibrate the model would actually fail. If the independent variables are simply highly correlated, the regression coefficients (at a minimum) may not have intuitive meaning. In general, equations or models with highly correlated independent variables are to be avoided; alternative models that examine one variable or the other, but not both, should be analyzed. It is also important to analyze the error distributions. Several assumptions relate to the errors and their distributions (normality, constant variance, uncorrelated, etc.) In transportation plan- ning, spatial variables and associations might become important; they require more elaborate constructs and often different estimation processes (e.g., Bayesian, Maximum Likelihood). (For more information about errors and error distributions, see NCHRP Project 20-45, Volume 2, Chapter 4.) Other logical considerations also exist. For example, for the measurement units of the different variables, does the magnitude of the result of multiplying the coefficient and the measured variable make sense and/or have a reasonable effect on the predicted magnitude of the dependent variable? Perhaps more importantly, do the independent variables make sense? In this example, does it make sense that changes in the number of vehicles in the household would cause an increase or decrease in the number of trips? These are measures of operational significance that go beyond consideration of statistical significance, but are no less important. 7. Applications in Other Areas of Transportation Research: Regression is a very important technique across many areas of transportation research, including: â¢ Transportation Planning â to include the other half of trip generation, e.g., predicting trip destinations as a function of employment levels by various types (factory, commercial), square footage of shopping center space, and so forth. â to investigate the trip distribution stage of the 4-step model (log transformation of the gravity model). â¢ Public Transportationâto predict loss/liability on subsidized freight rail lines (function of segment ton-miles, maintenance budgets and/or standards, operating speeds, etc.) for self-insurance computations. â¢ Pavement Engineeringâto model pavement deterioration (or performance) as a function of easily monitored predictor variables.

64 effective experiment Design and Data analysis in transportation research Example 17: Traffic Operations; Regression Analysis Area: Traffic operations Method of Analysis: Regression analysis (developing a model to predict the values that some variable can take as a function of one or more other variables, when not all variables are assumed to be continuous) 1. Research Question/Problem Statement: An engineer is concerned about false capacity at inter- sections being designed in a specified district. False capacity occurs where a lane is dropped just beyond a signalized intersection. Drivers approaching the intersection and knowing that the lane is going to be dropped shortly afterward avoid the lane. However, engineers estimating the capacity and level of service of the intersection during design have no reliable way to estimate the percentage of traffic that will avoid the lane (the lane distribution). Question/Issue Develop a model that can be used to predict the values that a dependent vari- able can take as a function of changes in the values of the independent variables. In this particular instance, how can engineers make a good estimate of the lane distribution of traffic volume in the case of a lane drop just beyond an intersec- tion? Can a linear model be developed that can be used to predict this distribu- tion based on other variables? The basic question is whether a linear relationship exists between the dependent variable (Y; in this case, the lane distribution percentage) and some independent variable(s) (X). The relationship can be expressed using the following equation: Y X= +a b i where a is the intercept and b is the slope of the line (see NCHRP Project 20-45, Volume 2, Chapter 4, Section B). 2. Identification and Description of Variables: The dependent variable of interest in this example is the volume of traffic in each lane on the approach to a signalized intersection with a lane drop just beyond. The traffic volumes by lane are converted into lane utilization factors (fLU), to be consistent with standard highway capacity techniques. The Highway Capacity Manual defines fLU using the following equation: f v v N LU g g = ( )1 where Vg is the flow rate in a lane group in vehicles per hour, Vg1 is the flow rate in the lane with the highest flow rate of any in the group in vehicles per hour, and N is the number of lanes in the lane group. The engineer thinks that lane utilization might be explained by one or more of 15 different factors, including the type of lane drop, the distance from the intersection to the lane drop, the taper length, and the heavy vehicle percentage. All of the variables are continuous except the type of lane drop. The type of lane drop is used to categorize the sites. 3. Data Collection: The engineer locates 46 lane-drop sites in the area and collects data at these sites by means of video recording. The engineer tapes for up to 3 hours at each site. The data are summarized in 15-minute periods, again to be consistent with standard highway capacity practice. For one type of lane-drop geometry, with two through lanes and an exclusive right- turn lane on the approach to the signalized intersection, the engineer ends up with 88 valid

examples of effective experiment Design and Data analysis in transportation research 65 data points (some sites have provided more than one data point), covering 15 minutes each, to use in equation (model) development. 4. Specification of Analysis Technique and Data Analysis: Multiple (or multivariate) regression is a standard statistical technique to develop predictive equations. (More information on this topic is given in NCHRP Project 20-45, Volume 2, Chapter 4, Section B). The engineer performs five steps to develop the predictive equation. Step 1. The engineer examines plots of each of the 15 candidate variables versus fLU to see if there is a relationship and to see what forms the relationships might take. Step 2. The engineer screens all 15 candidate variables for multicollinearity. (Multicollinearity occurs when two variables are related to each other and essentially contribute the same informa- tion to the prediction.) Multicollinearity can lead to models with poor predicting power and other problems. The engineer examines the variables for multicollinearity by â¢ looking at plots of each of the 15 candidate variables against every other candidate variable; â¢ calculating the correlation coefficient for each of the 15 candidate independent variables against every other candidate variable; and â¢ using more sophisticated tests (such as the variance influence factor) that are available in statistical software. Step 3. The engineer reduces the set of candidate variables to eight. Next, the engineer uses statistical software to select variables and estimate the coefficients for each selected variable, assuming that the regression equation has a linear form. To select variables, the engineer employs forward selection (adding variables one at a time until the equation fit ceases to improve significantly) and backward elimination (starting with all candidate variables in the equation and removing them one by one until the equation fit starts to deteriorate). The equation fit is measured by R2 (for more information, see NCHRP Project 20-45, Volume 2, Chapter 4, Section B, under the heading, âDescriptive Measures of Association Between X and Yâ), which shows how well the equation fits the data on a scale from 0 to 1, and other factors provided by statistical software. In this case, forward selection and backward elimination result in an equation with five variables: â¢ Drop: Lane drop type, a 0 or 1 depending on the type; â¢ Left: Left turn status, a 0 or 1 depending on the types of left turns allowed; â¢ Length: The distance from the intersection to the lane drop, in feet Ã· 1000; â¢ Volume: The average lane volume, in vehicles per hour per lane Ã· 1000; and â¢ Sign: The number of signs warning of the lane drop. Notice that the first two variables are discrete variables and had to assume a zero-or-one format to work within the regression model. Each of the five variables has a coefficient that is significantly different from zero at the 95% confidence level, as measured by a t-test. (For more information, see NCHRP Project 20-45, Volume 2, Chapter 4, Section B, âHow Are t-statistics Interpreted?â) Step 4. Once an initial model has been developed, the engineer plots the residuals for the tentative equation to see whether the assumed linear form is correct. A residual is the differ- ence, for each observation, between the prediction the equation makes for fLU and the actual value of fLU. In this example, a plot of the predicted value versus the residual for each of the 88 data points shows a fan-like shape, which indicates that the linear form is not appropriate. (NCHRP Project 20-45, Volume 2, Chapter 4, Section B, Figure 6 provides examples of residual plots that are and are not desirable.) The engineer experiments with several other model forms, including non-linear equations that involve transformations of variables, before settling on a lognormal form that provides a good R2 value of 0.73 and a desirable shape for the residual plot.

66 effective experiment Design and Data analysis in transportation research Step 5. Finally, the engineer examines the candidate equation for logic and practicality, asking whether the variables make sense, whether the signs of the variables make sense, and whether the variables can be collected easily by design engineers. Satisfied that the answers to these questions are âyes,â the final equation (model) can be expressed as follows: f Drop Left LLU = â â + +exp . . . .0 539 0 218 0 148 0 178i i i ength Volume Sign+ â( )0 627 0 105. .i i 5. Interpreting the Results: The process described in this example results in a useful equation for estimating the lane utilization in a lane to be dropped, thereby avoiding the estimation of false capacity. The equation has five terms and is non-linear, which will make its use a bit challenging. However, the database is large, the equation fits the data well, and the equation is logical, which should boost the confidence of potential users. If potential users apply the equation within the ranges of the data used for the calibration, the equation should provide good predictions. Applying any model outside the range of the data on which it was calibrated increases the likelihood of an inaccurate prediction. 6. Conclusion and Discussion: Regression is a powerful statistical technique that provides models engineers can use to make predictions in the absence of direct observation. Engineers tempted to use regression techniques should notice from this and other examples that the effort is substantial. Engineers using regression techniques should not skip any of the steps described above, as doing so may result in equations that provide poor predictions to users. Analysts considering developing a regression model to help make needed predictions should not be intimidated by the process. Although there are many pitfalls in developing a regression model, analysts considering making the effort should also consider the alternative: how the prediction will be made in the absence of a model. In the absence of a model, predic- tions of important factors like lane utilization would be made using tradition, opinion, or simple heuristics. With guidance from NCHRP Project 20-45 and other texts, and with good software available to make the calculations, credible regression models often can be developed that perform better than the traditional prediction methods. Because regression models developed by transportation engineers are often reused in later studies by others, the stakes are high. The consequences of a model that makes poor pre- dictions can be severe in terms of suboptimal decisions. Lane utilization models often are employed in traffic studies conducted to analyze new development proposals. A model that under-predicts utilization in a lane to be dropped may mean that the development is turned down due to the anticipated traffic impacts or that the developer has to pay for additional and unnecessary traffic mitigation measures. On the other hand, a model that over-predicts utilization in a lane to be dropped may mean that the development is approved with insufficient traffic mitigation measures in place, resulting in traffic delays, collisions, and the need for later intervention by a public agency. 7. Applications in Other Areas of Transportation Research: Regression is used in almost all areas of transportation research, including: â¢ Transportation Planningâto create equations to predict trip generation and mode split. â¢ Traffic Safetyâto create equations to predict the number of collisions expected on a particular section of road. â¢ Pavement Engineering/Materialsâto predict long-term wear and condition of pavements. Example 18: Transportation Planning; Logit and Related Analysis Area: Transportation planning Method of Analysis: Logit and related analysis (developing predictive models when the dependent variable is dichotomousâe.g., 0 or 1)

examples of effective experiment Design and Data analysis in transportation research 67 2. Identification and Description of Variables: Considering a typical, traditional urban area in the United States, it is reasonable to argue that the likelihood of taking public transit to work (Y) will be a function of income (X). Generally, more income means less likelihood of taking public transit. This can be modeled using the following equation: Y X ui i i= + +Î² Î²1 2 where Xi = family income, Y = 0 if the family uses public transit, and Y = 1 if the family doesnât use public transit. 3. Data Collection: These data normally are obtained from travel surveys conducted at the local level (e.g., by a metropolitan area or specific city), although the agency that collects the data often is a state DOT. 4. Specification of Analysis Techniques and Data Analysis: In this example the dependent variable is dichotomous and is a linear function of an explanatory variable. Consider the equation E(Yiï£´Xi) = b1 + b2Xi. Notice that if Pi = probability that Y = 1 (household utilizes transit), then (1 - Pi) = probability that Y = 0 (doesnât utilize transit). This has been called a linear probability model. Note that within this expression, âiâ refers to a household. Thus, Y has the distribution shown in Table 25. Any attempt to estimate this relationship with standard (OLS) regression is saddled with many problems (e.g., non-normality of errors, heteroscedasticity, and the possibility that the predicted Y will be outside the range 0 to 1, to say nothing of pretty terrible R2 values). Question/Issue Can a linear model be developed that can be used to predict the probability that one of two choices will be made? In this example, the question is whether a household will use public transit (or not). Rather than being continuous (as in linear regression), the dependent variable is reduced to two categories, a dichotomous variable (e.g., yes or no, 0 or 1). Although the question is simple, the statistical modeling becomes sophisticated very quickly. 1. Research Question/Problem Statement: Transportation planners often utilize variations of the classic four-step transportation planning process for predicting travel demand. Trip generation, trip distribution, mode split, and trip assignment are used to predict traffic flows under a variety of forecasted changes in networks, population, land use, and controls. Mode split, deciding which mode of transportation a traveler will take, requires predicting mutually exclusive outcomes. For example, will a traveler utilize public transit or drive his or her own car? Table 25. Distribution of Y. Values that Y Takes Probability Meaning/Interpretation 1 Pi Household uses transit 0 1 â Pi Household does not use transit 1.0 Total

68 effective experiment Design and Data analysis in transportation research An alternative formulation for estimating Pi, the cumulative logistic distribution, is expressed by the following equation: Pi Xi = + â +( ) 1 1 1 2Îµ Î² Î² This function can be plotted as a lazy Z-curve where on the left, with low values of X (low household income), the probability starts near 1 and ends at 0 (Figure 16). Notice that, even at 0 income, not all households use transit. The curve is said to be asymptotic to 1 and 0. The value of Pi varies between 1 and 0 in relation to income, X. Manipulating the definition of the cumulative logistic distribution from above, 1 11 2+( ) =â +( )Îµ Î² Î² Xi iP P Pi i Xi+( ) =â +( )Îµ Î² Î²1 2 1 P Pi Xi iÎµ Î² Î²â +( ) = â1 2 1 Îµ Î² Î²â +( ) = â1 2 1Xi i i P P and Îµ Î² Î²1 2 1 +( ) = â Xi i i P P The final expression is the ratio of the probability of utilizing public transit divided by the probability of not utilizing public transit. It is called the odds ratio. Next, taking the natural log of both sides (and reversing) results in the following equation: L P P Xi i i i= â ï£«ï£ï£¬ ï£¶ï£¸ï£· = +ln 1 1 2Î² Î² L is called the logit, and this is called a logit model. The left side is the natural log of the odds ratio. Unfortunately, this odds ratio is meaningless for individual households where the prob- ability is either 0 or 1 (utilize or not utilize). If the analyst uses standard OLS regression on this Figure 16. Plot of cumulative logistic distribution showing a lazy Z-curve.

examples of effective experiment Design and Data analysis in transportation research 69 equation, with data for individual households, there is a problem because when Pi happens to equal either 0 or 1 (which is all the time!), the odds ratio will, as a result, equal either 0 or infinity (and the logarithm will be undefined) for all observations. However, by using groups of households the problem can be mitigated. Table 26 presents data based on a survey of 701 households, more than half of which use transit (380). The income data are recorded for intervals; here, interval mid-points (Xj) are shown. The number of households in each income category is tallied (Nj), as is the number of households in each income category that utilizes public transit (nj). It is important to note that while there are more than 700 households (i), the number of observations (categories, j) is only 13. Using these data, for each income bracket, the probability of taking transit can be estimated as follows: P n N j j j = This equation is an expression of relative frequency (i.e., it expresses the proportion in income bracket âjâ using transit). An examination of Table 26 shows clearly that there is progression of these relative frequen- cies, with higher income brackets showing lower relative frequencies, just as was hypothesized. We can calculate the odds ratio for each income bracket listed in Table 26 and estimate the following logit function with OLS regression: L n N n N Xj j j j j j= â ï£« ï£ ï£¬ï£¬ï£¬ï£¬ ï£¶ ï£¸ ï£·ï£·ï£·ï£· = +ln 1 1 2Î² Î² The results of this regression are shown in Table 27. The results also can be expressed as an equation: LogOddsRatio X= â1 037 0 00003863. . 5. Interpreting the Results: This model provides a very good fit. The estimates of the coefficients can be inserted in the original cumulative logistic function to directly estimate the probability of using transit for any given X (income level). Indeed, the logistic graph in Figure 16 is produced with the estimated function. Xj ($) Nj (Households) nj (Utilizing Transit) Pj (Defined Above) $6,000 40 30 0.750 $8,000 55 39 0.709 $10,000 65 43 0.662 $13,000 88 58 0.659 $15,000 118 69 0.585 $20,000 81 44 0.543 $25,000 70 33 0.471 $30,000 62 25 0.403 $35,000 40 16 0.400 $40,000 30 11 0.367 $50,000 22 6 0.273 $60,000 18 4 0.222 $75,000 12 2 0.167 Total: 701 380 Table 26. Data examined by groups of households.

70 effective experiment Design and Data analysis in transportation research 6. Conclusion and Discussion: This approach to estimation is not without further problems. For example, the N within each income bracket needs to be sufficiently large that the relative fre- quency (and therefore the resulting odds ratio) is accurately estimated. Many statisticians would say that a minimum of 25 is reasonable. This approach also is limited by the fact that only one independent variable is used (income). Common sense suggests that the right-hand side of the function could logically be expanded to include more than one predictor variable (more Xs). For example, it could be argued that educational level might act, along with income, to account for the probability of using transit. However, combining predictor variables severely impinges on the categories (the j) used in this OLS regression formulation. To illustrate, assume that five educational categories are used in addition to the 13 income brackets (e.g., Grade 8 or less, high school graduate to Grade 9, some college, BA or BS degree, and graduate degree). For such an OLS regression analysis to work, data would be needed for 5 Ã 13, or 65 categories. Ideally, other travel modes should also be considered. In the example developed here, only transit and not-transit are considered. In some locations it is entirely reasonable to examine private auto versus bus versus bicycle versus subway versus light rail (involving five modes, not just two). This notion of a polychotomous logistic regression is possible. However, five modes cannot be estimated with the OLS regression technique employed above. The logit above is a variant of the binomial distribution and the polychotomous logistic model is a variant of the multi- nomial distribution (see NCHRP Project 20-45, Volume 2, Chapter 5). Estimation of these more advanced models requires maximum likelihood methods (as described in NCHRP Project 20-45, Volume 2, Chapter 5). Other model variants are based upon other cumulative probability distributions. For exam- ple, there is the probit model, in which the normal cumulative density function is used. The probit model is very similar to the logit model, but it is more difficult to estimate. 7. Applications in Other Areas of Transportation Research: Applications of logit and related models abound within transportation studies. In any situation in which human behavior is relegated to discrete choices, the category of models may be applied. Examples in other areas of transportation research include: â¢ Transportation Planningâto model any âchoiceâ issue, such as shopping destination choices. â¢ Traffic Safetyâto model dichotomous responses (e.g., did a motorist slow down or not) in response to traffic control devices. â¢ Highway Designâto model public reactions to proposed design solutions (e.g., support or not support proposed road diets, installation of roundabouts, or use of traffic calming techniques). Example 19: Public Transit; Survey Design and Analysis Area: Public transit Method of Analysis: Survey design and analysis (organizing survey data for statistical analysis) Coefficients t-values (statistics) p-values Measures of âFitâ 1 = 1.037 12.156 0.000 R = 0.980 2 = -0.00003863 Î² Î² -16.407 0.000 R2 = 0.961 adjusted R2 = 0.957 Table 27. Results of OLS regression.

examples of effective experiment Design and Data analysis in transportation research 71 2. Identification and Description of Variables: Two types of variables are needed for this analysis. The first is data on the characteristics of the riders, such as gender, age, and access to an automobile. These data are discrete variables. The second is data on the ridersâ stated responses to proposed changes in the fare or service characteristics. These data also are treated as discrete variables. Although some, like the fare, could theoretically be continuous, they are normally expressed in discrete increments (e.g., $1.00, $1.25, $1.50). 3. Data Collection: These data are normally collected by agencies conducting a survey of the transit users. The initial step in the experiment design is to choose the variables to be collected for each of these two data sets. The second step is to determine how to categorize the data. Both steps are generally based on past experience and common sense. Some of the variables used to describe the characteristics of the transit user are dichotomous, such as gender (male or female) and access to an automobile (yes or no). Other variables, such as age, are grouped into discrete categories within which the transit riding characteristics are similar. For example, one would not expect there to be a difference between the transit trip needs of a 14-year-old student and a 15-year-old student. Thus, the survey responses of these two age groups would be assigned to the same age category. However, experience (and common sense) leads one to differentiate a 19-year-old transit user from a 65-year-old transit user, because their purposes for taking trips and their perspectives on the relative value of the fare and the service components are both likely to be different. Obtaining user responses to changes in the fare or service is generally done in one of two ways. The first is to make a statement and ask the responder to mark one of several choices: strongly agree, agree, neither agree nor disagree, disagree, and strongly disagree. The number of statements used in the survey depends on how many parameter changes are being contemplated. Typical statements include: 1. I would increase the number of trips I make each month if the fare were reduced by $0.xx. 2. I would increase the number of trips I make each month if I could purchase a monthly pass. 3. I would increase the number of trips I make each month if the waiting time at the stop were reduced by 10 minutes. 4. I would increase the number of trips I make each month if express services were available from my origin to my destination. The second format is to propose a change and provide multiple choices for the responder. Typical questions for this format are: 1. If the fare were increased by $0.xx per trip I would: a) not change the number of trips per month b) reduce the non-commute trips c) reduce both the commute and non-commute trips d) switch modes 2. If express service were offered for an additional $0.xx per trip I would: a) not change the number of trips per month on this local service b) make additional trips each month c) shift from the local service to the express service Question/Issue Use and analysis of data collected in a survey. Results from a survey of transit users are used to estimate the change in ridership that would result from a change in the service or fare. 1. Research Question/Problem Statement: The transit director is considering changes to the fare structure and the service characteristics of the transit system. To assist in determining which changes would be most effective or efficient, a survey of the current transit riders is developed.

72 effective experiment Design and Data analysis in transportation research These surveys generally are administered by handing a survey form to people as they enter the transit vehicle and collecting them as people depart the transit vehicle. The surveys also can be administered by mail, telephone, or in a face-to-face interview. In constructing the questions, care should be taken to use terms with which the respondents will be familiar. For example, if the system does not currently offer âexpressâ service, this term will need to be defined in the survey. Other technical terms should be avoided. Similarly, the word âmodeâ is often used by transportation professionals but is not commonly used by the public at large. The length of a survey is almost always an issue as well. To avoid asking too many questions, each question needs to be reviewed to see if it is really necessary and will produce useful data (as opposed to just being something that would be nice to know). 4. Specification of Analysis Technique and Data Analysis: The results of these surveys often are displayed in tables or in frequency distribution diagrams (see also Example 1 and Example 2). Table 28 lists responses to a sample question posed in the form of a statement. Figure 17 shows the frequency diagram for these data. Similar presentations can be made for any of the groupings included in the first type of variables discussed above. For example, if gender is included as a Type 1 question, the results might appear as shown in Table 29 and Figure 18. Figure 18 shows the frequency diagram for these data. Presentations of the data can be made for any combination of the discrete variable groups included in the survey. For example, to display responses of female users over 65 years old, Strongly Agree Agree Neither Agree nor Disagree Disagree Strongly Disagree Total responses 450 600 300 400 100 Table 28. Table of responses to sample statement, âI would increase the number of trips I make each month if the fare were reduced by $0.xx.â 450 600 300 400 100 0 50 100 150 200 250 300 350 400 450 500 550 600 Strongly agree agree neither agree nor disagree disagree strongly disagree Figure 17. Frequency diagram for total responses to sample statement.

examples of effective experiment Design and Data analysis in transportation research 73 all of the survey forms on which these two characteristics (female and over 65 years old) are checked could be extracted and recorded in a table and shown in a frequency diagram. 5. Interpreting the Results: Survey data can be used to compare the responses to fare or service changes of different groups of transit users. This flexibility can be important in determining which changes would impact various segments of transit users. The information can be used to evaluate various fare and service options being considered and allows the transit agency to design promotions to obtain the greatest increase in ridership. For example, by creating fre- quency diagrams to display the responses to statements 2, 3, and 4 listed in Section 3, the engi- neer can compare the impact of changing the fare versus changing the headway or providing express services in the corridor. Organizing response data according to different characteristics of the user produces con- tingency tables like the one illustrated for males and females. This table format can be used to conduct chi-square analysis to determine if there is any statistically significant difference among the various groups. (Chi-square analysis is described in more detail in Example 4.) 6. Conclusions and Discussion: This example illustrates how to obtain and present quan- titative information using surveys. Although survey results provide reasonably good esti- mates of the relative importance users place on different transit attributes (fare, waiting time, hours of service, etc.), when determining how often they would use the system, the magnitude of usersâ responses often is overstated. Experience shows that what users say they would do (their stated preference) generally is different than what they actually do (their revealed preference). Strongly Agree Agree Neither Agree nor Disagree Disagree Strongly Disagree Male 200 275 200 200 70 Female 250 325 100 200 30 Total responses 450 600 300 400 100 Table 29. Contingency table showing responses by gender to sample statement, âI would increase the number of trips I make each month if the fare were reduced by $0.xx.â 200 275 200 200 70 250 325 100 200 30 0 50 100 150 200 250 300 350 Strongly agree agree neither agree nor disagree disagree strongly disagree Male Female Figure 18. Frequency diagram showing responses by gender to sample statement.

74 effective experiment Design and Data analysis in transportation research In this example, 1,050 of the 1,850 respondents (57%) have responded that they would use the bus service more frequently if the fare were decreased by $0.xx. Five hundred respondents (27%) have indicated that they would not use the bus service more frequently, and 300 respondents (16%) have indicated that they are not sure if they would change their bus use frequency. These percentages show the stated preferences of the users. The engineer does not yet know the revealed preferences of the users, but experience suggests that it is unlikely that 57% of the riders would actually increase the number of trips they make. 7. Applications in Other Area in Transportation: Survey design and analysis techniques can be used to collect and present data in many areas of transportation research, including: â¢ Transportation Planningâto assess public response to a proposal to enact a local motor fuel tax to improve road maintenance in a city or county. â¢ Traffic Operationsâto assess public response to implementing road diets (e.g., 4-lane to 3-lane conversions) on different corridors in a city. â¢ Highway Designâto assess public response to proposed alternative cross-section designs, such as a boulevard design versus an undivided multilane design in a corridor. Example 20: Traffic Operations; Simulation Area: Traffic operations Method of Analysis: Simulation (using field data to simulate, or model, operations or outcomes) 1. Research Question/Problem Statement: A team of engineers wants to determine whether one or more unconventional intersection designs will produce lower travel times than a conventional design at typical intersections for a given number of lanes. There is no way to collect field data to compare alternative intersection designs at a particular site. Macroscopic traffic operations models like those in the Highway Capacity Manual do a good job of estimating delay at specific points but are unable to provide travel time estimates for unconventional designs that consist of several smaller intersections and road segments. Microscopic simulation models measure the behaviors of individual vehicles as they traverse the highway network. Such simulation models are therefore very flexible in the types of networks and measures that can be examined. The team in this example turns to a simulation model to determine how other intersection designs might work. Question/Issue Developing and using a computer simulation model to examine operations in a computer environment. In this example, a traffic operations simulation model is used to show whether one or more unconventional intersection designs will produce lower travel times than a conventional design at typical intersections for a given number of lanes. 2. Identification and Description of Variables: The engineering team simulates seven different intersections to provide the needed scope for their findings. At each intersection, the team examines three different sets of traffic volumes: volumes from the evening (p.m.) peak hour, a typical midday off-peak hour, and a volume that is 15% greater than the p.m. peak hour to represent future conditions. At each intersection, the team models the current conventional intersection geometry and seven unconventional designs: the quadrant roadway, median U-turn, superstreet, bowtie, jughandle, split intersection, and continuous flow intersection. Traffic simulation models break the roadway network into nodes (intersections) and links (segments between intersections). Therefore, the engineering team has to design each of the

examples of effective experiment Design and Data analysis in transportation research 75 alternatives at each test site in terms of numbers of lanes, lane lengths, and such, and then faithfully translate that geometry into links and nodes that the simulation model can use. For each combination of traffic volume and intersection design, the team uses software to find the optimum signal timing and uses that during the simulation. To avoid bias, the team keeps all other factors (e.g., network size, numbers of lanes, turn lane lengths, truck percentages, average vehicle speeds) constant in all simulation runs. 3. Data Collection: The field data collection necessary in this effort consists of noting the current intersection geometries at the seven test intersections and counting the turning movements in the time periods described above. In many simulation efforts, it is also necessary to collect field data to calibrate and validate the simulation model. Calibration is the process by which simulation output is compared to actual measurements for some key measure(s) such as travel time. If a difference is found between the simulation output and the actual measurement, the simulation inputs are changed until the difference disappears. Validation is a test of the calibrated simulation model, comparing simulation output to a previously unused sample of actual field measurements. In this example, however, the team determines that it is unnecessary to collect calibration and validation data because a recent project has successfully calibrated and validated very similar models of most of these same unconventional designs. The engineer team uses the CORSIM traffic operations simulation model. Well known and widely used, CORSIM models the movement of each vehicle through a specified network in small time increments. CORSIM is a good choice for this example because it was originally designed for problems of this type, has produced appropriate results, has excellent animation and other debugging features, runs quickly in these kinds of cases, and is well-supported by the software developers. The team makes two CORSIM runs with different random number seeds for each combina- tion of volume and design at each intersection, or 48 runs for each intersection altogether. It is necessary to make more than one run (or replication) of each simulation combination with different random number seeds because of the randomness built into simulation models. The experiment design in this case allows the team to reduce the number of replications to two; typical practice in simulations when one is making simple comparisons between two variables is to make at least 5 to 10 replications. Each run lasts 30 simulated minutes. Table 30 shows the simulation data for one of the seven intersections. The lowest travel time produced in each case is bolded. Notice that Table 30 does not show data for the bowtie design. That design became congested (gridlocked) and produced essentially infinite travel times for this intersection. Handling overly congested networks is a difficult problem in many efforts and with several different simulation software packages. The best current advice is for analysts to not push their networks too hard and to scan often for gridlock. 4. Specification of Analysis Technique and Data Analysis: The experiment assembled in this example uses a factorial design. (Factorial design also is discussed in Example 11.) The team analyzes the data from this factorial experiment using analysis of variance (ANOVA). Because Time of Day Total Travel Time, Vehicle-hours, Average of Two Simulation Runs Conventional Quadrant Median U Superstreet Jughandle Split Continuous Midday 67 64 61 74 63 59* 75 P.M. peak 121 95 119 179 139 114 106 Peak + 15% 170 *Lowest total travel time. 135 145 245 164 180 142 Table 30. Simulation results for different designs and time of day.

76 effective experiment Design and Data analysis in transportation research the experimenter has complete control in a simulation, it is common to use efficient designs like factorials and efficient analysis methods like ANOVA to squeeze all possible information out of the effort. Statistical tests comparing the individual mean values of key results by factor are common ways to follow up on ANOVA results. Although ANOVA will reveal which factors make a significant contribution to the overall variance in the dependent variable, means tests will show which levels of a significant factor differ from the other levels. In this example, the team uses Tukeyâs means test, which is available as part of the battery of standard tests accom- panying ANOVA in statistical software. (For more information about ANOVA, see NCHRP Project 20-45, Volume 2, Chapter 4, Section A.) 5. Interpreting the Results: For the data shown in Table 30, the ANOVA reveals that the volume and design factors are statistically significant at the 99.99% confidence level. Furthermore, the interaction between the volume and design factors also is statistically significant at the 99.99% level. The means tests on the design factors show that the quadrant roadway is significantly different from (has a lower overall travel time than) the other designs at the 95% level. The next- best designs overall are the median U-turn and the continuous flow intersection; these are not statistically different from each other at the 95% level. The third tier of designs consists of the conventional and the split, which are statistically different from all others at the 95% level but not from each other. Finally, the jughandle and the superstreet designs are statistically different from each other and from all other designs at the 95% level according to the means test. Through the simulation, the team learns that several designs appear to be more efficient than the conventional design, especially at higher volume levels. From the results at all seven intersections, the team sees that the quadrant roadway and median U-turn designs generally lead to the lowest travel times, especially with the higher volume levels. 6. Conclusion and Discussion: Simulation is an effective tool to analyze traffic operations, as at the seven intersections of interest in this example. No other tool would allow such a robust comparison of many different designs and provide the results for travel times in a larger net- work rather than delays at a single spot. The simulation conducted in this example also allows the team to conduct an efficient factorial design, which maximizes the information provided from the effort. Simulation is a useful tool in research for traffic operations because it â¢ affords the ability to conduct randomized experiments, â¢ allows the examination of details that other methods cannot provide, and â¢ allows the analysis of large and complex networks. In practice, simulation also is popular because of the vivid and realistic animation output provided by common software packages. The superb animations allow analysts to spot and treat flaws in the design or model and provide agencies an effective tool by which to share designs with politicians and the public. Although simulation results can sometimes be surprising, more often they confirm what the analysts already suspect based on simpler analyses. In the example described here, the analysts suspected that the quadrant roadway and median U-turn designs would perform well because these designs had performed well in prior Highway Capacity Manual calculations. In many studies, simulations provide rich detail and vivid animation but no big surprises. 7. Applications in Other Areas of Transportation Research: Simulations are critical analysis methods in several areas of transportation research. Besides traffic operations, simulations are used in research related to: â¢ Maintenanceâto model the lifetime performance of traffic signs. â¢ Traffic Safety â to examine vehicle performance and driver behaviors or performance. â to predict the number of collisions from a new roadway design (potentially, given the recent development of the FHWA SSAM program).

examples of effective experiment Design and Data analysis in transportation research 77 Example 21: Traffic Safety; Non-parametric Methods Area: Traffic safety Method of Analysis: Non-parametric methods (methods used when data do not follow assumed or conventional distributions, such as when comparing median values) 1. Research Question/Problem Statement: A city traffic engineer has been receiving many citizen complaints about the perceived lack of safety at unsignalized midblock crosswalks. Apparently, some motorists seem surprised by pedestrians in the crosswalks and do not yield to the pedestrians. The engineer believes that larger and brighter warning signs may be an inexpensive way to enhance safety at these locations. Question/Issue Determine whether some treatment has an effect when data to be tested do not follow known distributions. In this example, a nonparametric method is used to determine whether larger and brighter warning signs improve pedestrian safety at unsignalized midblock crosswalks. The null hypothesis and alternative hypothesis are stated as follows: Ho: There is no difference in the median values of the number of conflicts before and after a treatment. Ha: There is a difference in the median values. 2. Identification and Description of Variables: The engineer would like to collect collision data at crosswalks with improved signs, but it would take a long time at a large sample of crosswalks to collect a reasonable sample size of collisions to answer the question. Instead, the engineer collects data for conflicts, which are near-collisions when one or both of the involved entities brakes or swerves within 2 seconds of a collision to avoid the collision. Research literature has shown that conflicts are related to collisions, and because conflicts are much more numerous than collisions, it is much quicker to collect a good sample size. Conflict data are not nearly as widely used as collision data, however, and the underlying distribution of conflict data is not clear. Thus, the use of non-parametric methods seems appropriate. 3. Data Collection: The engineer identifies seven test crosswalks in the city based on large pedes- trian volumes and the presence of convenient vantage points for observing conflicts. The engi- neering staff collects data on traffic conflicts for 2 full days at each of the seven crosswalks with standard warning signs. The engineer then has larger and brighter warning signs installed at the seven sites. After waiting at least 1 month at each site after sign installation, the staff again collects traffic conflicts for 2 full days, making sure that weather, light, and as many other conditions as possible are similar between the before-and-after data collection periods at each site. 4. Specification of Analysis Technique and Data Analysis: A nonparametric statistical test is an efficient way to analyze data when the underlying distribution is unclear (as in this example using conflict data) and when the sample size is small (as in this example with its small number of sites). Several such tests, such as the sign test and the Wilcoxon signed-rank (Wilcoxon rank-sum) test are plausible in this example. (For more information about nonparametric tests, see NCHRP Project 20-45, Volume 2, Chapter 6, Section D, âHypothesis About Population Medians for Independent Samples.â ) The decision is made to use the Wilcoxon signed-rank test because it is a more powerful test for paired numerical measurements than other tests, and this example uses paired (before-and-after) measurements. The sign test is a popular nonparametric test for paired data but loses information contained in numerical measurements by reducing the data to a series of positive or negative signs.

78 effective experiment Design and Data analysis in transportation research Having decided on the Wilcoxon signed-rank test, the engineer arranges the data (see Table 31). The third row of the table is the difference between the frequencies of the two conflict measurements at each site. The last row shows the rank order of the sites from lowest to highest based on the absolute value of the difference. Site 3 has the least difference (35 - 33 = 2) while Site 7 has the greatest difference (54 - 61 = -16). The Wilcoxon signed-rank test ranks the differences from low to high in terms of absolute values. In this case, that would be 2, 3, 7, 7, 12, 15, and 16. The test statistic, x, is the sum of the ranks that have positive differences. In this example, x = 1 + 2 + 3.5 + 3.5 + 6 = 16. Notice that all but the sixth and seventh ranked sites had positive differences. Notice also that the tied differences were assigned ranks equal to the average of the ranks they would have received if they were just slightly different from each other. The engineer then consults a table for the Wilcoxon signed-rank test to get a critical value against which to compare. (Such a table appears in NCHRP Project 20-45, Volume 2, Appendix C, Table C-8.) The standard table for a sample size of seven shows that the critical value for a one-tailed test (testing whether there is an improvement) with a confidence level of 95% is x = 24. 5. Interpreting the Results: Because the calculated value (x = 16) is less than the critical value (x = 24), the engineer concludes that there is not a statistically significant difference between the number of conflicts recorded with standard signs and the number of conflicts recorded with larger and brighter signs. 6. Conclusion and Discussion: Nonparametric tests do not require the engineer to make restric- tive assumptions about an underlying distribution and are therefore good choices in cases like this, in which the sample size is small and the data collected do not have a familiar underlying distribution. Many nonparametric tests are available, so analysts should do some reading and searching before settling on the best one for any particular case. Once a nonparametric test is determined, it is usually easy to apply. This example also illustrates one of the potential pitfalls of statistical testing. The engineerâs conclusion is that there is not a statistically significant difference between the number of conflicts recorded with standard signs and the number of conflicts recorded with larger and brighter signs. That conclusion does not necessarily mean that larger and brighter signs are a bad idea at sites similar to those tested. Notice that in this experiment, larger and brighter signs produced lower conflict frequencies at five of the seven sites, and the average number of conflicts per site was lower with the larger and brighter signs. Given that signs are relatively inexpensive, they may be a good idea at sites like those tested. A statistical test can provide useful information, especially about the quality of the experiment, but analysts must be careful not to interpret the results of a statistical test too strictly. In this example, the greatest danger to the validity of the test result lies not in the statistical test but in the underlying before-and-after test setup. For the results to be valid, it is necessary that the only important change that affects conflicts at the test sites during data collection be Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Standard signs 170 39 35 32 32 19 45 Larger and brighter signs 155 26 33 29 25 31 61 Difference 15 7 2 3 7 -12 -16 Rank of absolute difference 6 73.5 1 2 3.5 5 Table 31. Number of conflicts recorded during each (equal) time period at each site.

examples of effective experiment Design and Data analysis in transportation research 79 the new signs. The engineer has kept the duration short between the before-and-after data collection periods, which helps minimize the chances of other important changes. However, if there is any reason to suspect other important changes, these test results should be viewed skeptically and a more sophisticated test strategy should be employed. 7. Applications in Other Areas of Transportation Research: Nonparametric tests are helpful when researchers are working with small sample sizes or sample data wherein the underlying distribution is unknown. Examples of other areas of transportation research in which non- parametric tests may be applied include: â¢ Transportation Planning, Public Transportationâto analyze data from surveys and questionnaires when the scale of the response calls into question the underlying distribution. Such data are often analyzed in transportation planning and public transportation. â¢ Traffic Operationsâto analyze small samples of speed or volume data. â¢ Structures, Pavementsâto analyze quality ratings of pavements, bridges, and other trans- portation assets. Such ratings also use scales. Resources The examples used in this report have included references to the following resources. Researchers are encouraged to consult these resources for more information about statistical procedures. Freund, R. J. and W. J. Wilson (2003). Statistical Methods. 2d ed. Burlington, MA: Academic Press. See page 256 for a discussion of Tukeyâs procedure. Kutner, M. et al. (2005). Applied Linear Statistical Models. 5th ed. Boston: McGraw-Hill. See page 746 for a discussion of Tukeyâs procedure. NCHRP CD-22: Scientific Approaches to Transportation Research, Vol. 1 and 2. 2002. Transpor- tation Research Board of the National Academies, Washington, D.C. This two-volume electronic manual developed under NCHRP Project 20-45 provides a comprehensive source of information on the conduct of research. The manual includes state-of-the-art techniques for problem state- ment development; literature searching; development of the research work plan; execution of the experiment; data collection, management, quality control, and reporting of results; and evaluation of the effectiveness of the research, as well as the requirements for the systematic, pro- fessional, and ethical conduct of transportation research. For readersâ convenience, the references to NCHRP Project 20-45 from the various examples contained in this report are summarized here by topic and location in NCHRP CD-22. More information about NCHRP CD-22 is available at http://www.trb.org/Main/Blurbs/152122.aspx. â¢ Analysis of Variance (one-way ANOVA and two-way ANOVA): See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 113, 119â31). â¢ Assumptions for residual errors: See Volume 2, Chapter 4. â¢ Box plots; Q-Q plots: See Volume 2, Chapter 6, Section C. â¢ Chi-square test: See Volume 2, Chapter 6, Sections E (Chi-Square Test for Independence) and F. â¢ Chi-square values: See Volume 2, Appendix C, Table C-2. â¢ Computations on unbalanced designs and multi-factorial designs: See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 119â31). â¢ Confidence intervals: See Volume 2, Chapter 4. â¢ Correlation coefficient: See Volume 2, Appendix A, Glossary, Correlation Coefficient. â¢ Critical F-value: See Volume 2, Appendix C, Table C-5. â¢ Desirable and undesirable residual plots (scatter plots): See Volume 2, Chapter 4, Section B, Figure 6.

80 effective experiment Design and Data analysis in transportation research â¢ Equation fit: See Volume 2, Chapter 4, Glossary, Descriptive Measures of Association Between X and Y. â¢ Error distributions (normality, constant variance, uncorrelated, etc.): See Volume 2, Chapter 4 (pp. 146â55). â¢ Experiment design and data collection: See Volume 2, Chapter 1. â¢ Fcrit and F-distribution table: See Volume 2, Appendix C, Table C-5. â¢ F-test (or F-test): See Volume 2, Chapter 4, Section A, Compute the F-ratio Test Statistic (p. 124). â¢ Formulation of formal hypotheses for testing: See Volume 1, Chapter 2, Hypothesis; Volume 2, Appendix A, Glossary. â¢ History and maturation biases (specification errors): See Volume 2, Chapter 1, Quasi- Experiments. â¢ Indicator (dummy) variables: See Volume 2, Chapter 4 (pp. 142â45). â¢ Intercept and slope: See Volume 2, Chapter 4 (pp. 140â42). â¢ Maximum likelihood methods: See Volume 2, Chapter 5 (pp. 208â11). â¢ Mean and standard deviation formulas: See Volume 2, Chapter 6, Table C, Frequency Distribu- tions, Variance, Standard Deviation, Histograms, and Boxplots. â¢ Measured ratio or interval scale: See Volume 2, Chapter 1 (p. 83). â¢ Multinomial distribution and polychotomous logistical model: See Volume 2, Chapter 5 (pp. 211â18). â¢ Multiple (multivariate) regression: See Volume 2, Chapter 4, Section B. â¢ Non-parametric tests: See Volume 2, Chapter 6, Section D. â¢ Normal distribution: See Volume 2, Appendix A, Glossary, Normal Distribution. â¢ One- and two-sided hypothesis testing (one- and two-tail test values): See Volume 2, Chapter 4 (pp. 161 and 164â5). â¢ Ordinary least squares (OLS) regression: See Volume 2, Chapter 4, Section B, Linear Regression. â¢ Sample size and confidence: See Volume 2, Chapter 1, Sample Size Determination. â¢ Sample size determination based on statistical power requirements: See Volume 2, Chapter 1, Sample Size Determination (p. 94). â¢ Sign test and the Wilcoxon signed-rank (Wilcoxon rank-sum) test: See Volume 2, Chapter 6, Section D, and Appendix C, Table C-8, Hypothesis About Population Medians for Independent Samples. â¢ Split samples: See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 119â31). â¢ Standard chi-square distribution table: See Volume 2, Appendix C, Table C-2. â¢ Standard normal values: See Volume 2, Appendix C, Table C-1. â¢ tcrit values: See Volume 2, Appendix C, Table C-4. â¢ t-statistic: See Volume 2, Appendix A, Glossary. â¢ t-statistic using equation for equal variance: See Volume 2, Appendix C, Table C-4. â¢ t-test: See Volume 2, Chapter 4, Section B, How are t-statistics Interpreted? â¢ Tabularized values of t-statistic: See Volume 2, Appendix C, Table C-4. â¢ Tukeyâs test, Bonferroniâs test, Scheffeâs test: See Volume 2, Chapter 4, Section A, Analysis of Variance Methodology (pp. 119â31). â¢ Types of data and implications for selection of analysis techniques: See Volume 2, Chapter 1, Identification of Empirical Setting.

Abbreviations and acronyms used without deï¬nitions in TRB publications: AAAE American Association of Airport Executives AASHO American Association of State Highway Officials AASHTO American Association of State Highway and Transportation Officials ACIâNA Airports Council InternationalâNorth America ACRP Airport Cooperative Research Program ADA Americans with Disabilities Act APTA American Public Transportation Association ASCE American Society of Civil Engineers ASME American Society of Mechanical Engineers ASTM American Society for Testing and Materials ATA American Trucking Associations CTAA Community Transportation Association of America CTBSSP Commercial Truck and Bus Safety Synthesis Program DHS Department of Homeland Security DOE Department of Energy EPA Environmental Protection Agency FAA Federal Aviation Administration FHWA Federal Highway Administration FMCSA Federal Motor Carrier Safety Administration FRA Federal Railroad Administration FTA Federal Transit Administration HMCRP Hazardous Materials Cooperative Research Program IEEE Institute of Electrical and Electronics Engineers ISTEA Intermodal Surface Transportation Efficiency Act of 1991 ITE Institute of Transportation Engineers NASA National Aeronautics and Space Administration NASAO National Association of State Aviation Officials NCFRP National Cooperative Freight Research Program NCHRP National Cooperative Highway Research Program NHTSA National Highway Traffic Safety Administration NTSB National Transportation Safety Board PHMSA Pipeline and Hazardous Materials Safety Administration RITA Research and Innovative Technology Administration SAE Society of Automotive Engineers SAFETEA-LU Safe, Accountable, Flexible, Efficient Transportation Equity Act: A Legacy for Users (2005) TCRP Transit Cooperative Research Program TEA-21 Transportation Equity Act for the 21st Century (1998) TRB Transportation Research Board TSA Transportation Security Administration U.S.DOT United States Department of Transportation

TRB’s National Cooperative Highway Research Program (NCHRP) Report 727: Effective Experiment Design and Data Analysis in Transportation Research describes the factors that may be considered in designing experiments and presents 21 typical transportation examples illustrating the experiment design process, including selection of appropriate statistical tests.

The report is a companion to NCHRP CD-22, Scientific Approaches to Transportation Research, Volumes 1 and 2 , which present detailed information on statistical methods.

READ FREE ONLINE

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

Do you want to take a quick tour of the OpenBook's features?

Show this book's table of contents , where you can jump to any chapter by name.

...or use these buttons to go back to the previous chapter or skip to the next one.

Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

To search the entire text of this book, type in your search term here and press Enter .

Share a link to this book page on your preferred social network or via email.

View our suggested citation for this chapter.

Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

Get Email Updates

Do you enjoy reading reports from the Academies online for free ? Sign up for email notifications and we'll let you know about new publications in your areas of interest when they're released.

Data Analysis Techniques in Research – Methods, Tools & Examples

Varun Saharawat is a seasoned professional in the fields of SEO and content writing. With a profound knowledge of the intricate aspects of these disciplines, Varun has established himself as a valuable asset in the world of digital marketing and online content creation.

Data analysis techniques in research are essential because they allow researchers to derive meaningful insights from data sets to support their hypotheses or research objectives.

Data Analysis Techniques in Research : While various groups, institutions, and professionals may have diverse approaches to data analysis, a universal definition captures its essence. Data analysis involves refining, transforming, and interpreting raw data to derive actionable insights that guide informed decision-making for businesses.

A straightforward illustration of data analysis emerges when we make everyday decisions, basing our choices on past experiences or predictions of potential outcomes.

If you want to learn more about this topic and acquire valuable skills that will set you apart in today’s data-driven world, we highly recommend enrolling in the Data Analytics Course by Physics Wallah . And as a special offer for our readers, use the coupon code “READER” to get a discount on this course.

Table of Contents

What is Data Analysis?

Data analysis is the systematic process of inspecting, cleaning, transforming, and interpreting data with the objective of discovering valuable insights and drawing meaningful conclusions. This process involves several steps:

Inspecting : Initial examination of data to understand its structure, quality, and completeness.
Cleaning : Removing errors, inconsistencies, or irrelevant information to ensure accurate analysis.
Transforming : Converting data into a format suitable for analysis, such as normalization or aggregation.
Interpreting : Analyzing the transformed data to identify patterns, trends, and relationships.

Types of Data Analysis Techniques in Research

Data analysis techniques in research are categorized into qualitative and quantitative methods, each with its specific approaches and tools. These techniques are instrumental in extracting meaningful insights, patterns, and relationships from data to support informed decision-making, validate hypotheses, and derive actionable recommendations. Below is an in-depth exploration of the various types of data analysis techniques commonly employed in research:

1) Qualitative Analysis:

Definition: Qualitative analysis focuses on understanding non-numerical data, such as opinions, concepts, or experiences, to derive insights into human behavior, attitudes, and perceptions.

Content Analysis: Examines textual data, such as interview transcripts, articles, or open-ended survey responses, to identify themes, patterns, or trends.
Narrative Analysis: Analyzes personal stories or narratives to understand individuals’ experiences, emotions, or perspectives.
Ethnographic Studies: Involves observing and analyzing cultural practices, behaviors, and norms within specific communities or settings.

2) Quantitative Analysis:

Quantitative analysis emphasizes numerical data and employs statistical methods to explore relationships, patterns, and trends. It encompasses several approaches:

Descriptive Analysis:

Frequency Distribution: Represents the number of occurrences of distinct values within a dataset.
Central Tendency: Measures such as mean, median, and mode provide insights into the central values of a dataset.
Dispersion: Techniques like variance and standard deviation indicate the spread or variability of data.

Diagnostic Analysis:

Regression Analysis: Assesses the relationship between dependent and independent variables, enabling prediction or understanding causality.
ANOVA (Analysis of Variance): Examines differences between groups to identify significant variations or effects.

Predictive Analysis:

Time Series Forecasting: Uses historical data points to predict future trends or outcomes.
Machine Learning Algorithms: Techniques like decision trees, random forests, and neural networks predict outcomes based on patterns in data.

Prescriptive Analysis:

Optimization Models: Utilizes linear programming, integer programming, or other optimization techniques to identify the best solutions or strategies.
Simulation: Mimics real-world scenarios to evaluate various strategies or decisions and determine optimal outcomes.

Specific Techniques:

Monte Carlo Simulation: Models probabilistic outcomes to assess risk and uncertainty.
Factor Analysis: Reduces the dimensionality of data by identifying underlying factors or components.
Cohort Analysis: Studies specific groups or cohorts over time to understand trends, behaviors, or patterns within these groups.
Cluster Analysis: Classifies objects or individuals into homogeneous groups or clusters based on similarities or attributes.
Sentiment Analysis: Uses natural language processing and machine learning techniques to determine sentiment, emotions, or opinions from textual data.

Also Read: AI and Predictive Analytics: Examples, Tools, Uses, Ai Vs Predictive Analytics

Data Analysis Techniques in Research Examples

To provide a clearer understanding of how data analysis techniques are applied in research, let’s consider a hypothetical research study focused on evaluating the impact of online learning platforms on students’ academic performance.

Research Objective:

Determine if students using online learning platforms achieve higher academic performance compared to those relying solely on traditional classroom instruction.

Data Collection:

Quantitative Data: Academic scores (grades) of students using online platforms and those using traditional classroom methods.
Qualitative Data: Feedback from students regarding their learning experiences, challenges faced, and preferences.

Data Analysis Techniques Applied:

1) Descriptive Analysis:

Calculate the mean, median, and mode of academic scores for both groups.
Create frequency distributions to represent the distribution of grades in each group.

2) Diagnostic Analysis:

Conduct an Analysis of Variance (ANOVA) to determine if there’s a statistically significant difference in academic scores between the two groups.
Perform Regression Analysis to assess the relationship between the time spent on online platforms and academic performance.

3) Predictive Analysis:

Utilize Time Series Forecasting to predict future academic performance trends based on historical data.
Implement Machine Learning algorithms to develop a predictive model that identifies factors contributing to academic success on online platforms.

4) Prescriptive Analysis:

Apply Optimization Models to identify the optimal combination of online learning resources (e.g., video lectures, interactive quizzes) that maximize academic performance.
Use Simulation Techniques to evaluate different scenarios, such as varying student engagement levels with online resources, to determine the most effective strategies for improving learning outcomes.

5) Specific Techniques:

Conduct Factor Analysis on qualitative feedback to identify common themes or factors influencing students’ perceptions and experiences with online learning.
Perform Cluster Analysis to segment students based on their engagement levels, preferences, or academic outcomes, enabling targeted interventions or personalized learning strategies.
Apply Sentiment Analysis on textual feedback to categorize students’ sentiments as positive, negative, or neutral regarding online learning experiences.

By applying a combination of qualitative and quantitative data analysis techniques, this research example aims to provide comprehensive insights into the effectiveness of online learning platforms.

Also Read: Learning Path to Become a Data Analyst in 2024

Data Analysis Techniques in Quantitative Research

Quantitative research involves collecting numerical data to examine relationships, test hypotheses, and make predictions. Various data analysis techniques are employed to interpret and draw conclusions from quantitative data. Here are some key data analysis techniques commonly used in quantitative research:

1) Descriptive Statistics:

Description: Descriptive statistics are used to summarize and describe the main aspects of a dataset, such as central tendency (mean, median, mode), variability (range, variance, standard deviation), and distribution (skewness, kurtosis).
Applications: Summarizing data, identifying patterns, and providing initial insights into the dataset.

2) Inferential Statistics:

Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. This technique includes hypothesis testing, confidence intervals, t-tests, chi-square tests, analysis of variance (ANOVA), regression analysis, and correlation analysis.
Applications: Testing hypotheses, making predictions, and generalizing findings from a sample to a larger population.

3) Regression Analysis:

Description: Regression analysis is a statistical technique used to model and examine the relationship between a dependent variable and one or more independent variables. Linear regression, multiple regression, logistic regression, and nonlinear regression are common types of regression analysis .
Applications: Predicting outcomes, identifying relationships between variables, and understanding the impact of independent variables on the dependent variable.

4) Correlation Analysis:

Description: Correlation analysis is used to measure and assess the strength and direction of the relationship between two or more variables. The Pearson correlation coefficient, Spearman rank correlation coefficient, and Kendall’s tau are commonly used measures of correlation.
Applications: Identifying associations between variables and assessing the degree and nature of the relationship.

5) Factor Analysis:

Description: Factor analysis is a multivariate statistical technique used to identify and analyze underlying relationships or factors among a set of observed variables. It helps in reducing the dimensionality of data and identifying latent variables or constructs.
Applications: Identifying underlying factors or constructs, simplifying data structures, and understanding the underlying relationships among variables.

6) Time Series Analysis:

Description: Time series analysis involves analyzing data collected or recorded over a specific period at regular intervals to identify patterns, trends, and seasonality. Techniques such as moving averages, exponential smoothing, autoregressive integrated moving average (ARIMA), and Fourier analysis are used.
Applications: Forecasting future trends, analyzing seasonal patterns, and understanding time-dependent relationships in data.

7) ANOVA (Analysis of Variance):

Description: Analysis of variance (ANOVA) is a statistical technique used to analyze and compare the means of two or more groups or treatments to determine if they are statistically different from each other. One-way ANOVA, two-way ANOVA, and MANOVA (Multivariate Analysis of Variance) are common types of ANOVA.
Applications: Comparing group means, testing hypotheses, and determining the effects of categorical independent variables on a continuous dependent variable.

8) Chi-Square Tests:

Description: Chi-square tests are non-parametric statistical tests used to assess the association between categorical variables in a contingency table. The Chi-square test of independence, goodness-of-fit test, and test of homogeneity are common chi-square tests.
Applications: Testing relationships between categorical variables, assessing goodness-of-fit, and evaluating independence.

These quantitative data analysis techniques provide researchers with valuable tools and methods to analyze, interpret, and derive meaningful insights from numerical data. The selection of a specific technique often depends on the research objectives, the nature of the data, and the underlying assumptions of the statistical methods being used.

Also Read: Analysis vs. Analytics: How Are They Different?

Data Analysis Methods

Data analysis methods refer to the techniques and procedures used to analyze, interpret, and draw conclusions from data. These methods are essential for transforming raw data into meaningful insights, facilitating decision-making processes, and driving strategies across various fields. Here are some common data analysis methods:

Description: Descriptive statistics summarize and organize data to provide a clear and concise overview of the dataset. Measures such as mean, median, mode, range, variance, and standard deviation are commonly used.
Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. Techniques such as hypothesis testing, confidence intervals, and regression analysis are used.

3) Exploratory Data Analysis (EDA):

Description: EDA techniques involve visually exploring and analyzing data to discover patterns, relationships, anomalies, and insights. Methods such as scatter plots, histograms, box plots, and correlation matrices are utilized.
Applications: Identifying trends, patterns, outliers, and relationships within the dataset.

4) Predictive Analytics:

Description: Predictive analytics use statistical algorithms and machine learning techniques to analyze historical data and make predictions about future events or outcomes. Techniques such as regression analysis, time series forecasting, and machine learning algorithms (e.g., decision trees, random forests, neural networks) are employed.
Applications: Forecasting future trends, predicting outcomes, and identifying potential risks or opportunities.

5) Prescriptive Analytics:

Description: Prescriptive analytics involve analyzing data to recommend actions or strategies that optimize specific objectives or outcomes. Optimization techniques, simulation models, and decision-making algorithms are utilized.
Applications: Recommending optimal strategies, decision-making support, and resource allocation.

6) Qualitative Data Analysis:

Description: Qualitative data analysis involves analyzing non-numerical data, such as text, images, videos, or audio, to identify themes, patterns, and insights. Methods such as content analysis, thematic analysis, and narrative analysis are used.
Applications: Understanding human behavior, attitudes, perceptions, and experiences.

7) Big Data Analytics:

Description: Big data analytics methods are designed to analyze large volumes of structured and unstructured data to extract valuable insights. Technologies such as Hadoop, Spark, and NoSQL databases are used to process and analyze big data.
Applications: Analyzing large datasets, identifying trends, patterns, and insights from big data sources.

8) Text Analytics:

Description: Text analytics methods involve analyzing textual data, such as customer reviews, social media posts, emails, and documents, to extract meaningful information and insights. Techniques such as sentiment analysis, text mining, and natural language processing (NLP) are used.
Applications: Analyzing customer feedback, monitoring brand reputation, and extracting insights from textual data sources.

These data analysis methods are instrumental in transforming data into actionable insights, informing decision-making processes, and driving organizational success across various sectors, including business, healthcare, finance, marketing, and research. The selection of a specific method often depends on the nature of the data, the research objectives, and the analytical requirements of the project or organization.

Also Read: Quantitative Data Analysis: Types, Analysis & Examples

Data Analysis Tools

Data analysis tools are essential instruments that facilitate the process of examining, cleaning, transforming, and modeling data to uncover useful information, make informed decisions, and drive strategies. Here are some prominent data analysis tools widely used across various industries:

1) Microsoft Excel:

Description: A spreadsheet software that offers basic to advanced data analysis features, including pivot tables, data visualization tools, and statistical functions.
Applications: Data cleaning, basic statistical analysis, visualization, and reporting.

2) R Programming Language:

Description: An open-source programming language specifically designed for statistical computing and data visualization.
Applications: Advanced statistical analysis, data manipulation, visualization, and machine learning.

3) Python (with Libraries like Pandas, NumPy, Matplotlib, and Seaborn):

Description: A versatile programming language with libraries that support data manipulation, analysis, and visualization.
Applications: Data cleaning, statistical analysis, machine learning, and data visualization.

4) SPSS (Statistical Package for the Social Sciences):

Description: A comprehensive statistical software suite used for data analysis, data mining, and predictive analytics.
Applications: Descriptive statistics, hypothesis testing, regression analysis, and advanced analytics.

5) SAS (Statistical Analysis System):

Description: A software suite used for advanced analytics, multivariate analysis, and predictive modeling.
Applications: Data management, statistical analysis, predictive modeling, and business intelligence.

6) Tableau:

Description: A data visualization tool that allows users to create interactive and shareable dashboards and reports.
Applications: Data visualization , business intelligence , and interactive dashboard creation.

7) Power BI:

Description: A business analytics tool developed by Microsoft that provides interactive visualizations and business intelligence capabilities.
Applications: Data visualization, business intelligence, reporting, and dashboard creation.

8) SQL (Structured Query Language) Databases (e.g., MySQL, PostgreSQL, Microsoft SQL Server):

Description: Database management systems that support data storage, retrieval, and manipulation using SQL queries.
Applications: Data retrieval, data cleaning, data transformation, and database management.

9) Apache Spark:

Description: A fast and general-purpose distributed computing system designed for big data processing and analytics.
Applications: Big data processing, machine learning, data streaming, and real-time analytics.

10) IBM SPSS Modeler:

Description: A data mining software application used for building predictive models and conducting advanced analytics.
Applications: Predictive modeling, data mining, statistical analysis, and decision optimization.

These tools serve various purposes and cater to different data analysis needs, from basic statistical analysis and data visualization to advanced analytics, machine learning, and big data processing. The choice of a specific tool often depends on the nature of the data, the complexity of the analysis, and the specific requirements of the project or organization.

Also Read: How to Analyze Survey Data: Methods & Examples

Importance of Data Analysis in Research

The importance of data analysis in research cannot be overstated; it serves as the backbone of any scientific investigation or study. Here are several key reasons why data analysis is crucial in the research process:

Data analysis helps ensure that the results obtained are valid and reliable. By systematically examining the data, researchers can identify any inconsistencies or anomalies that may affect the credibility of the findings.
Effective data analysis provides researchers with the necessary information to make informed decisions. By interpreting the collected data, researchers can draw conclusions, make predictions, or formulate recommendations based on evidence rather than intuition or guesswork.
Data analysis allows researchers to identify patterns, trends, and relationships within the data. This can lead to a deeper understanding of the research topic, enabling researchers to uncover insights that may not be immediately apparent.
In empirical research, data analysis plays a critical role in testing hypotheses. Researchers collect data to either support or refute their hypotheses, and data analysis provides the tools and techniques to evaluate these hypotheses rigorously.
Transparent and well-executed data analysis enhances the credibility of research findings. By clearly documenting the data analysis methods and procedures, researchers allow others to replicate the study, thereby contributing to the reproducibility of research findings.
In fields such as business or healthcare, data analysis helps organizations allocate resources more efficiently. By analyzing data on consumer behavior, market trends, or patient outcomes, organizations can make strategic decisions about resource allocation, budgeting, and planning.
In public policy and social sciences, data analysis is instrumental in developing and evaluating policies and interventions. By analyzing data on social, economic, or environmental factors, policymakers can assess the effectiveness of existing policies and inform the development of new ones.
Data analysis allows for continuous improvement in research methods and practices. By analyzing past research projects, identifying areas for improvement, and implementing changes based on data-driven insights, researchers can refine their approaches and enhance the quality of future research endeavors.

However, it is important to remember that mastering these techniques requires practice and continuous learning. That’s why we highly recommend the Data Analytics Course by Physics Wallah . Not only does it cover all the fundamentals of data analysis, but it also provides hands-on experience with various tools such as Excel, Python, and Tableau. Plus, if you use the “ READER ” coupon code at checkout, you can get a special discount on the course.

For Latest Tech Related Information, Join Our Official Free Telegram Group : PW Skills Telegram Group

Data Analysis Techniques in Research FAQs

What are the 5 techniques for data analysis.

The five techniques for data analysis include: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis Qualitative Analysis

What are techniques of data analysis in research?

Techniques of data analysis in research encompass both qualitative and quantitative methods. These techniques involve processes like summarizing raw data, investigating causes of events, forecasting future outcomes, offering recommendations based on predictions, and examining non-numerical data to understand concepts or experiences.

What are the 3 methods of data analysis?

The three primary methods of data analysis are: Qualitative Analysis Quantitative Analysis Mixed-Methods Analysis

What are the four types of data analysis techniques?

The four types of data analysis techniques are: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis

What is Business Analytics?

This comprehensive article explores the concept of business analytics, detailing its definition, influence on business decisions, comparisons with related fields,…

What Is A Trusted Analytics Platform?

A trusted analytics platform refers to a software infrastructure or system that helps organizations to securely and effectively analyze large…

10 Best Companies For Data Analysis Internships 2024

This article will help you provide the top 10 best companies for a Data Analysis Internship which will not only…

Quantitative Data Analysis 101

The lingo, methods and techniques, explained simply.

By: Derek Jansen (MBA) and Kerryn Warren (PhD) | December 2020

Quantitative data analysis is one of those things that often strikes fear in students. It’s totally understandable – quantitative analysis is a complex topic, full of daunting lingo , like medians, modes, correlation and regression. Suddenly we’re all wishing we’d paid a little more attention in math class…

The good news is that while quantitative data analysis is a mammoth topic, gaining a working understanding of the basics isn’t that hard , even for those of us who avoid numbers and math . In this post, we’ll break quantitative analysis down into simple , bite-sized chunks so you can approach your research with confidence.

Quantitative data analysis methods and techniques 101

Overview: Quantitative Data Analysis 101

What (exactly) is quantitative data analysis?
When to use quantitative analysis
How quantitative analysis works

The two “branches” of quantitative analysis

Descriptive statistics 101
Inferential statistics 101
How to choose the right quantitative methods
Recap & summary

What is quantitative data analysis?

Despite being a mouthful, quantitative data analysis simply means analysing data that is numbers-based – or data that can be easily “converted” into numbers without losing any meaning.

For example, category-based variables like gender, ethnicity, or native language could all be “converted” into numbers without losing meaning – for example, English could equal 1, French 2, etc.

This contrasts against qualitative data analysis, where the focus is on words, phrases and expressions that can’t be reduced to numbers. If you’re interested in learning about qualitative analysis, check out our post and video here .

What is quantitative analysis used for?

Quantitative analysis is generally used for three purposes.

Firstly, it’s used to measure differences between groups . For example, the popularity of different clothing colours or brands.
Secondly, it’s used to assess relationships between variables . For example, the relationship between weather temperature and voter turnout.
And third, it’s used to test hypotheses in a scientifically rigorous way. For example, a hypothesis about the impact of a certain vaccine.

Again, this contrasts with qualitative analysis , which can be used to analyse people’s perceptions and feelings about an event or situation. In other words, things that can’t be reduced to numbers.

How does quantitative analysis work?

Well, since quantitative data analysis is all about analysing numbers , it’s no surprise that it involves statistics . Statistical analysis methods form the engine that powers quantitative analysis, and these methods can vary from pretty basic calculations (for example, averages and medians) to more sophisticated analyses (for example, correlations and regressions).

Sounds like gibberish? Don’t worry. We’ll explain all of that in this post. Importantly, you don’t need to be a statistician or math wiz to pull off a good quantitative analysis. We’ll break down all the technical mumbo jumbo in this post.

Need a helping hand?

As I mentioned, quantitative analysis is powered by statistical analysis methods . There are two main “branches” of statistical methods that are used – descriptive statistics and inferential statistics . In your research, you might only use descriptive statistics, or you might use a mix of both , depending on what you’re trying to figure out. In other words, depending on your research questions, aims and objectives . I’ll explain how to choose your methods later.

So, what are descriptive and inferential statistics?

Well, before I can explain that, we need to take a quick detour to explain some lingo. To understand the difference between these two branches of statistics, you need to understand two important words. These words are population and sample .

First up, population . In statistics, the population is the entire group of people (or animals or organisations or whatever) that you’re interested in researching. For example, if you were interested in researching Tesla owners in the US, then the population would be all Tesla owners in the US.

However, it’s extremely unlikely that you’re going to be able to interview or survey every single Tesla owner in the US. Realistically, you’ll likely only get access to a few hundred, or maybe a few thousand owners using an online survey. This smaller group of accessible people whose data you actually collect is called your sample .

So, to recap – the population is the entire group of people you’re interested in, and the sample is the subset of the population that you can actually get access to. In other words, the population is the full chocolate cake , whereas the sample is a slice of that cake.

So, why is this sample-population thing important?

Well, descriptive statistics focus on describing the sample , while inferential statistics aim to make predictions about the population, based on the findings within the sample. In other words, we use one group of statistical methods – descriptive statistics – to investigate the slice of cake, and another group of methods – inferential statistics – to draw conclusions about the entire cake. There I go with the cake analogy again…

With that out the way, let’s take a closer look at each of these branches in more detail.

Descriptive statistics vs inferential statistics

Branch 1: Descriptive Statistics

Descriptive statistics serve a simple but critically important role in your research – to describe your data set – hence the name. In other words, they help you understand the details of your sample . Unlike inferential statistics (which we’ll get to soon), descriptive statistics don’t aim to make inferences or predictions about the entire population – they’re purely interested in the details of your specific sample .

When you’re writing up your analysis, descriptive statistics are the first set of stats you’ll cover, before moving on to inferential statistics. But, that said, depending on your research objectives and research questions , they may be the only type of statistics you use. We’ll explore that a little later.

So, what kind of statistics are usually covered in this section?

Some common statistical tests used in this branch include the following:

Mean – this is simply the mathematical average of a range of numbers.
Median – this is the midpoint in a range of numbers when the numbers are arranged in numerical order. If the data set makes up an odd number, then the median is the number right in the middle of the set. If the data set makes up an even number, then the median is the midpoint between the two middle numbers.
Mode – this is simply the most commonly occurring number in the data set.
In cases where most of the numbers are quite close to the average, the standard deviation will be relatively low.
Conversely, in cases where the numbers are scattered all over the place, the standard deviation will be relatively high.
Skewness . As the name suggests, skewness indicates how symmetrical a range of numbers is. In other words, do they tend to cluster into a smooth bell curve shape in the middle of the graph, or do they skew to the left or right?

Feeling a bit confused? Let’s look at a practical example using a small data set.

On the left-hand side is the data set. This details the bodyweight of a sample of 10 people. On the right-hand side, we have the descriptive statistics. Let’s take a look at each of them.

First, we can see that the mean weight is 72.4 kilograms. In other words, the average weight across the sample is 72.4 kilograms. Straightforward.

Next, we can see that the median is very similar to the mean (the average). This suggests that this data set has a reasonably symmetrical distribution (in other words, a relatively smooth, centred distribution of weights, clustered towards the centre).

In terms of the mode , there is no mode in this data set. This is because each number is present only once and so there cannot be a “most common number”. If there were two people who were both 65 kilograms, for example, then the mode would be 65.

Next up is the standard deviation . 10.6 indicates that there’s quite a wide spread of numbers. We can see this quite easily by looking at the numbers themselves, which range from 55 to 90, which is quite a stretch from the mean of 72.4.

And lastly, the skewness of -0.2 tells us that the data is very slightly negatively skewed. This makes sense since the mean and the median are slightly different.

As you can see, these descriptive statistics give us some useful insight into the data set. Of course, this is a very small data set (only 10 records), so we can’t read into these statistics too much. Also, keep in mind that this is not a list of all possible descriptive statistics – just the most common ones.

But why do all of these numbers matter?

While these descriptive statistics are all fairly basic, they’re important for a few reasons:

Firstly, they help you get both a macro and micro-level view of your data. In other words, they help you understand both the big picture and the finer details.
Secondly, they help you spot potential errors in the data – for example, if an average is way higher than you’d expect, or responses to a question are highly varied, this can act as a warning sign that you need to double-check the data.
And lastly, these descriptive statistics help inform which inferential statistical techniques you can use, as those techniques depend on the skewness (in other words, the symmetry and normality) of the data.

Simply put, descriptive statistics are really important , even though the statistical techniques used are fairly basic. All too often at Grad Coach, we see students skimming over the descriptives in their eagerness to get to the more exciting inferential methods, and then landing up with some very flawed results.

Don’t be a sucker – give your descriptive statistics the love and attention they deserve!

Branch 2: Inferential Statistics

As I mentioned, while descriptive statistics are all about the details of your specific data set – your sample – inferential statistics aim to make inferences about the population . In other words, you’ll use inferential statistics to make predictions about what you’d expect to find in the full population.

What kind of predictions, you ask? Well, there are two common types of predictions that researchers try to make using inferential stats:

Firstly, predictions about differences between groups – for example, height differences between children grouped by their favourite meal or gender.
And secondly, relationships between variables – for example, the relationship between body weight and the number of hours a week a person does yoga.

In other words, inferential statistics (when done correctly), allow you to connect the dots and make predictions about what you expect to see in the real world population, based on what you observe in your sample data. For this reason, inferential statistics are used for hypothesis testing – in other words, to test hypotheses that predict changes or differences.

Inferential statistics are used to make predictions about what you’d expect to find in the full population, based on the sample.

Of course, when you’re working with inferential statistics, the composition of your sample is really important. In other words, if your sample doesn’t accurately represent the population you’re researching, then your findings won’t necessarily be very useful.

For example, if your population of interest is a mix of 50% male and 50% female , but your sample is 80% male , you can’t make inferences about the population based on your sample, since it’s not representative. This area of statistics is called sampling, but we won’t go down that rabbit hole here (it’s a deep one!) – we’ll save that for another post .

What statistics are usually used in this branch?

There are many, many different statistical analysis methods within the inferential branch and it’d be impossible for us to discuss them all here. So we’ll just take a look at some of the most common inferential statistical methods so that you have a solid starting point.

First up are T-Tests . T-tests compare the means (the averages) of two groups of data to assess whether they’re statistically significantly different. In other words, do they have significantly different means, standard deviations and skewness.

This type of testing is very useful for understanding just how similar or different two groups of data are. For example, you might want to compare the mean blood pressure between two groups of people – one that has taken a new medication and one that hasn’t – to assess whether they are significantly different.

Kicking things up a level, we have ANOVA, which stands for “analysis of variance”. This test is similar to a T-test in that it compares the means of various groups, but ANOVA allows you to analyse multiple groups , not just two groups So it’s basically a t-test on steroids…

Next, we have correlation analysis . This type of analysis assesses the relationship between two variables. In other words, if one variable increases, does the other variable also increase, decrease or stay the same. For example, if the average temperature goes up, do average ice creams sales increase too? We’d expect some sort of relationship between these two variables intuitively , but correlation analysis allows us to measure that relationship scientifically .

Lastly, we have regression analysis – this is quite similar to correlation in that it assesses the relationship between variables, but it goes a step further to understand cause and effect between variables, not just whether they move together. In other words, does the one variable actually cause the other one to move, or do they just happen to move together naturally thanks to another force? Just because two variables correlate doesn’t necessarily mean that one causes the other.

Stats overload…

I hear you. To make this all a little more tangible, let’s take a look at an example of a correlation in action.

Here’s a scatter plot demonstrating the correlation (relationship) between weight and height. Intuitively, we’d expect there to be some relationship between these two variables, which is what we see in this scatter plot. In other words, the results tend to cluster together in a diagonal line from bottom left to top right.

As I mentioned, these are are just a handful of inferential techniques – there are many, many more. Importantly, each statistical method has its own assumptions and limitations .

For example, some methods only work with normally distributed (parametric) data, while other methods are designed specifically for non-parametric data. And that’s exactly why descriptive statistics are so important – they’re the first step to knowing which inferential techniques you can and can’t use.

Remember that every statistical method has its own assumptions and limitations, so you need to be aware of these.

How to choose the right analysis method

To choose the right statistical methods, you need to think about two important factors :

The type of quantitative data you have (specifically, level of measurement and the shape of the data). And,
Your research questions and hypotheses

Let’s take a closer look at each of these.

Factor 1 – Data type

The first thing you need to consider is the type of data you’ve collected (or the type of data you will collect). By data types, I’m referring to the four levels of measurement – namely, nominal, ordinal, interval and ratio. If you’re not familiar with this lingo, check out the video below.

Why does this matter?

Well, because different statistical methods and techniques require different types of data. This is one of the “assumptions” I mentioned earlier – every method has its assumptions regarding the type of data.

For example, some techniques work with categorical data (for example, yes/no type questions, or gender or ethnicity), while others work with continuous numerical data (for example, age, weight or income) – and, of course, some work with multiple data types.

If you try to use a statistical method that doesn’t support the data type you have, your results will be largely meaningless . So, make sure that you have a clear understanding of what types of data you’ve collected (or will collect). Once you have this, you can then check which statistical methods would support your data types here .

If you haven’t collected your data yet, you can work in reverse and look at which statistical method would give you the most useful insights, and then design your data collection strategy to collect the correct data types.

Another important factor to consider is the shape of your data . Specifically, does it have a normal distribution (in other words, is it a bell-shaped curve, centred in the middle) or is it very skewed to the left or the right? Again, different statistical techniques work for different shapes of data – some are designed for symmetrical data while others are designed for skewed data.

This is another reminder of why descriptive statistics are so important – they tell you all about the shape of your data.

Factor 2: Your research questions

The next thing you need to consider is your specific research questions, as well as your hypotheses (if you have some). The nature of your research questions and research hypotheses will heavily influence which statistical methods and techniques you should use.

If you’re just interested in understanding the attributes of your sample (as opposed to the entire population), then descriptive statistics are probably all you need. For example, if you just want to assess the means (averages) and medians (centre points) of variables in a group of people.

On the other hand, if you aim to understand differences between groups or relationships between variables and to infer or predict outcomes in the population, then you’ll likely need both descriptive statistics and inferential statistics.

So, it’s really important to get very clear about your research aims and research questions, as well your hypotheses – before you start looking at which statistical techniques to use.

Never shoehorn a specific statistical technique into your research just because you like it or have some experience with it. Your choice of methods must align with all the factors we’ve covered here.

Time to recap…

You’re still with me? That’s impressive. We’ve covered a lot of ground here, so let’s recap on the key points:

Quantitative data analysis is all about analysing number-based data (which includes categorical and numerical data) using various statistical techniques.
The two main branches of statistics are descriptive statistics and inferential statistics . Descriptives describe your sample, whereas inferentials make predictions about what you’ll find in the population.
Common descriptive statistical methods include mean (average), median , standard deviation and skewness .
Common inferential statistical methods include t-tests , ANOVA , correlation and regression analysis.
To choose the right statistical methods and techniques, you need to consider the type of data you’re working with , as well as your research questions and hypotheses.

Psst... there’s more!

This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...

You Might Also Like:

75 Comments

Hi, I have read your article. Such a brilliant post you have created.

Thank you for the feedback. Good luck with your quantitative analysis.

Thank you so much.

Thank you so much. I learnt much well. I love your summaries of the concepts. I had love you to explain how to input data using SPSS

Amazing and simple way of breaking down quantitative methods.

This is beautiful….especially for non-statisticians. I have skimmed through but I wish to read again. and please include me in other articles of the same nature when you do post. I am interested. I am sure, I could easily learn from you and get off the fear that I have had in the past. Thank you sincerely.

Send me every new information you might have.

i need every new information

Thank you for the blog. It is quite informative. Dr Peter Nemaenzhe PhD

It is wonderful. l’ve understood some of the concepts in a more compréhensive manner

Your article is so good! However, I am still a bit lost. I am doing a secondary research on Gun control in the US and increase in crime rates and I am not sure which analysis method I should use?

Based on the given learning points, this is inferential analysis, thus, use ‘t-tests, ANOVA, correlation and regression analysis’

Well explained notes. Am an MPH student and currently working on my thesis proposal, this has really helped me understand some of the things I didn’t know.

I like your page..helpful

wonderful i got my concept crystal clear. thankyou!!

This is really helpful , thank you

Thank you so much this helped

Wonderfully explained

thank u so much, it was so informative

THANKYOU, this was very informative and very helpful

This is great GRADACOACH I am not a statistician but I require more of this in my thesis

Include me in your posts.

This is so great and fully useful. I would like to thank you again and again.

Glad to read this article. I’ve read lot of articles but this article is clear on all concepts. Thanks for sharing.

Thank you so much. This is a very good foundation and intro into quantitative data analysis. Appreciate!

You have a very impressive, simple but concise explanation of data analysis for Quantitative Research here. This is a God-send link for me to appreciate research more. Thank you so much!

Avery good presentation followed by the write up. yes you simplified statistics to make sense even to a layman like me. Thank so much keep it up. The presenter did ell too. i would like more of this for Qualitative and exhaust more of the test example like the Anova.

This is a very helpful article, couldn’t have been clearer. Thank you.

Awesome and phenomenal information.Well done

The video with the accompanying article is super helpful to demystify this topic. Very well done. Thank you so much.

thank you so much, your presentation helped me a lot

I don’t know how should I express that ur article is saviour for me 🥺😍

It is well defined information and thanks for sharing. It helps me a lot in understanding the statistical data.

I gain a lot and thanks for sharing brilliant ideas, so wish to be linked on your email update.

Very helpful and clear .Thank you Gradcoach.

Thank for sharing this article, well organized and information presented are very clear.

VERY INTERESTING AND SUPPORTIVE TO NEW RESEARCHERS LIKE ME. AT LEAST SOME BASICS ABOUT QUANTITATIVE.

An outstanding, well explained and helpful article. This will help me so much with my data analysis for my research project. Thank you!

wow this has just simplified everything i was scared of how i am gonna analyse my data but thanks to you i will be able to do so

simple and constant direction to research. thanks

This is helpful

Great writing!! Comprehensive and very helpful.

Do you provide any assistance for other steps of research methodology like making research problem testing hypothesis report and thesis writing?

Thank you so much for such useful article!

Amazing article. So nicely explained. Wow

Very insightfull. Thanks

I am doing a quality improvement project to determine if the implementation of a protocol will change prescribing habits. Would this be a t-test?

The is a very helpful blog, however, I’m still not sure how to analyze my data collected. I’m doing a research on “Free Education at the University of Guyana”

tnx. fruitful blog!

So I am writing exams and would like to know how do establish which method of data analysis to use from the below research questions: I am a bit lost as to how I determine the data analysis method from the research questions.

Do female employees report higher job satisfaction than male employees with similar job descriptions across the South African telecommunications sector? – I though that maybe Chi Square could be used here. – Is there a gender difference in talented employees’ actual turnover decisions across the South African telecommunications sector? T-tests or Correlation in this one. – Is there a gender difference in the cost of actual turnover decisions across the South African telecommunications sector? T-tests or Correlation in this one. – What practical recommendations can be made to the management of South African telecommunications companies on leveraging gender to mitigate employee turnover decisions?

Your assistance will be appreciated if I could get a response as early as possible tomorrow

This was quite helpful. Thank you so much.

wow I got a lot from this article, thank you very much, keep it up

Thanks for yhe guidance. Can you send me this guidance on my email? To enable offline reading?

Thank you very much, this service is very helpful.

Every novice researcher needs to read this article as it puts things so clear and easy to follow. Its been very helpful.

Wonderful!!!! you explained everything in a way that anyone can learn. Thank you!!

I really enjoyed reading though this. Very easy to follow. Thank you

Many thanks for your useful lecture, I would be really appreciated if you could possibly share with me the PPT of presentation related to Data type?

Thank you very much for sharing, I got much from this article

This is a very informative write-up. Kindly include me in your latest posts.

Very interesting mostly for social scientists

Thank you so much, very helpfull

You’re welcome 🙂

woow, its great, its very informative and well understood because of your way of writing like teaching in front of me in simple languages.

I have been struggling to understand a lot of these concepts. Thank you for the informative piece which is written with outstanding clarity.

very informative article. Easy to understand

Beautiful read, much needed.

Always greet intro and summary. I learn so much from GradCoach

Quite informative. Simple and clear summary.

I thoroughly enjoyed reading your informative and inspiring piece. Your profound insights into this topic truly provide a better understanding of its complexity. I agree with the points you raised, especially when you delved into the specifics of the article. In my opinion, that aspect is often overlooked and deserves further attention.

Absolutely!!! Thank you

Thank you very much for this post. It made me to understand how to do my data analysis.

its nice work and excellent job ,you have made my work easier

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Print Friendly

How to conduct a meta-analysis in eight steps: a practical guide

Open access
Published: 30 November 2021
Volume 72 , pages 1–19, ( 2022 )

Cite this article

You have full access to this open access article

Christopher Hansen 1 ,
Holger Steinmetz 2 &
Jörn Block 3 , 4 , 5

150k Accesses

45 Citations

158 Altmetric

Explore all metrics

Avoid common mistakes on your manuscript.

1 Introduction

“Scientists have known for centuries that a single study will not resolve a major issue. Indeed, a small sample study will not even resolve a minor issue. Thus, the foundation of science is the cumulation of knowledge from the results of many studies.” (Hunter et al. 1982 , p. 10)

Meta-analysis is a central method for knowledge accumulation in many scientific fields (Aguinis et al. 2011c ; Kepes et al. 2013 ). Similar to a narrative review, it serves as a synopsis of a research question or field. However, going beyond a narrative summary of key findings, a meta-analysis adds value in providing a quantitative assessment of the relationship between two target variables or the effectiveness of an intervention (Gurevitch et al. 2018 ). Also, it can be used to test competing theoretical assumptions against each other or to identify important moderators where the results of different primary studies differ from each other (Aguinis et al. 2011b ; Bergh et al. 2016 ). Rooted in the synthesis of the effectiveness of medical and psychological interventions in the 1970s (Glass 2015 ; Gurevitch et al. 2018 ), meta-analysis is nowadays also an established method in management research and related fields.

The increasing importance of meta-analysis in management research has resulted in the publication of guidelines in recent years that discuss the merits and best practices in various fields, such as general management (Bergh et al. 2016 ; Combs et al. 2019 ; Gonzalez-Mulé and Aguinis 2018 ), international business (Steel et al. 2021 ), economics and finance (Geyer-Klingeberg et al. 2020 ; Havranek et al. 2020 ), marketing (Eisend 2017 ; Grewal et al. 2018 ), and organizational studies (DeSimone et al. 2020 ; Rudolph et al. 2020 ). These articles discuss existing and trending methods and propose solutions for often experienced problems. This editorial briefly summarizes the insights of these papers; provides a workflow of the essential steps in conducting a meta-analysis; suggests state-of-the art methodological procedures; and points to other articles for in-depth investigation. Thus, this article has two goals: (1) based on the findings of previous editorials and methodological articles, it defines methodological recommendations for meta-analyses submitted to Management Review Quarterly (MRQ); and (2) it serves as a practical guide for researchers who have little experience with meta-analysis as a method but plan to conduct one in the future.

2 Eight steps in conducting a meta-analysis

2.1 step 1: defining the research question.

The first step in conducting a meta-analysis, as with any other empirical study, is the definition of the research question. Most importantly, the research question determines the realm of constructs to be considered or the type of interventions whose effects shall be analyzed. When defining the research question, two hurdles might develop. First, when defining an adequate study scope, researchers must consider that the number of publications has grown exponentially in many fields of research in recent decades (Fortunato et al. 2018 ). On the one hand, a larger number of studies increases the potentially relevant literature basis and enables researchers to conduct meta-analyses. Conversely, scanning a large amount of studies that could be potentially relevant for the meta-analysis results in a perhaps unmanageable workload. Thus, Steel et al. ( 2021 ) highlight the importance of balancing manageability and relevance when defining the research question. Second, similar to the number of primary studies also the number of meta-analyses in management research has grown strongly in recent years (Geyer-Klingeberg et al. 2020 ; Rauch 2020 ; Schwab 2015 ). Therefore, it is likely that one or several meta-analyses for many topics of high scholarly interest already exist. However, this should not deter researchers from investigating their research questions. One possibility is to consider moderators or mediators of a relationship that have previously been ignored. For example, a meta-analysis about startup performance could investigate the impact of different ways to measure the performance construct (e.g., growth vs. profitability vs. survival time) or certain characteristics of the founders as moderators. Another possibility is to replicate previous meta-analyses and test whether their findings can be confirmed with an updated sample of primary studies or newly developed methods. Frequent replications and updates of meta-analyses are important contributions to cumulative science and are increasingly called for by the research community (Anderson & Kichkha 2017 ; Steel et al. 2021 ). Consistent with its focus on replication studies (Block and Kuckertz 2018 ), MRQ therefore also invites authors to submit replication meta-analyses.

2.2 Step 2: literature search

2.2.1 search strategies.

Similar to conducting a literature review, the search process of a meta-analysis should be systematic, reproducible, and transparent, resulting in a sample that includes all relevant studies (Fisch and Block 2018 ; Gusenbauer and Haddaway 2020 ). There are several identification strategies for relevant primary studies when compiling meta-analytical datasets (Harari et al. 2020 ). First, previous meta-analyses on the same or a related topic may provide lists of included studies that offer a good starting point to identify and become familiar with the relevant literature. This practice is also applicable to topic-related literature reviews, which often summarize the central findings of the reviewed articles in systematic tables. Both article types likely include the most prominent studies of a research field. The most common and important search strategy, however, is a keyword search in electronic databases (Harari et al. 2020 ). This strategy will probably yield the largest number of relevant studies, particularly so-called ‘grey literature’, which may not be considered by literature reviews. Gusenbauer and Haddaway ( 2020 ) provide a detailed overview of 34 scientific databases, of which 18 are multidisciplinary or have a focus on management sciences, along with their suitability for literature synthesis. To prevent biased results due to the scope or journal coverage of one database, researchers should use at least two different databases (DeSimone et al. 2020 ; Martín-Martín et al. 2021 ; Mongeon & Paul-Hus 2016 ). However, a database search can easily lead to an overload of potentially relevant studies. For example, key term searches in Google Scholar for “entrepreneurial intention” and “firm diversification” resulted in more than 660,000 and 810,000 hits, respectively. Footnote 1 Therefore, a precise research question and precise search terms using Boolean operators are advisable (Gusenbauer and Haddaway 2020 ). Addressing the challenge of identifying relevant articles in the growing number of database publications, (semi)automated approaches using text mining and machine learning (Bosco et al. 2017 ; O’Mara-Eves et al. 2015 ; Ouzzani et al. 2016 ; Thomas et al. 2017 ) can also be promising and time-saving search tools in the future. Also, some electronic databases offer the possibility to track forward citations of influential studies and thereby identify further relevant articles. Finally, collecting unpublished or undetected studies through conferences, personal contact with (leading) scholars, or listservs can be strategies to increase the study sample size (Grewal et al. 2018 ; Harari et al. 2020 ; Pigott and Polanin 2020 ).

2.2.2 Study inclusion criteria and sample composition

Next, researchers must decide which studies to include in the meta-analysis. Some guidelines for literature reviews recommend limiting the sample to studies published in renowned academic journals to ensure the quality of findings (e.g., Kraus et al. 2020 ). For meta-analysis, however, Steel et al. ( 2021 ) advocate for the inclusion of all available studies, including grey literature, to prevent selection biases based on availability, cost, familiarity, and language (Rothstein et al. 2005 ), or the “Matthew effect”, which denotes the phenomenon that highly cited articles are found faster than less cited articles (Merton 1968 ). Harrison et al. ( 2017 ) find that the effects of published studies in management are inflated on average by 30% compared to unpublished studies. This so-called publication bias or “file drawer problem” (Rosenthal 1979 ) results from the preference of academia to publish more statistically significant and less statistically insignificant study results. Owen and Li ( 2020 ) showed that publication bias is particularly severe when variables of interest are used as key variables rather than control variables. To consider the true effect size of a target variable or relationship, the inclusion of all types of research outputs is therefore recommended (Polanin et al. 2016 ). Different test procedures to identify publication bias are discussed subsequently in Step 7.

In addition to the decision of whether to include certain study types (i.e., published vs. unpublished studies), there can be other reasons to exclude studies that are identified in the search process. These reasons can be manifold and are primarily related to the specific research question and methodological peculiarities. For example, studies identified by keyword search might not qualify thematically after all, may use unsuitable variable measurements, or may not report usable effect sizes. Furthermore, there might be multiple studies by the same authors using similar datasets. If they do not differ sufficiently in terms of their sample characteristics or variables used, only one of these studies should be included to prevent bias from duplicates (Wood 2008 ; see this article for a detection heuristic).

In general, the screening process should be conducted stepwise, beginning with a removal of duplicate citations from different databases, followed by abstract screening to exclude clearly unsuitable studies and a final full-text screening of the remaining articles (Pigott and Polanin 2020 ). A graphical tool to systematically document the sample selection process is the PRISMA flow diagram (Moher et al. 2009 ). Page et al. ( 2021 ) recently presented an updated version of the PRISMA statement, including an extended item checklist and flow diagram to report the study process and findings.

2.3 Step 3: choice of the effect size measure

2.3.1 types of effect sizes.

The two most common meta-analytical effect size measures in management studies are (z-transformed) correlation coefficients and standardized mean differences (Aguinis et al. 2011a ; Geyskens et al. 2009 ). However, meta-analyses in management science and related fields may not be limited to those two effect size measures but rather depend on the subfield of investigation (Borenstein 2009 ; Stanley and Doucouliagos 2012 ). In economics and finance, researchers are more interested in the examination of elasticities and marginal effects extracted from regression models than in pure bivariate correlations (Stanley and Doucouliagos 2012 ). Regression coefficients can also be converted to partial correlation coefficients based on their t-statistics to make regression results comparable across studies (Stanley and Doucouliagos 2012 ). Although some meta-analyses in management research have combined bivariate and partial correlations in their study samples, Aloe ( 2015 ) and Combs et al. ( 2019 ) advise researchers not to use this practice. Most importantly, they argue that the effect size strength of partial correlations depends on the other variables included in the regression model and is therefore incomparable to bivariate correlations (Schmidt and Hunter 2015 ), resulting in a possible bias of the meta-analytic results (Roth et al. 2018 ). We endorse this opinion. If at all, we recommend separate analyses for each measure. In addition to these measures, survival rates, risk ratios or odds ratios, which are common measures in medical research (Borenstein 2009 ), can be suitable effect sizes for specific management research questions, such as understanding the determinants of the survival of startup companies. To summarize, the choice of a suitable effect size is often taken away from the researcher because it is typically dependent on the investigated research question as well as the conventions of the specific research field (Cheung and Vijayakumar 2016 ).

2.3.2 Conversion of effect sizes to a common measure

After having defined the primary effect size measure for the meta-analysis, it might become necessary in the later coding process to convert study findings that are reported in effect sizes that are different from the chosen primary effect size. For example, a study might report only descriptive statistics for two study groups but no correlation coefficient, which is used as the primary effect size measure in the meta-analysis. Different effect size measures can be harmonized using conversion formulae, which are provided by standard method books such as Borenstein et al. ( 2009 ) or Lipsey and Wilson ( 2001 ). There also exist online effect size calculators for meta-analysis. Footnote 2

2.4 Step 4: choice of the analytical method used

Choosing which meta-analytical method to use is directly connected to the research question of the meta-analysis. Research questions in meta-analyses can address a relationship between constructs or an effect of an intervention in a general manner, or they can focus on moderating or mediating effects. There are four meta-analytical methods that are primarily used in contemporary management research (Combs et al. 2019 ; Geyer-Klingeberg et al. 2020 ), which allow the investigation of these different types of research questions: traditional univariate meta-analysis, meta-regression, meta-analytic structural equation modeling, and qualitative meta-analysis (Hoon 2013 ). While the first three are quantitative, the latter summarizes qualitative findings. Table 1 summarizes the key characteristics of the three quantitative methods.

2.4.1 Univariate meta-analysis

In its traditional form, a meta-analysis reports a weighted mean effect size for the relationship or intervention of investigation and provides information on the magnitude of variance among primary studies (Aguinis et al. 2011c ; Borenstein et al. 2009 ). Accordingly, it serves as a quantitative synthesis of a research field (Borenstein et al. 2009 ; Geyskens et al. 2009 ). Prominent traditional approaches have been developed, for example, by Hedges and Olkin ( 1985 ) or Hunter and Schmidt ( 1990 , 2004 ). However, going beyond its simple summary function, the traditional approach has limitations in explaining the observed variance among findings (Gonzalez-Mulé and Aguinis 2018 ). To identify moderators (or boundary conditions) of the relationship of interest, meta-analysts can create subgroups and investigate differences between those groups (Borenstein and Higgins 2013 ; Hunter and Schmidt 2004 ). Potential moderators can be study characteristics (e.g., whether a study is published vs. unpublished), sample characteristics (e.g., study country, industry focus, or type of survey/experiment participants), or measurement artifacts (e.g., different types of variable measurements). The univariate approach is thus suitable to identify the overall direction of a relationship and can serve as a good starting point for additional analyses. However, due to its limitations in examining boundary conditions and developing theory, the univariate approach on its own is currently oftentimes viewed as not sufficient (Rauch 2020 ; Shaw and Ertug 2017 ).

2.4.2 Meta-regression analysis

Meta-regression analysis (Hedges and Olkin 1985 ; Lipsey and Wilson 2001 ; Stanley and Jarrell 1989 ) aims to investigate the heterogeneity among observed effect sizes by testing multiple potential moderators simultaneously. In meta-regression, the coded effect size is used as the dependent variable and is regressed on a list of moderator variables. These moderator variables can be categorical variables as described previously in the traditional univariate approach or (semi)continuous variables such as country scores that are merged with the meta-analytical data. Thus, meta-regression analysis overcomes the disadvantages of the traditional approach, which only allows us to investigate moderators singularly using dichotomized subgroups (Combs et al. 2019 ; Gonzalez-Mulé and Aguinis 2018 ). These possibilities allow a more fine-grained analysis of research questions that are related to moderating effects. However, Schmidt ( 2017 ) critically notes that the number of effect sizes in the meta-analytical sample must be sufficiently large to produce reliable results when investigating multiple moderators simultaneously in a meta-regression. For further reading, Tipton et al. ( 2019 ) outline the technical, conceptual, and practical developments of meta-regression over the last decades. Gonzalez-Mulé and Aguinis ( 2018 ) provide an overview of methodological choices and develop evidence-based best practices for future meta-analyses in management using meta-regression.

2.4.3 Meta-analytic structural equation modeling (MASEM)

MASEM is a combination of meta-analysis and structural equation modeling and allows to simultaneously investigate the relationships among several constructs in a path model. Researchers can use MASEM to test several competing theoretical models against each other or to identify mediation mechanisms in a chain of relationships (Bergh et al. 2016 ). This method is typically performed in two steps (Cheung and Chan 2005 ): In Step 1, a pooled correlation matrix is derived, which includes the meta-analytical mean effect sizes for all variable combinations; Step 2 then uses this matrix to fit the path model. While MASEM was based primarily on traditional univariate meta-analysis to derive the pooled correlation matrix in its early years (Viswesvaran and Ones 1995 ), more advanced methods, such as the GLS approach (Becker 1992 , 1995 ) or the TSSEM approach (Cheung and Chan 2005 ), have been subsequently developed. Cheung ( 2015a ) and Jak ( 2015 ) provide an overview of these approaches in their books with exemplary code. For datasets with more complex data structures, Wilson et al. ( 2016 ) also developed a multilevel approach that is related to the TSSEM approach in the second step. Bergh et al. ( 2016 ) discuss nine decision points and develop best practices for MASEM studies.

2.4.4 Qualitative meta-analysis

While the approaches explained above focus on quantitative outcomes of empirical studies, qualitative meta-analysis aims to synthesize qualitative findings from case studies (Hoon 2013 ; Rauch et al. 2014 ). The distinctive feature of qualitative case studies is their potential to provide in-depth information about specific contextual factors or to shed light on reasons for certain phenomena that cannot usually be investigated by quantitative studies (Rauch 2020 ; Rauch et al. 2014 ). In a qualitative meta-analysis, the identified case studies are systematically coded in a meta-synthesis protocol, which is then used to identify influential variables or patterns and to derive a meta-causal network (Hoon 2013 ). Thus, the insights of contextualized and typically nongeneralizable single studies are aggregated to a larger, more generalizable picture (Habersang et al. 2019 ). Although still the exception, this method can thus provide important contributions for academics in terms of theory development (Combs et al., 2019 ; Hoon 2013 ) and for practitioners in terms of evidence-based management or entrepreneurship (Rauch et al. 2014 ). Levitt ( 2018 ) provides a guide and discusses conceptual issues for conducting qualitative meta-analysis in psychology, which is also useful for management researchers.

2.5 Step 5: choice of software

Software solutions to perform meta-analyses range from built-in functions or additional packages of statistical software to software purely focused on meta-analyses and from commercial to open-source solutions. However, in addition to personal preferences, the choice of the most suitable software depends on the complexity of the methods used and the dataset itself (Cheung and Vijayakumar 2016 ). Meta-analysts therefore must carefully check if their preferred software is capable of performing the intended analysis.

Among commercial software providers, Stata (from version 16 on) offers built-in functions to perform various meta-analytical analyses or to produce various plots (Palmer and Sterne 2016 ). For SPSS and SAS, there exist several macros for meta-analyses provided by scholars, such as David B. Wilson or Andy P. Field and Raphael Gillet (Field and Gillett 2010 ). Footnote 3 Footnote 4 For researchers using the open-source software R (R Core Team 2021 ), Polanin et al. ( 2017 ) provide an overview of 63 meta-analysis packages and their functionalities. For new users, they recommend the package metafor (Viechtbauer 2010 ), which includes most necessary functions and for which the author Wolfgang Viechtbauer provides tutorials on his project website. Footnote 5 Footnote 6 In addition to packages and macros for statistical software, templates for Microsoft Excel have also been developed to conduct simple meta-analyses, such as Meta-Essentials by Suurmond et al. ( 2017 ). Footnote 7 Finally, programs purely dedicated to meta-analysis also exist, such as Comprehensive Meta-Analysis (Borenstein et al. 2013 ) or RevMan by The Cochrane Collaboration ( 2020 ).

2.6 Step 6: coding of effect sizes

2.6.1 coding sheet.

The first step in the coding process is the design of the coding sheet. A universal template does not exist because the design of the coding sheet depends on the methods used, the respective software, and the complexity of the research design. For univariate meta-analysis or meta-regression, data are typically coded in wide format. In its simplest form, when investigating a correlational relationship between two variables using the univariate approach, the coding sheet would contain a column for the study name or identifier, the effect size coded from the primary study, and the study sample size. However, such simple relationships are unlikely in management research because the included studies are typically not identical but differ in several respects. With more complex data structures or moderator variables being investigated, additional columns are added to the coding sheet to reflect the data characteristics. These variables can be coded as dummy, factor, or (semi)continuous variables and later used to perform a subgroup analysis or meta regression. For MASEM, the required data input format can deviate depending on the method used (e.g., TSSEM requires a list of correlation matrices as data input). For qualitative meta-analysis, the coding scheme typically summarizes the key qualitative findings and important contextual and conceptual information (see Hoon ( 2013 ) for a coding scheme for qualitative meta-analysis). Figure 1 shows an exemplary coding scheme for a quantitative meta-analysis on the correlational relationship between top-management team diversity and profitability. In addition to effect and sample sizes, information about the study country, firm type, and variable operationalizations are coded. The list could be extended by further study and sample characteristics.

Exemplary coding sheet for a meta-analysis on the relationship (correlation) between top-management team diversity and profitability

2.6.2 Inclusion of moderator or control variables

It is generally important to consider the intended research model and relevant nontarget variables before coding a meta-analytic dataset. For example, study characteristics can be important moderators or function as control variables in a meta-regression model. Similarly, control variables may be relevant in a MASEM approach to reduce confounding bias. Coding additional variables or constructs subsequently can be arduous if the sample of primary studies is large. However, the decision to include respective moderator or control variables, as in any empirical analysis, should always be based on strong (theoretical) rationales about how these variables can impact the investigated effect (Bernerth and Aguinis 2016 ; Bernerth et al. 2018 ; Thompson and Higgins 2002 ). While substantive moderators refer to theoretical constructs that act as buffers or enhancers of a supposed causal process, methodological moderators are features of the respective research designs that denote the methodological context of the observations and are important to control for systematic statistical particularities (Rudolph et al. 2020 ). Havranek et al. ( 2020 ) provide a list of recommended variables to code as potential moderators. While researchers may have clear expectations about the effects for some of these moderators, the concerns for other moderators may be tentative, and moderator analysis may be approached in a rather exploratory fashion. Thus, we argue that researchers should make full use of the meta-analytical design to obtain insights about potential context dependence that a primary study cannot achieve.

2.6.3 Treatment of multiple effect sizes in a study

A long-debated issue in conducting meta-analyses is whether to use only one or all available effect sizes for the same construct within a single primary study. For meta-analyses in management research, this question is fundamental because many empirical studies, particularly those relying on company databases, use multiple variables for the same construct to perform sensitivity analyses, resulting in multiple relevant effect sizes. In this case, researchers can either (randomly) select a single value, calculate a study average, or use the complete set of effect sizes (Bijmolt and Pieters 2001 ; López-López et al. 2018 ). Multiple effect sizes from the same study enrich the meta-analytic dataset and allow us to investigate the heterogeneity of the relationship of interest, such as different variable operationalizations (López-López et al. 2018 ; Moeyaert et al. 2017 ). However, including more than one effect size from the same study violates the independency assumption of observations (Cheung 2019 ; López-López et al. 2018 ), which can lead to biased results and erroneous conclusions (Gooty et al. 2021 ). We follow the recommendation of current best practice guides to take advantage of using all available effect size observations but to carefully consider interdependencies using appropriate methods such as multilevel models, panel regression models, or robust variance estimation (Cheung 2019 ; Geyer-Klingeberg et al. 2020 ; Gooty et al. 2021 ; López-López et al. 2018 ; Moeyaert et al. 2017 ).

2.7 Step 7: analysis

2.7.1 outlier analysis and tests for publication bias.

Before conducting the primary analysis, some preliminary sensitivity analyses might be necessary, which should ensure the robustness of the meta-analytical findings (Rudolph et al. 2020 ). First, influential outlier observations could potentially bias the observed results, particularly if the number of total effect sizes is small. Several statistical methods can be used to identify outliers in meta-analytical datasets (Aguinis et al. 2013 ; Viechtbauer and Cheung 2010 ). However, there is a debate about whether to keep or omit these observations. Anyhow, relevant studies should be closely inspected to infer an explanation about their deviating results. As in any other primary study, outliers can be a valid representation, albeit representing a different population, measure, construct, design or procedure. Thus, inferences about outliers can provide the basis to infer potential moderators (Aguinis et al. 2013 ; Steel et al. 2021 ). On the other hand, outliers can indicate invalid research, for instance, when unrealistically strong correlations are due to construct overlap (i.e., lack of a clear demarcation between independent and dependent variables), invalid measures, or simply typing errors when coding effect sizes. An advisable step is therefore to compare the results both with and without outliers and base the decision on whether to exclude outlier observations with careful consideration (Geyskens et al. 2009 ; Grewal et al. 2018 ; Kepes et al. 2013 ). However, instead of simply focusing on the size of the outlier, its leverage should be considered. Thus, Viechtbauer and Cheung ( 2010 ) propose considering a combination of standardized deviation and a study’s leverage.

Second, as mentioned in the context of a literature search, potential publication bias may be an issue. Publication bias can be examined in multiple ways (Rothstein et al. 2005 ). First, the funnel plot is a simple graphical tool that can provide an overview of the effect size distribution and help to detect publication bias (Stanley and Doucouliagos 2010 ). A funnel plot can also support in identifying potential outliers. As mentioned above, a graphical display of deviation (e.g., studentized residuals) and leverage (Cook’s distance) can help detect the presence of outliers and evaluate their influence (Viechtbauer and Cheung 2010 ). Moreover, several statistical procedures can be used to test for publication bias (Harrison et al. 2017 ; Kepes et al. 2012 ), including subgroup comparisons between published and unpublished studies, Begg and Mazumdar’s ( 1994 ) rank correlation test, cumulative meta-analysis (Borenstein et al. 2009 ), the trim and fill method (Duval and Tweedie 2000a , b ), Egger et al.’s ( 1997 ) regression test, failsafe N (Rosenthal 1979 ), or selection models (Hedges and Vevea 2005 ; Vevea and Woods 2005 ). In examining potential publication bias, Kepes et al. ( 2012 ) and Harrison et al. ( 2017 ) both recommend not relying only on a single test but rather using multiple conceptionally different test procedures (i.e., the so-called “triangulation approach”).

2.7.2 Model choice

After controlling and correcting for the potential presence of impactful outliers or publication bias, the next step in meta-analysis is the primary analysis, where meta-analysts must decide between two different types of models that are based on different assumptions: fixed-effects and random-effects (Borenstein et al. 2010 ). Fixed-effects models assume that all observations share a common mean effect size, which means that differences are only due to sampling error, while random-effects models assume heterogeneity and allow for a variation of the true effect sizes across studies (Borenstein et al. 2010 ; Cheung and Vijayakumar 2016 ; Hunter and Schmidt 2004 ). Both models are explained in detail in standard textbooks (e.g., Borenstein et al. 2009 ; Hunter and Schmidt 2004 ; Lipsey and Wilson 2001 ).

In general, the presence of heterogeneity is likely in management meta-analyses because most studies do not have identical empirical settings, which can yield different effect size strengths or directions for the same investigated phenomenon. For example, the identified studies have been conducted in different countries with different institutional settings, or the type of study participants varies (e.g., students vs. employees, blue-collar vs. white-collar workers, or manufacturing vs. service firms). Thus, the vast majority of meta-analyses in management research and related fields use random-effects models (Aguinis et al. 2011a ). In a meta-regression, the random-effects model turns into a so-called mixed-effects model because moderator variables are added as fixed effects to explain the impact of observed study characteristics on effect size variations (Raudenbush 2009 ).

2.8 Step 8: reporting results

2.8.1 reporting in the article.

The final step in performing a meta-analysis is reporting its results. Most importantly, all steps and methodological decisions should be comprehensible to the reader. DeSimone et al. ( 2020 ) provide an extensive checklist for journal reviewers of meta-analytical studies. This checklist can also be used by authors when performing their analyses and reporting their results to ensure that all important aspects have been addressed. Alternative checklists are provided, for example, by Appelbaum et al. ( 2018 ) or Page et al. ( 2021 ). Similarly, Levitt et al. ( 2018 ) provide a detailed guide for qualitative meta-analysis reporting standards.

For quantitative meta-analyses, tables reporting results should include all important information and test statistics, including mean effect sizes; standard errors and confidence intervals; the number of observations and study samples included; and heterogeneity measures. If the meta-analytic sample is rather small, a forest plot provides a good overview of the different findings and their accuracy. However, this figure will be less feasible for meta-analyses with several hundred effect sizes included. Also, results displayed in the tables and figures must be explained verbally in the results and discussion sections. Most importantly, authors must answer the primary research question, i.e., whether there is a positive, negative, or no relationship between the variables of interest, or whether the examined intervention has a certain effect. These results should be interpreted with regard to their magnitude (or significance), both economically and statistically. However, when discussing meta-analytical results, authors must describe the complexity of the results, including the identified heterogeneity and important moderators, future research directions, and theoretical relevance (DeSimone et al. 2019 ). In particular, the discussion of identified heterogeneity and underlying moderator effects is critical; not including this information can lead to false conclusions among readers, who interpret the reported mean effect size as universal for all included primary studies and ignore the variability of findings when citing the meta-analytic results in their research (Aytug et al. 2012 ; DeSimone et al. 2019 ).

2.8.2 Open-science practices

Another increasingly important topic is the public provision of meta-analytical datasets and statistical codes via open-source repositories. Open-science practices allow for results validation and for the use of coded data in subsequent meta-analyses ( Polanin et al. 2020 ), contributing to the development of cumulative science. Steel et al. ( 2021 ) refer to open science meta-analyses as a step towards “living systematic reviews” (Elliott et al. 2017 ) with continuous updates in real time. MRQ supports this development and encourages authors to make their datasets publicly available. Moreau and Gamble ( 2020 ), for example, provide various templates and video tutorials to conduct open science meta-analyses. There exist several open science repositories, such as the Open Science Foundation (OSF; for a tutorial, see Soderberg 2018 ), to preregister and make documents publicly available. Furthermore, several initiatives in the social sciences have been established to develop dynamic meta-analyses, such as metaBUS (Bosco et al. 2015 , 2017 ), MetaLab (Bergmann et al. 2018 ), or PsychOpen CAMA (Burgard et al. 2021 ).

3 Conclusion

This editorial provides a comprehensive overview of the essential steps in conducting and reporting a meta-analysis with references to more in-depth methodological articles. It also serves as a guide for meta-analyses submitted to MRQ and other management journals. MRQ welcomes all types of meta-analyses from all subfields and disciplines of management research.

Gusenbauer and Haddaway ( 2020 ), however, point out that Google Scholar is not appropriate as a primary search engine due to a lack of reproducibility of search results.

One effect size calculator by David B. Wilson is accessible via: https://www.campbellcollaboration.org/escalc/html/EffectSizeCalculator-Home.php .

The macros of David B. Wilson can be downloaded from: http://mason.gmu.edu/~dwilsonb/ .

The macros of Field and Gillet ( 2010 ) can be downloaded from: https://www.discoveringstatistics.com/repository/fieldgillett/how_to_do_a_meta_analysis.html .

The tutorials can be found via: https://www.metafor-project.org/doku.php .

Metafor does currently not provide functions to conduct MASEM. For MASEM, users can, for instance, use the package metaSEM (Cheung 2015b ).

The workbooks can be downloaded from: https://www.erim.eur.nl/research-support/meta-essentials/ .

Aguinis H, Dalton DR, Bosco FA, Pierce CA, Dalton CM (2011a) Meta-analytic choices and judgment calls: Implications for theory building and testing, obtained effect sizes, and scholarly impact. J Manag 37(1):5–38

Google Scholar

Aguinis H, Gottfredson RK, Joo H (2013) Best-practice recommendations for defining, identifying, and handling outliers. Organ Res Methods 16(2):270–301

Article Google Scholar

Aguinis H, Gottfredson RK, Wright TA (2011b) Best-practice recommendations for estimating interaction effects using meta-analysis. J Organ Behav 32(8):1033–1043

Aguinis H, Pierce CA, Bosco FA, Dalton DR, Dalton CM (2011c) Debunking myths and urban legends about meta-analysis. Organ Res Methods 14(2):306–331

Aloe AM (2015) Inaccuracy of regression results in replacing bivariate correlations. Res Synth Methods 6(1):21–27

Anderson RG, Kichkha A (2017) Replication, meta-analysis, and research synthesis in economics. Am Econ Rev 107(5):56–59

Appelbaum M, Cooper H, Kline RB, Mayo-Wilson E, Nezu AM, Rao SM (2018) Journal article reporting standards for quantitative research in psychology: the APA publications and communications BOARD task force report. Am Psychol 73(1):3–25

Aytug ZG, Rothstein HR, Zhou W, Kern MC (2012) Revealed or concealed? Transparency of procedures, decisions, and judgment calls in meta-analyses. Organ Res Methods 15(1):103–133

Begg CB, Mazumdar M (1994) Operating characteristics of a rank correlation test for publication bias. Biometrics 50(4):1088–1101. https://doi.org/10.2307/2533446

Bergh DD, Aguinis H, Heavey C, Ketchen DJ, Boyd BK, Su P, Lau CLL, Joo H (2016) Using meta-analytic structural equation modeling to advance strategic management research: Guidelines and an empirical illustration via the strategic leadership-performance relationship. Strateg Manag J 37(3):477–497

Becker BJ (1992) Using results from replicated studies to estimate linear models. J Educ Stat 17(4):341–362

Becker BJ (1995) Corrections to “Using results from replicated studies to estimate linear models.” J Edu Behav Stat 20(1):100–102

Bergmann C, Tsuji S, Piccinini PE, Lewis ML, Braginsky M, Frank MC, Cristia A (2018) Promoting replicability in developmental research through meta-analyses: Insights from language acquisition research. Child Dev 89(6):1996–2009

Bernerth JB, Aguinis H (2016) A critical review and best-practice recommendations for control variable usage. Pers Psychol 69(1):229–283

Bernerth JB, Cole MS, Taylor EC, Walker HJ (2018) Control variables in leadership research: A qualitative and quantitative review. J Manag 44(1):131–160

Bijmolt TH, Pieters RG (2001) Meta-analysis in marketing when studies contain multiple measurements. Mark Lett 12(2):157–169

Block J, Kuckertz A (2018) Seven principles of effective replication studies: Strengthening the evidence base of management research. Manag Rev Quart 68:355–359

Borenstein M (2009) Effect sizes for continuous data. In: Cooper H, Hedges LV, Valentine JC (eds) The handbook of research synthesis and meta-analysis. Russell Sage Foundation, pp 221–235

Borenstein M, Hedges LV, Higgins JPT, Rothstein HR (2009) Introduction to meta-analysis. John Wiley, Chichester

Book Google Scholar

Borenstein M, Hedges LV, Higgins JPT, Rothstein HR (2010) A basic introduction to fixed-effect and random-effects models for meta-analysis. Res Synth Methods 1(2):97–111

Borenstein M, Hedges L, Higgins J, Rothstein H (2013) Comprehensive meta-analysis (version 3). Biostat, Englewood, NJ

Borenstein M, Higgins JP (2013) Meta-analysis and subgroups. Prev Sci 14(2):134–143

Bosco FA, Steel P, Oswald FL, Uggerslev K, Field JG (2015) Cloud-based meta-analysis to bridge science and practice: Welcome to metaBUS. Person Assess Decis 1(1):3–17

Bosco FA, Uggerslev KL, Steel P (2017) MetaBUS as a vehicle for facilitating meta-analysis. Hum Resour Manag Rev 27(1):237–254

Burgard T, Bošnjak M, Studtrucker R (2021) Community-augmented meta-analyses (CAMAs) in psychology: potentials and current systems. Zeitschrift Für Psychologie 229(1):15–23

Cheung MWL (2015a) Meta-analysis: A structural equation modeling approach. John Wiley & Sons, Chichester

Cheung MWL (2015b) metaSEM: An R package for meta-analysis using structural equation modeling. Front Psychol 5:1521

Cheung MWL (2019) A guide to conducting a meta-analysis with non-independent effect sizes. Neuropsychol Rev 29(4):387–396

Cheung MWL, Chan W (2005) Meta-analytic structural equation modeling: a two-stage approach. Psychol Methods 10(1):40–64

Cheung MWL, Vijayakumar R (2016) A guide to conducting a meta-analysis. Neuropsychol Rev 26(2):121–128

Combs JG, Crook TR, Rauch A (2019) Meta-analytic research in management: contemporary approaches unresolved controversies and rising standards. J Manag Stud 56(1):1–18. https://doi.org/10.1111/joms.12427

DeSimone JA, Köhler T, Schoen JL (2019) If it were only that easy: the use of meta-analytic research by organizational scholars. Organ Res Methods 22(4):867–891. https://doi.org/10.1177/1094428118756743

DeSimone JA, Brannick MT, O’Boyle EH, Ryu JW (2020) Recommendations for reviewing meta-analyses in organizational research. Organ Res Methods 56:455–463

Duval S, Tweedie R (2000a) Trim and fill: a simple funnel-plot–based method of testing and adjusting for publication bias in meta-analysis. Biometrics 56(2):455–463

Duval S, Tweedie R (2000b) A nonparametric “trim and fill” method of accounting for publication bias in meta-analysis. J Am Stat Assoc 95(449):89–98

Egger M, Smith GD, Schneider M, Minder C (1997) Bias in meta-analysis detected by a simple, graphical test. BMJ 315(7109):629–634

Eisend M (2017) Meta-Analysis in advertising research. J Advert 46(1):21–35

Elliott JH, Synnot A, Turner T, Simmons M, Akl EA, McDonald S, Salanti G, Meerpohl J, MacLehose H, Hilton J, Tovey D, Shemilt I, Thomas J (2017) Living systematic review: 1. Introduction—the why, what, when, and how. J Clin Epidemiol 91:2330. https://doi.org/10.1016/j.jclinepi.2017.08.010

Field AP, Gillett R (2010) How to do a meta-analysis. Br J Math Stat Psychol 63(3):665–694

Fisch C, Block J (2018) Six tips for your (systematic) literature review in business and management research. Manag Rev Quart 68:103–106

Fortunato S, Bergstrom CT, Börner K, Evans JA, Helbing D, Milojević S, Petersen AM, Radicchi F, Sinatra R, Uzzi B, Vespignani A (2018) Science of science. Science 359(6379). https://doi.org/10.1126/science.aao0185

Geyer-Klingeberg J, Hang M, Rathgeber A (2020) Meta-analysis in finance research: Opportunities, challenges, and contemporary applications. Int Rev Finan Anal 71:101524

Geyskens I, Krishnan R, Steenkamp JBE, Cunha PV (2009) A review and evaluation of meta-analysis practices in management research. J Manag 35(2):393–419

Glass GV (2015) Meta-analysis at middle age: a personal history. Res Synth Methods 6(3):221–231

Gonzalez-Mulé E, Aguinis H (2018) Advancing theory by assessing boundary conditions with metaregression: a critical review and best-practice recommendations. J Manag 44(6):2246–2273

Gooty J, Banks GC, Loignon AC, Tonidandel S, Williams CE (2021) Meta-analyses as a multi-level model. Organ Res Methods 24(2):389–411. https://doi.org/10.1177/1094428119857471

Grewal D, Puccinelli N, Monroe KB (2018) Meta-analysis: integrating accumulated knowledge. J Acad Mark Sci 46(1):9–30

Gurevitch J, Koricheva J, Nakagawa S, Stewart G (2018) Meta-analysis and the science of research synthesis. Nature 555(7695):175–182

Gusenbauer M, Haddaway NR (2020) Which academic search systems are suitable for systematic reviews or meta-analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources. Res Synth Methods 11(2):181–217

Habersang S, Küberling-Jost J, Reihlen M, Seckler C (2019) A process perspective on organizational failure: a qualitative meta-analysis. J Manage Stud 56(1):19–56

Harari MB, Parola HR, Hartwell CJ, Riegelman A (2020) Literature searches in systematic reviews and meta-analyses: A review, evaluation, and recommendations. J Vocat Behav 118:103377

Harrison JS, Banks GC, Pollack JM, O’Boyle EH, Short J (2017) Publication bias in strategic management research. J Manag 43(2):400–425

Havránek T, Stanley TD, Doucouliagos H, Bom P, Geyer-Klingeberg J, Iwasaki I, Reed WR, Rost K, Van Aert RCM (2020) Reporting guidelines for meta-analysis in economics. J Econ Surveys 34(3):469–475

Hedges LV, Olkin I (1985) Statistical methods for meta-analysis. Academic Press, Orlando

Hedges LV, Vevea JL (2005) Selection methods approaches. In: Rothstein HR, Sutton A, Borenstein M (eds) Publication bias in meta-analysis: prevention, assessment, and adjustments. Wiley, Chichester, pp 145–174

Hoon C (2013) Meta-synthesis of qualitative case studies: an approach to theory building. Organ Res Methods 16(4):522–556

Hunter JE, Schmidt FL (1990) Methods of meta-analysis: correcting error and bias in research findings. Sage, Newbury Park

Hunter JE, Schmidt FL (2004) Methods of meta-analysis: correcting error and bias in research findings, 2nd edn. Sage, Thousand Oaks

Hunter JE, Schmidt FL, Jackson GB (1982) Meta-analysis: cumulating research findings across studies. Sage Publications, Beverly Hills

Jak S (2015) Meta-analytic structural equation modelling. Springer, New York, NY

Kepes S, Banks GC, McDaniel M, Whetzel DL (2012) Publication bias in the organizational sciences. Organ Res Methods 15(4):624–662

Kepes S, McDaniel MA, Brannick MT, Banks GC (2013) Meta-analytic reviews in the organizational sciences: Two meta-analytic schools on the way to MARS (the Meta-Analytic Reporting Standards). J Bus Psychol 28(2):123–143

Kraus S, Breier M, Dasí-Rodríguez S (2020) The art of crafting a systematic literature review in entrepreneurship research. Int Entrepreneur Manag J 16(3):1023–1042

Levitt HM (2018) How to conduct a qualitative meta-analysis: tailoring methods to enhance methodological integrity. Psychother Res 28(3):367–378

Levitt HM, Bamberg M, Creswell JW, Frost DM, Josselson R, Suárez-Orozco C (2018) Journal article reporting standards for qualitative primary, qualitative meta-analytic, and mixed methods research in psychology: the APA publications and communications board task force report. Am Psychol 73(1):26

Lipsey MW, Wilson DB (2001) Practical meta-analysis. Sage Publications, Inc.

López-López JA, Page MJ, Lipsey MW, Higgins JP (2018) Dealing with effect size multiplicity in systematic reviews and meta-analyses. Res Synth Methods 9(3):336–351

Martín-Martín A, Thelwall M, Orduna-Malea E, López-Cózar ED (2021) Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations’ COCI: a multidisciplinary comparison of coverage via citations. Scientometrics 126(1):871–906

Merton RK (1968) The Matthew effect in science: the reward and communication systems of science are considered. Science 159(3810):56–63

Moeyaert M, Ugille M, Natasha Beretvas S, Ferron J, Bunuan R, Van den Noortgate W (2017) Methods for dealing with multiple outcomes in meta-analysis: a comparison between averaging effect sizes, robust variance estimation and multilevel meta-analysis. Int J Soc Res Methodol 20(6):559–572

Moher D, Liberati A, Tetzlaff J, Altman DG, Prisma Group (2009) Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS medicine. 6(7):e1000097

Mongeon P, Paul-Hus A (2016) The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics 106(1):213–228

Moreau D, Gamble B (2020) Conducting a meta-analysis in the age of open science: Tools, tips, and practical recommendations. Psychol Methods. https://doi.org/10.1037/met0000351

O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S (2015) Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev 4(1):1–22

Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A (2016) Rayyan—a web and mobile app for systematic reviews. Syst Rev 5(1):1–10

Owen E, Li Q (2021) The conditional nature of publication bias: a meta-regression analysis. Polit Sci Res Methods 9(4):867–877

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Akl EA, Brennan SE, Chou R, Glanville J, Grimshaw JM, Hróbjartsson A, Lalu MM, Li T, Loder EW, Mayo-Wilson E,McDonald S,McGuinness LA, Stewart LA, Thomas J, Tricco AC, Welch VA, Whiting P, Moher D (2021) The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372. https://doi.org/10.1136/bmj.n71

Palmer TM, Sterne JAC (eds) (2016) Meta-analysis in stata: an updated collection from the stata journal, 2nd edn. Stata Press, College Station, TX

Pigott TD, Polanin JR (2020) Methodological guidance paper: High-quality meta-analysis in a systematic review. Rev Educ Res 90(1):24–46

Polanin JR, Tanner-Smith EE, Hennessy EA (2016) Estimating the difference between published and unpublished effect sizes: a meta-review. Rev Educ Res 86(1):207–236

Polanin JR, Hennessy EA, Tanner-Smith EE (2017) A review of meta-analysis packages in R. J Edu Behav Stat 42(2):206–242

Polanin JR, Hennessy EA, Tsuji S (2020) Transparency and reproducibility of meta-analyses in psychology: a meta-review. Perspect Psychol Sci 15(4):1026–1041. https://doi.org/10.1177/17456916209064

R Core Team (2021). R: A language and environment for statistical computing . R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/ .

Rauch A (2020) Opportunities and threats in reviewing entrepreneurship theory and practice. Entrep Theory Pract 44(5):847–860

Rauch A, van Doorn R, Hulsink W (2014) A qualitative approach to evidence–based entrepreneurship: theoretical considerations and an example involving business clusters. Entrep Theory Pract 38(2):333–368

Raudenbush SW (2009) Analyzing effect sizes: Random-effects models. In: Cooper H, Hedges LV, Valentine JC (eds) The handbook of research synthesis and meta-analysis, 2nd edn. Russell Sage Foundation, New York, NY, pp 295–315

Rosenthal R (1979) The file drawer problem and tolerance for null results. Psychol Bull 86(3):638

Rothstein HR, Sutton AJ, Borenstein M (2005) Publication bias in meta-analysis: prevention, assessment and adjustments. Wiley, Chichester

Roth PL, Le H, Oh I-S, Van Iddekinge CH, Bobko P (2018) Using beta coefficients to impute missing correlations in meta-analysis research: Reasons for caution. J Appl Psychol 103(6):644–658. https://doi.org/10.1037/apl0000293

Rudolph CW, Chang CK, Rauvola RS, Zacher H (2020) Meta-analysis in vocational behavior: a systematic review and recommendations for best practices. J Vocat Behav 118:103397

Schmidt FL (2017) Statistical and measurement pitfalls in the use of meta-regression in meta-analysis. Career Dev Int 22(5):469–476

Schmidt FL, Hunter JE (2015) Methods of meta-analysis: correcting error and bias in research findings. Sage, Thousand Oaks

Schwab A (2015) Why all researchers should report effect sizes and their confidence intervals: Paving the way for meta–analysis and evidence–based management practices. Entrepreneurship Theory Pract 39(4):719–725. https://doi.org/10.1111/etap.12158

Shaw JD, Ertug G (2017) The suitability of simulations and meta-analyses for submissions to Academy of Management Journal. Acad Manag J 60(6):2045–2049

Soderberg CK (2018) Using OSF to share data: A step-by-step guide. Adv Methods Pract Psychol Sci 1(1):115–120

Stanley TD, Doucouliagos H (2010) Picture this: a simple graph that reveals much ado about research. J Econ Surveys 24(1):170–191

Stanley TD, Doucouliagos H (2012) Meta-regression analysis in economics and business. Routledge, London

Stanley TD, Jarrell SB (1989) Meta-regression analysis: a quantitative method of literature surveys. J Econ Surveys 3:54–67

Steel P, Beugelsdijk S, Aguinis H (2021) The anatomy of an award-winning meta-analysis: Recommendations for authors, reviewers, and readers of meta-analytic reviews. J Int Bus Stud 52(1):23–44

Suurmond R, van Rhee H, Hak T (2017) Introduction, comparison, and validation of Meta-Essentials: a free and simple tool for meta-analysis. Res Synth Methods 8(4):537–553

The Cochrane Collaboration (2020). Review Manager (RevMan) [Computer program] (Version 5.4).

Thomas J, Noel-Storr A, Marshall I, Wallace B, McDonald S, Mavergames C, Glasziou P, Shemilt I, Synnot A, Turner T, Elliot J (2017) Living systematic reviews: 2. Combining human and machine effort. J Clin Epidemiol 91:31–37

Thompson SG, Higgins JP (2002) How should meta-regression analyses be undertaken and interpreted? Stat Med 21(11):1559–1573

Tipton E, Pustejovsky JE, Ahmadi H (2019) A history of meta-regression: technical, conceptual, and practical developments between 1974 and 2018. Res Synth Methods 10(2):161–179

Vevea JL, Woods CM (2005) Publication bias in research synthesis: Sensitivity analysis using a priori weight functions. Psychol Methods 10(4):428–443

Viechtbauer W (2010) Conducting meta-analyses in R with the metafor package. J Stat Softw 36(3):1–48

Viechtbauer W, Cheung MWL (2010) Outlier and influence diagnostics for meta-analysis. Res Synth Methods 1(2):112–125

Viswesvaran C, Ones DS (1995) Theory testing: combining psychometric meta-analysis and structural equations modeling. Pers Psychol 48(4):865–885

Wilson SJ, Polanin JR, Lipsey MW (2016) Fitting meta-analytic structural equation models with complex datasets. Res Synth Methods 7(2):121–139. https://doi.org/10.1002/jrsm.1199

Wood JA (2008) Methodology for dealing with duplicate study effects in a meta-analysis. Organ Res Methods 11(1):79–95

Download references

Open Access funding enabled and organized by Projekt DEAL. No funding was received to assist with the preparation of this manuscript.

Author information

Authors and affiliations.

University of Luxembourg, Luxembourg, Luxembourg

Christopher Hansen

Leibniz Institute for Psychology (ZPID), Trier, Germany

Holger Steinmetz

Trier University, Trier, Germany

Erasmus University Rotterdam, Rotterdam, The Netherlands

Wittener Institut Für Familienunternehmen, Universität Witten/Herdecke, Witten, Germany

You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jörn Block .

Ethics declarations

Conflict of interest.

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

See Table 1 .

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Hansen, C., Steinmetz, H. & Block, J. How to conduct a meta-analysis in eight steps: a practical guide. Manag Rev Q 72 , 1–19 (2022). https://doi.org/10.1007/s11301-021-00247-4

Download citation

Published : 30 November 2021

Issue Date : February 2022

DOI : https://doi.org/10.1007/s11301-021-00247-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Find a journal
Publish with us
Track your research

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
J Korean Med Sci
v.37(16); 2022 Apr 25

A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

Edward barroga.

1 Department of General Education, Graduate School of Nursing Science, St. Luke’s International University, Tokyo, Japan.

Glafera Janet Matanguihan

2 Department of Biological Sciences, Messiah University, Mechanicsburg, PA, USA.

The development of research questions and the subsequent hypotheses are prerequisites to defining the main research purpose and specific objectives of a study. Consequently, these objectives determine the study design and research outcome. The development of research questions is a process based on knowledge of current trends, cutting-edge studies, and technological advances in the research field. Excellent research questions are focused and require a comprehensive literature search and in-depth understanding of the problem being investigated. Initially, research questions may be written as descriptive questions which could be developed into inferential questions. These questions must be specific and concise to provide a clear foundation for developing hypotheses. Hypotheses are more formal predictions about the research outcomes. These specify the possible results that may or may not be expected regarding the relationship between groups. Thus, research questions and hypotheses clarify the main purpose and specific objectives of the study, which in turn dictate the design of the study, its direction, and outcome. Studies developed from good research questions and hypotheses will have trustworthy outcomes with wide-ranging social and health implications.

INTRODUCTION

Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses. 1 , 2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results. 3 , 4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the inception of novel studies and the ethical testing of ideas. 5 , 6

It is crucial to have knowledge of both quantitative and qualitative research 2 as both types of research involve writing research questions and hypotheses. 7 However, these crucial elements of research are sometimes overlooked; if not overlooked, then framed without the forethought and meticulous attention it needs. Planning and careful consideration are needed when developing quantitative or qualitative research, particularly when conceptualizing research questions and hypotheses. 4

There is a continuing need to support researchers in the creation of innovative research questions and hypotheses, as well as for journal articles that carefully review these elements. 1 When research questions and hypotheses are not carefully thought of, unethical studies and poor outcomes usually ensue. Carefully formulated research questions and hypotheses define well-founded objectives, which in turn determine the appropriate design, course, and outcome of the study. This article then aims to discuss in detail the various aspects of crafting research questions and hypotheses, with the goal of guiding researchers as they develop their own. Examples from the authors and peer-reviewed scientific articles in the healthcare field are provided to illustrate key points.

DEFINITIONS AND RELATIONSHIP OF RESEARCH QUESTIONS AND HYPOTHESES

A research question is what a study aims to answer after data analysis and interpretation. The answer is written in length in the discussion section of the paper. Thus, the research question gives a preview of the different parts and variables of the study meant to address the problem posed in the research question. 1 An excellent research question clarifies the research writing while facilitating understanding of the research topic, objective, scope, and limitations of the study. 5

On the other hand, a research hypothesis is an educated statement of an expected outcome. This statement is based on background research and current knowledge. 8 , 9 The research hypothesis makes a specific prediction about a new phenomenon 10 or a formal statement on the expected relationship between an independent variable and a dependent variable. 3 , 11 It provides a tentative answer to the research question to be tested or explored. 4

Hypotheses employ reasoning to predict a theory-based outcome. 10 These can also be developed from theories by focusing on components of theories that have not yet been observed. 10 The validity of hypotheses is often based on the testability of the prediction made in a reproducible experiment. 8

Conversely, hypotheses can also be rephrased as research questions. Several hypotheses based on existing theories and knowledge may be needed to answer a research question. Developing ethical research questions and hypotheses creates a research design that has logical relationships among variables. These relationships serve as a solid foundation for the conduct of the study. 4 , 11 Haphazardly constructed research questions can result in poorly formulated hypotheses and improper study designs, leading to unreliable results. Thus, the formulations of relevant research questions and verifiable hypotheses are crucial when beginning research. 12

CHARACTERISTICS OF GOOD RESEARCH QUESTIONS AND HYPOTHESES

Excellent research questions are specific and focused. These integrate collective data and observations to confirm or refute the subsequent hypotheses. Well-constructed hypotheses are based on previous reports and verify the research context. These are realistic, in-depth, sufficiently complex, and reproducible. More importantly, these hypotheses can be addressed and tested. 13

There are several characteristics of well-developed hypotheses. Good hypotheses are 1) empirically testable 7 , 10 , 11 , 13 ; 2) backed by preliminary evidence 9 ; 3) testable by ethical research 7 , 9 ; 4) based on original ideas 9 ; 5) have evidenced-based logical reasoning 10 ; and 6) can be predicted. 11 Good hypotheses can infer ethical and positive implications, indicating the presence of a relationship or effect relevant to the research theme. 7 , 11 These are initially developed from a general theory and branch into specific hypotheses by deductive reasoning. In the absence of a theory to base the hypotheses, inductive reasoning based on specific observations or findings form more general hypotheses. 10

TYPES OF RESEARCH QUESTIONS AND HYPOTHESES

Research questions and hypotheses are developed according to the type of research, which can be broadly classified into quantitative and qualitative research. We provide a summary of the types of research questions and hypotheses under quantitative and qualitative research categories in Table 1 .

Research questions in quantitative research

In quantitative research, research questions inquire about the relationships among variables being investigated and are usually framed at the start of the study. These are precise and typically linked to the subject population, dependent and independent variables, and research design. 1 Research questions may also attempt to describe the behavior of a population in relation to one or more variables, or describe the characteristics of variables to be measured ( descriptive research questions ). 1 , 5 , 14 These questions may also aim to discover differences between groups within the context of an outcome variable ( comparative research questions ), 1 , 5 , 14 or elucidate trends and interactions among variables ( relationship research questions ). 1 , 5 We provide examples of descriptive, comparative, and relationship research questions in quantitative research in Table 2 .

Hypotheses in quantitative research

In quantitative research, hypotheses predict the expected relationships among variables. 15 Relationships among variables that can be predicted include 1) between a single dependent variable and a single independent variable ( simple hypothesis ) or 2) between two or more independent and dependent variables ( complex hypothesis ). 4 , 11 Hypotheses may also specify the expected direction to be followed and imply an intellectual commitment to a particular outcome ( directional hypothesis ) 4 . On the other hand, hypotheses may not predict the exact direction and are used in the absence of a theory, or when findings contradict previous studies ( non-directional hypothesis ). 4 In addition, hypotheses can 1) define interdependency between variables ( associative hypothesis ), 4 2) propose an effect on the dependent variable from manipulation of the independent variable ( causal hypothesis ), 4 3) state a negative relationship between two variables ( null hypothesis ), 4 , 11 , 15 4) replace the working hypothesis if rejected ( alternative hypothesis ), 15 explain the relationship of phenomena to possibly generate a theory ( working hypothesis ), 11 5) involve quantifiable variables that can be tested statistically ( statistical hypothesis ), 11 6) or express a relationship whose interlinks can be verified logically ( logical hypothesis ). 11 We provide examples of simple, complex, directional, non-directional, associative, causal, null, alternative, working, statistical, and logical hypotheses in quantitative research, as well as the definition of quantitative hypothesis-testing research in Table 3 .

Research questions in qualitative research

Unlike research questions in quantitative research, research questions in qualitative research are usually continuously reviewed and reformulated. The central question and associated subquestions are stated more than the hypotheses. 15 The central question broadly explores a complex set of factors surrounding the central phenomenon, aiming to present the varied perspectives of participants. 15

There are varied goals for which qualitative research questions are developed. These questions can function in several ways, such as to 1) identify and describe existing conditions ( contextual research question s); 2) describe a phenomenon ( descriptive research questions ); 3) assess the effectiveness of existing methods, protocols, theories, or procedures ( evaluation research questions ); 4) examine a phenomenon or analyze the reasons or relationships between subjects or phenomena ( explanatory research questions ); or 5) focus on unknown aspects of a particular topic ( exploratory research questions ). 5 In addition, some qualitative research questions provide new ideas for the development of theories and actions ( generative research questions ) or advance specific ideologies of a position ( ideological research questions ). 1 Other qualitative research questions may build on a body of existing literature and become working guidelines ( ethnographic research questions ). Research questions may also be broadly stated without specific reference to the existing literature or a typology of questions ( phenomenological research questions ), may be directed towards generating a theory of some process ( grounded theory questions ), or may address a description of the case and the emerging themes ( qualitative case study questions ). 15 We provide examples of contextual, descriptive, evaluation, explanatory, exploratory, generative, ideological, ethnographic, phenomenological, grounded theory, and qualitative case study research questions in qualitative research in Table 4 , and the definition of qualitative hypothesis-generating research in Table 5 .

Qualitative studies usually pose at least one central research question and several subquestions starting with How or What . These research questions use exploratory verbs such as explore or describe . These also focus on one central phenomenon of interest, and may mention the participants and research site. 15

Hypotheses in qualitative research

Hypotheses in qualitative research are stated in the form of a clear statement concerning the problem to be investigated. Unlike in quantitative research where hypotheses are usually developed to be tested, qualitative research can lead to both hypothesis-testing and hypothesis-generating outcomes. 2 When studies require both quantitative and qualitative research questions, this suggests an integrative process between both research methods wherein a single mixed-methods research question can be developed. 1

FRAMEWORKS FOR DEVELOPING RESEARCH QUESTIONS AND HYPOTHESES

Research questions followed by hypotheses should be developed before the start of the study. 1 , 12 , 14 It is crucial to develop feasible research questions on a topic that is interesting to both the researcher and the scientific community. This can be achieved by a meticulous review of previous and current studies to establish a novel topic. Specific areas are subsequently focused on to generate ethical research questions. The relevance of the research questions is evaluated in terms of clarity of the resulting data, specificity of the methodology, objectivity of the outcome, depth of the research, and impact of the study. 1 , 5 These aspects constitute the FINER criteria (i.e., Feasible, Interesting, Novel, Ethical, and Relevant). 1 Clarity and effectiveness are achieved if research questions meet the FINER criteria. In addition to the FINER criteria, Ratan et al. described focus, complexity, novelty, feasibility, and measurability for evaluating the effectiveness of research questions. 14

The PICOT and PEO frameworks are also used when developing research questions. 1 The following elements are addressed in these frameworks, PICOT: P-population/patients/problem, I-intervention or indicator being studied, C-comparison group, O-outcome of interest, and T-timeframe of the study; PEO: P-population being studied, E-exposure to preexisting conditions, and O-outcome of interest. 1 Research questions are also considered good if these meet the “FINERMAPS” framework: Feasible, Interesting, Novel, Ethical, Relevant, Manageable, Appropriate, Potential value/publishable, and Systematic. 14

As we indicated earlier, research questions and hypotheses that are not carefully formulated result in unethical studies or poor outcomes. To illustrate this, we provide some examples of ambiguous research question and hypotheses that result in unclear and weak research objectives in quantitative research ( Table 6 ) 16 and qualitative research ( Table 7 ) 17 , and how to transform these ambiguous research question(s) and hypothesis(es) into clear and good statements.

a These statements were composed for comparison and illustrative purposes only.

b These statements are direct quotes from Higashihara and Horiuchi. 16

a This statement is a direct quote from Shimoda et al. 17

The other statements were composed for comparison and illustrative purposes only.

CONSTRUCTING RESEARCH QUESTIONS AND HYPOTHESES

To construct effective research questions and hypotheses, it is very important to 1) clarify the background and 2) identify the research problem at the outset of the research, within a specific timeframe. 9 Then, 3) review or conduct preliminary research to collect all available knowledge about the possible research questions by studying theories and previous studies. 18 Afterwards, 4) construct research questions to investigate the research problem. Identify variables to be accessed from the research questions 4 and make operational definitions of constructs from the research problem and questions. Thereafter, 5) construct specific deductive or inductive predictions in the form of hypotheses. 4 Finally, 6) state the study aims . This general flow for constructing effective research questions and hypotheses prior to conducting research is shown in Fig. 1 .

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g001.jpg

Research questions are used more frequently in qualitative research than objectives or hypotheses. 3 These questions seek to discover, understand, explore or describe experiences by asking “What” or “How.” The questions are open-ended to elicit a description rather than to relate variables or compare groups. The questions are continually reviewed, reformulated, and changed during the qualitative study. 3 Research questions are also used more frequently in survey projects than hypotheses in experiments in quantitative research to compare variables and their relationships.

Hypotheses are constructed based on the variables identified and as an if-then statement, following the template, ‘If a specific action is taken, then a certain outcome is expected.’ At this stage, some ideas regarding expectations from the research to be conducted must be drawn. 18 Then, the variables to be manipulated (independent) and influenced (dependent) are defined. 4 Thereafter, the hypothesis is stated and refined, and reproducible data tailored to the hypothesis are identified, collected, and analyzed. 4 The hypotheses must be testable and specific, 18 and should describe the variables and their relationships, the specific group being studied, and the predicted research outcome. 18 Hypotheses construction involves a testable proposition to be deduced from theory, and independent and dependent variables to be separated and measured separately. 3 Therefore, good hypotheses must be based on good research questions constructed at the start of a study or trial. 12

In summary, research questions are constructed after establishing the background of the study. Hypotheses are then developed based on the research questions. Thus, it is crucial to have excellent research questions to generate superior hypotheses. In turn, these would determine the research objectives and the design of the study, and ultimately, the outcome of the research. 12 Algorithms for building research questions and hypotheses are shown in Fig. 2 for quantitative research and in Fig. 3 for qualitative research.

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g002.jpg

EXAMPLES OF RESEARCH QUESTIONS FROM PUBLISHED ARTICLES

EXAMPLE 1. Descriptive research question (quantitative research)
- Presents research variables to be assessed (distinct phenotypes and subphenotypes)
“BACKGROUND: Since COVID-19 was identified, its clinical and biological heterogeneity has been recognized. Identifying COVID-19 phenotypes might help guide basic, clinical, and translational research efforts.
RESEARCH QUESTION: Does the clinical spectrum of patients with COVID-19 contain distinct phenotypes and subphenotypes? ” 19
EXAMPLE 2. Relationship research question (quantitative research)
- Shows interactions between dependent variable (static postural control) and independent variable (peripheral visual field loss)
“Background: Integration of visual, vestibular, and proprioceptive sensations contributes to postural control. People with peripheral visual field loss have serious postural instability. However, the directional specificity of postural stability and sensory reweighting caused by gradual peripheral visual field loss remain unclear.
Research question: What are the effects of peripheral visual field loss on static postural control ?” 20
EXAMPLE 3. Comparative research question (quantitative research)
- Clarifies the difference among groups with an outcome variable (patients enrolled in COMPERA with moderate PH or severe PH in COPD) and another group without the outcome variable (patients with idiopathic pulmonary arterial hypertension (IPAH))
“BACKGROUND: Pulmonary hypertension (PH) in COPD is a poorly investigated clinical condition.
RESEARCH QUESTION: Which factors determine the outcome of PH in COPD?
STUDY DESIGN AND METHODS: We analyzed the characteristics and outcome of patients enrolled in the Comparative, Prospective Registry of Newly Initiated Therapies for Pulmonary Hypertension (COMPERA) with moderate or severe PH in COPD as defined during the 6th PH World Symposium who received medical therapy for PH and compared them with patients with idiopathic pulmonary arterial hypertension (IPAH) .” 21
EXAMPLE 4. Exploratory research question (qualitative research)
- Explores areas that have not been fully investigated (perspectives of families and children who receive care in clinic-based child obesity treatment) to have a deeper understanding of the research problem
“Problem: Interventions for children with obesity lead to only modest improvements in BMI and long-term outcomes, and data are limited on the perspectives of families of children with obesity in clinic-based treatment. This scoping review seeks to answer the question: What is known about the perspectives of families and children who receive care in clinic-based child obesity treatment? This review aims to explore the scope of perspectives reported by families of children with obesity who have received individualized outpatient clinic-based obesity treatment.” 22
EXAMPLE 5. Relationship research question (quantitative research)
- Defines interactions between dependent variable (use of ankle strategies) and independent variable (changes in muscle tone)
“Background: To maintain an upright standing posture against external disturbances, the human body mainly employs two types of postural control strategies: “ankle strategy” and “hip strategy.” While it has been reported that the magnitude of the disturbance alters the use of postural control strategies, it has not been elucidated how the level of muscle tone, one of the crucial parameters of bodily function, determines the use of each strategy. We have previously confirmed using forward dynamics simulations of human musculoskeletal models that an increased muscle tone promotes the use of ankle strategies. The objective of the present study was to experimentally evaluate a hypothesis: an increased muscle tone promotes the use of ankle strategies. Research question: Do changes in the muscle tone affect the use of ankle strategies ?” 23

EXAMPLES OF HYPOTHESES IN PUBLISHED ARTICLES

EXAMPLE 1. Working hypothesis (quantitative research)
- A hypothesis that is initially accepted for further research to produce a feasible theory
“As fever may have benefit in shortening the duration of viral illness, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response when taken during the early stages of COVID-19 illness .” 24
“In conclusion, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response . The difference in perceived safety of these agents in COVID-19 illness could be related to the more potent efficacy to reduce fever with ibuprofen compared to acetaminophen. Compelling data on the benefit of fever warrant further research and review to determine when to treat or withhold ibuprofen for early stage fever for COVID-19 and other related viral illnesses .” 24
EXAMPLE 2. Exploratory hypothesis (qualitative research)
- Explores particular areas deeper to clarify subjective experience and develop a formal hypothesis potentially testable in a future quantitative approach
“We hypothesized that when thinking about a past experience of help-seeking, a self distancing prompt would cause increased help-seeking intentions and more favorable help-seeking outcome expectations .” 25
“Conclusion
Although a priori hypotheses were not supported, further research is warranted as results indicate the potential for using self-distancing approaches to increasing help-seeking among some people with depressive symptomatology.” 25
EXAMPLE 3. Hypothesis-generating research to establish a framework for hypothesis testing (qualitative research)
“We hypothesize that compassionate care is beneficial for patients (better outcomes), healthcare systems and payers (lower costs), and healthcare providers (lower burnout). ” 26
Compassionomics is the branch of knowledge and scientific study of the effects of compassionate healthcare. Our main hypotheses are that compassionate healthcare is beneficial for (1) patients, by improving clinical outcomes, (2) healthcare systems and payers, by supporting financial sustainability, and (3) HCPs, by lowering burnout and promoting resilience and well-being. The purpose of this paper is to establish a scientific framework for testing the hypotheses above . If these hypotheses are confirmed through rigorous research, compassionomics will belong in the science of evidence-based medicine, with major implications for all healthcare domains.” 26
EXAMPLE 4. Statistical hypothesis (quantitative research)
- An assumption is made about the relationship among several population characteristics ( gender differences in sociodemographic and clinical characteristics of adults with ADHD ). Validity is tested by statistical experiment or analysis ( chi-square test, Students t-test, and logistic regression analysis)
“Our research investigated gender differences in sociodemographic and clinical characteristics of adults with ADHD in a Japanese clinical sample. Due to unique Japanese cultural ideals and expectations of women's behavior that are in opposition to ADHD symptoms, we hypothesized that women with ADHD experience more difficulties and present more dysfunctions than men . We tested the following hypotheses: first, women with ADHD have more comorbidities than men with ADHD; second, women with ADHD experience more social hardships than men, such as having less full-time employment and being more likely to be divorced.” 27
“Statistical Analysis
( text omitted ) Between-gender comparisons were made using the chi-squared test for categorical variables and Students t-test for continuous variables…( text omitted ). A logistic regression analysis was performed for employment status, marital status, and comorbidity to evaluate the independent effects of gender on these dependent variables.” 27

EXAMPLES OF HYPOTHESIS AS WRITTEN IN PUBLISHED ARTICLES IN RELATION TO OTHER PARTS

EXAMPLE 1. Background, hypotheses, and aims are provided
“Pregnant women need skilled care during pregnancy and childbirth, but that skilled care is often delayed in some countries …( text omitted ). The focused antenatal care (FANC) model of WHO recommends that nurses provide information or counseling to all pregnant women …( text omitted ). Job aids are visual support materials that provide the right kind of information using graphics and words in a simple and yet effective manner. When nurses are not highly trained or have many work details to attend to, these job aids can serve as a content reminder for the nurses and can be used for educating their patients (Jennings, Yebadokpo, Affo, & Agbogbe, 2010) ( text omitted ). Importantly, additional evidence is needed to confirm how job aids can further improve the quality of ANC counseling by health workers in maternal care …( text omitted )” 28
“ This has led us to hypothesize that the quality of ANC counseling would be better if supported by job aids. Consequently, a better quality of ANC counseling is expected to produce higher levels of awareness concerning the danger signs of pregnancy and a more favorable impression of the caring behavior of nurses .” 28
“This study aimed to examine the differences in the responses of pregnant women to a job aid-supported intervention during ANC visit in terms of 1) their understanding of the danger signs of pregnancy and 2) their impression of the caring behaviors of nurses to pregnant women in rural Tanzania.” 28
EXAMPLE 2. Background, hypotheses, and aims are provided
“We conducted a two-arm randomized controlled trial (RCT) to evaluate and compare changes in salivary cortisol and oxytocin levels of first-time pregnant women between experimental and control groups. The women in the experimental group touched and held an infant for 30 min (experimental intervention protocol), whereas those in the control group watched a DVD movie of an infant (control intervention protocol). The primary outcome was salivary cortisol level and the secondary outcome was salivary oxytocin level.” 29
“ We hypothesize that at 30 min after touching and holding an infant, the salivary cortisol level will significantly decrease and the salivary oxytocin level will increase in the experimental group compared with the control group .” 29
EXAMPLE 3. Background, aim, and hypothesis are provided
“In countries where the maternal mortality ratio remains high, antenatal education to increase Birth Preparedness and Complication Readiness (BPCR) is considered one of the top priorities [1]. BPCR includes birth plans during the antenatal period, such as the birthplace, birth attendant, transportation, health facility for complications, expenses, and birth materials, as well as family coordination to achieve such birth plans. In Tanzania, although increasing, only about half of all pregnant women attend an antenatal clinic more than four times [4]. Moreover, the information provided during antenatal care (ANC) is insufficient. In the resource-poor settings, antenatal group education is a potential approach because of the limited time for individual counseling at antenatal clinics.” 30
“This study aimed to evaluate an antenatal group education program among pregnant women and their families with respect to birth-preparedness and maternal and infant outcomes in rural villages of Tanzania.” 30
“ The study hypothesis was if Tanzanian pregnant women and their families received a family-oriented antenatal group education, they would (1) have a higher level of BPCR, (2) attend antenatal clinic four or more times, (3) give birth in a health facility, (4) have less complications of women at birth, and (5) have less complications and deaths of infants than those who did not receive the education .” 30

Research questions and hypotheses are crucial components to any type of research, whether quantitative or qualitative. These questions should be developed at the very beginning of the study. Excellent research questions lead to superior hypotheses, which, like a compass, set the direction of research, and can often determine the successful conduct of the study. Many research studies have floundered because the development of research questions and subsequent hypotheses was not given the thought and meticulous attention needed. The development of research questions and hypotheses is an iterative process based on extensive knowledge of the literature and insightful grasp of the knowledge gap. Focused, concise, and specific research questions provide a strong foundation for constructing hypotheses which serve as formal predictions about the research outcomes. Research questions and hypotheses are crucial elements of research that should not be overlooked. They should be carefully thought of and constructed when planning research. This avoids unethical studies and poor outcomes by defining well-founded objectives that determine the design, course, and outcome of the study.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

Conceptualization: Barroga E, Matanguihan GJ.
Methodology: Barroga E, Matanguihan GJ.
Writing - original draft: Barroga E, Matanguihan GJ.
Writing - review & editing: Barroga E, Matanguihan GJ.

Skip to main content
Skip to primary sidebar
Skip to footer
QuestionPro

Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
Resources Blog eBooks Survey Templates Case Studies Training Help center

Home Market Research

Qualitative Data Analysis: What is it, Methods + Examples

Explore qualitative data analysis with diverse methods and real-world examples. Uncover the nuances of human experiences with this guide.

In a world rich with information and narrative, understanding the deeper layers of human experiences requires a unique vision that goes beyond numbers and figures. This is where the power of qualitative data analysis comes to light.

In this blog, we’ll learn about qualitative data analysis, explore its methods, and provide real-life examples showcasing its power in uncovering insights.

What is Qualitative Data Analysis?

Qualitative data analysis is a systematic process of examining non-numerical data to extract meaning, patterns, and insights.

In contrast to quantitative analysis, which focuses on numbers and statistical metrics, the qualitative study focuses on the qualitative aspects of data, such as text, images, audio, and videos. It seeks to understand every aspect of human experiences, perceptions, and behaviors by examining the data’s richness.

Companies frequently conduct this analysis on customer feedback. You can collect qualitative data from reviews, complaints, chat messages, interactions with support centers, customer interviews, case notes, or even social media comments. This kind of data holds the key to understanding customer sentiments and preferences in a way that goes beyond mere numbers.

Importance of Qualitative Data Analysis

Qualitative data analysis plays a crucial role in your research and decision-making process across various disciplines. Let’s explore some key reasons that underline the significance of this analysis:

In-Depth Understanding

It enables you to explore complex and nuanced aspects of a phenomenon, delving into the ‘how’ and ‘why’ questions. This method provides you with a deeper understanding of human behavior, experiences, and contexts that quantitative approaches might not capture fully.

Contextual Insight

You can use this analysis to give context to numerical data. It will help you understand the circumstances and conditions that influence participants’ thoughts, feelings, and actions. This contextual insight becomes essential for generating comprehensive explanations.

Theory Development

You can generate or refine hypotheses via qualitative data analysis. As you analyze the data attentively, you can form hypotheses, concepts, and frameworks that will drive your future research and contribute to theoretical advances.

Participant Perspectives

When performing qualitative research, you can highlight participant voices and opinions. This approach is especially useful for understanding marginalized or underrepresented people, as it allows them to communicate their experiences and points of view.

Exploratory Research

The analysis is frequently used at the exploratory stage of your project. It assists you in identifying important variables, developing research questions, and designing quantitative studies that will follow.

Types of Qualitative Data

When conducting qualitative research, you can use several qualitative data collection methods , and here you will come across many sorts of qualitative data that can provide you with unique insights into your study topic. These data kinds add new views and angles to your understanding and analysis.

Interviews and Focus Groups

Interviews and focus groups will be among your key methods for gathering qualitative data. Interviews are one-on-one talks in which participants can freely share their thoughts, experiences, and opinions.

Focus groups, on the other hand, are discussions in which members interact with one another, resulting in dynamic exchanges of ideas. Both methods provide rich qualitative data and direct access to participant perspectives.

Observations and Field Notes

Observations and field notes are another useful sort of qualitative data. You can immerse yourself in the research environment through direct observation, carefully documenting behaviors, interactions, and contextual factors.

These observations will be recorded in your field notes, providing a complete picture of the environment and the behaviors you’re researching. This data type is especially important for comprehending behavior in their natural setting.

Textual and Visual Data

Textual and visual data include a wide range of resources that can be qualitatively analyzed. Documents, written narratives, and transcripts from various sources, such as interviews or speeches, are examples of textual data.

Photographs, films, and even artwork provide a visual layer to your research. These forms of data allow you to investigate what is spoken and the underlying emotions, details, and symbols expressed by language or pictures.

When to Choose Qualitative Data Analysis over Quantitative Data Analysis

As you begin your research journey, understanding why the analysis of qualitative data is important will guide your approach to understanding complex events. If you analyze qualitative data, it will provide new insights that complement quantitative methodologies, which will give you a broader understanding of your study topic.

It is critical to know when to use qualitative analysis over quantitative procedures. You can prefer qualitative data analysis when:

Complexity Reigns: When your research questions involve deep human experiences, motivations, or emotions, qualitative research excels at revealing these complexities.
Exploration is Key: Qualitative analysis is ideal for exploratory research. It will assist you in understanding a new or poorly understood topic before formulating quantitative hypotheses.
Context Matters: If you want to understand how context affects behaviors or results, qualitative data analysis provides the depth needed to grasp these relationships.
Unanticipated Findings: When your study provides surprising new viewpoints or ideas, qualitative analysis helps you to delve deeply into these emerging themes.
Subjective Interpretation is Vital: When it comes to understanding people’s subjective experiences and interpretations, qualitative data analysis is the way to go.

You can make informed decisions regarding the right approach for your research objectives if you understand the importance of qualitative analysis and recognize the situations where it shines.

Qualitative Data Analysis Methods and Examples

Exploring various qualitative data analysis methods will provide you with a wide collection for making sense of your research findings. Once the data has been collected, you can choose from several analysis methods based on your research objectives and the data type you’ve collected.

There are five main methods for analyzing qualitative data. Each method takes a distinct approach to identifying patterns, themes, and insights within your qualitative data. They are:

Method 1: Content Analysis

Content analysis is a methodical technique for analyzing textual or visual data in a structured manner. In this method, you will categorize qualitative data by splitting it into manageable pieces and assigning the manual coding process to these units.

As you go, you’ll notice ongoing codes and designs that will allow you to conclude the content. This method is very beneficial for detecting common ideas, concepts, or themes in your data without losing the context.

Steps to Do Content Analysis

Follow these steps when conducting content analysis:

Collect and Immerse: Begin by collecting the necessary textual or visual data. Immerse yourself in this data to fully understand its content, context, and complexities.
Assign Codes and Categories: Assign codes to relevant data sections that systematically represent major ideas or themes. Arrange comparable codes into groups that cover the major themes.
Analyze and Interpret: Develop a structured framework from the categories and codes. Then, evaluate the data in the context of your research question, investigate relationships between categories, discover patterns, and draw meaning from these connections.

Benefits & Challenges

There are various advantages to using content analysis:

Structured Approach: It offers a systematic approach to dealing with large data sets and ensures consistency throughout the research.
Objective Insights: This method promotes objectivity, which helps to reduce potential biases in your study.
Pattern Discovery: Content analysis can help uncover hidden trends, themes, and patterns that are not always obvious.
Versatility: You can apply content analysis to various data formats, including text, internet content, images, etc.

However, keep in mind the challenges that arise:

Subjectivity: Even with the best attempts, a certain bias may remain in coding and interpretation.
Complexity: Analyzing huge data sets requires time and great attention to detail.
Contextual Nuances: Content analysis may not capture all of the contextual richness that qualitative data analysis highlights.

Example of Content Analysis

Suppose you’re conducting market research and looking at customer feedback on a product. As you collect relevant data and analyze feedback, you’ll see repeating codes like “price,” “quality,” “customer service,” and “features.” These codes are organized into categories such as “positive reviews,” “negative reviews,” and “suggestions for improvement.”

According to your findings, themes such as “price” and “customer service” stand out and show that pricing and customer service greatly impact customer satisfaction. This example highlights the power of content analysis for obtaining significant insights from large textual data collections.

Method 2: Thematic Analysis

Thematic analysis is a well-structured procedure for identifying and analyzing recurring themes in your data. As you become more engaged in the data, you’ll generate codes or short labels representing key concepts. These codes are then organized into themes, providing a consistent framework for organizing and comprehending the substance of the data.

The analysis allows you to organize complex narratives and perspectives into meaningful categories, which will allow you to identify connections and patterns that may not be visible at first.

Steps to Do Thematic Analysis

Follow these steps when conducting a thematic analysis:

Code and Group: Start by thoroughly examining the data and giving initial codes that identify the segments. To create initial themes, combine relevant codes.
Code and Group: Begin by engaging yourself in the data, assigning first codes to notable segments. To construct basic themes, group comparable codes together.
Analyze and Report: Analyze the data within each theme to derive relevant insights. Organize the topics into a consistent structure and explain your findings, along with data extracts that represent each theme.

Thematic analysis has various benefits:

Structured Exploration: It is a method for identifying patterns and themes in complex qualitative data.
Comprehensive knowledge: Thematic analysis promotes an in-depth understanding of the complications and meanings of the data.
Application Flexibility: This method can be customized to various research situations and data kinds.

However, challenges may arise, such as:

Interpretive Nature: Interpreting qualitative data in thematic analysis is vital, and it is critical to manage researcher bias.
Time-consuming: The study can be time-consuming, especially with large data sets.
Subjectivity: The selection of codes and topics might be subjective.

Example of Thematic Analysis

Assume you’re conducting a thematic analysis on job satisfaction interviews. Following your immersion in the data, you assign initial codes such as “work-life balance,” “career growth,” and “colleague relationships.” As you organize these codes, you’ll notice themes develop, such as “Factors Influencing Job Satisfaction” and “Impact on Work Engagement.”

Further investigation reveals the tales and experiences included within these themes and provides insights into how various elements influence job satisfaction. This example demonstrates how thematic analysis can reveal meaningful patterns and insights in qualitative data.

Method 3: Narrative Analysis

The narrative analysis involves the narratives that people share. You’ll investigate the histories in your data, looking at how stories are created and the meanings they express. This method is excellent for learning how people make sense of their experiences through narrative.

Steps to Do Narrative Analysis

The following steps are involved in narrative analysis:

Gather and Analyze: Start by collecting narratives, such as first-person tales, interviews, or written accounts. Analyze the stories, focusing on the plot, feelings, and characters.
Find Themes: Look for recurring themes or patterns in various narratives. Think about the similarities and differences between these topics and personal experiences.
Interpret and Extract Insights: Contextualize the narratives within their larger context. Accept the subjective nature of each narrative and analyze the narrator’s voice and style. Extract insights from the tales by diving into the emotions, motivations, and implications communicated by the stories.

There are various advantages to narrative analysis:

Deep Exploration: It lets you look deeply into people’s personal experiences and perspectives.
Human-Centered: This method prioritizes the human perspective, allowing individuals to express themselves.

However, difficulties may arise, such as:

Interpretive Complexity: Analyzing narratives requires dealing with the complexities of meaning and interpretation.
Time-consuming: Because of the richness and complexities of tales, working with them can be time-consuming.

Example of Narrative Analysis

Assume you’re conducting narrative analysis on refugee interviews. As you read the stories, you’ll notice common themes of toughness, loss, and hope. The narratives provide insight into the obstacles that refugees face, their strengths, and the dreams that guide them.

The analysis can provide a deeper insight into the refugees’ experiences and the broader social context they navigate by examining the narratives’ emotional subtleties and underlying meanings. This example highlights how narrative analysis can reveal important insights into human stories.

Method 4: Grounded Theory Analysis

Grounded theory analysis is an iterative and systematic approach that allows you to create theories directly from data without being limited by pre-existing hypotheses. With an open mind, you collect data and generate early codes and labels that capture essential ideas or concepts within the data.

As you progress, you refine these codes and increasingly connect them, eventually developing a theory based on the data. Grounded theory analysis is a dynamic process for developing new insights and hypotheses based on details in your data.

Steps to Do Grounded Theory Analysis

Grounded theory analysis requires the following steps:

Initial Coding: First, immerse yourself in the data, producing initial codes that represent major concepts or patterns.
Categorize and Connect: Using axial coding, organize the initial codes, which establish relationships and connections between topics.
Build the Theory: Focus on creating a core category that connects the codes and themes. Regularly refine the theory by comparing and integrating new data, ensuring that it evolves organically from the data.

Grounded theory analysis has various benefits:

Theory Generation: It provides a one-of-a-kind opportunity to generate hypotheses straight from data and promotes new insights.
In-depth Understanding: The analysis allows you to deeply analyze the data and reveal complex relationships and patterns.
Flexible Process: This method is customizable and ongoing, which allows you to enhance your research as you collect additional data.

However, challenges might arise with:

Time and Resources: Because grounded theory analysis is a continuous process, it requires a large commitment of time and resources.
Theoretical Development: Creating a grounded theory involves a thorough understanding of qualitative data analysis software and theoretical concepts.
Interpretation of Complexity: Interpreting and incorporating a newly developed theory into existing literature can be intellectually hard.

Example of Grounded Theory Analysis

Assume you’re performing a grounded theory analysis on workplace collaboration interviews. As you open code the data, you will discover notions such as “communication barriers,” “team dynamics,” and “leadership roles.” Axial coding demonstrates links between these notions, emphasizing the significance of efficient communication in developing collaboration.

You create the core “Integrated Communication Strategies” category through selective coding, which unifies new topics.

This theory-driven category serves as the framework for understanding how numerous aspects contribute to effective team collaboration. This example shows how grounded theory analysis allows you to generate a theory directly from the inherent nature of the data.

Method 5: Discourse Analysis

Discourse analysis focuses on language and communication. You’ll look at how language produces meaning and how it reflects power relations, identities, and cultural influences. This strategy examines what is said and how it is said; the words, phrasing, and larger context of communication.

The analysis is precious when investigating power dynamics, identities, and cultural influences encoded in language. By evaluating the language used in your data, you can identify underlying assumptions, cultural standards, and how individuals negotiate meaning through communication.

Steps to Do Discourse Analysis

Conducting discourse analysis entails the following steps:

Select Discourse: For analysis, choose language-based data such as texts, speeches, or media content.
Analyze Language: Immerse yourself in the conversation, examining language choices, metaphors, and underlying assumptions.
Discover Patterns: Recognize the dialogue’s reoccurring themes, ideologies, and power dynamics. To fully understand the effects of these patterns, put them in their larger context.

There are various advantages of using discourse analysis:

Understanding Language: It provides an extensive understanding of how language builds meaning and influences perceptions.
Uncovering Power Dynamics: The analysis reveals how power dynamics appear via language.
Cultural Insights: This method identifies cultural norms, beliefs, and ideologies stored in communication.

However, the following challenges may arise:

Complexity of Interpretation: Language analysis involves navigating multiple levels of nuance and interpretation.
Subjectivity: Interpretation can be subjective, so controlling researcher bias is important.
Time-Intensive: Discourse analysis can take a lot of time because careful linguistic study is required in this analysis.

Example of Discourse Analysis

Consider doing discourse analysis on media coverage of a political event. You notice repeating linguistic patterns in news articles that depict the event as a conflict between opposing parties. Through deconstruction, you can expose how this framing supports particular ideologies and power relations.

You can illustrate how language choices influence public perceptions and contribute to building the narrative around the event by analyzing the speech within the broader political and social context. This example shows how discourse analysis can reveal hidden power dynamics and cultural influences on communication.

How to do Qualitative Data Analysis with the QuestionPro Research suite?

QuestionPro is a popular survey and research platform that offers tools for collecting and analyzing qualitative and quantitative data. Follow these general steps for conducting qualitative data analysis using the QuestionPro Research Suite:

Collect Qualitative Data: Set up your survey to capture qualitative responses. It might involve open-ended questions, text boxes, or comment sections where participants can provide detailed responses.
Export Qualitative Responses: Export the responses once you’ve collected qualitative data through your survey. QuestionPro typically allows you to export survey data in various formats, such as Excel or CSV.
Prepare Data for Analysis: Review the exported data and clean it if necessary. Remove irrelevant or duplicate entries to ensure your data is ready for analysis.
Code and Categorize Responses: Segment and label data, letting new patterns emerge naturally, then develop categories through axial coding to structure the analysis.
Identify Themes: Analyze the coded responses to identify recurring themes, patterns, and insights. Look for similarities and differences in participants’ responses.
Generate Reports and Visualizations: Utilize the reporting features of QuestionPro to create visualizations, charts, and graphs that help communicate the themes and findings from your qualitative research.
Interpret and Draw Conclusions: Interpret the themes and patterns you’ve identified in the qualitative data. Consider how these findings answer your research questions or provide insights into your study topic.
Integrate with Quantitative Data (if applicable): If you’re also conducting quantitative research using QuestionPro, consider integrating your qualitative findings with quantitative results to provide a more comprehensive understanding.

Qualitative data analysis is vital in uncovering various human experiences, views, and stories. If you’re ready to transform your research journey and apply the power of qualitative analysis, now is the moment to do it. Book a demo with QuestionPro today and begin your journey of exploration.

LEARN MORE FREE TRIAL

MORE LIKE THIS

Data Information vs Insight: Essential differences

May 14, 2024

Pricing Analytics Software: Optimize Your Pricing Strategy

May 13, 2024

Relationship Marketing: What It Is, Examples & Top 7 Benefits

May 8, 2024

The Best Email Survey Tool to Boost Your Feedback Game

May 7, 2024

Experimental Research: Definition, Types, Design, Examples

Appinio Research · 14.05.2024 · 31min read

Experimental research is a cornerstone of scientific inquiry, providing a systematic approach to understanding cause-and-effect relationships and advancing knowledge in various fields. At its core, experimental research involves manipulating variables, observing outcomes, and drawing conclusions based on empirical evidence. By controlling factors that could influence the outcome, researchers can isolate the effects of specific variables and make reliable inferences about their impact. This guide offers a step-by-step exploration of experimental research, covering key elements such as research design, data collection, analysis, and ethical considerations. Whether you're a novice researcher seeking to understand the basics or an experienced scientist looking to refine your experimental techniques, this guide will equip you with the knowledge and tools needed to conduct rigorous and insightful research.

What is Experimental Research?

Experimental research is a systematic approach to scientific inquiry that aims to investigate cause-and-effect relationships by manipulating independent variables and observing their effects on dependent variables. Experimental research primarily aims to test hypotheses, make predictions, and draw conclusions based on empirical evidence.

By controlling extraneous variables and randomizing participant assignment, researchers can isolate the effects of specific variables and establish causal relationships. Experimental research is characterized by its rigorous methodology, emphasis on objectivity, and reliance on empirical data to support conclusions.

Importance of Experimental Research

Establishing Cause-and-Effect Relationships : Experimental research allows researchers to establish causal relationships between variables by systematically manipulating independent variables and observing their effects on dependent variables. This provides valuable insights into the underlying mechanisms driving phenomena and informs theory development.
Testing Hypotheses and Making Predictions : Experimental research provides a structured framework for testing hypotheses and predicting the relationship between variables . By systematically manipulating variables and controlling for confounding factors, researchers can empirically test the validity of their hypotheses and refine theoretical models.
Informing Evidence-Based Practice : Experimental research generates empirical evidence that informs evidence-based practice in various fields, including healthcare, education, and business. Experimental research contributes to improving outcomes and informing decision-making in real-world settings by identifying effective interventions, treatments, and strategies.
Driving Innovation and Advancement : Experimental research drives innovation and advancement by uncovering new insights, challenging existing assumptions, and pushing the boundaries of knowledge. Through rigorous experimentation and empirical validation, researchers can develop novel solutions to complex problems and contribute to the advancement of science and technology.
Enhancing Research Rigor and Validity : Experimental research upholds high research rigor and validity standards by employing systematic methods, controlling for confounding variables, and ensuring replicability of findings. By adhering to rigorous methodology and ethical principles, experimental research produces reliable and credible evidence that withstands scrutiny and contributes to the cumulative body of knowledge.

Experimental research plays a pivotal role in advancing scientific understanding, informing evidence-based practice, and driving innovation across various disciplines. By systematically testing hypotheses, establishing causal relationships, and generating empirical evidence, experimental research contributes to the collective pursuit of knowledge and the improvement of society.

Understanding Experimental Design

Experimental design serves as the blueprint for your study, outlining how you'll manipulate variables and control factors to draw valid conclusions.

Experimental Design Components

Experimental design comprises several essential elements:

Independent Variable (IV) : This is the variable manipulated by the researcher. It's what you change to observe its effect on the dependent variable. For example, in a study testing the impact of different study techniques on exam scores, the independent variable might be the study method (e.g., flashcards, reading, or practice quizzes).
Dependent Variable (DV) : The dependent variable is what you measure to assess the effect of the independent variable. It's the outcome variable affected by the manipulation of the independent variable. In our study example, the dependent variable would be the exam scores.
Control Variables : These factors could influence the outcome but are kept constant or controlled to isolate the effect of the independent variable. Controlling variables helps ensure that any observed changes in the dependent variable can be attributed to manipulating the independent variable rather than other factors.
Experimental Group : This group receives the treatment or intervention being tested. It's exposed to the manipulated independent variable. In contrast, the control group does not receive the treatment and serves as a baseline for comparison.

Types of Experimental Designs

Experimental designs can vary based on the research question, the nature of the variables, and the desired level of control. Here are some common types:

Between-Subjects Design : In this design, different groups of participants are exposed to varying levels of the independent variable. Each group represents a different experimental condition, and participants are only exposed to one condition. For instance, in a study comparing the effectiveness of two teaching methods, one group of students would use Method A, while another would use Method B.
Within-Subjects Design : Also known as repeated measures design , this approach involves exposing the same group of participants to all levels of the independent variable. Participants serve as their own controls, and the order of conditions is typically counterbalanced to control for order effects. For example, participants might be tested on their reaction times under different lighting conditions, with the order of conditions randomized to eliminate any research bias .
Mixed Designs : Mixed designs combine elements of both between-subjects and within-subjects designs. This allows researchers to examine both between-group differences and within-group changes over time. Mixed designs help study complex phenomena that involve multiple variables and temporal dynamics.

Factors Influencing Experimental Design Choices

Several factors influence the selection of an appropriate experimental design:

Research Question : The nature of your research question will guide your choice of experimental design. Some questions may be better suited to between-subjects designs, while others may require a within-subjects approach.
Variables : Consider the number and type of variables involved in your study. A factorial design might be appropriate if you're interested in exploring multiple factors simultaneously. Conversely, if you're focused on investigating the effects of a single variable, a simpler design may suffice.
Practical Considerations : Practical constraints such as time, resources, and access to participants can impact your choice of experimental design. Depending on your study's specific requirements, some designs may be more feasible or cost-effective than others .
Ethical Considerations : Ethical concerns, such as the potential risks to participants or the need to minimize harm, should also inform your experimental design choices. Ensure that your design adheres to ethical guidelines and safeguards the rights and well-being of participants.

By carefully considering these factors and selecting an appropriate experimental design, you can ensure that your study is well-designed and capable of yielding meaningful insights.

Experimental Research Elements

When conducting experimental research, understanding the key elements is crucial for designing and executing a robust study. Let's explore each of these elements in detail to ensure your experiment is well-planned and executed effectively.

Independent and Dependent Variables

In experimental research, the independent variable (IV) is the factor that the researcher manipulates or controls, while the dependent variable (DV) is the measured outcome or response. The independent variable is what you change in the experiment to observe its effect on the dependent variable.

For example, in a study investigating the effect of different fertilizers on plant growth, the type of fertilizer used would be the independent variable, while the plant growth (height, number of leaves, etc.) would be the dependent variable.

Control Groups and Experimental Groups

Control groups and experimental groups are essential components of experimental design. The control group serves as a baseline for comparison and does not receive the treatment or intervention being studied. Its purpose is to provide a reference point to assess the effects of the independent variable.

In contrast, the experimental group receives the treatment or intervention and is used to measure the impact of the independent variable. For example, in a drug trial, the control group would receive a placebo, while the experimental group would receive the actual medication.

Randomization and Random Sampling

Randomization is the process of randomly assigning participants to different experimental conditions to minimize biases and ensure that each participant has an equal chance of being assigned to any condition. Randomization helps control for extraneous variables and increases the study's internal validity .

Random sampling, on the other hand, involves selecting a representative sample from the population of interest to generalize the findings to the broader population. Random sampling ensures that each member of the population has an equal chance of being included in the sample, reducing the risk of sampling bias .

Replication and Reliability

Replication involves repeating the experiment to confirm the results and assess the reliability of the findings . It is essential for ensuring the validity of scientific findings and building confidence in the robustness of the results. A study that can be replicated consistently across different settings and by various researchers is considered more reliable. Researchers should strive to design experiments that are easily replicable and transparently report their methods to facilitate replication by others.

Validity: Internal, External, Construct, and Statistical Conclusion Validity

Validity refers to the degree to which an experiment measures what it intends to measure and the extent to which the results can be generalized to other populations or contexts. There are several types of validity that researchers should consider:

Internal Validity : Internal validity refers to the extent to which the study accurately assesses the causal relationship between variables. Internal validity is threatened by factors such as confounding variables, selection bias, and experimenter effects. Researchers can enhance internal validity through careful experimental design and control procedures.
External Validity : External validity refers to the extent to which the study's findings can be generalized to other populations or settings. External validity is influenced by factors such as the representativeness of the sample and the ecological validity of the experimental conditions. Researchers should consider the relevance and applicability of their findings to real-world situations.
Construct Validity : Construct validity refers to the degree to which the study accurately measures the theoretical constructs of interest. Construct validity is concerned with whether the operational definitions of the variables align with the underlying theoretical concepts. Researchers can establish construct validity through careful measurement selection and validation procedures.
Statistical Conclusion Validity : Statistical conclusion validity refers to the accuracy of the statistical analyses and conclusions drawn from the data. It ensures that the statistical tests used are appropriate for the data and that the conclusions drawn are warranted. Researchers should use robust statistical methods and report effect sizes and confidence intervals to enhance statistical conclusion validity.

By addressing these elements of experimental research and ensuring the validity and reliability of your study, you can conduct research that contributes meaningfully to the advancement of knowledge in your field.

How to Conduct Experimental Research?

Embarking on an experimental research journey involves a series of well-defined phases, each crucial for the success of your study. Let's explore the pre-experimental, experimental, and post-experimental phases to ensure you're equipped to conduct rigorous and insightful research.

Pre-Experimental Phase

The pre-experimental phase lays the foundation for your study, setting the stage for what's to come. Here's what you need to do:

Formulating Research Questions and Hypotheses : Start by clearly defining your research questions and formulating testable hypotheses. Your research questions should be specific, relevant, and aligned with your research objectives. Hypotheses provide a framework for testing the relationships between variables and making predictions about the outcomes of your study.
Reviewing Literature and Establishing Theoretical Framework : Dive into existing literature relevant to your research topic and establish a solid theoretical framework. Literature review helps you understand the current state of knowledge, identify research gaps, and build upon existing theories. A well-defined theoretical framework provides a conceptual basis for your study and guides your research design and analysis.

Experimental Phase

The experimental phase is where the magic happens – it's time to put your hypotheses to the test and gather data. Here's what you need to consider:

Participant Recruitment and Sampling Techniques : Carefully recruit participants for your study using appropriate sampling techniques . The sample should be representative of the population you're studying to ensure the generalizability of your findings. Consider factors such as sample size , demographics , and inclusion criteria when recruiting participants.
Implementing Experimental Procedures : Once you've recruited participants, it's time to implement your experimental procedures. Clearly outline the experimental protocol, including instructions for participants, procedures for administering treatments or interventions, and measures for controlling extraneous variables. Standardize your procedures to ensure consistency across participants and minimize sources of bias.
Data Collection and Measurement : Collect data using reliable and valid measurement instruments. Depending on your research questions and variables of interest, data collection methods may include surveys , observations, physiological measurements, or experimental tasks. Ensure that your data collection procedures are ethical, respectful of participants' rights, and designed to minimize errors and biases.

Post-Experimental Phase

In the post-experimental phase, you make sense of your data, draw conclusions, and communicate your findings to the world . Here's what you need to do:

Data Analysis Techniques : Analyze your data using appropriate statistical techniques . Choose methods that are aligned with your research design and hypotheses. Standard statistical analyses include descriptive statistics, inferential statistics (e.g., t-tests, ANOVA), regression analysis , and correlation analysis. Interpret your findings in the context of your research questions and theoretical framework.
Interpreting Results and Drawing Conclusions : Once you've analyzed your data, interpret the results and draw conclusions. Discuss the implications of your findings, including any theoretical, practical, or real-world implications. Consider alternative explanations and limitations of your study and propose avenues for future research. Be transparent about the strengths and weaknesses of your study to enhance the credibility of your conclusions.
Reporting Findings : Finally, communicate your findings through research reports, academic papers, or presentations. Follow standard formatting guidelines and adhere to ethical standards for research reporting. Clearly articulate your research objectives, methods, results, and conclusions. Consider your target audience and choose appropriate channels for disseminating your findings to maximize impact and reach.

By meticulously planning and executing each experimental research phase, you can generate valuable insights, advance knowledge in your field, and contribute to scientific progress.

A s you navigate the intricate phases of experimental research, leveraging Appinio can streamline your journey toward actionable insights. With our intuitive platform, you can swiftly gather real-time consumer data, empowering you to make informed decisions with confidence. Say goodbye to the complexities of traditional market research and hello to a seamless, efficient process that puts you in the driver's seat of your research endeavors.

Ready to revolutionize your approach to data-driven decision-making? Book a demo today and discover the power of Appinio in transforming your research experience!

Book a Demo

Experimental Research Examples

Understanding how experimental research is applied in various contexts can provide valuable insights into its practical significance and effectiveness. Here are some examples illustrating the application of experimental research in different domains:

Market Research

Experimental studies are crucial in market research in testing hypotheses, evaluating marketing strategies, and understanding consumer behavior . For example, a company may conduct an experiment to determine the most effective advertising message for a new product. Participants could be exposed to different versions of an advertisement, each emphasizing different product features or appeals.

By measuring variables such as brand recall, purchase intent, and brand perception, researchers can assess the impact of each advertising message and identify the most persuasive approach.

Software as a Service (SaaS)

In the SaaS industry, experimental research is often used to optimize user interfaces, features, and pricing models to enhance user experience and drive engagement. For instance, a SaaS company may conduct A/B tests to compare two versions of its software interface, each with a different layout or navigation structure.

Researchers can identify design elements that lead to higher user satisfaction and retention by tracking user interactions, conversion rates, and customer feedback . Experimental research also enables SaaS companies to test new product features or pricing strategies before full-scale implementation, minimizing risks and maximizing return on investment.

Business Management

Experimental research is increasingly utilized in business management to inform decision-making, improve organizational processes, and drive innovation. For example, a business may conduct an experiment to evaluate the effectiveness of a new training program on employee productivity. Participants could be randomly assigned to either receive the training or serve as a control group.

By measuring performance metrics such as sales revenue, customer satisfaction, and employee turnover, researchers can assess the training program's impact and determine its return on investment. Experimental research in business management provides empirical evidence to support strategic initiatives and optimize resource allocation.

In healthcare , experimental research is instrumental in testing new treatments, interventions, and healthcare delivery models to improve patient outcomes and quality of care. For instance, a clinical trial may be conducted to evaluate the efficacy of a new drug in treating a specific medical condition. Participants are randomly assigned to either receive the experimental drug or a placebo, and their health outcomes are monitored over time.

By comparing the effectiveness of the treatment and placebo groups, researchers can determine the drug's efficacy, safety profile, and potential side effects. Experimental research in healthcare informs evidence-based practice and drives advancements in medical science and patient care.

These examples illustrate the versatility and applicability of experimental research across diverse domains, demonstrating its value in generating actionable insights, informing decision-making, and driving innovation. Whether in market research or healthcare, experimental research provides a rigorous and systematic approach to testing hypotheses, evaluating interventions, and advancing knowledge.

Experimental Research Challenges

Even with careful planning and execution, experimental research can present various challenges. Understanding these challenges and implementing effective solutions is crucial for ensuring the validity and reliability of your study. Here are some common challenges and strategies for addressing them.

Sample Size and Statistical Power

Challenge : Inadequate sample size can limit your study's generalizability and statistical power, making it difficult to detect meaningful effects. Small sample sizes increase the risk of Type II errors (false negatives) and reduce the reliability of your findings.

Solution : Increase your sample size to improve statistical power and enhance the robustness of your results. Conduct a power analysis before starting your study to determine the minimum sample size required to detect the effects of interest with sufficient power. Consider factors such as effect size, alpha level, and desired power when calculating sample size requirements. Additionally, consider using techniques such as bootstrapping or resampling to augment small sample sizes and improve the stability of your estimates.

To enhance the reliability of your experimental research findings, you can leverage our Sample Size Calculator . By determining the optimal sample size based on your desired margin of error, confidence level, and standard deviation, you can ensure the representativeness of your survey results. Don't let inadequate sample sizes hinder the validity of your study and unlock the power of precise research planning!

Confounding Variables and Bias

Challenge : Confounding variables are extraneous factors that co-vary with the independent variable and can distort the relationship between the independent and dependent variables. Confounding variables threaten the internal validity of your study and can lead to erroneous conclusions.

Solution : Implement control measures to minimize the influence of confounding variables on your results. Random assignment of participants to experimental conditions helps distribute confounding variables evenly across groups, reducing their impact on the dependent variable. Additionally, consider using matching or blocking techniques to ensure that groups are comparable on relevant variables. Conduct sensitivity analyses to assess the robustness of your findings to potential confounders and explore alternative explanations for your results.

Researcher Effects and Experimenter Bias

Challenge : Researcher effects and experimenter bias occur when the experimenter's expectations or actions inadvertently influence the study's outcomes. This bias can manifest through subtle cues, unintentional behaviors, or unconscious biases , leading to invalid conclusions.

Solution : Implement double-blind procedures whenever possible to mitigate researcher effects and experimenter bias. Double-blind designs conceal information about the experimental conditions from both the participants and the experimenters, minimizing the potential for bias. Standardize experimental procedures and instructions to ensure consistency across conditions and minimize experimenter variability. Additionally, consider using objective outcome measures or automated data collection procedures to reduce the influence of experimenter bias on subjective assessments.

External Validity and Generalizability

Challenge : External validity refers to the extent to which your study's findings can be generalized to other populations, settings, or conditions. Limited external validity restricts the applicability of your results and may hinder their relevance to real-world contexts.

Solution : Enhance external validity by designing studies closely resembling real-world conditions and populations of interest. Consider using diverse samples that represent the target population's demographic, cultural, and ecological variability. Conduct replication studies in different contexts or with different populations to assess the robustness and generalizability of your findings. Additionally, consider conducting meta-analyses or systematic reviews to synthesize evidence from multiple studies and enhance the external validity of your conclusions.

By proactively addressing these challenges and implementing effective solutions, you can strengthen the validity, reliability, and impact of your experimental research. Remember to remain vigilant for potential pitfalls throughout the research process and adapt your strategies as needed to ensure the integrity of your findings.

Advanced Topics in Experimental Research

As you delve deeper into experimental research, you'll encounter advanced topics and methodologies that offer greater complexity and nuance.

Quasi-Experimental Designs

Quasi-experimental designs resemble true experiments but lack random assignment to experimental conditions. They are often used when random assignment is impractical, unethical, or impossible. Quasi-experimental designs allow researchers to investigate cause-and-effect relationships in real-world settings where strict experimental control is challenging. Common examples include:

Non-Equivalent Groups Design : This design compares two or more groups that were not created through random assignment. While similar to between-subjects designs, non-equivalent group designs lack the random assignment of participants, increasing the risk of confounding variables.
Interrupted Time Series Design : In this design, multiple measurements are taken over time before and after an intervention is introduced. Changes in the dependent variable are assessed over time, allowing researchers to infer the impact of the intervention.
Regression Discontinuity Design : This design involves assigning participants to different groups based on a cutoff score on a continuous variable. Participants just above and below the cutoff are treated as if they were randomly assigned to different conditions, allowing researchers to estimate causal effects.

Quasi-experimental designs offer valuable insights into real-world phenomena but require careful consideration of potential confounding variables and limitations inherent to non-random assignment.

Factorial Designs

Factorial designs involve manipulating two or more independent variables simultaneously to examine their main effects and interactions. By systematically varying multiple factors, factorial designs allow researchers to explore complex relationships between variables and identify how they interact to influence outcomes. Common types of factorial designs include:

2x2 Factorial Design : This design manipulates two independent variables, each with two levels. It allows researchers to examine the main effects of each variable as well as any interaction between them.
Mixed Factorial Design : In this design, one independent variable is manipulated between subjects, while another is manipulated within subjects. Mixed factorial designs enable researchers to investigate both between-subjects and within-subjects effects simultaneously.

Factorial designs provide a comprehensive understanding of how multiple factors contribute to outcomes and offer greater statistical efficiency compared to studying variables in isolation.

Longitudinal and Cross-Sectional Studies

Longitudinal studies involve collecting data from the same participants over an extended period, allowing researchers to observe changes and trajectories over time. Cross-sectional studies , on the other hand, involve collecting data from different participants at a single point in time, providing a snapshot of the population at that moment. Both longitudinal and cross-sectional studies offer unique advantages and challenges:

Longitudinal Studies : Longitudinal designs allow researchers to examine developmental processes, track changes over time, and identify causal relationships. However, longitudinal studies require long-term commitment, are susceptible to attrition and dropout, and may be subject to practice effects and cohort effects.
Cross-Sectional Studies : Cross-sectional designs are relatively quick and cost-effective, provide a snapshot of population characteristics, and allow for comparisons across different groups. However, cross-sectional studies cannot assess changes over time or establish causal relationships between variables.

Researchers should carefully consider the research question, objectives, and constraints when choosing between longitudinal and cross-sectional designs.

Meta-Analysis and Systematic Reviews

Meta-analysis and systematic reviews are quantitative methods used to synthesize findings from multiple studies and draw robust conclusions. These methods offer several advantages:

Meta-Analysis : Meta-analysis combines the results of multiple studies using statistical techniques to estimate overall effect sizes and assess the consistency of findings across studies. Meta-analysis increases statistical power, enhances generalizability, and provides more precise estimates of effect sizes.
Systematic Reviews : Systematic reviews involve systematically searching, appraising, and synthesizing existing literature on a specific topic. Systematic reviews provide a comprehensive summary of the evidence, identify gaps and inconsistencies in the literature, and inform future research directions.

Meta-analysis and systematic reviews are valuable tools for evidence-based practice, guiding policy decisions, and advancing scientific knowledge by aggregating and synthesizing empirical evidence from diverse sources.

By exploring these advanced topics in experimental research, you can expand your methodological toolkit, tackle more complex research questions, and contribute to deeper insights and understanding in your field.

Experimental Research Ethical Considerations

When conducting experimental research, it's imperative to uphold ethical standards and prioritize the well-being and rights of participants. Here are some key ethical considerations to keep in mind throughout the research process:

Informed Consent : Obtain informed consent from participants before they participate in your study. Ensure that participants understand the purpose of the study, the procedures involved, any potential risks or benefits, and their right to withdraw from the study at any time without penalty.
Protection of Participants' Rights : Respect participants' autonomy, privacy, and confidentiality throughout the research process. Safeguard sensitive information and ensure that participants' identities are protected. Be transparent about how their data will be used and stored.
Minimizing Harm and Risks : Take steps to mitigate any potential physical or psychological harm to participants. Conduct a risk assessment before starting your study and implement appropriate measures to reduce risks. Provide support services and resources for participants who may experience distress or adverse effects as a result of their participation.
Confidentiality and Data Security : Protect participants' privacy and ensure the security of their data. Use encryption and secure storage methods to prevent unauthorized access to sensitive information. Anonymize data whenever possible to minimize the risk of data breaches or privacy violations.
Avoiding Deception : Minimize the use of deception in your research and ensure that any deception is justified by the scientific objectives of the study. If deception is necessary, debrief participants fully at the end of the study and provide them with an opportunity to withdraw their data if they wish.
Respecting Diversity and Cultural Sensitivity : Be mindful of participants' diverse backgrounds, cultural norms, and values. Avoid imposing your own cultural biases on participants and ensure that your research is conducted in a culturally sensitive manner. Seek input from diverse stakeholders to ensure your research is inclusive and respectful.
Compliance with Ethical Guidelines : Familiarize yourself with relevant ethical guidelines and regulations governing research with human participants, such as those outlined by institutional review boards (IRBs) or ethics committees. Ensure that your research adheres to these guidelines and that any potential ethical concerns are addressed appropriately.
Transparency and Openness : Be transparent about your research methods, procedures, and findings. Clearly communicate the purpose of your study, any potential risks or limitations, and how participants' data will be used. Share your research findings openly and responsibly, contributing to the collective body of knowledge in your field.

By prioritizing ethical considerations in your experimental research, you demonstrate integrity, respect, and responsibility as a researcher, fostering trust and credibility in the scientific community.

Conclusion for Experimental Research

Experimental research is a powerful tool for uncovering causal relationships and expanding our understanding of the world around us. By carefully designing experiments, collecting data, and analyzing results, researchers can make meaningful contributions to their fields and address pressing questions. However, conducting experimental research comes with responsibilities. Ethical considerations are paramount to ensure the well-being and rights of participants, as well as the integrity of the research process. Researchers can build trust and credibility in their work by upholding ethical standards and prioritizing participant safety and autonomy. Furthermore, as you continue to explore and innovate in experimental research, you must remain open to new ideas and methodologies. Embracing diversity in perspectives and approaches fosters creativity and innovation, leading to breakthrough discoveries and scientific advancements. By promoting collaboration and sharing findings openly, we can collectively push the boundaries of knowledge and tackle some of society's most pressing challenges.

How to Conduct Research in Minutes?

Discover the power of Appinio , the real-time market research platform revolutionizing experimental research. With Appinio, you can access real-time consumer insights to make better data-driven decisions in minutes. Join the thousands of companies worldwide who trust Appinio to deliver fast, reliable consumer insights.

Here's why you should consider using Appinio for your research needs:

From questions to insights in minutes: With Appinio, you can conduct your own market research and get actionable insights in record time, allowing you to make fast, informed decisions for your business.
Intuitive platform for anyone: You don't need a PhD in research to use Appinio. Our platform is designed to be user-friendly and intuitive so that anyone can easily create and launch surveys.
Extensive reach and targeting options: Define your target audience from over 1200 characteristics and survey them in over 90 countries. Our platform ensures you reach the right people for your research needs, no matter where they are.

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

16.05.2024 | 30min read

Time Series Analysis: Definition, Types, Techniques, Examples

14.05.2024 | 31min read

07.05.2024 | 29min read

Interval Scale: Definition, Characteristics, Examples

Research Writing and Analysis

NVivo Group and Study Sessions
SPSS This link opens in a new window
Statistical Analysis Group sessions
Using Qualtrics
Dissertation and Data Analysis Group Sessions
Defense Schedule - Commons Calendar This link opens in a new window
Research Process Flow Chart
Research Alignment Chapter 1 This link opens in a new window
Step 1: Seek Out Evidence
Step 2: Explain
Step 3: The Big Picture
Step 4: Own It
Step 5: Illustrate
Annotated Bibliography
Literature Review This link opens in a new window
Systematic Reviews & Meta-Analyses
How to Synthesize and Analyze
Synthesis and Analysis Practice
Synthesis and Analysis Group Sessions
Problem Statement
Purpose Statement
Conceptual Framework
Theoretical Framework
Quantitative Research Questions
Qualitative Research Questions
Trustworthiness of Qualitative Data
Analysis and Coding Example- Qualitative Data
Thematic Data Analysis in Qualitative Design
Dissertation to Journal Article This link opens in a new window
International Journal of Online Graduate Education (IJOGE) This link opens in a new window
Journal of Research in Innovative Teaching & Learning (JRIT&L) This link opens in a new window

What is an Annotated Bibliography?

An annotated bibliography is a summary and evaluation of a resource. According to Merriam-Webster, a bibliography is “the works or a list of the works referred to in a text or consulted by the author in its production.” Your references (APA) or Works Cited (MLA) can be considered a bibliography. A bibliography follows a documentation style and usually includes bibliographic information (i.e., the author(s), title, publication date, place of publication, publisher, etc.). An annotation refers to explanatory notes or comments on a source.

An annotated bibliography, therefore, typically consists of:

Documentation for each source you have used, following the required documentation style.

For each entry, one to three paragraphs that:

Begins with a summary ,

Evaluates the reliability of the information,

Demonstrates how the information relates to previous and future research.

Entries in an annotated bibliography should be in alphabetical order.

** Please note: This may vary depending on your professor’s requirements.

Why Write an Annotated Bibliography?

Why Write an Annotated Bibliography

Writing an annotated bibliography will help you understand your topics in-depth.

An annotated bibliography is useful for organizing and cataloging resources when developing an argument.

Formatting an Annotated Bibliography

Formatting Annotated Bibliographies

Use 1-inch margins all around
Indent annotations ½ inch from the left margin.
Use double spacing.
Entries should be in alphabetical order.

Structure of an Annotated Bibliography

This table provides a high-level outline of the structure of a research article and how each section relates to important information for developing an annotated bibliography.

Annotated Bibliography Sample Outline

Author, S. A. (date of publication). Title of the article. Title of Periodical, vol. (issue), page-page. https://doi.org/XXXXXX

Write one or two paragraphs that focus on the study and its findings.

Two or more sentences that outline the thesis, hypothesis, and population of the study.
Two or more sentences that discuss the methodology.
Two or more sentences that discuss the study findings.
One or more sentences evaluating the study and its relationship to other studies.

Sample Annotated Bibliographies

Student Experience Feedback Buttons

Was this resource helpful.

<< Previous: Step 5: Illustrate
Next: Literature Review >>
Last Updated: May 16, 2024 8:25 AM
URL: https://resources.nu.edu/researchtools

Python For Data Analysis
Data Science
Data Analysis with R
Data Analysis with Python
Data Visualization with Python
Data Analysis Examples
Math for Data Analysis
Data Analysis Interview questions
Artificial Intelligence
Data Analysis Projects
Machine Learning
Deep Learning
Computer Vision
10 Best AI Tools for Data Analysts in 2024
10 Best AI Tools For Data Analysis [2024]
10 Best Generative AI Tools for Data Analysis in 2024
Top 10 Data Analytics Tools in 2024
Data Science for Beginners
Data Analysis in Excel
Best ChatGPT Plugin for Data Analysis
Data analysis challenges in the future
30+ Top Data Analytics Projects in 2024 [With Source Codes]
Top 10 R Project Ideas for Beginners in 2024
10 Best AI Tools for Stock Market Analysis [2024]
10 Data Analytics Project Ideas
100 Days of Data Analytics: A Complete Guide For Beginners
Capgemini Interview Experience For Differential Analysts 1(2023)
Top 10 Data Science Project Ideas for Beginners in 2024
Best Books to Learn Data Science for Beginners and Experts
Data Analysis in Financial Market – Where to Begin?
Data | Analysis Quiz | Question 1

15 Data Analysis Examples for Beginners in 2024

Data analysis is a multifaceted process that involves inspecting, cleaning, transforming, and modeling data to uncover valuable insights. It encompasses a wide array of techniques and methodologies, enabling organizations to interpret complex data structures and extract meaningful patterns.

Data Analysis Examples foe Beginners

In this article, we will explore about 15 Data Analysis Examples for Beginners in 2024.

Real-world Examples of Data Analysis for Beginners

The Applications of Data Analysis are vast and far-reaching, permeating various sectors and industries. Let’s explore 15 illuminating examples that highlight the versatility and impact of this powerful discipline.

Sales Trend Analysis

Businesses often leverage data analysis to assess sales data over different periods, identifying trends and patterns that can inform their strategies. For instance, a retail company might monitor quarterly sales data to pinpoint peak buying times, popular products, and emerging customer preferences. By doing so, they can adjust inventory management, marketing efforts, and sales strategies to align with consumer demands and seasonal fluctuations, ultimately enhancing profitability and operational efficiency.

Customer Segmentation

In this data analysis application, companies categorize their customer base into distinct groups based on specific criteria, such as purchasing behavior, demographics, or preferences. An online shopping platform, for example, might segment its customers into categories like frequent buyers, seasonal shoppers, or budget-conscious consumers. This analysis enables businesses to tailor marketing campaigns, product offerings, and customer experiences to appeal to each group’s unique needs, fostering improved engagement and driving business growth.

Social Media Sentiment Analysis

In the digital age, companies harness the power of data analysis to gauge public sentiment towards their products or brands by analyzing social media interactions. By examining comments, likes, shares, and other engagement metrics, they can assess overall customer satisfaction and identify areas for improvement. This type of analysis significantly impacts online reputation management, influencing marketing and public relations strategies.

Forecasting and Predictive Analysis

Data analysis plays a crucial role in predicting future trends or outcomes. An airline company, for instance, might analyze past data on seat bookings, flight timings, and passenger preferences to forecast future travel trends. This predictive analysis allows the airline to optimize flight schedules, plan for peak travel periods, and set competitive ticket prices, ultimately contributing to improved customer satisfaction and increased revenues.

Operational Efficiency Analysis

This form of data analysis focuses on optimizing internal processes within an organization. A manufacturing company might analyze data regarding machine performance, maintenance schedules, and production output to identify bottlenecks or inefficiencies. By addressing these issues, the company can streamline operations, improve productivity, and reduce costs, underscoring the importance of data analysis in achieving operational excellence.

Risk Assessment Analysis

Data analysis helps businesses identify potential risks that could adversely impact their operations or profits. An insurance company, for instance, might analyze customer data and historical claim information to estimate future claim risks. This analysis supports accurate premium setting and proactive risk management, mitigating potential financial hazards and highlighting the role of data analysis in sound risk management practices.

Recruitment and Talent Management Analysis

In this example, human resources departments scrutinize data concerning employee performance, retention rates, and skill sets. A technology firm might conduct an analysis to identify the skills and experience most prevalent among its top-performing employees. This analysis enables companies to attract and retain high-caliber talent, tailor training programs, and improve overall workforce effectiveness.

Supply Chain Optimization Analysis

This form of data analysis aims to enhance the efficiency of a business’s supply chain. A grocery store, for example, might examine sales data, warehouse inventory levels, and supplier delivery times to ensure the right products are in stock at the right time. This analysis can reduce warehousing costs, minimize stockouts or overstocks, and increase customer satisfaction, underscoring the role of data analysis in streamlining supply chains.

Web Analytics

In the digital era, businesses invest in data analysis to optimize their online presence and functionality. An e-commerce business might analyze website traffic data, bounce rates, conversion rates, and user engagement metrics. This analysis can guide website redesign, enhance user experience, and boost conversion rates, reflecting the importance of data analysis in digital marketing and web optimization.

Medical and Healthcare Analysis

Data analysis plays a crucial role in the healthcare sector. A hospital might analyze patient data, disease patterns, treatment outcomes, and more. This analysis can support evidence-based treatment plans, inform research on healthcare trends, and contribute to policy development. It can also enhance patient care by identifying efficient treatment paths and reducing hospitalization time, underscoring the significance of data analysis in the medical field.

Fraud Detection Analysis

In the financial and banking sector, data analysis is paramount in identifying and mitigating fraudulent activities. Banks might analyze transaction data, account activity, and user behavior trends to detect abnormal patterns indicative of fraud. By alerting the concerned authorities about suspicious activity, such analysis can prevent financial losses and protect customer assets, illustrating the importance of data analysis in ensuring financial security.

Energy Consumption Analysis

Utilities and energy companies often utilize data analysis to optimize their energy distribution and consumption. By evaluating data on customer usage patterns, peak demand times, and grid performance, companies can enhance energy efficiency, optimize grid operations, and develop more customer-centric services. This example showcases how data analysis can contribute to more sustainable and efficient resource utilization.

Market Research Analysis

Many businesses rely on data analysis to gauge market dynamics and consumer behaviors. A cosmetic brand, for instance, might analyze sales data, consumer feedback, and competitor information. Such analysis can provide useful insights into consumer preferences, popular trends, and competitive strategies, facilitating the development of products that align with market demands and showcasing how data analysis can drive business innovation.

Quality Control Analysis

Manufacturing industries often employ data analysis in their quality control processes. They may monitor operational data, machine performance, and product fault reports. By identifying causes of defects or inefficiencies, these industries can improve product quality, enhance manufacturing processes, and reduce waste, demonstrating the decisive role of data analysis in maintaining high-quality standards.

Economic and Policy Analysis

Government agencies and think tanks utilize data analysis to inform policy decisions and societal strategies. They might analyze data relating to employment rates, GDP, public health, or educational attainment. These insights can inform policy development, assess the impact of existing policies, and guide strategies for societal improvement, revealing that data analysis is a key tool in managing social and economic progression.

Analysis Techniques and Insights

The examples above highlight the diverse applications of data analysis, but it’s essential to delve deeper into the techniques and methodologies that enable these insights. From exploratory analysis to predictive modeling, each approach serves a unique purpose and provides distinct perspectives.

Exploratory Analysis

Exploratory analysis is often the starting point in a data analysis process, allowing researchers to understand the main characteristics of a dataset. This technique involves visual methods such as scatter plots, histograms, and box plots, enabling analysts to summarize the data’s primary aspects, check for missing values, and test assumptions.

Regression Analysis

Regression analysis is a statistical method used to understand the relationship between a dependent variable and one or more independent variables. It is commonly employed for forecasting, time series modeling, and identifying causal effect relationships between variables. This technique is widely used in areas such as machine learning and business intelligence.

Factor Analysis

Factor analysis is a technique used to reduce a large number of variables into fewer factors, capturing the maximum possible information from the original variables. This approach is often utilized in market research, customer segmentation, and image recognition, enabling analysts to identify underlying patterns and relationships within complex datasets.

Monte Carlo Simulation

Monte Carlo simulation is a technique that uses probability distributions and random sampling to estimate numerical results. It is frequently employed in risk analysis and decision-making scenarios where there is significant uncertainty, providing a powerful tool for exploring potential outcomes and informing strategic decisions.

Key Lessons from Implementing Data Analysis in Various Industries

As we delve into the various examples and techniques of data analysis, several valuable lessons emerge:

Embrace a Data-Driven Mindset: Successful organizations recognize the value of data-driven decision-making and actively incorporate data analysis into their strategic planning and operations.
Foster Cross-Functional Collaboration: Effective data analysis often requires collaboration between different departments and stakeholders, enabling a holistic understanding of the problem at hand and facilitating comprehensive solutions.
Invest in Talent and Technology: Developing a skilled workforce proficient in data analysis techniques and leveraging cutting-edge tools and technologies are crucial for extracting meaningful insights from complex datasets.
Prioritize Data Quality: The accuracy and reliability of data analysis outcomes are heavily dependent on the quality of the input data. Implementing robust data governance practices and ensuring data integrity is essential.
Continuously Adapt and Evolve: The field of data analysis is constantly evolving, with new techniques and methodologies emerging regularly. Embracing a culture of continuous learning and adaptation is vital to staying ahead of the curve.

Best Practices from Real-World Data Analysis

To maximize the benefits of data analysis and ensure its successful implementation, it is essential to adopt best practices. These include:

Clearly Define Objectives: Before embarking on a data analysis project, clearly define the objectives, questions, and metrics to be addressed, ensuring alignment with organizational goals.
Establish Data Governance Frameworks: Implement robust data governance frameworks to ensure data quality, security, and compliance with relevant regulations and policies.
Leverage Automation: Explore opportunities to automate repetitive tasks and processes within the data analysis workflow, improving efficiency and reducing the risk of human error.
Encourage Collaboration and Knowledge Sharing: Foster an environment that promotes collaboration, knowledge sharing, and cross-functional communication, enabling a holistic approach to data analysis.
Continuously Monitor and Iterate: Regularly monitor and evaluate the effectiveness of data analysis initiatives, iterating and refining processes as needed to ensure ongoing relevance and alignment with evolving business needs.

In the digital era, data analysis has become a vital tool for organizations, enabling them to unleash the full potential of their data. The examples presented demonstrate the wide-ranging impact of data analysis, from operational optimization to driving innovation. Embracing a data-driven approach and staying current with emerging technologies will be key to unlocking future growth and success.

Data Analysis Examples for Beginners in 2024 – FAQs

What are some common examples of data analysis.

Common examples include sales data analysis for identifying trends, customer behavior analysis for marketing, and financial data analysis for predicting market trends.

Can you provide an example of data analysis in business?

Using sales data to identify top-selling products and make informed decisions about inventory and marketing strategies is a common example of data analysis in business.

How is data analysis used in healthcare?

Data analysis in healthcare involves studying patient records to identify disease patterns and treatment outcomes, aiding in improved patient care and resource allocation.

What are some examples of data visualization in data analysis?

Examples of data visualization include bar charts and line graphs for representing trends, and heat maps for spatial data analysis and identifying patterns.

Please Login to comment...

Improve your Coding Skills with Practice

What kind of Experience do you want to share?

Biodiversity Loss Increases the Risk of Disease Outbreaks, Analysis Suggests

Researchers found that human-caused environmental changes are driving the severity and prevalence of disease, putting people, animals and plants at risk

Christian Thorsberg

Daily Correspondent

A monarch butterfly sips nectar from an orange and red flower.

Human-driven changes to the planet are bringing widespread and sometimes surprising effects—including shifting the Earth’s rotation , hiding meteorites in Antarctic ice and, potentially, supporting locust swarms .

Now, a large-scale analysis of nearly 1,000 scientific studies has shown just how closely human activity is tied to public health. Published last week in the journal Nature , the findings suggest anthropogenic environmental changes are making the risk of infectious disease outbreaks all the more likely.

The biodiversity crisis—which has left some one million plant and animal species at risk of extinction —is a leading driver of disease spread, the researchers found.

“It could mean that by modifying the environment, we increase the risks of future pandemics,” Jason Rohr , a co-author of the study and a biologist at the University of Notre Dame, tells the Washington Post ’s Scott Dance.

An overhead view of a muddy Arctic river, surrounded by green forested areas and permafrost

The analysis centered on earlier studies that investigated at least one of five “global change drivers” affecting wildlife and landscapes on Earth: biodiversity change, climate change, habitat change or loss, chemical pollution and the introduction of non-native species to new areas. Based on the previous studies’ findings, they collected nearly 3,000 data points related to how each of these factors might impact the severity or prevalence of infectious disease outbreaks.

Researchers aimed to avoid a human-centric approach to their analysis, considering also how plants and animals would be at risk from pathogens. Their conclusions showed that four of the examined factors—climate change, chemical pollution, the introduction of non-native species to new areas and biodiversity loss—all increased the likelihood of spreading disease, with the latter having the most significant impact.

Disease and mortality were nearly nine times higher in areas of the world where human activity has decreased biodiversity, compared to the levels expected by Earth’s natural variation in biodiversity, per the Washington Post .

Scientists hypothesize this finding could be explained by the “dilution effect”: the idea that pathogens and parasites evolve to thrive in the most common species, so the loss of rarer creatures makes infection more likely.

“That means that the species that remain are the competent ones, the ones that are really good at transmitting disease,” Rohr tells the New York Times ’ Emily Anthes.

For example, white-footed mice, the main carriers of Lyme disease, have become one of the most dominant species in their habitat as other, rarer animals have disappeared—a change that might have played a role, among other factors, in driving rising rates of Lyme disease in the United States.

One global change factor, however, actually decreased the likelihood of disease outbreaks: habitat loss and change. But here, context is key. Most habitat loss is linked to creating a single type of environment—urban ecosystems—which generally have good sanitation systems and less wildlife, reducing opportunities for disease spillover.

“In urban areas with lots of concrete, there is a much smaller number of species that can thrive in that environment,” Rohr tells the Guardian ’s Phoebe Weston. “From a human disease perspective, there is often greater sanitation and health infrastructure than in rural environments.”

Deforestation, another type of habitat loss, has been shown to increase the likelihood of disease. The incidence of malaria and Ebola , for example, worsens in such instances.

The new work adds to past research on how human activity can prompt the spread of disease. For instance, climate change-induced permafrost melt may release pathogens from the Arctic , a concern that’s been well-documented in recent years. And both habitat loss and climate change may force some animals to move closer together—and closer to humans — increasing the potential for transmitting disease .

Additionally, the research signals the need for public health officials to remain vigilant as the effects of human-caused climate change play out, experts say.

“It’s a big step forward in the science,” Colin Carlson , a global change biologist at Georgetown University who was not an author of the new analysis, tells the New York Times. “This paper is one of the strongest pieces of evidence that I think has been published that shows how important it is health systems start getting ready to exist in a world with climate change, with biodiversity loss.”

Get the latest stories in your inbox every weekday.

Christian Thorsberg | READ MORE

Christian Thorsberg is an environmental writer and photographer from Chicago. His work, which often centers on freshwater issues, climate change and subsistence, has appeared in Circle of Blue , Sierra magazine, Discover magazine and Alaska Sporting Journal .

Artificial intelligence in strategy

Can machines automate strategy development? The short answer is no. However, there are numerous aspects of strategists’ work where AI and advanced analytics tools can already bring enormous value. Yuval Atsmon is a senior partner who leads the new McKinsey Center for Strategy Innovation, which studies ways new technologies can augment the timeless principles of strategy. In this episode of the Inside the Strategy Room podcast, he explains how artificial intelligence is already transforming strategy and what’s on the horizon. This is an edited transcript of the discussion. For more conversations on the strategy issues that matter, follow the series on your preferred podcast platform .

Joanna Pachner: What does artificial intelligence mean in the context of strategy?

Yuval Atsmon: When people talk about artificial intelligence, they include everything to do with analytics, automation, and data analysis. Marvin Minsky, the pioneer of artificial intelligence research in the 1960s, talked about AI as a “suitcase word”—a term into which you can stuff whatever you want—and that still seems to be the case. We are comfortable with that because we think companies should use all the capabilities of more traditional analysis while increasing automation in strategy that can free up management or analyst time and, gradually, introducing tools that can augment human thinking.

Joanna Pachner: AI has been embraced by many business functions, but strategy seems to be largely immune to its charms. Why do you think that is?

Subscribe to the Inside the Strategy Room podcast

Yuval Atsmon: You’re right about the limited adoption. Only 7 percent of respondents to our survey about the use of AI say they use it in strategy or even financial planning, whereas in areas like marketing, supply chain, and service operations, it’s 25 or 30 percent. One reason adoption is lagging is that strategy is one of the most integrative conceptual practices. When executives think about strategy automation, many are looking too far ahead—at AI capabilities that would decide, in place of the business leader, what the right strategy is. They are missing opportunities to use AI in the building blocks of strategy that could significantly improve outcomes.

I like to use the analogy to virtual assistants. Many of us use Alexa or Siri but very few people use these tools to do more than dictate a text message or shut off the lights. We don’t feel comfortable with the technology’s ability to understand the context in more sophisticated applications. AI in strategy is similar: it’s hard for AI to know everything an executive knows, but it can help executives with certain tasks.

When executives think about strategy automation, many are looking too far ahead—at AI deciding the right strategy. They are missing opportunities to use AI in the building blocks of strategy.

Joanna Pachner: What kind of tasks can AI help strategists execute today?

Yuval Atsmon: We talk about six stages of AI development. The earliest is simple analytics, which we refer to as descriptive intelligence. Companies use dashboards for competitive analysis or to study performance in different parts of the business that are automatically updated. Some have interactive capabilities for refinement and testing.

The second level is diagnostic intelligence, which is the ability to look backward at the business and understand root causes and drivers of performance. The level after that is predictive intelligence: being able to anticipate certain scenarios or options and the value of things in the future based on momentum from the past as well as signals picked in the market. Both diagnostics and prediction are areas that AI can greatly improve today. The tools can augment executives’ analysis and become areas where you develop capabilities. For example, on diagnostic intelligence, you can organize your portfolio into segments to understand granularly where performance is coming from and do it in a much more continuous way than analysts could. You can try 20 different ways in an hour versus deploying one hundred analysts to tackle the problem.

Predictive AI is both more difficult and more risky. Executives shouldn’t fully rely on predictive AI, but it provides another systematic viewpoint in the room. Because strategic decisions have significant consequences, a key consideration is to use AI transparently in the sense of understanding why it is making a certain prediction and what extrapolations it is making from which information. You can then assess if you trust the prediction or not. You can even use AI to track the evolution of the assumptions for that prediction.

Those are the levels available today. The next three levels will take time to develop. There are some early examples of AI advising actions for executives’ consideration that would be value-creating based on the analysis. From there, you go to delegating certain decision authority to AI, with constraints and supervision. Eventually, there is the point where fully autonomous AI analyzes and decides with no human interaction.

Because strategic decisions have significant consequences, you need to understand why AI is making a certain prediction and what extrapolations it’s making from which information.

Joanna Pachner: What kind of businesses or industries could gain the greatest benefits from embracing AI at its current level of sophistication?

Yuval Atsmon: Every business probably has some opportunity to use AI more than it does today. The first thing to look at is the availability of data. Do you have performance data that can be organized in a systematic way? Companies that have deep data on their portfolios down to business line, SKU, inventory, and raw ingredients have the biggest opportunities to use machines to gain granular insights that humans could not.

Companies whose strategies rely on a few big decisions with limited data would get less from AI. Likewise, those facing a lot of volatility and vulnerability to external events would benefit less than companies with controlled and systematic portfolios, although they could deploy AI to better predict those external events and identify what they can and cannot control.

Third, the velocity of decisions matters. Most companies develop strategies every three to five years, which then become annual budgets. If you think about strategy in that way, the role of AI is relatively limited other than potentially accelerating analyses that are inputs into the strategy. However, some companies regularly revisit big decisions they made based on assumptions about the world that may have since changed, affecting the projected ROI of initiatives. Such shifts would affect how you deploy talent and executive time, how you spend money and focus sales efforts, and AI can be valuable in guiding that. The value of AI is even bigger when you can make decisions close to the time of deploying resources, because AI can signal that your previous assumptions have changed from when you made your plan.

Joanna Pachner: Can you provide any examples of companies employing AI to address specific strategic challenges?

Yuval Atsmon: Some of the most innovative users of AI, not coincidentally, are AI- and digital-native companies. Some of these companies have seen massive benefits from AI and have increased its usage in other areas of the business. One mobility player adjusts its financial planning based on pricing patterns it observes in the market. Its business has relatively high flexibility to demand but less so to supply, so the company uses AI to continuously signal back when pricing dynamics are trending in a way that would affect profitability or where demand is rising. This allows the company to quickly react to create more capacity because its profitability is highly sensitive to keeping demand and supply in equilibrium.

Joanna Pachner: Given how quickly things change today, doesn’t AI seem to be more a tactical than a strategic tool, providing time-sensitive input on isolated elements of strategy?

Yuval Atsmon: It’s interesting that you make the distinction between strategic and tactical. Of course, every decision can be broken down into smaller ones, and where AI can be affordably used in strategy today is for building blocks of the strategy. It might feel tactical, but it can make a massive difference. One of the world’s leading investment firms, for example, has started to use AI to scan for certain patterns rather than scanning individual companies directly. AI looks for consumer mobile usage that suggests a company’s technology is catching on quickly, giving the firm an opportunity to invest in that company before others do. That created a significant strategic edge for them, even though the tool itself may be relatively tactical.

Joanna Pachner: McKinsey has written a lot about cognitive biases and social dynamics that can skew decision making. Can AI help with these challenges?

Yuval Atsmon: When we talk to executives about using AI in strategy development, the first reaction we get is, “Those are really big decisions; what if AI gets them wrong?” The first answer is that humans also get them wrong—a lot. [Amos] Tversky, [Daniel] Kahneman, and others have proven that some of those errors are systemic, observable, and predictable. The first thing AI can do is spot situations likely to give rise to biases. For example, imagine that AI is listening in on a strategy session where the CEO proposes something and everyone says “Aye” without debate and discussion. AI could inform the room, “We might have a sunflower bias here,” which could trigger more conversation and remind the CEO that it’s in their own interest to encourage some devil’s advocacy.

We also often see confirmation bias, where people focus their analysis on proving the wisdom of what they already want to do, as opposed to looking for a fact-based reality. Just having AI perform a default analysis that doesn’t aim to satisfy the boss is useful, and the team can then try to understand why that is different than the management hypothesis, triggering a much richer debate.

In terms of social dynamics, agency problems can create conflicts of interest. Every business unit [BU] leader thinks that their BU should get the most resources and will deliver the most value, or at least they feel they should advocate for their business. AI provides a neutral way based on systematic data to manage those debates. It’s also useful for executives with decision authority, since we all know that short-term pressures and the need to make the quarterly and annual numbers lead people to make different decisions on the 31st of December than they do on January 1st or October 1st. Like the story of Ulysses and the sirens, you can use AI to remind you that you wanted something different three months earlier. The CEO still decides; AI can just provide that extra nudge.

Joanna Pachner: It’s like you have Spock next to you, who is dispassionate and purely analytical.

Yuval Atsmon: That is not a bad analogy—for Star Trek fans anyway.

Joanna Pachner: Do you have a favorite application of AI in strategy?

Yuval Atsmon: I have worked a lot on resource allocation, and one of the challenges, which we call the hockey stick phenomenon, is that executives are always overly optimistic about what will happen. They know that resource allocation will inevitably be defined by what you believe about the future, not necessarily by past performance. AI can provide an objective prediction of performance starting from a default momentum case: based on everything that happened in the past and some indicators about the future, what is the forecast of performance if we do nothing? This is before we say, “But I will hire these people and develop this new product and improve my marketing”— things that every executive thinks will help them overdeliver relative to the past. The neutral momentum case, which AI can calculate in a cold, Spock-like manner, can change the dynamics of the resource allocation discussion. It’s a form of predictive intelligence accessible today and while it’s not meant to be definitive, it provides a basis for better decisions.

Joanna Pachner: Do you see access to technology talent as one of the obstacles to the adoption of AI in strategy, especially at large companies?

Yuval Atsmon: I would make a distinction. If you mean machine-learning and data science talent or software engineers who build the digital tools, they are definitely not easy to get. However, companies can increasingly use platforms that provide access to AI tools and require less from individual companies. Also, this domain of strategy is exciting—it’s cutting-edge, so it’s probably easier to get technology talent for that than it might be for manufacturing work.

The bigger challenge, ironically, is finding strategists or people with business expertise to contribute to the effort. You will not solve strategy problems with AI without the involvement of people who understand the customer experience and what you are trying to achieve. Those who know best, like senior executives, don’t have time to be product managers for the AI team. An even bigger constraint is that, in some cases, you are asking people to get involved in an initiative that may make their jobs less important. There could be plenty of opportunities for incorporating AI into existing jobs, but it’s something companies need to reflect on. The best approach may be to create a digital factory where a different team tests and builds AI applications, with oversight from senior stakeholders.

The big challenge is finding strategists to contribute to the AI effort. You are asking people to get involved in an initiative that may make their jobs less important.

Joanna Pachner: Do you think this worry about job security and the potential that AI will automate strategy is realistic?

Yuval Atsmon: The question of whether AI will replace human judgment and put humanity out of its job is a big one that I would leave for other experts.

The pertinent question is shorter-term automation. Because of its complexity, strategy would be one of the later domains to be affected by automation, but we are seeing it in many other domains. However, the trend for more than two hundred years has been that automation creates new jobs, although ones requiring different skills. That doesn’t take away the fear some people have of a machine exposing their mistakes or doing their job better than they do it.

Joanna Pachner: We recently published an article about strategic courage in an age of volatility that talked about three types of edge business leaders need to develop. One of them is an edge in insights. Do you think AI has a role to play in furnishing a proprietary insight edge?

Yuval Atsmon: One of the challenges most strategists face is the overwhelming complexity of the world we operate in—the number of unknowns, the information overload. At one level, it may seem that AI will provide another layer of complexity. In reality, it can be a sharp knife that cuts through some of the clutter. The question to ask is, Can AI simplify my life by giving me sharper, more timely insights more easily?

Joanna Pachner: You have been working in strategy for a long time. What sparked your interest in exploring this intersection of strategy and new technology?

Yuval Atsmon: I have always been intrigued by things at the boundaries of what seems possible. Science fiction writer Arthur C. Clarke’s second law is that to discover the limits of the possible, you have to venture a little past them into the impossible, and I find that particularly alluring in this arena.

AI in strategy is in very nascent stages but could be very consequential for companies and for the profession. For a top executive, strategic decisions are the biggest way to influence the business, other than maybe building the top team, and it is amazing how little technology is leveraged in that process today. It’s conceivable that competitive advantage will increasingly rest in having executives who know how to apply AI well. In some domains, like investment, that is already happening, and the difference in returns can be staggering. I find helping companies be part of that evolution very exciting.

Explore a career with us

Strategic courage in an age of volatility

Bias Busters Collection

IMAGES

FREE 13+ Research Analysis Samples in MS Word
FREE 13+ Research Analysis Samples in MS Word
FREE 13+ Research Analysis Samples in MS Word
FREE 42+ Research Paper Examples in PDF
FREE 10+ Sample Data Analysis Templates in PDF
FREE 10+ Research Analysis Report Templates in PDF

VIDEO

Analysis of Data? Some Examples to Explore
Examples of regression analyses [in 53 sec.] #shorts
How Do You Write a Situational Analysis? Tips for Public Relations and Advertising professionals
examples of analytic functions || examples of analytic function complex analysis
Data Analysis [Video 6]
Data Analysis in Research| Types of Data Analysis in Research| Data Processing in Research

COMMENTS

The Beginner's Guide to Statistical Analysis
Example: Paired t test for experimental research Because your research design is a within-subjects experiment, both pretest and posttest measurements come from the same group, so you require a dependent (paired ) t test. Since you predict a change in a specific direction (an improvement in test scores), you need a one-tailed test.
Data Analysis in Research: Types & Methods
Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. Three essential things occur during the data ...
Qualitative Data Analysis Methods: Top 6 + Examples
QDA Method #1: Qualitative Content Analysis. Content analysis is possibly the most common and straightforward QDA method. At the simplest level, content analysis is used to evaluate patterns within a piece of content (for example, words, phrases or images) or across multiple pieces of content or sources of communication. For example, a collection of newspaper articles or political speeches.
What Is Data Analysis? (With Examples)
What Is Data Analysis? (With Examples) Data analysis is the practice of working with data to glean useful information, which can then be used to make informed decisions. "It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts," Sherlock Holme's proclaims ...
Learning to Do Qualitative Data Analysis: A Starting Point
For many researchers unfamiliar with qualitative research, determining how to conduct qualitative analyses is often quite challenging. Part of this challenge is due to the seemingly limitless approaches that a qualitative researcher might leverage, as well as simply learning to think like a qualitative researcher when analyzing data. From framework analysis (Ritchie & Spencer, 1994) to content ...
Chapter 3
10 Examples of Effective Experiment Design and Data Analysis in Transportation Research About this Chapter This chapter provides a wide variety of examples of research questions. The examples demon- strate varying levels of detail with regard to experiment designs and the statistical analyses required.
A Really Simple Guide to Quantitative Data Analysis
nominal. It is important to know w hat kind of data you are planning to collect or analyse as this w ill. affect your analysis method. A 12 step approach to quantitative data analysis. Step 1 ...
Data Analysis Techniques In Research
Data analysis techniques in research are essential because they allow researchers to derive meaningful insights from data sets to support their hypotheses or research objectives.. Data Analysis Techniques in Research: While various groups, institutions, and professionals may have diverse approaches to data analysis, a universal definition captures its essence.
Quantitative Data Analysis Methods & Techniques 101
The two "branches" of quantitative analysis. As I mentioned, quantitative analysis is powered by statistical analysis methods.There are two main "branches" of statistical methods that are used - descriptive statistics and inferential statistics.In your research, you might only use descriptive statistics, or you might use a mix of both, depending on what you're trying to figure out.
Introduction to Research Statistical Analysis: An Overview of the
Introduction. Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology.
How to conduct a meta-analysis in eight steps: a practical guide
2.1 Step 1: defining the research question. The first step in conducting a meta-analysis, as with any other empirical study, is the definition of the research question. Most importantly, the research question determines the realm of constructs to be considered or the type of interventions whose effects shall be analyzed.
What is Data Analysis? An Expert Guide With Examples
Data analysis is a comprehensive method of inspecting, cleansing, transforming, and modeling data to discover useful information, draw conclusions, and support decision-making. It is a multifaceted process involving various techniques and methodologies to interpret data from various sources in different formats, both structured and unstructured.
Basic statistical tools in research and data analysis
An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis. ... Bad statistics may lead to bad research, and bad research may lead to unethical practice. Hence, an adequate knowledge of statistics and the appropriate use of ...
A Practical Guide to Writing Quantitative and Qualitative Research
INTRODUCTION. Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses.1,2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results.3,4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the ...
Qualitative Data Analysis: What is it, Methods + Examples
Qualitative data analysis is a systematic process of examining non-numerical data to extract meaning, patterns, and insights. In contrast to quantitative analysis, which focuses on numbers and statistical metrics, the qualitative study focuses on the qualitative aspects of data, such as text, images, audio, and videos.
PDF Summary and Analysis of Scientific Research Articles
The analysis shows that you can evaluate the evidence presented in the research and explain why the research could be important. Summary. The summary portion of the paper should be written with enough detail so that a reader would not have to look at the original research to understand all the main points. At the same time, the summary section ...
How To Write an Analysis (With Examples and Tips)
Writing an analysis requires a particular structure and key components to create a compelling argument. The following steps can help you format and write your analysis: Choose your argument. Define your thesis. Write the introduction. Write the body paragraphs. Add a conclusion. 1. Choose your argument.
PDF Chapter 4: Analysis and Interpretation of Results
The analysis and interpretation of data is carried out in two phases. The. first part, which is based on the results of the questionnaire, deals with a quantitative. analysis of data. The second, which is based on the results of the interview and focus group. discussions, is a qualitative interpretation.
Experimental Research: Definition, Types, Examples
Content. Experimental research is a cornerstone of scientific inquiry, providing a systematic approach to understanding cause-and-effect relationships and advancing knowledge in various fields. At its core, experimental research involves manipulating variables, observing outcomes, and drawing conclusions based on empirical evidence.
Effect Size's Role in Power Analysis for Research
Power analysis is a critical step in research design that you use to determine the minimum sample size needed for your study. It's based on three key factors: the desired significance level ...
Causal effects of genetically determined ...
This study aims to elucidate the causal relationship between metabolites and AGA using Mendelian randomization (MR) analysis. Methods. We conducted a two-sample Mendelian randomization (TSMR) analysis based on a genome-wide association study (GWAS) to assess the causality of 452 metabolites on AGA. ... Limited research has been conducted on 2 ...
Annotated Bibliography
An annotated bibliography is a summary and evaluation of a resource. According to Merriam-Webster, a bibliography is "the works or a list of the works referred to in a text or consulted by the author in its production.". Your references (APA) or Works Cited (MLA) can be considered a bibliography. A bibliography follows a documentation style ...
15 Data Analysis Examples for Beginners in 2024
Real-world Examples of Data Analysis for Beginners. The Applications of Data Analysis are vast and far-reaching, permeating various sectors and industries. Let's explore 15 illuminating examples that highlight the versatility and impact of this powerful discipline. Sales Trend Analysis.
How to Write a Market Analysis: Guidelines & Templates
8. Market Share. Build your market analysis and share relevant information about market segments, market share, size and opportunities using this beautiful template. The template will help inform your business plan and strategy and communicate the size of the opportunity to potential investors.
Biodiversity Loss Increases the Risk of Disease Outbreaks, Analysis
May 13, 2024. Human-caused biodiversity loss is a major factor that could contribute to more frequent and severe disease outbreaks, according to a new study. Renee Grayson via Flickr under CC BY 2 ...
AI strategy in business: A guide for executives
Yuval Atsmon: When people talk about artificial intelligence, they include everything to do with analytics, automation, and data analysis. Marvin Minsky, the pioneer of artificial intelligence research in the 1960s, talked about AI as a "suitcase word"—a term into which you can stuff whatever you want—and that still seems to be the case.