# A Quick Guide for Choosing the Appropriate Statistical Test

The most important step in choosing the appropriate statistical procedure is to know what the variables of your study are. What are the independent and dependent variables of your study? How are each of the variables measured? Once you have a better grasp of your variables, you can easily choose the statistical procedure that will best answer your study's questions.

## An Example

Five students are asked to design a study that will assess the relationship between using the Wii Fit and weight loss in a group of 150 overweight pre-teens during a month-long period. The weight of the participants is taken at the beginning and at the end of the study. The students come up with five different proposals:

- Student A proposes to randomly assign half of the group to the Wii intervention. This group will be instructed to do Wii Fit Aerobics training for 30 minutes five times a week. The other half would serve as the control and will not be told to do anything. Student A then wants to determine whether using the Wii Fit would lead to weight loss. Accordingly, she suggests that the participants be classified into one of two groups: no weight loss and weight loss.
- Student B also proposes to randomly assign half of the group to the Wii intervention. This group will be instructed to do Wii Fit Aerobics training for 30 minutes five times a week. The other half would serve as the control and will not be told to do anything. Student B then wants to determine whether using the Wii Fit would lead to weight loss. Thus, he suggests that weight loss should be defined in terms of the difference in weight prior to and immediately after the study.
- Student C proposes to randomly assign 30 pre-teens to one of four types of Wii interventions: Aerobics, Yoga, Strength Training, and Balance Games. These four groups will be instructed to do the specific Wii Fit activity for 30 minutes five times a week. The last group of 30 pre-teens would serve as the control and will not be told to do anything. Student C then wants to determine whether using the Wii Fit would lead to weight loss. Thus, she suggests that weight loss should be defined in terms of the difference in weight prior to and immediately after the study.
- Student D proposes to simply ask the 150 overweight teens to record the number of minutes per day they spend using the Wii Fit. He then suggests that the participants be classified into one of two groups: no weight loss and weight loss.
- Student E also proposes to simply ask the 150 overweight teens to record the number of minutes per day they spend using the Wii Fit. He then suggests that weight loss should be defined in terms of the difference in weight prior to and immediately after the study.

### First Point: Variables of the Study

What are the variables of the study? Using the Wii Fit would be the independent variable of the study while weight loss would be the dependent variable of the study.

### Second Point: Definition or Measurement of the Variables

From the example above it is obvious that there are several ways to define or measure the independent and dependent variables of a study. But there are two main questions to consider:

- Is the independent variable measured categorically or continuously?
- Is the dependent variable measured categorically or continuously?

**Student A. **Student A defined Wii Fit use in terms of using the Wii Fit or not using the Wii Fit. Accordingly, Student A defined Wii Fit use in terms of categories. She defined weight loss in terms of no weight loss or weight loss. Thus, her definition of weight loss was categorical.

**Student B**. Student B also defined Wii Fit use in terms of using the Wii Fit or not using the Wii Fit. Thus, he defined Wii Fit use in terms of categories. But Student B defined weight loss in terms of the difference between weight prior to the study and weight immediately after the study. Weight loss, therefore, was defined continuously.

**Student C**. Student C defined Wii Fit use in terms of the type of Wii Fit activity. Thus, she defined Wii Fit use in terms of five categories. She also defined weight loss in terms of the difference between weight prior to the study and weight immediately after the study. Weight loss, therefore, was defined continuously.

**Student D**. Student D defined Wii Fit use in terms of the number of minutes per day spent using the Wii Fit. As such, Wii Fit use was defined continuously. He defined weight loss in terms of no weight loss or weight loss. Thus, his definition of weight loss was categorical.

**Student E**. Student E defined Wii Fit use in terms of the number of minutes per day spent using the Wii Fit. As such, Wii Fit use was defined continuously. Student E also defined weight loss in terms of the difference between weight prior to the study and weight immediately after the study. Weight loss, therefore, was defined continuously.

### Third Point: Choosing the Appropriate Statistical Procedure

Given that independent and dependent variables can be classified as categorical or continuous, the grid below can be used to classify the more common statistical procedures.

Dependent Variable |
|||

Categorical | Continuous | ||

Independent Variable |
Categorical | Cross-tabulation |
t-test |

Continuous | Logistic regression |
Correlation |

**Student A**. Student A could thus choose to perform either a cross-tabulation analysis or a logistic regression procedure. These tests are useful when the independent and dependent variables are measured categorically.

**Student B**. Student B would need to conduct an independent t-test procedure since his independent variable would be defined in terms of categories and his dependent variable would be measured continuously. An independent t-test procedure is used only when the independent variable has two categories.

**Student C**. Student C would need to conduct a one-way ANOVA since her independent variable would be defined in terms of categories and her dependent variable would be measured continuously. One-way ANOVAs are used when the independent variable has three or more categories.

**Student D**. Student D would use a logistic regression procedure to analyze his data since his independent variable would be measured continuously and his dependent variable would be measured categorically. If Student D defined his dependent variable in terms of three or more categories that could be ranked (e.g., weight gain, no weight loss, weight loss), then he would use an ordinal regression procedure.

**Student E**. Student E could choose to perform either a Pearson correlation procedure or a linear regression procedure since both of her variables would be defined continuously. Usually, a correlation test is conducted when there is only one independent variable and one independent variable. If Student E wanted to study the relationship between several independent variables (e.g., number of hours spent sleeping, number of calories consumed per day) and weight loss, then she would use a linear regression procedure.

## About the Author

## Victoria Briones, PhD

### Organizational Psychology

**DR. BRIONES**, a graduate from Columbia University and former fellow at the Harvard Kennedy School of Government, taught Applied Regression Analysis and Research Methods to graduate students. In the last eight years, Victoria has worked as a dissertation and statistics consultant, helping graduate students in psychology, education, nursing, biology, and business formulate/hone their study hypotheses, arrive at better operational definitions for their study variables, improve procedures to increase the internal and the external validity of their studies, test their study hypotheses, analyze their resulting data, and understand their results.

In addition to having a strong grasp of research methods, Victoria's primary areas of expertise are in testing moderation and mediation hypotheses (using SPSS) and in testing path, measurement (via a confirmatory factor analysis or CFA), and structural models using the AMOS, LISREL, EQS, and MPlus programs. She is also able to explain relatively complex procedures and results to clients who have minimal knowledge of research methods and statistics.

Victoria is familiar with non-parametric procedures such as Mann-Whitney, Kruskal-Wallis, and chi-square tests. But her strengths lie in conducting basic (e.g., Pearson correlations, t-tests, and ANOVAs) and complex parametric procedures such as exploratory factor analysis (EFA), regression (i.e., linear, logistic, and multinomial), mixed-ANOVA, MANOVA, and discriminant analysis. Moreover, Victoria provides her clients with concise and coherent summaries of the results of these procedures.

Scope: research methods, reliability analyses, t-tests, ANOVA, repeated-measures ANOVA, ANCOVA, exploratory and confirmatory factor analyses, multiple linear regression, logistic regression, MANOVA, structural equation modeling (AMOS, LISREL, and EQS).

### Feedback

"She is the best stat consultant. I could not finish my dissertation without her. She is available almost 24 hours and responded to all of my questions. She returned my work so quickly. Her explanation was always clear and accurate even though I am not good at stat. She is so patient for explaining every detail stat question. She is amazing!"

(Contact information available on request)