Which section of a research article allows the reader to review the descriptive and inferential statistics for each hypothesis or research question?

  • Journal List
  • Fam Med Community Health
  • v.7[2]; 2019
  • PMC6583801

Fam Med Community Health. 2019; 7[2]: e000067.

Abstract

The purpose of this article is to provide an accessible introduction to foundational statistical procedures and present the steps of data analysis to address research questions and meet standards for scientific rigour. It is aimed at individuals new to research with less familiarity with statistics, or anyone interested in reviewing basic statistics. After examining a brief overview of foundational statistical techniques, for example, differences between descriptive and inferential statistics, the article illustrates 10 steps in conducting statistical analysis with examples of each. The following are the general steps for statistical analysis: [1] formulate a hypothesis, [2] select an appropriate statistical test, [3] conduct a power analysis, [4] prepare data for analysis, [5] start with descriptive statistics, [6] check assumptions of tests, [7] run the analysis, [8] examine the statistical model, [9] report the results and [10] evaluate threats to validity of the statistical analysis. Researchers in family medicine and community health can follow specific steps to ensure a systematic and rigorous analysis.

Keywords: Family Medicine, Community Health Services, Methodology, Statistics

Investigators in family medicine and community health often employ quantitative research to address aims that examine trends, relationships among variables or comparisons of groups [Fetters, 2019, this issue]. Quantitative research involves collecting structured or closed-ended data, typically in the form of numbers, and analysing that numeric data to address research questions and test hypotheses. Research hypotheses provide a proposition about the expected outcome of research that may be assessed using a variety of methodologies, while statistical hypotheses are specific statements about propositions that can only be tested statistically. Statistical analysis requires a series of steps beginning with formulating hypotheses and selecting appropriate statistical tests. After preparing data for analysis, researchers then proceed with the actual statistical analysis and finally report and interpret the results.

Family medicine and community health researchers often limit their analyses to descriptive statistics—reporting frequencies, means and standard deviation [SD]. While sometimes an appropriate stopping point, researchers may be missing opportunities for more advanced analyses. For example, knowing that patients have favourable attitudes about a treatment may be important and can be addressed with descriptive statistics. On the other hand, finding that attitudes are different [or not] between men and women and that difference is statistically significant may give even more actionable information to healthcare professionals. The latter question, about differences, can be addressed through inferential statistical tests. The purpose of this article is to provide an accessible introduction to foundational statistical procedures and present the steps of data analysis to address research questions and meet standards for scientific rigour. It is aimed at individuals new to research with less familiarity with statistics and may be helpful information when reading research or conducting peer review.

Foundational statistical techniques

Statistical analysis is a method of aggregating numeric data and drawing inferences about variables. Statistical procedures may be broadly classified into [1] statistics that describe data—descriptive statistics; and [2] statistics that make inferences about more general situations beyond the actual data set—inferential statistics.

Descriptive statistics

Descriptive statistics aggregate data that are grouped into variables to examine typical values and the spread of values for each variable in a data set. Statistics summarising typical values are referred to as measures of central tendency and include the mean, median and mode. The spread of values is represented through measures of variability, including the variance, SD and range. Together, descriptive statistics provide indicators of the distribution of data, or the frequency of values through the data set as in a histogram plot. Table 1 summarises commonly used descriptive statistics. For consistency, I use the terms independent variable and dependent variable, but in some fields and types of research such as correlational studies the preferred terms may be predictor and outcome variable. An independent variable influences, affects or predicts a dependent variable.

Table 1

Descriptive statistics

Statistic Statistic Description of calculation Intent
Measures of central tendency Mean Total of values divided by the number of values. Describe all responses with the average value.
Median Arrange all values in order and determine the halfway point. Determine the middle value among all values, which is important when dealing with extreme outliers.
Mode Examine all values and determine which one appears most frequently. Describe the most common value.
Measures of variability Variance Calculate the difference of each value from the mean, square this difference score, sum all of the squared difference scores and divide by the number of values minus 1. Provide an indicator of spread.
Standard deviation Square root of variance. Give an indicator of spread by reporting on average how much values differ from the mean.
Range The difference between the maximum and minimum value. Give a very general indicator of spread.
Frequencies Count the number of occurrences of each value. Provide a distribution of how many times each value occurs.

Inferential statistics: comparing groups with t tests and ANOVA

Inferential statistics are another broad category of techniques that go beyond describing a data set. Inferential statistics can help researchers draw conclusions from a sample to a population.1 We can use inferential statistics to examine differences among groups and the relationships among variables. Table 2 presents a menu of common, fundamental inferential tests. Remember that even more complex statistics rely on these as a foundation.

Table 2

Inferential statistics

Statistic Intent
t tests Compare groups to examine whether means between two groups are statistically significant.
Analysis of variance Compare groups to examine whether means among two or more groups are statistically significant.
Correlation Examine whether there is a relationship or association between two or more variables.
Regression Examine how one or more variables predict another variable.

The t test is used to compare two group means by determining whether group differences are likely to have occurred randomly by chance or systematically indicating a real difference. Two common forms are the independent samples t test, which compares means of two unrelated groups, such as means for a treatment group relative to a control group, and the paired samples t test, which compares means of related groups, such as the pretest and post-test scores for the same individuals before and after a treatment. A t test is essentially determining whether the difference in means between groups is larger than the variability within the groups themselves.

Another fundamental set of inferential statistics falls under the general linear model and includes analysis of variance [ANOVA], correlation and regression. To determine whether group means are different, use the t test or the ANOVA. Note that the t test is limited to two groups, but the ANOVA is applicable to two or more groups. For example, an ANOVA could examine whether a primary outcome measure—dependent variable—is significantly different for groups assigned to one of three different interventions. The ANOVA result comes in an F statistic along with a p value or confidence interval [CI], which tells whether there is some significant difference among groups. We then need to use other statistics [eg, planned comparisons or a Bonferroni comparison, to give two possibilities] to determine which of those groups are significantly different from one another. Planned comparisons are established before conducting the analysis to contrast the groups, while other tests like the Bonferroni comparison are conducted post-hoc [ie, after analysis].

Examining relationships using correlation and regression

The general linear model contains two other major methods of analysis, correlation and regression. Correlation reveals whether values between two variables tend to systematically change together. Correlation analysis has three general outcomes: [1] the two variables rise and fall together; [2] as values in one variable rise, the other falls; and [3] the two variables do not appear to be systematically related. To make those determinations, we use the correlation coefficient [r] and related p value or CI. First, use the p value or CI, as compared with established significance criteria [eg, p0.05 or a CI crossing 0], we can conclude no significant difference between groups. The communication assessment exemplar reports significance of the t tests along with measures such as equality of variance.

For an ANOVA, if the F statistic is not statistically significant [eg, p>0.05 or a CI crossing 0], we can conclude no significant difference between groups and stop because there is no point in further examining what groups may be different. If the F statistic is significant in an ANOVA, we can then use contrasts or post-hoc tests to examine what is different. For a correlation test, if the r value is not statistically significant [eg, p>0.05 or a CI crossing 0], we can stop because there is no point in looking at the magnitude or direction of the coefficient. If it is significant, we can proceed to interpret the r. Finally, for a regression, we can examine the F statistic as an omnibus test and its significance. If it is not significant, we can stop. If it is significant, then examine the p value of each independent variable and residuals.

Step 9. Report the results of statistical analysis

When writing statistical results, always start with descriptive statistics and note whether assumptions for tests were met. When reporting inferential statistical tests, give the statistic itself [eg, a F statistic], the measure of significance [p value or CI], the effect size and a brief written interpretation of the statistical test. The interpretation, for example, could note that an intervention was not significantly different from the control or that it was associated with improvement that was statistically significant. For example, the exemplar study gives the pre–post means along standard error, t statistic, p value and an interpretation that postseminar means were lower, along with a reminder to the reader that lower is better.6

When writing for a journal, follow the journal’s style. Many styles italicise non-Greek statistics [eg, the p value], but follow the particular instructions given. Remember a p value can never be 0 even though some statistical programs round the p to 0. In that case, most styles prefer to report as p

Chủ Đề