How to Analyze a Data Set?

Analytical Robustness

Aczel et al. (2026)

Aczel, B., Szaszi, B., Clelland, H. T., Kovacs, M., Holzmeister, F., Ravenzwaaij, D. van, et al. (2026). Investigating the analytical robustness of the social and behavioural sciences. Nature 652, 135–142. doi: 10.1038/s41586-025-09844-9

Analytical Robustness (2)

An Example Data Set

Table 1: Data provided by Andrew Gelman (Gelman and Hill, 2007).

	mom_iq	kid_score
1	121.1	65
2	89.4	98
3	115.4	85
4	99.4	83
5	92.7	115
6..429
430	84.9	94
431	93.0	76
432	94.9	50
433	96.9	88
434	91.3	70

The Research Question

Is there a relation between the IQ and the test score?

What would be the best method to answer this question?

The Answer Is:

It depends…

…on what you REALLY want to find out
…on your (alternative) hypothesis
…on the intended audience

…on the expected effect size
…on the possible sample size
…on the analyst’s knowledge & tendencies

The Methods Toolbox

One Grouping Variable

Location

Test the scores of kids with low-IQ moms against the scores of kids with high-IQ moms

t-test (power loss if uneq SS & uneq var)
Welch’s test
One-way ANOVA (power loss if uneq SS)

Distribution

Test the scores of kids with low-IQ moms against the scores of kids with high-IQ moms

Mann-Whitney U
a.k.a. Wilcoxon rank sum test

Continuous Data

Dependence

Correlation
Regression

Categorial Data

Independence

Test (high-IQ moms vs low-IQ moms) vs (high-score kids vs low-score kids)

Pearson’s \(\chi^2\)-test

Relation Between Mom’s IQ and Child’s Test Score?

Relation Between Mom’s IQ and Child’s Test Score?

Relation Between Mom’s IQ and Child’s Test Score?

Relation Between Mom’s IQ and Child’s Test Score?

Relation Between Mom’s IQ and Child’s Test Score?

Location Difference Between Low and High IQ

‘Make’ a new variable, high_iq, set to 1 for moms with IQ >= 100, 0 otherwise
Assess your test’s assumptions (distribution, variance, sample sizes!)


    Welch Two Sample t-test

data:  kid_score by high_iq
t = -9, df = 431, p-value <2e-16
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -19.3 -12.2
sample estimates:
mean in group 0 mean in group 1 
           79.7            95.5

Distribution Difference between Low and High IQ


    Wilcoxon rank sum test with continuity correction

data:  kid_score by high_iq
W = 12716, p-value = 4e-16
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
 -19 -12
sample estimates:
difference in location 
                   -15

Always plot the distributions of both groups, as the Mann-Whitney test can lead you to falsely reject the Null when both distributions’ shape and spread are different, yet their medians are identical!

Dependence

Correlation

Continuous data
How are mom_iq and kid_score associated?
Pearson’s product-moment correlation coefficient \(r\)

corr_coeff <- cor(
  kids$mom_iq, 
  kids$kid_score,
  method = "pearson"
)

0.448

Dependence

Regression

Continuous data
Centered (or standardized) \(X\) variable
Can mom_iq explain/predict kid_score?

\(Y_i = \mathcal{f}(X_i, \beta) + \mathcal{e}_i\)
Ordinary least squares (OLS) regression
\(\mathcal{f}(X_i, \beta) = \beta_0 + \beta_1X_i\)

    (Intercept) standard_mom_iq 
          86.80            0.61

Independence

By dichotomizing both mom_iq and kid_score we can test independence:

       high_score
high_iq   0   1
      0 138 101
      1  48 147


    Pearson's Chi-squared test with Yates' continuity correction

data:  tabyl(kids, high_iq, high_score)
X-squared = 47, df = 1, p-value = 8e-12

Summary

id	Test	(adjusted) Cohen's d	SE
1	Welch Two Sample t-test	0.834	0.102
2	Wilcoxon rank sum test with continuity correction	0.546	NA
3	Correlation	1.003	0.002
4	OLS regression	1.003	0.000
5	Pearson's Chi-squared test with Yates' continuity correction	2.307	NA

Calculation of (adjusted) Cohen’s d after Borenstein (2009)

Key Takeaways

Given the differing results by the different methods:

What is the research question, the hypothesis?
Consider power aspects
Who is the audience?
When reading a paper: ask yourself the same questions!
Multiverse analyses as good practice

References

Aczel, B., Szaszi, B., Clelland, H. T., Kovacs, M., Holzmeister, F., Ravenzwaaij, D. van, et al. (2026). Investigating the analytical robustness of the social and behavioural sciences. Nature 652, 135–142. doi: 10.1038/s41586-025-09844-9

Borenstein, M. (2009). Introduction to meta-analysis. Chichester, U.K: John Wiley & Sons.

Gelman, A., and Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models., 1st Edn, eds. R. M. Alvarez, N. L. Beck, and L. L. Wu. Cambridge University Press.