Abstract

Well, maybe it was because of the hot weather here in Germany, but really, I do not understand what an effect of d = 0.09 between the groups means. Therefore, a bit sad and grumpy I asked my colleague and friend Dennis Anheyer to join in this editorial to help me out of this mess.
The first thing we discovered: Possibly my intellectual misconception and bad mood actually depend on the hot weather conditions. According to an analysis of the National Longitudinal Survey of Youth, changes in temperature beyond 26°C led to decreases in cognitive performance on math in children, 1 which was also underpinned by similar results of a more recent study in students in grades 3 to 8. 2 Moreover, a meta-analysis of Taylor et al. 3 found that heat stress mainly affects more complex cognitive tasks.
Ok, probably the hot weather in combination with the complex task of interpreting a between-group effect size was the problem. Well, this does not alter my mood. However, Dennis helped me out again: according to another meta-analysis of Liu et al., 4 heat affects the mood, in particular in elderly people >65 years. Luckily, I am not that old but my sentiment may nevertheless be influenced by the weather? And yes, a recent analysis of expressed sentiment in Facebook and Twitter channels found an increase of negative sentiment above a temperature of 26°C (Fig. 1). 5

Change in sentiment in social media depending on the temperature adapted from Baylis et al. 5 Color images are available online.
So, after checking the evidence perhaps weather should be introduced as a covariate in studies dealing with mental outcomes?
However, we now turn back to the main problem: what does it mean: an effect of d = 0.09 between the groups? There must be a more intuitive way to describe effects between groups. So we had a look how it all started.
Dating back in 1962, Cohen
6
in his well-cited article asked the question “How large an effect […] in the population do I expect actually exists, or want to be able to detect?” To answer this question he first introduces the concept of a dimensionless effect size d and then surveyed 70 articles from the 1960/61 issues of the Journal of Abnormal and Social Psychology, and found that “the average power […] over the 70 research studies was .18 for small effects, .48 for medium effects, and .83 for large effects.” In a way this was a kind of very first meta-analysis with respect to power but this, except for the fact that d is dimensionless, did not help us. Some 7 years later in his book on Statistical Power Analysis for the Behavioral Sciences,
7
he explains the interpretation of d in more detail as follows: If the A population has a mean of 280 and the B population a mean of 270, the question ‘How large is the effect?’ […] d provides an answer to such questions by expressing score distances in units of variability.
He then illustrated this idea using a within-population standard deviation of σ = 100 scale points, which yields to
and he states: “the means differ by a tenth of a standard deviation.” Again another step in the explanation of d that from that time on was frequently used in research in the life science.
It took another 13 years when in 1992 the idea of a “common language effect size (CLES)” was introduced. 8 The idea behind CLES is to provide a standardized, yet easily understood, measure based on probabilities of success that can be used in different study areas and with different types of data, and most importantly can be understood by nonstatisticians. 9 Mathematically, the CLES is equivalent to the probability of the corresponding z-value of the standardized mean difference. Dunlap 10 extended the CLES to bivariate correlations and some years later provided a computer program and an overview with corresponding tables. 11
So, Dennis an I decided to turn back to the headline and the weather conditions to explain the idea as I was still a bit scary on my cognitive performance. After a while we found a study on the effects of environmental heat exposure on the cognitive performance of older adults. 12 In this study, the authors observed “a trend for worse performance at 32°C when compared to 24°C” and calculated a between-group effect size of d = 0.09.
A Cohen's d of 0.09 after dividing with
This statement can be interpreted much better, which has already been demonstrated in Brooks et al. 14 Introducing the effect size into the online calculator of Magnusson 13 also leads to the following graphical very intuitive graphical representation (Fig. 2):

Translating Cohen's d into common language effect sizes. Color images are available online.
Well, finally, what is it to take home?
First: Have a closer look at the heat stress when you are in a bad mood, whereby Dennis here corrects objection that it probably applies only to my or higher age. Second: Try to explain what is meant by effect sizes in common language. And finally: From a methodological point of view, there is a lot to do in this area (i.e., when conducting meta-analyses) and it seems to make fun! Let us go for it!
Actually, now after this editorial is done would be the time for a cold beer. But here too, caution is advised. According to Flores-Salamanca and Aragón-Vargas, 15 rehydration with beer after strenuous activity results in higher quantity of urine excretion and slower reaction time. Cheers!
