Using a Pseudoscience Activity to Teach Critical Thinking

Abstract

In two studies, we assessed the effectiveness of a classroom activity designed to increase students’ ability to think critically. This activity involved watching and discussing an infomercial that contained pseudoscientific claims, thus incorporating course material on good research design and critical thinking. In Study 1, we used a pretest–posttest design. After the activity, students were significantly more likely to correctly identify flaws in a series of claims. In Study 2, we compared the effectiveness of this activity to a traditional lecture. Participation in the activity was more effective at increasing students’ ability to critically evaluate claims than the lecture. These results suggest that short-term interventions to increase critical thinking can be successful and can be made interesting for students.

Keywords

critical thinking pseudoscience activity active learning

Critical thinking is defined as engaged, skillful, judgmental assessment of one’s own beliefs or those of others (for a review, see Behar-Horenstein & Niu, 2011). An important aspect of critical thinking is the reflection involved in evaluating claims or arguments so that logical conclusions are drawn about those claims (Bensley, 1998; Halpern, 1998). This is an important skill not only in psychology but also in daily life. Because of the importance of teaching critical thinking skills, many researchers have conducted interventions aimed at improving students’ critical thinking skills, with varying success (Abrami et al., 2008).

In general, psychology as a discipline seems to be effective at increasing critical thinking skills. For example, senior psychology majors identified more problems with a set of claims (such as reliance on anecdotal evidence, confusion between correlation and causation, and the presence of confounding variables) when compared to either natural science majors or students in an introductory psychology course (Lawson, 1999). Similarly, the number of psychology classes taken by students relates to students’ ability to analyze arguments (Bensley, Crowe, Bernhardt, Buckner, & Allman, 2010).

Some researchers have used techniques similar to those of Lawson (1999) to assess the effectiveness of entire courses on improving critical thinking. After participating in courses designed to improve critical thinking skills, students were better able to evaluate arguments (Bensley et al., 2010; Blessing & Blessing, 2010), identify flaws in evidence (McLean & Miller, 2010; Penningroth, Despain, & Gray, 2007), explain flaws in statements and provide alternate explanations (Wesp & Montgomery, 1998), and were less likely to hold beliefs in the paranormal (McLean & Miller, 2010) than those participants in classes that do not emphasize critical thinking skills.

Overall, recent research demonstrates that longer term interventions (entire courses or long-term class projects) that emphasize critical thinking in psychology have generally been successful in increasing students’ abilities to evaluate claims. These interventions, however, require a significant amount of time in the classroom that instructors may not have, and it may be difficult to conceptualize activities that encourage critical thinking. The current article describes the use of a short active-learning activity involving the discussion of pseudoscience in a television commercial to help students critically evaluate claims and identify important scientific flaws. Active-learning techniques, in which students self-generate knowledge, can help students’ test performance compared to traditional lectures or readings (Yoder & Hochevar, 2005). We designed the activity to be both interesting for students and easily replicated by other instructors. This article describes two quasi experiments examining the effectiveness of the activity, one using a pre–post only design and the other comparing the activity to a traditional lecture covering the same information to see which approach was more effective.

Study 1

In the first study, participants read four flawed claims, indicated the extent of their agreement with claims, and evaluated the evidence presented in each. We hypothesized that after the pseudoscience activity participants would (a) indicate significantly less agreement with the claims and (b) correctly identify more flaws in the claims.

Method

Participants

Sixty-six students enrolled in one of two introductory psychology classes taught by the authors participated in this study. The majority of the sample was Caucasian, and 61% of the participants were female and 39% were male.

Measures

We based our measures on Lawson’s (1999) study. We measured critical thinking using a pre- and posttest, in which participants evaluated a set of four flawed claims developed by the researchers. The source of the claims included a parent, a scientist, an advertisement, and a teacher, in an effort to encourage students to transfer critical thinking skills outside of the classroom to everyday occurrences (Halpern, 1998). The claims are included in the Appendix. Participants read each claim and indicated their level of agreement with the claim on a scale from 1 (strongly disagree) to 7 (strongly agree). Participants then wrote as many reasons as they could for why the evidence in the claim may not have convinced them of the validity of the claim. We wrote the claims to include one or more of the following flaws: reliance on anecdotal evidence, the role of chance, reliance on very small samples, no control group, the confusion of correlation and causation, participant group differences, placebo effect, experimenter bias, order effects, and practice effects.

We independently coded the number of correct flaws identified by each participant and were blind to whether the data were from a pre- or posttest. We practiced coding on data from a third class and carefully discussed the results prior to coding the responses from the two classes in this study. We calculated interrater reliability as the correlation between our separately coded participant scores for both the pre- and posttest. Reliability for the pretest was .87. Reliability for the posttest was r = .90. We then averaged the number of flaws correctly identified by each participant across coders.

Procedure

One 75-min class period was set aside for the pseudoscience activity. We conducted the activity in the first 2 weeks of the semester prior to covering any related material in the course. We did not specially ask students to prepare in any way.

The university’s institutional review board approved all procedures and materials. Participants received an information sheet and then completed the pretest.

After we collected the pretests, the class viewed a 2-min long infomercial for iRenew bracelets found on YouTube. This infomercial is for a bracelet that purports to help wearers gain energy, strength, and balance using the body’s natural frequencies. The commercial included consumer anecdotes, “expert” testimonials, and mall demonstrations of balance improvements when wearing the bracelet. The demonstrations consisted of having customers lock their hands behind their backs. An experimenter applied downward pressure to the customers’ hands, causing them to lose balance. This was repeated with the customer wearing the bracelet. Customers reported that they had more balance while wearing the bracelet. Students viewed the commercial twice, and we asked them to think about the claims in the commercial and the evidence presented.

We assigned participants to groups and asked them to (a) identify the claims of the infomercial, (b) identify any evidence supporting these claims, and (c) evaluate that evidence. Finally, groups discussed how they could test for themselves whether the claims were valid. After about 10 min of group discussion, the groups shared their ideas and discussed them as a class. As a whole, the class identified and discussed the following flaws (often apparent in pseudoscience): order effects, the placebo effect, overuse of anecdotal evidence, reliance on extremely small samples, disregard for alternative explanations, and experimenter bias. Participants also evaluated the source of the evidence and the claims. We facilitated the discussion by providing correct terminology for their observations and ensuring that both classes successfully identified the same list of flaws.

After the class discussion, participants completed the posttest individually. The posttest consisted of the four scenarios previously evaluated, and we asked participants to reassess the claims in those scenarios. Participants then evaluated the pseudoscience activity.

Results

We conducted four paired-samples t-tests to analyze changes in agreement with each claim. All four tests were significant and in the predicted direction (see Table 1). Participants were significantly less likely to agree with each of the flawed claims after the activity.

Table 1.

Change in Agreement With Claims in Study 1.

	Pretest		Posttest
Scenario	M	SD	M	SD	t(df)	p	d
1	3.42	1.55	2.57	1.63	5.12(64)	<.001	0.64
2	3.59	1.53	2.25	1.48	7.18(64)	<.001	0.89
3	2.02	1.23	1.59	1.10	2.52(64)	.014	0.31
4	2.80	1.83	2.34	1.65	2.41(60)	.019	0.34

Note. SD = standard deviation.

Using a paired-samples t-test, we compared the total number of correct flaws identified across the four claims in the pretest and posttest (see Table 2 for the change in the number of flaws identified for each scenario). As hypothesized, participants detected a significantly greater number of flaws after the class activity, M = 5.72, SD = 2.73, compared to before the pseudoscience activity, M = 4.00, SD = 2.23, t(65) = 6.86, p < .001, d = 0.84.

Table 2.

Change in Number of Flaws Detected in Study 1.

	Pretest		Posttest
Scenario	M	SD	M	SD	t(df)	p	d
1	1.13	0.72	1.45	0.86	−3.48(65)	.001	0.43
2	0.73	0.89	1.31	0.91	−5.06(65)	<.001	0.62
3	1.17	0.71	1.63	0.93	−4.39(65)	<.001	0.55
4	1.05	0.72	1.37	0.74	−3.74(62)	<.001	0.47

Note. SD = standard deviation.

After the posttest, participants evaluated how enjoyable they found the activity, how much they thought that it helped their critical thinking skills, and how suited the activity was to the course. We measured responses on 5-point Likert scales. Almost two thirds of the participants (65%) found the pseudoscience activity to be enjoyable (rated it a 4 or 5). Similarly, 72% of participants thought the activity was helpful in increasing their critical thinking skills.

Discussion

Overall, after the pseudoscience activity, students were better able to correctly identify flaws in claims and were less likely to agree with flawed claims, which are important aspects of critical thinking. This study suggests that shorter interventions can increase students’ critical thinking abilities as well as longer term interventions. However, a limitation of this study was its simple pretest–posttest design. Although we demonstrated that students successfully identified more flaws in presented claims during the posttest, we did not know whether this active-learning approach would be more effective than other teaching methods. We conducted Study 2 to address this issue.

Study 2

In Study 2, we conducted a 2 (condition: pseudoscience activity vs. lecture) × 2 (time: pretest vs. posttest) quasi experiment that compared improvements in critical thinking using the pseudoscience activity to lecture-only instruction. We again measured agreement with claims and the number of flaws correctly identified in the claims. We hypothesized that there would be a significant main effect of time on agreement, such that participants would be less likely to agree with faulty claims at posttest. We also hypothesized a significant interaction effect between time and condition, such that participants in the activity group would show a greater decrease in agreement with claims. We also hypothesized that there would be a significant main effect of time and a significant interaction between time and condition on the number of flaws identified, such that participants in the activity group would show a greater increase in the number of flaws identified.

Method

Participants

Students enrolled in one of four introductory psychology classes participated in this study. Each author was the instructor for one class; a third instructor taught the remaining two classes. Participants were between the ages of 18 and 43 (M = 20.41, SD = 4.32). About half (47%) of participants were male and half were female (53%).

Measures and Procedure

We used the same measures and procedure as in Study 1, with a few exceptions. In Study 2, we used a quasi-experimental design in which we gave two of the classes the pseudoscience activity and covered the same material in the other two classes in a lecture format. Again, we conducted the activity in the first 2 weeks of the semester prior to covering any related material in the course. We each visited one introductory psychology class taught by another instructor and conducted the pseudoscience activity. Our own introductory psychology classes served as the lecture groups. We each gave a lecture we routinely give in introductory classes on the scientific method and basic experimental design. We covered the importance of using the scientific method to answer questions rather than simply relying on personal experience or anecdotal evidence. Using different examples from the pseudoscience activity, we covered the basic components of experiments, potential confounds and alternative explanations (e.g., order effects, placebo effect, experimenter bias, existing differences between groups), and how to account for these confounds. This lecture defined and described the same flaws included in the pseudoscience activity. The major difference between the two approaches was that in the activity groups students self-generated examples and descriptions of flaws through discussion. In Study 2, the participants completed the pretest and posttest on different days, because participants seemed to experience fatigue in Study 1 and stopped answering questions toward the end. We administered the pretest to all groups during the first week of the semester and conducted the activity or lecture and the posttest the next week. Due to the high level of coding agreement between the authors in Study 1, we each coded half of the measures and were blind to participant condition.

Results

Fifty-eight students in the lecture condition completed both pre- and posttest measures, and 89 students in the activity condition completed both sets of measures. Agreement scores and paired comparisons for the pretest and posttest are given in Table 3.

Table 3.

Change in Agreement With Claims in Study 2.

		Pretest		Posttest
Scenario	Condition	M	SD	M	SD	t(df)	p	d
1	Activity	3.62	1.56	2.64	1.45	6.92(88)	<.001	0.78
1	Lecture	3.57	1.59	2.73	1.56	4.29(58)	<.001	0.58
2	Activity	3.71	1.62	2.63	1.58	5.76(88)	<.001	0.59
2	Lecture	4.03	1.62	2.97	1.68	5.19(57)	<.001	0.63
3	Activity	2.21	1.50	1.52	0.77	4.33(88)	<.001	0.53
3	Lecture	2.51	1.58	1.92	1.13	2.58(58)	.012	0.36
4	Activity	3.86	2.13	2.69	1.89	6.58(88)	<.001	0.67
4	Lecture	3.70	1.91	3.03	1.94	2.81(58)	.007	0.38

Note. SD = standard deviation.

We also calculated the average agreement score for all four scenarios. There was no difference in average pretest agreement scores for those who completed both the pre- and posttest, N = 148, M = 3.39, SD = 1.13, and those who only took the pretest, N = 34, M = 3.33, SD = 1.02, t(180) = −0.30, p = .767. Then we conducted a 2 (activity vs. lecture) × 2 (pretest vs. posttest) mixed analysis of variance (ANOVA) to see if agreement scores decreased more in the activity group compared to the lecture group. We found the expected significant main effect of time on participants’ average agreement with the flawed claims, F(1, 176) = 675.63, p < .001, partial η² = .79. All participants were less likely to agree with flawed claims over time. There was no significant main effect of condition on agreement scores, F(1, 176) = 1.21, p = .274, partial η² = .007, nor was there an interaction between condition and time on agreement scores, F(1, 176) = 0.04, p = .837, partial η² = .00. Contrary to expectations, students in the activity condition did not agree less over time with the flawed claims compared to students in the lecture condition.

We also looked at the number of flaws detected for each scenario across time and condition (see Table 4). We summed the number of flaws correctly detected by participants for the four scenarios, creating a total pretest score and a total posttest score for each participant. There was no difference in the number of flaws detected at pretest between participants who completed both the pre- and posttest, M = 2.48, SD = 1.65, and participants who only took the pretest, M = 2.64, SD = 1.71, t(181) = 0.51, p = .614.

Table 4.

Change in Number of Flaws Detected in Study 2.

		Pretest		Posttest
Scenario	Condition	M	SD	M	SD	t(df)	p	d
1	Activity	0.76	0.62	1.06	0.83	−3.72(88)	<.001	0.42
1	Lecture	0.81	0.78	0.78	0.70	0.32(57)	.749	0.04
2	Activity	0.55	0.62	1.04	0.88	−5.08(88)	<.001	0.55
2	Lecture	0.65	0.79	0.84	0.70	−1.85(56)	.07	0.24
3	Activity	0.48	0.62	1.16	0.72	−7.26(88)	<.001	0.78
3	Lecture	0.74	0.58	0.72	0.64	0.17(57)	.867	0.03
4	Activity	0.55	0.60	0.76	0.69	−2.82(88)	<.01	0.30
4	Lecture	0.53	0.50	0.79	0.61	−3.594(57)	.001	0.48

Note. SD = standard deviation.

The number of flaws correctly detected increased for both the lecture group, pretest M = 2.72, SD = 1.85; posttest M = 3.12, SD = 1.88, and the activity group, pretest M = 2.33, SD = 1.49; posttest M = 4.02, SD = 2.17. We conducted a 2 (activity vs. lecture) × 2 (pretest vs. posttest) mixed ANOVA to examine the influence of these factors on the number of flaws correctly detected. Results of the ANOVA revealed a significant main effect of time on the number of flaws detected, F(1, 145) = 48.22, p < .001, partial η² = .25. There was no significant main effect of condition on the number of flaws detected, F(1, 145) = 0.83, p = .363, partial η² = .006, but there was a significant interaction effect between condition and time on the number of flaws detected, F(1, 145) = 18.60, p < .001, partial η² = .11. Planned comparisons showed that pretest scores did not differ between the pseudoscience activity group, M = 2.33, SD = 1.49, and the lecture group, M = 2.72, SD = 1.85, t(181) = 1.16, p = .249, d = 0.18. Posttest scores, however, were significantly higher for the activity group, M = 4.02, SD = 2.17, compared to the lecture group, M = 3.12, SD = 1.88, t(145) = 2.59, p = .011, d = 0.44.

In addition, the same attitudinal questions were asked as in Study 1, and answers were measured on scales ranging from 1 (strongly disagree) to 7 (strongly agree). Fifty-nine percent of students found the pseudoscience activity to be enjoyable (rated it a 5, 6, or 7), and 60% saw the activity as helpful in increasing their critical thinking skills (rated the activity a 5, 6, or 7).

General Discussion

Several studies have found that longer term interventions designed to increase critical thinking abilities can be successful (e.g., McLean & Miller, 2010; Penningroth et al., 2007; Wesp & Montgomery, 1998). This article describes a short, easily replicable classroom activity that instructors can use to increase critical thinking skills. Most participants enjoyed the activity, which involved more active involvement than a simple classroom lecture. Actually, we believe that students underreported their enjoyment of the activity, thinking that we referred to the posttest itself as the activity rather than the video and discussion. Regardless, in both studies, students showed significant increases in their ability to detect flaws in claims—an important component of critical thinking—after completing the activity. We also demonstrated that this active-learning task was more effective than a traditional lecture that covered the same material. This type of short-term critical thinking intervention is easily replicable by other instructors. YouTube is an excellent source of videos containing pseudoscientific claims and is an easily available resource, which makes it easy to incorporate this activity into introductory psychology classes.

Although we improved the design of this research in Study 2, we did not randomly assign students to groups, which could be problematic. However, results of the pretest were not significantly different between classes. In addition, a meta-analysis of empirical research on the effectiveness of critical thinking instruction found no difference in effect sizes for pretest–posttest, quasi experimental, or experimental designs (Abrami et al., 2008). This suggests that results would have been similar if we had randomly assigned participants to groups.

One limitation of both studies discussed in this article was that the participants were obviously more aware that there may have been flaws in the scenarios during the posttest. This awareness may have made participants less likely to agree with the claims without influencing critical thinking. However, in Study 2, the number of flaws identified in the posttest increased more for the activity group than the lecture group, which suggests that simply knowing the terms is not what helped them. Another limitation is our emphasis on short-term effects. To better address both these issues, future researchers could use a different set of claims in the posttest or use a posttest-only between-groups design and assess critical thinking skills at the end of the semester.

Footnotes

Appendix

Authors’ Note

We thank Robert Lipinski, Brittany Sizemore, Hannah Hatton, and Alicia Barnickle for their assistance with this project.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

References

Abrami

P. C.

Bernard

R. M.

Borokhovski

Wadem

Surkes

M. A.

Tamim

Zhang

(2008). Instructional interventions affecting critical thinking skills and dispositions: A stage 1 meta-analysis. Review of Educational Research, 78, 1102–1134. doi:10.3102/0034654308326084

Behar-Horenstein

L. S.

Niu

(2011).Teaching critical thinking skills in higher education: A review of the literature. Journal of College Teaching and Learning, 8, 25–42.

Bensley

D. A.

(1998). Critical thinking in psychology: A unified skills approach. Pacific Grove, CA: Brooks/Cole.

Bensley

D. A.

Crowe

D. S.

Bernhardt

Buckner

Allman

A. L.

(2010). Teaching and assessing critical thinking skills for argument analysis in psychology. Teaching of Psychology, 37, 91–96. doi:10.1080/00986281003626656

Blessing

S. B.

Blessing

J. S.

(2010). PsychBusters: A means of fostering critical thinking in the introductory course. Teaching of Psychology, 37, 178–182. doi:10.1080/00986283.2010.488540

Halpern

D. F.

(1998). Teaching critical thinking for transfer across domains. Dispositions, skills, structure training, and metacognitive monitoring. American Psychologist, 53, 449–455. doi:10.1037/0003-066X.53.4.449

Lawson

T. J.

(1999). Assessing psychological critical thinking as a learning outcome for psychology majors. Teaching of Psychology, 26, 207–209. doi:10.1207/S15328023TOP260311

McLean

C. P.

Miller

N. A.

(2010). Changes in critical thinking skills following a course on science and pseudoscience: A quasi-experimental study. Teaching of Psychology, 37, 85–90. doi:10.1080/00986281003626714

Penningroth

S. L.

Despain

L. H.

Gray

M. J.

(2007). A course designed to improve psychological critical thinking. Teaching of Psychology, 34, 153–157. doi:10.1080/00986280701498509

10.

Wesp

Montgomery

(1998). Developing critical thinking through the study of paranormal phenomenon. Teaching of Psychology, 25, 27–278. doi:10.1080/00986289809709714

11.

Yoder

J. D.

Hochevar

C. M.

(2005). Encouraging active learning can improve students’ performance on examinations. Teaching of Psychology, 32, 91–95. doi:10.1207/s15328023top3202_2