Abstract
In this article, the impact of assessment tasks on examination result (measured by examination grades) is investigated. Although many describe the advantages of electronic assessment tasks, few studies have been undertaken which compare a traditional approach using a classical examination with a new approach using assessment tasks. The main hypothesis of this study is that assessment tasks have a positive impact on results. It should motivate students to avoid procrastination and to adopt a deep learning approach. We test this research hypothesis in a case study with undergraduate students in their second year at university. Study results of students before and after the implementation of the assessment tasks are compared using t-tests. A regression analysis is performed to investigate the impact of the assessment tasks on the examination result. Empirical evidence of whether or not a positive effect of assessment tasks on the examination result is presented. The impact of how assessment tasks affect students with differing levels of performance is provided, and implications for educators is offered.
Assessment tasks, learning and performance
Assessment tasks are tasks that are given and made during the course and which partly determine the final result of the student. It is generally believed and shown in the literature that assessment has a significant effect on teaching and learning (Gibbs, 1999; Scouller, 1998). Assessment shows students what they should be learning during educational processes (Biggs, 1998), and the central role of assessment in teaching and learning is recognized more and more in higher education (Ramsden, 2003; Stobart, 2008). There is much literature on assessment, not least of which is around issues of the nature of assessment itself, including formative and summative assessment and the impact or otherwise that feedback has on future performance (see, for example, Black and Wiliam, 2009; Brown and Race, 2012; Gardner, 2006; Pokorny and Pickford, 2010; Race, 2010, 2011; Rust, 2002; Sadler, 2010; Taras and Davies, 2013). The ‘division’ is not as clear as we might think, but also we, educators, may not fully understand what these are, say Taras and Davies (2013) who go on to say that there is much around the theoretical discourses of assessment, citing the work of leading researchers in the field such as Sadler (1998). In the literature, an assessment task is suggested as a possibility to ameliorate the efficiency of learning because it can help to motivate deep learning (Davidson, 2002; Trigwell and Prosser, 1991) and to avoid procrastination (Ackerman and Gross, 2005).
Concerning deep learning, Davidson (2002) shows that there is a significant relationship between study performance and a deep study approach. There is a greater likelihood that a deeper approach to learning is adopted when more active participation by students occurs (Davidson, 2002). Results of studies (Kuh, 2003; Marriot and Lau, 2008; Oliver, 1998; Potter and Johnston, 2006) suggest that frequent and timely testing and giving feedback increases motivation. The more students practise and receive feedback, the more they learn and the more engaged and motivated they are. The active engagement of students can result in higher classroom participation and a better understanding of the course (Kuh, 2003; Oliver, 1998). In order for feedback to have a positive impact, it should be specific, timely, accurate and realistic in terms of what is achievable. It should motivate students to reflect on their own performance. According to Boud (1995a, 1995b) and Brown and Pendleburry (1992) (quoted by Potter and Johnston (2006)), only then feedback should lead to improved results. Assessment tasks might avoid procrastination (Ackerman and Gross, 2005). In theory, student performance should depend only on the capacities and talent of the student and not on the proximity to the deadline that the work is performed. Rotenstein et al. (2009) found however that while controlling for student quality, procrastination is associated with lower performance. Rothblum et al. (1986) also found evidence of a significant negative correlation between self-reported procrastination and academic performance.
In their study, Marriot and Lau (2008) found that assessments play a significant role in the teaching and learning process. The students in their sample sensed a positive effect on learning motivation and engagement. When the assessments are electronic, this can even result in extra advantages because students might be motivated by the use of information and communication technology (ICT), say Marriot and Lau (2008), and the possibility of receiving feedback very quickly after the examination (Nicol, 2006; Van Berkel and Bax, 2006). Potter and Johnston (2006) found that the use of ICT improves academic performance. The benefits of the use of ICT should not be exaggerated, however. Adrangi (1989) concludes that the use of ICT in assessments appears to have no significant impact on student performance. Kennely et al. (2011) found little evidence that the online completion of assignments has a positive impact on the exam performance. Kennely et al. (2011) found reasonable evidence that students are more likely to complete online assignments compared to paper assignments. The use of ICT increases student effort and engagement and results in higher completion rates.
We can conclude from the literature that assessment tasks might have a positive impact on the examination result. Assessment tasks are important and powerful assessment practices that support high-quality learning and teaching (Marriot and Lau, 2008). Many empirical studies also point to these conclusions. From the empirical work, we can conclude that students prefer assessment tasks to examinations, that students have better grades when assessments tasks are used instead of only a classical examination and moreover, that assessments are better in predicting long-term learning and student performance and result in a higher quality of learning (Gibbs and Simpson, 2004). These conclusions are drawn from many empirical studies. For example, indicated that students believe that assessments measure a larger range of abilities than examinations and are therefore fairer. Further, Bridges et al. (2002) found that assessment grades were higher than examination grades for six subjects at four different universities. Gibbs and Lucas (1997) compared different methods of evaluation. They report that evaluation that consists of assessments only result in 3.5% higher marks than evaluation that consists of examinations only. Also, three times as many students failed when there was only an examination and no assessments. Also Gijbels et al. (2005) conclude that assessment tasks result in higher examination scores. Concerning long-term learning, Conway et al. (1992) found that assessment tasks in the past resulted in long-term recall of content for psychology students while examinations did not. After a thorough literature review, Baird (1985) concluded that there is only a limited relationship between adult achievement and examination results. Tynjala (1998) proved that assessments result in a higher quality of learning compared to examinations. They compared results of two student groups. The first group had conventional lectures and an examination; the second group had assessments and also an examination. Students of the second group proved to place more emphasis on thinking and revealed more comparisons, evaluations and sophisticated structures.
Despite these empirical studies, Watson et al. (2007) conclude from their literature review that there are still not enough studies which have investigated the impact of assessment tasks. Watson et al. (2007) therefore call for empirical evidence about the effectiveness of these kinds of assessments. In this study, we provide such evidence by comparing the results of students with a traditional examination without assessment tasks with those of students who had assessment tasks on top of an examination. In order to investigate this, the following research question is formulated. Does the introduction of assessment tasks have a positive impact on the performance of students in their examination (performance as measured by examination marks/grades)? To answer this, a set of hypotheses was introduced:
The group of students who undertake assessment tasks prior to taking the examination perform significantly better than students who do not undertake any assessment tasks prior to taking the examination.
The marks/grades achieved on assessment tasks (as a whole) have a significantly positive impact on examination results.
Students who achieve good (high) marks/grades do so on/across all their courses undertaken within the same academic year.
Prior knowledge has a significantly positive impact on examination results.
Students who take an examination for a second time have a significant advantage and perform significantly better than those who do not.
To address these, two sets of results (marks/grades) achieved by students are compared. Both groups of students undertook a traditional examination at the end of their course. Prior to taking the examination, one group of students had undertaken some assessment tasks before they undertook their examination and the other had not.
Methodology
Course and participants
The learning activities are part of the ‘Financial Reporting and Analyses’ course in the academic year 2010–2011, which is part of the programme for the second year students in Applied Economics. The participants consist of the 401 students enrolled on this course. These students came from different programmes of study: Applied Economics Business Administration, Economic Policy and Business Engineering/Business Engineering in Management Information Systems.
Assessment tasks
Implementation
Before 2010, students were evaluated by a group assignment and a classical written examination, without assessment tasks. To counter procrastination and improve examination results, teaching staff decided to implement assessment tasks in 2010 on top of the existing evaluation methods. Computer-aided tools were developed to design, submit and evaluate electronic assessment tasks rapidly. Students had to undertake three assessment tasks, each corresponding with a main course topic.
Characteristics
After attending theoretical lessons and exercise sessions about a course topic, students received an individual assessment task within 1 week. All the assessment tasks related to a specific topic consisted of the same application-based questions made up by data from a non-existing company and a problem description. In general, students had to calculate financial ratios, construct group statements and calculate the value of assets. A predefined spreadsheet was distributed among the students to fill in their answers. After 4 days, students had to deliver their solutions. The assessment tasks were evaluated on the correctness of the calculated ratios, statements and values. Within 2 days after delivery, students received an individual report with feedback showing an indicator of correctness for each of their answers.
Group assignment
Besides the assessment tasks, students had to carry out a group assignment. The group assignment (groups of five or six students) took the form of a case study in which they had to calculate financial ratios and comment on the financial position and performance of an existing company. This group assignment took place during the course. The group assignment, graded on 20 points, was evaluated on correctness of calculations and comments. The group assignment was not part of this study.
Examination
The final examination took place at the end of the course. The examination was divided into four parts. The first part was based on open knowledge-based questions where students had to explain several key words or definitions. The other three parts were application-based questions each in relation to one of the main course topics. Each part of the final examination was graded on 20 points setting the total examination result score at 80 points.
Course final grade
The calculation of the final grade was conducted in two steps. In the first step, the scores of the group assignment and final examination were summed obtaining a total score on 100 points (group assignment weight = 20% and final examination weight = 80%). Dividing by 5, this score was converted to a score on 20 points. In the second step, the results of the assessment tasks were used to evaluate whether the grade was adjusted with +1, 0 or −1 point. Students who handed in their assignments in time and passed them (two assignments with minimum score ≥ 50% and one assignment with minimum score ≥ 25%) gained one extra point (+1) for their final grade of the course. Students who handed in the assessments on time but failed gained or lost nothing (0). Students who did not hand in one or more of the assignments on time lost one point on their final course grade (−1). This correction of −1 could be remitted if the student passed the examination questions related to the failed assessment topic(s). The final course grade was not part of this study.
Announcement of study
This study was not announced separately to students because the introduction of the assessment tasks was not an experiment but a decision made by the teaching staff. The use of the assessment tasks and their implementation were documented in the course description, which serves as official announcement for course characteristics. The decision to start with this research came after the finalization of the course in 2010 using this kind of assessments. Nevertheless, all researchers committed to the premise that no data of an individual student could be displayed or communicated. Data and information that would lead to identification of students was omitted. The Department for Innovation and Quality Control in Teaching (CIKO department), which collects student results for study purposes at the faculty, audited the input and output of the research to assure that these terms were met.
Data and methods
The main measures under investigation are the results of the assessment tasks and the examination marks/grades. The data were collected in July 2011 after the first examination period and after feedback had been provided to students on their assessment tasks, which both took place in June 2011. Data from the second examination period were not used to avoid bias from short-term relearning effects. In total, 366 students took the exam in June 2011. Two methods were used to investigate the impact of the assessment tasks on the examination marks/grades.
First, to measure the impact of the assessment tasks on the student group, the average examination results (on 80 points) of the students in 2009 (n = 420, before the introduction of the assessment tasks) are compared with those in 2010 (n = 366, after the introduction of the assessment tasks) by using independent samples t-tests.
Second, to explore whether the assessment tasks have a significant impact on examination marks/grades, several linear regression models were estimated using data at student level. According to Bacdayan (1997), the use of individual marks/grades is a reliable method for analysis of examination results. Focusing on data from 2010, 355 observations out of 366 observations were used (11 observations were omitted because of lacking data). All models were estimated with ordinary least squares (OLS) estimation and took into account the results of the examination on a total of 80 points (EXAM) as the dependent variable. The main independent variable (ASSIGNTOT) is representing the total score of the three assessment tasks. For students who did not deliver their task, the variable is set to zero. Several characteristics of students were introduced as independent control variables. The students’ qualitative characteristics are measured by the grade point average (GPA). GPA is a percentage variable representing the end result of the total first examination period containing all courses of the second year of Applied Economics. GPA is calculated as a weighted average taking into account the grades and study load of each course based on European Credit Transfer System (ECTS). It is important to include this variable to capture qualitative features impacting their performance (Adrangi, 1989). The level of specific prior knowledge is captured by the variable PRIOR, which measures the grade (on 20 points) of the course in Accounting from the previous year. Accounting knowledge is assumed to be a prerequisite in order to follow the course of Financial Reporting and Analysis. Previous research showed that prior knowledge has a substantial impact on examination result (Dowling et al., 2003; Potter and Johnston, 2006).
Although we used only data from the first examination period to avoid bias from short-term relearning effects, we must take into account the fact that students may have taken the examination for the second time when they failed the examination in the previous academic year. Therefore RETAKE is introduced as a dummy variable. RETAKE is 1 if the student has already taken the examination in a previous academic year (n = 46). Previous studies modelling this variable were not found, but any excess performance by taking the same examination twice must be captured. To incorporate gender differences, GENDER is inserted as a dummy variable which is 1 when the student is female (n = 129). It seems of added value to include this variable since earlier studies did not come to the same conclusion about the impact of gender (Arbaugh, 2000; Barret and Lally, 1999). Three models were estimated. Model 1 uses prior knowledge, GPA and total score on assessment tasks to explain exam results. Model 1 was evaluated for quasi-multicollinearity by calculating a cross-correlation matrix, tolerances and variance inflation factor (VIF) coefficients. At first sight, a strong correlation between prior knowledge and GPA was observed (r = 0.58), but no quasi-multicollinearity was detected (all tolerance values > 0.2). Heteroscedasticity was encountered. Further analysis using a Park-test shows that this is mainly due to correlation between the variance of examination results and levels of student’s GPA (p-value = 0.009). For those who score low on GPA, a large variance in examination results is observed, whereas for those who score high on GPA, there is only a small variance of (higher) examination marks/grades. This means that ‘low-performance students’ (those who score low on GPA) have a more deviated correlation with examination results. Instead of solving this in a methodological way, this observation is translated into an extension of previous hypotheses with the assumption that low-performance students (LPS) have more to gain in completing the assessment tasks. Figure 1 shows the relationship between GPA and exam results.

Relation between examination marks and GPAs.
Model 2 incorporates a solution for the heteroscedasticity by dividing the data set into high-performance students (HPS), that is, those who score high on GPA, and LPS, that is, those who score low on GPA using a split value of GPA = 60. A Chow-test proved that the split data set model has significantly higher explanatory power than a model without subdivision of HPS and LPS. Repeating the Chow-test for a range of different split values showed that a split value of 60 delivers the best fit. Model 2 also contains the variable RETAKE to capture the excess performance generated by retaking the examination. Prior knowledge is omitted in this model because its contribution was not significant (model 1; p-value = 0.11). Model 3 was set up to incorporate the difference in performance generated according to the gender of the students. However, gender was not significant for both HPS (p-value = 0.99) and LPS (p-value = 0.29). To evaluate the significant contribution of an independent variable, we took into account a significance level α of 0.05.
Results
Descriptive statistics
Table 1 displays the descriptive statistics of the results of the examination, comparing the years 2009 and 2010.
Overview of descriptive statistics for year 2009–2010 and 2010–2011.
A large number of students took part in the examinations during the first examination session (420 in 2009 and 366 in 2010). With an average result of 10.35, students in 2010 seem to score better than students in 2009 (average of 9.70). There is less deviation around the average in 2010. The success rate in 2010 is higher because 55% of the students passed the examination in 2010, whereas in 2009, only 51% passed the examination. As a first observation, this set of descriptive statistics reveals that students in 2010 performed better than their peers in 2009.
Table 2 shows the descriptive statistics for the results of the assessment tasks.
Results from assessment tasks 1, 2 and 3 (academic year 2010).
A very high participation rate is observed: for the first assessment, only 5 students out of 366 did not complete the task. For the second and third assessment, this rate slightly went up to 9 and 7 students, respectively, out of 366. Averages, medians and standard deviations show a very positive result for most of the students. For the first assessment, only 16 of the participating 361 students did not pass. More or less, the same observations can be made for assessment 2 and 3. Assessment 2 has a larger standard deviation than the others. Every year, the topic of ‘Group Statements’, which is experienced as difficult by students, reveals more deviated scores than the other topics.
Comparison of average exam result 2009 versus 2010
The average results of 2009 and 2010 are compared with an independent samples t-test. Results are shown in the overview in Table 3.
Comparison of examination result (EXAM) averages between 2009 and 2010.
The difference between the result of 2010 and the result of 2009 is significant (t = −2.45; p-value = 0.008), meaning that 2010 exceeds 2009 in examination performance. Students generally performed better when assessment tasks were used. Still, the added value of the assessment tasks cannot be explained. A regression analysis was conducted of which the results are explained in the next section.
Regression analyses
All estimates are shown in Table 4.
Results of regression analysis with examination result (EXAM) as dependent variable.
HPS: high-performance students (GPA ≥ 60%); LPS: low-performance students (GPA < 60%).
The effect of assessment tasks on examination results
The results in model 2 show that the effect of the assessment tasks depends on the GPA of a student. For HPS, there is no significant contribution from assessment tasks to the examination results (p-value = 0.27). For the LPS, however, there is a significantly positive contribution from assessment tasks to the examination results (p-value = 0.02) The estimated coefficient is 0.05 meaning that for every point scored on an assessment task, an LPS can on average expect 0.05 extra points on the examination score. LPS benefit from assessment tasks whereas HPS do not need assessment tasks to get satisfactory examination marks/grades.
The effect of student’s qualitative characteristics (GPA) on examination results
In model 2, both HPS’ and LPS’ GPAs influence significantly the examination mark/grade (p-value = 0.00). The impact of GPA is greater for HPS (0.86) than for LPS (0.36). For HPS, an extra GPA point results (almost) in an extra mark on the examination. For LPS, an extra GPA point accounts for more or less than one-third of the effect on the examination mark/grade. Not surprisingly, being a better student in general leads to higher examination marks/grades.
The effect of retaking the examination
The influence of retaking the examination for LPS is clear. For those students, there is no significant contribution from additional effort on examination mark/grade (p-value = 0.33). For HPS, the situation is somewhat ambiguous. Based on a significance level of 5%, one must come to the same conclusion for HPS as for LPS. On the basis of a 10% significance level, however, there is a significant and positive effect from taking the examination twice.
The effect of prior knowledge and gender on examination results
There is no significant effect of prior knowledge on examination results (p-value = 0.11). There is also no significant effect of gender on examination results (HPS: p = 0.99; LPS: p = 0.29).
Conclusion and discussion
Results reveal a significant difference in examination performance from one academic year to the next; students generally performed better when assessment tasks were used. With regard to the effect of assessment tasks on examination results, the effect of the assessment tasks relates to the GPA of a student. For those students who score high on GPA, the assessment tasks make no significant contribution to their subsequent examination marks/grades. For those students who score low on GPA, the assessment tasks make a significantly positive contribution to their subsequent examination marks/grades. For every additional mark obtained on an assessment task, a student who scores low on GPA can on average expect 0.05 extra marks on the marks/grades achieved on the examination which follows. In brief, a student who scores low on GPA benefits from assessment tasks whereas a student who scores high on GPA does not need assessment tasks in order to get satisfactory examination marks/grades. Not surprisingly, perhaps, being a better student in general leads to higher examination marks/grades. In terms of the effect of prior knowledge and gender on examination results, there is no significant effect of either prior knowledge or gender on examination marks/grades.
In their literature review, Watson et al. (2007) call for empirical evidence on the effectiveness of assessment tasks and this article provides such empirical evidence. Results from the study described in this article support those of Potter and Johnston (2006) and Adrangi (1989). Empirical evidence is also found in studies by Bridges et al. (2002), Gibbs and Lucas (1997), Marriot and Lau (2008) and Gijbels et al. (2005). Bridges et al. (2002) and Gibbs and Lucas (1997) both find that assessments lead to significantly better results. Marriot and Lau (2008) investigated the impact of online assessment tasks in an undergraduate course. Results of students were compared before and after the implementation of computer-aided assessment tasks. Marriot and Lau (2008) found that student performance had improved after employing an assessment task. The assessment tasks provided the students with timely and regular feedback which enabled the students and the teacher to identify weaknesses early enough to take remedial action. The students believed this would have a positive impact on their examination mark/grade. The students reported more confidence with the subject which improved their participation in class and had a positive impact on their motivation.
In their study, Gijbels et al. (2005) compared examination scores of students who performed assessment tasks and students who did not. They also corrected for previous knowledge of the students. Gijbels et al. (2005) found that students who completed the assessments performed better on the examination, even on examination questions that were not related to the assessment topics. They therefore conclude that introducing assessment tasks ‘helped students to address more appropriate student learning activities, going beyond the assessment tasks and content’ (p. 84). The students in their sample also reported that they studied more and also more critically and systematically due to the assessment tasks.
However, not all results in the study described in this article concur with previous research. Concerning prior knowledge and its effect on examination results, results from the study described in this article suggest that there is no significant effect of prior knowledge on examination marks/grades, despite research that suggests otherwise (Dowling et al., 2003; Potter and Johnston, 2006). However, the result of the study described in this article might perhaps be explained by the fact that the course is not highly dependent on the Accountancy course that the students had taken earlier and which we assumed to be responsible for this prior knowledge.
The regression model delivers new information about the impact of assessment tasks on different student groups. Students who score low on GPA across all courses within their degree programme seem to benefit significantly from the assessments while students who score high on GPA across all courses within their degree programme do not need the assessments in order to get high marks on their examination.
Several limitations of this study should however be reported. In order to examine the effect of the assessment tasks, the results of students of the year 2009–2010 (no assessment task) were compared to those of students of the year 2010–2011. It is noted that student characteristics may vary from one year to the next and that the method of assessment changed. However, since the focus was mainly on the regression results (2010–2011 students only), this limitation does not play a big role. A second limitation of this study is that only the examination marks/grades of the students were compared. There is no examination of the true impact of the assessments on deeper learning, procrastination and motivation. A third limitation is the very specific sample of students in the setting of a specific course, namely, second year undergraduate students within the discipline of accounting.
Results of this study suggest the following implications for educators and all those involved in supporting students in their learning. It is particularly important to take these results into account for those who teach students in large classes and who therefore have students within such classes whose GPA varies greatly. Such educators might like to consider the possibility of splitting the large class into two or more groups, with the group comprising those who score low on GPA across all courses being provided with supplementary hours of teaching and extra support.
Further research could be conducted to capture the impact of assessment tasks on deeper learning, procrastination and motivation. This research could measure the effect of each of these three factors on examination marks/grades by interviewing the students and linking their answers to the assessment results. It would be useful to study the impact of a new assessment technique by setting up an experiment in which half of the students have the ‘old’ evaluation method and the other half the ‘new’ method including assessment tasks. Another avenue to explore is to study the impact of the assessment tasks on the three factors of deeper learning, procrastination and motivation instead of looking only at examination marks/grades. It may also be useful to investigate the impact of participation versus no participation in the assessment tasks instead of taking the total marks/grades across all assessment tasks. Another possibility is to investigate the impact of a topic specific assessment task on the corresponding question of the examination. Finally, it would be useful to look at the issues explored in different disciplines and countries and with different student populations, for example, postgraduates or those either at the start or at the end of their undergraduate programme as different results may be obtained.
Footnotes
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
