Abstract
In a wide range of professional service firms, individuals perform a variety of tasks which are highly cognitive and knowledge intensive yet repetitive in nature, providing significant opportunities for learning. In addition, individuals in such environments tend to enjoy considerable discretion in managing when and how they perform their tasks. In light of these observations, we investigate task allocation and timing strategies that may enhance or inhibit learning and productivity for professional service workers. Specifically, we focus on the role of task variety. We use a detailed dataset of 3273 coronary artery bypass surgeries in a private European hospital over 7 years to examine the effect of concurrent and non‐concurrent exposure to task variety on learning and productivity on a focal task. We find that while concurrent exposure to variety has a positive impact on focal productivity, non‐concurrent exposure to variety has a negative impact on it. Our results also suggest that short‐term exposure to variety amplifies these relationships.
Introduction
Professional service firms globally generate annual sales over $3 trillion and represent 7%–8% of total service sector revenue in advanced economies (Chui et al. 2012). This percentage is even higher in service‐based economies such as Britain, where 15% of the GDP and 14% of employment comes from professional services firms (PwC 2012). Following Von Nordenflycht's (2010) characterization, the professional service industry broadly includes accounting, advertising and marketing, management consulting, architecture, legal services, scientific research services, and physician practices. While these firms have distinct characteristics at a high level, including knowledge intensity, low capital intensity, and a professionalized workforce (Von Nordenflycht 2010), their operations are also distinguished by several key features that relate to how their employees perform their work.
First, the majority of the work that professional service workers perform is quite repetitive in nature. For example, management consultants follow similar steps in their engagement with clients, from initiation to contracting and final deliverables; legal professionals draft legal documents by engaging in a similar set of activities; and surgeons follow a certain set of procedures in the operating room. While most activities and tasks may be quite similar from one job to another, workers still engage in high levels of cognitive activity, presumably due to variation in work content across tasks (e.g., differences between consulting projects, surgeries, or legal cases). Hopp et al. (2009) consider this key observation in their classification of white‐collar work: they call, for instance, consulting and legal services intellectual and routine work. As a result of the repetitive nature of the work and the significant opportunities for learning that these settings offer, individuals inevitably transfer their past experience and knowledge when they work on subsequent tasks (Gick and Holyoak 1987, Tversky 1977, Zollo and Reuer 2010).
A second important feature of professional service work is that individuals tend to have a relatively high degree of discretion in managing when and how they perform their tasks (Hopp et al. 2007, Ibanez et al. 2017
Considering these two key features of professional service work—first, its repetitive nature and many learning opportunities across tasks and, second, workers’ potential discretion in managing when and how to perform various tasks—raises an important operational question: How should professional service workers perform their various tasks to achieve greater learning and productivity over time? More specifically, in performing the variety of tasks that a professional service worker is supposed to carry out, are there certain task‐timing configurations that enhance or inhibit learning and productivity?
Our study addresses these questions by focusing on how workers perform various tasks. We distinguish between concurrent and non‐concurrent exposure to variety and examine productivity implications of these two approaches to organizing work. Concurrent variety refers to performing another task concurrently with the focal one, whereas non‐concurrent variety refers to performing another task independently (i.e., at a different time) from the focal one. Because many professional service workers inevitably perform a variety of tasks, we seek to understand when and how exposure to variety helps, and when and how it hurts individuals’ focal productivity.
Exposure to variety through successful knowledge transfer to the focal task may have a positive impact on performance (Boh et al. 2007, Schilling et al. 2003, Staats and Gino 2012), but too much exposure to variety may be detrimental to productivity (Narayanan et al. 2009). Also, task variety could be confusing for individuals (Allport et al. 1994) and therefore decrease their subsequent focal productivity due to switching costs and warm‐up periods (Cellier and Eyrolle 1992, Monsell 2003). Like the studies that have produced these findings, our study explores the productivity implications of exposure to task variety, but with an important distinction. We propose that the influence of task variety on productivity critically depends on when other tasks are performed in relation to the focal task. We suggest that knowledge transfer and learning mechanisms are different when other tasks are performed concurrently vs. non‐concurrently with the focal task, which leads to differentiated and, in fact, contrasting effects on productivity.
We develop and test four hypotheses regarding professional service workers’ concurrent and non‐concurrent exposure to variety by examining their effect on productivity in subsequent focal tasks. We find that concurrent exposure to variety enhances productivity, whereas non‐concurrent exposure to variety is detrimental to productivity. Non‐concurrent variety reduces productivity by generating inaccurate transfers and mapping processes between tasks performed at different times that are similar on the surface but structurally different (Gick and Holyoak 1987). Surface similarities may include similar contexts (Craik and Tulving 1975), goals, and processes (Bransford and Franks 1976, Tulving and Thomson 1973). In contrast, structural similarities are deeper resemblances between the elements and underlying causal structure of tasks (Gick and Holyoak 1987); structural similarities enable workers to derive causal inferences from one task to apply to another task (Holyoak and Koh 1987). With concurrent variety, surface similarities do not mask structural differences, because the worker is performing the tasks at the same time. In fact, concurrent performance is highly conducive to learning: it enables “implicit learning” (Reber 1989, Wulf and Schmidt 1997) and facilitates cognitive skill acquisition through the discrimination process (Anderson 1982), all of which leads to better comprehension of the focal task and increased productivity.
In addition, our results suggest that short‐term exposure to variety amplifies the influence of long‐term exposure on subsequent focal productivity. That is, recent (short‐term) concurrent variety increases the positive influence of concurrent variety on focal productivity, whereas recent non‐concurrent exposure to variety strengthens the negative impact of non‐concurrent variety on focal productivity.
As we test our hypotheses and explore how professional service workers can achieve higher productivity by better organizing the variety of tasks they perform, we want to emphasize an important empirical consideration. Because many professional service workers have significant discretion over when and how to perform their tasks, there may be inherent endogeneity in many professional service settings, which could pose problems in empirical identification. The ideal test of our hypotheses requires a professional service setting where decisions about which tasks to perform, when to perform different tasks, and whether to perform them concurrently or non‐concurrently are not up to individuals’ discretion but are instead determined exogenously. We test our hypotheses using a detailed dataset of 3273 coronary artery bypass graft (CABG) operations from the cardiac unit of a private European hospital over 7 years. Because a patient's need is the primary driver of the type, nature, and timing of an operation and these are not up to the discretion of the surgeon, we believe that this is an ideal setting in which to investigate the effects of task variety on productivity.
Our study offers a number of significant contributions to the service operations management literature. First, by focusing on an under‐studied service sector, namely operations of professional services firms (Lewis and Brown 2012, Roth and Menor 2003), our study examines how different task allocation and composition strategies may influence learning and subsequent focal productivity of professional service workers. This, we believe, is also a step toward answering Argote et al.'s (2003) call for more research to identify “mechanisms and conditions under which experience is beneficial (or harmful) for learning outcomes” and a step toward answering their question about whether different types of experience may provide better understanding of the task (p. 579). Second, our study contributes to the growing literature on task variety and productivity by introducing a new dimension of variety which has not been considered before: concurrent or non‐concurrent exposure. In a similar study, Staats and Gino (2012) focused on a single current task, studied when different tasks took place (on the same day or in the past) with respect to that current task, and examined how these timing differences affected current task performance. We, however, concentrate on a common focal task performed over time (i.e., CABG), study how exposure to variety takes place with respect to past focal tasks—in conjunction with it (concurrently) or independently from it (non‐concurrently)—and examine subsequent performance implications. Third, we make an important distinction between short‐term and long‐term learning dynamics. We introduce short‐term exposure to variety as a factor that moderates how long‐term exposure to variety (both concurrent and non‐concurrent) affects individuals’ focal productivity.
In the next section, we describe our setting before proceeding to motivate and develop our hypotheses.
Setting
Our setting is the cardiac unit of a private hospital in Europe, and our dataset consists of 3273 coronary artery bypass graft (CABG) surgeries that were conducted in the hospital over a period of 7 years and 3 months. In addition to CABG surgeries, which we identify as the focal task (see section 4 for a discussion), we also have information regarding all other types of cardiac surgeries that were conducted in the hospital during the same time period (Avgerinos and Gokpinar 2017), which allows us to observe surgeons’ exposure to other types of tasks (i.e., task variety).
Our setting is a very suitable context in which to investigate the effects of exposure to variety on individual learning and productivity, because surgeons perform CABG surgeries as well as a variety of other types of cardiac surgery and can therefore potentially transfer knowledge from one task to another. In addition, learning is an integral part of hospital operations (Tucker et al. 2007) and surgeons’ practices (KC and Staats 2012). Furthermore, CABG surgeries are highly critical and complex yet common and frequent tasks for surgeons (Clark and Huckman 2012, Pisano et al. 2001), making them an ideal setting in which to study how exposure to task variety may influence learning and the resulting productivity of professional service workers. Finally, because our dataset covers a span of more than 7 years, we are able to examine both long‐term and short‐term effects of task variety on learning and subsequent focal productivity.
As mentioned, one major advantage of our setting is that the nature, type, and time of the tasks (surgeries) are not endogenously determined by the worker but are driven by outside factors (patients’ needs). Consequently, a surgeon may perform a single CABG surgery if that is all the patient needs; she may perform a single valve replacement if the patient requires only that, or she can perform multiple surgeries during the same operation (valve replacement and CABG) if the patient needs both. Likewise, surgeons’ assignments to tasks and the similarity of their skills sets are other important considerations to avoid endogeneity; we investigate these in detail in the robustness checks section.
We identify CABG as the focal surgery in this setting and examine the impact of surgeons’ exposure to other types of surgeries on subsequent focal task (CABG) productivity. Because surgeons perform other types of surgeries, too, both concurrently (e.g., performing valve replacement and CABG together) and non‐concurrently (e.g., performing a single valve replacement) depending on medical requirements, we are able to observe knowledge transfer and learning between both concurrent and non‐concurrent tasks.
Literature Review and Theory Development
Exposure to Variety, Learning, and Productivity
Previous research on the effect of task variety on individuals’ performance has yielded mixed results. A significant body of literature argues that task variety enhances learning through successful transfer of knowledge between the focal and related activity (Tversky 1977, Zollo and Reuer 2010). In addition, task variety can increase workers’ commitment and motivation (Hackman and Oldham 1976, Langer 1989), resulting in improved productivity. Schilling et al. (2003) showed that related variety can improve the learning rate of students playing different versions of a game. Staats and Gino (2012) showed that variety promotes workers’ productivity in the long run.
On the other hand, researchers have also argued that task variety could be distracting and eventually detrimental to individual learning and productivity (Allport et al. 1994, Monsell 2003). Prior learning in one task may have a negative effect on performance of a related task due to “negative transfer effect” (Gick and Holyoak 1987, Reed 1989, Zollo and Reuer 2010). Others have observed more nuanced relationships. For example, Narayanan et al. (2009) observed that variety had an inverted U‐shaped relationship with individual productivity in an offshore software support services company, and KC and Staats (2012) found that focal subtask variety has an inverted U‐shaped relationship with performance, whereas related subtask variety has a U‐shaped relationship with the performance of cardiac surgeons.
Concurrent and Non‐Concurrent Exposure to Variety
Our study contributes to this debate by suggesting and testing a new mechanism by which task variety may influence individual learning and focal productivity. We reconcile the two views outlined above and suggest that exposure to variety independently (non‐concurrently) or concurrently determines the direction and level of successful knowledge transfer from the related task to the focal task. That is, non‐concurrent task variety could result in negative knowledge transfer and hence lead to lower productivity the next time the focal task is performed. On the other hand, concurrent variety could enhance knowledge transfer from related tasks to the focal task and hence improve individual productivity for subsequent focal task. 1 For example, during an operation, a surgeon might perform only a valve replacement or might perform a valve replacement combined with a CABG. In the former operation, she will be exposed to variety (i.e., valve replacement) non‐concurrently with the focal task (i.e., CABG), whereas in the latter case her exposure to variety will occur concurrently with the focal task.
Non‐Concurrent Exposure to Variety
The basic mechanism through which past‐related tasks affect subsequent performance of a focal task is knowledge transfer. Negative transfer effects between tasks have been suggested in the cognitive psychology literature (see Gick and Holyoak 1987 for a review): prior learning and experience in one task may produce negative transfers and hence may be misleading in executing another task.
The most well‐known mechanism for knowledge transfer is structure‐mapping (Gentner 1983, Gentner and Markman 1997, Reed 1987, Ross 1987): When faced with a task, individuals will compare it with past related tasks and retrieve past schema based on the similarities between the current task and past ones (Loewenstein et al. 1999). In explaining the mapping process, researchers have highlighted a critical distinction between surface similarity (i.e., similarity of context, goals, and processes) and structural similarity (i.e., deeper association in underlying causal structure). Even though the specific elements involved in tasks may resemble each other (surface similarity), the underlying causal structure and ordering of elements could be quite different (Novick 1988), and therefore, deriving causal inferences based on surface similarity can be misleading (Holyoak and Koh 1987).
Surface similarity has been found to have greater influence than structural similarity on schema retrieval from previous tasks (Gentner 1989, Gentner and Colhoun 2010, Holyoak and Koh 1987, Loewenstein et al. 1999). Indeed, individuals tend to focus on superficial (surface) details during the retrieval process, which is an important cognitive challenge of applying learning from one case to another (Loewenstein et al. 2003). Therefore, when individuals perform two tasks non‐concurrently, they can be misled by the previous task's surface similarity, retrieve incorrect schema, and make inaccurate transfers if the tasks’ surface similarity does not correlate well with their structural similarity (Ben‐Zeev and Star 2001, Lee and Simon 2004, Reed 1987, Ross 1987). These incorrect transfers in turn lower the individual's productivity (Gick and Holyoak 1987).
Similarly, in our setting, while different types of cardiac operations are similar on the surface, there are important structural differences between them, which we argue would lead to a negative transfer. Specifically, in addition to CABG, our setting involves other types of operations, from tumor removal to heart transplant. Such operations share surface similarities with CABG, including context, aim, and process, but they are carried out in distinct steps with significant structural differences from CABG (Reznick and MacRae 2006). 2 During a CABG operation, a vein from elsewhere in the patient's body is stitched to the aorta and a coronary artery, whereas during a tumor removal, for instance, the tumor is removed from the heart or the valve along with the tissue around it.
Hence, despite the surface similarities, we argue that different cardiac surgery types do not have enough structural similarity to allow for accurate causal inferences from one another (Holyoak and Koh 1987) when they are performed at different times (non‐concurrently). Specifically, non‐concurrent variety will not enable the surgeons to directly compare (Loewenstein et al. 1999) and successfully comprehend the alignable differences (i.e., differences that are connected to the common context of the two tasks) (Gentner and Markman 1997) between the two operation types. When variety is non‐concurrent (i.e., the other type of task is performed at an earlier time), surgeons are likely to retrieve schema that are not exactly appropriate for the present task, leading to inaccurate transfer and therefore reduced productivity. Such inaccuracy is due to the dominance of surface similarity in schema retrieval from long‐term memory (Gentner and Colhoun 2010, Loewenstein et al. 1999).
In addition to negative transfer effects, the development of implicit memory (i.e., priming) due to non‐concurrent variety can impair decision making in a subsequent focal task (Allport and Wylie 1999). Medical researchers have found that surgery operations require 25% technical and 75% decision‐making skills (Grierson et al. 2011, Spencer 1978) and that doctors and medical staff do respond unconsciously based on past stimuli (Bargh and Williams 2007, Loewenstein and Lerner 2002). These findings suggest that priming from related variety can lead surgeons to take suboptimal decisions and actions in a subsequent focal surgery (i.e., CABG) and thereby decrease their focal productivity. For these reasons, we predict that: Non‐concurrent exposure to task variety has a negative impact on subsequent focal task productivity.
Concurrent Exposure to Variety
We next examine the productivity implications of performing other tasks in conjunction with the focal task. When another task is performed concurrently with the focal one, surface similarities between tasks will not mask their structural differences, because individuals will be able to compare and comprehend alignable differences between different tasks (Gentner and Markman 1997). This comparison entails a mapping process that highlights both the differences and similarities between the two tasks (Gentner and Markman 1997, Holyoak and Thagard 1995, Medin et al. 1993) and results in the abstraction of schema that are useful for the focal task (Lassaline and Murphy 1998, Markman and Gentner 1993).
Individuals also better comprehend, learn about, and identify the intricacies of the focal task as a result of the greater implicit learning (Reber 1989, Wulf and Schmidt 1997) and discrimination (Anderson 1982) that come from concurrent variety. Indeed, concurrent variety provides a new, varied context for the focal task which promotes implicit learning. As a result, individuals develop critical but highly complex and abstract knowledge about the focal task and its associations (Maskarinec and Thompson 1976). In addition, concurrent variety facilitates the discrimination process necessary for cognitive learning (Anderson 1982) by producing multiple variants of the conditions of the same action. Performing two tasks concurrently activates cognitive processes (Koechlin et al. 1999) which are linked with the ability to integrate and combine subtasks and ideas (Koechlin and Hyafil 2007). Individuals can then identify the most appropriate course of action by learning about and comparing differences between the focal task and the concurrent task (Anderson 2013).
For example, the common reference book for cardiac surgery—Kirklin/Barrat‐Boyes (Kouchoukos et al. 2012, p. 529)—identifies two primary considerations when CABG is accompanied by mitral valve repair/replacement
3
: “1. Planning the operation to limit the duration of CPD and global myocardial ischemia (aortic clamping), 2. Reducing the need to tilt the heart up after the valve is inserted or repaired.” These specifications suggest that concurrent variety can facilitate processes of implicit learning and discrimination by giving the focal CABG operation a different context and differing conditions in the form of additional coordination and planning considerations and new constraints. Considering all the above arguments, we predict that: Concurrent exposure to task variety has a positive impact on subsequent focal task productivity.
The Moderating Role of Short‐Term Non‐concurrent Variety
We next ask how recent (short‐term) non‐concurrent variety interacts with long‐term non‐concurrent variety. We argue that the negative effects of non‐concurrent variety will be aggravated by recent non‐concurrent exposure to variety. The negative knowledge‐transfer effect (Gick and Holyoak 1987) of non‐concurrent variety will be further amplified by recent non‐concurrent exposure to variety. This is because individuals tend to retrieve schemas that they have used recently, even when more plausible and reasonable alternatives exist (Reder 1982). So, a recent non‐concurrent exposure to variety will lead surgeons to adopt its schema, which may not be appropriate for the current task.
In addition, as discussed in section 3.3, non‐concurrent exposure to variety may decrease surgeons’ productivity due to their response to similar stimuli, which may prove detrimental over time. Researchers have shown that priming is more likely to happen when exposure to the related task is more recent (Bargh and Williams 2006, Lerner et al. 2004). Compared with more distant experiences, the carryover effects of recent experiences are more likely to cause automatic responses to subsequent similar experiences (Bargh and Williams 2006), since individuals tend to respond unconsciously with their most recent relevant behaviors (Bargh et al. 2012). We therefore expect that: Recent non‐concurrent exposure to variety amplifies the negative effect of non‐concurrent exposure to variety on subsequent focal task productivity.
The Moderating Role of Short‐Term Concurrent Variety
Finally, we consider the moderating effect of recent (short‐term) concurrent exposure to variety on the relationship between concurrent exposure to task variety and productivity. We expect that the positive knowledge transfer from all past concurrent exposures to variety will be higher after a recent exposure to concurrent variety.
As discussed in section 3.4, an important way that concurrent variety improves subsequent focal task productivity is implicit learning. Because this process is essentially about identifying and recalling associations of the focal task (Maskarinec and Thompson 1976), which implicitly involves time, exposure to recent concurrent variety will further improve implicit learning.
In addition, individuals tend to forget (Egelman et al. 2016, Shtub et al. 1993), which can have a significant impact on their productivity in procedural cognitive tasks (Bailey 1989). This forgetting effect becomes less rapid as the complexity of the performed task increases (Lance et al. 1998, Nembhard 2000) and more rapid as the time interval between two consecutive tasks increases (Bailey 1989, Globerson et al. 1989), even in a procedural cognitive task (Nembhard and Uzumeri 2000) such as a cardiac operation. Researchers have also shown that surgeons’ learned skills (both technical and cognitive) tend to deteriorate over time (Ramdas et al. 2017) and have highlighted the need for periodic remediation of any necessary skills (Kahol et al. 2010). Moreover, Kahol et al. (2010) have suggested that surgeons can improve their retention of skills by performing multiple surgeries concurrently. Hence, we believe that a recent concurrent exposure will significantly reduce the forgetting effect for a surgeon. With reduced forgetting, long‐term learning from exposure to variety will increase. Therefore, we expect that: Recent concurrent exposure to variety amplifies the positive effect of concurrent exposure to variety on subsequent focal task productivity.
Data and Variables
The organization that we use for our study is the cardiac unit of a private hospital in Europe that is the property of an American non‐profit organization. The hospital admits more than 2000 patients annually and performs around 850 cardiac operations each year. We test our hypotheses using an archival dataset of all 3275 CABG operations performed in the hospital during the period from 01/01/2004 to 31/03/2011. After removing two operations with missing data, we are left with 3723 operations for our study.
Each surgery team consists of one lead surgeon and zero to four assistant surgeons. Our sample includes 44 surgeons, 19 of whom started working after the beginning of our dataset, and 13 of whom do not appear during the last year of our dataset. Apart from the surgeons, each team also typically has one anesthesiologist, one perfusionist, and zero to three scrub nurses. We provide further information on surgeons in Appendix S1. Like other studies that examine the effect of experience on performance in surgical settings (see, e.g., KC and Staats 2012), our study concentrates on surgeons (lead and assistant surgeons, 44 in total) and their exposure to various tasks. There are two reasons for this. First, surgeons perform different sets of steps and activities during different kinds of surgery, which gives them many opportunities to learn. The rest of the team members (e.g., the anesthesiologist preparing the patient, nurses providing the equipment, etc.) perform tasks that are more trivial and very similar across different kinds of surgery. Since our primary agenda in this study is to identify the effect of task variety on learning and productivity, we focus on surgeons, who perform somewhat different activities and tasks in different kinds of surgery. Second, interviews with the medical staff at the hospital confirmed our intuition that we should consider only surgeons in studying the effect of task variety on productivity in this setting.
Our dataset contains information about the type of operation performed, the members of the surgical team, and the operation's duration, including exact start and end times. Our sample also includes information regarding the patient's age, sex, and condition before the operation. Specifically, the hospital labels each patient's case either “severe,” “medium” or “mild.” Finally, our dataset includes limited information about in‐hospital mortality for patients who have had an operation in the hospital.
To test our hypotheses, we use CABG as the focal surgery type and examine the impact of exposure to other cardiac surgery types (i.e., concurrently vs. non‐concurrently) on the duration of subsequent CABG‐only surgeries. We use CABG as our focal task because it is the most common cardiac surgery type (Clark and Huckman 2012), and indeed it appears more than any other surgery type in our dataset. In addition, since our goal is to examine the different effects of concurrent and non‐concurrent task variety on subsequent productivity in the focal task, we need a task which could be performed both concurrently with another task and on its own. Finally, CABG surgeries have received significant attention in the recent operations management literature (Huckman 2003, Huckman and Pisano 2006, KC and Staats 2012, Pisano et al. 2001), which could help us consider our results’ validity and generalizability. Consequently, CABG is an ideal choice of focal task for our study.
A surgeon can perform multiple types of surgery on the same patient during the same operation, as required by medical conditions. For this analysis, in addition to the focal task (i.e., CABG), we have information regarding other types of cardiac surgeries performed during the same time interval in the hospital. Our dataset includes 1324 valve repair/replacements, 86 congenital surgeries, 70 heart failure procedures, 20 tumour removals, 185 routine cardiac surgeries, and 78 other normal surgeries (all other surgeries that do not fit into any of the previous categories). In addition, our dataset includes 951 complex operations in which a CABG surgery and one of the other types of surgeries were performed concurrently. The dataset also includes 170 very complex operations in which a CABG and two of the other types of surgeries were performed concurrently. Because all our hypotheses address focal task productivity, we use 3273 CABG‐only surgeries as our observations to test the hypotheses. However, when calculating our independent variables, we make use of all 6171 surgeries (CABG only, other type only, and concurrent surgeries; it is worth noting that apart from the 951 complex operations which include a CABG, 14 additional operations include two surgeries other than CABG performed concurrently); details appear in section 4.4.
Finally, we also conducted a limited number of interviews with several hospital staff members. These interviews enabled us to better comprehend how CABG and other surgeries are performed and helped us understand hospital policies and management practices.
Variables
Dependent Variable
We use duration of CABG operations as our dependent variable. Extensive research in psychology has shown that decrease of the completion time of a performed task is an indicator of learning and increased productivity (Graham and Gagne 1940, Thurstone 1919). In examining performance implications of experience and learning‐related issues, operation completion time is a commonly employed dependent variable. For example, in their investigation of the effect of team familiarity, organizational experience, and role experience on team productivity, Reagans et al. (2005) employed surgery duration as the dependent variable. Similarly, other learning‐related studies such as Pisano et al. (2001) and Edmondson et al. (2003) used procedure completion time for cardiac surgery as their dependent variable. Similarly, we contend that lower completion time reflects learning and increased productivity for the surgeons and so use it as our dependent variable. In addition, consistent with the previous literature which found that shorter completion times decrease the probability of post‐surgical infections for cardiac surgeries (Gaynes et al. 2001, Gibbons et al. 2011), in our semiformal interviews, staff members in our hospital also confirmed that lower completion times usually reflect better clinical outcomes.
While our dataset also included in‐hospital mortalities, unlike in large‐scale multi‐hospital studies, death events were quite infrequent in our setting of only one unit of a single hospital. Consequently, like Pisano et al. (2001), we were not able to detect any significant variables that explain variation in mortality rates other than the clinical condition of the patient (i.e., severe, medium, mild). In fact, all mortalities came from the severe group. Therefore, in line with our theoretical development, we have decided to keep the scope of our study on surgical productivity implications of concurrent and non‐concurrent task variety and leave it to future work to explore quality (e.g., mortality) implications of task variety.
Independent Variables
Non‐Concurrent Variety
This variable captures all prior days’ non‐current exposure to variety for the surgeons in a team. For each surgeon in a team, we first count the total number of times since the beginning of our dataset that she has performed a surgery that does not include CABG (that is, other type only), up to the current CABG operation. Please also see Table 1 for an illustration of how we calculate this variable for an individual surgeon.
Surgeries Performed by An Individual Surgeon over Time (t = t 1, t 2, t 3, t 4)
Concurrent Variety
This variable captures all prior days’ concurrent exposure to variety for the surgeons in a team. For each surgeon, we first count the total number of concurrent operations she has performed, that is, an operation which involves a CABG surgery and one of the other types of surgeries up to the current (focal) CABG surgery since the beginning of our dataset. Please also see Table 1 for an illustration of how we calculate this variable for an individual surgeon.
Non‐Concurrent Variety × Recent Non‐Concurrent Variety
We first calculate Recent Non‐concurrent Variety. To capture Recent Non‐concurrent Variety, we calculate for each surgeon the number of surgeries that did not include a CABG operation (that is, other type only) that she has performed during the week before the current CABG surgery. We then multiply this new variable by the Non‐concurrent Variety variable to create the interaction term.
Given our setting, we use 1 week to capture a short time period (i.e., recent). Clearly, what constitutes a short time period may differ depending on the operational context. Given operational dynamics in our context and the fact that CABG operations last around 3–6 hours, we think that 1 week is an appropriate choice. (See also Staats and Gino (2012, p. 143) for a similar discussion, which suggests that a week may be an appropriate choice for characterizing a short time period with tasks that last 5 hours.) Consequently, in our study, any exposure to variety that takes place within the week prior to the current operation is considered recent. We also performed robustness checks for our choice of 1 week by considering slightly longer and shorter time periods, and our results remained the same.
Concurrent Variety × Recent Concurrent Variety
We first calculate Recent Concurrent Variety. To capture Recent Concurrent Variety, we calculate for each surgeon the number of concurrent operations (a CABG and another type concurrently) she has performed during the week before the current CABG surgery. We then multiply this new variable with the variable Concurrent Variety to create the interaction term.
Control Variables
Focal Experience
For each surgeon, we calculate the number of CABG surgeries (either CABG‐only or combined with another surgery type) she has conducted up to the current CABG surgery (excluding the current one) since the beginning of our dataset. This way, we capture each surgeon's total experience with the focal surgery type. Please also see Table 1 for an illustration of how we calculate this variable for an individual surgeon.
Team Size
We control for the size of the surgical team (counting all team members). Larger teams might have more access to experience and resources (Reagans et al. 2005), but smaller work‐group size is associated with increased team productivity (Gladstein 1984), since larger teams sometimes face coordination challenges that decrease their productivity (Hackman 2002).
Time Fixed Effect
To control for potential environmental changes in our setting (such as hospital policy or technological advances) and also organizational experience that may influence surgery durations, we include dummy variables indicating the year, month, and day of the week that the operation took place.
Individual Average Experience
We control for the average experience—measured in number of operations—of the team members other than the lead surgeon. For each team member (excluding the lead surgeon), we calculate the number of times she appears in any operation prior to the current one (not including the current one) since the beginning of our dataset. We then take the sum and divide by the number of team members (excluding the lead surgeon).
Indicators for Severity of the Case
As mentioned, the hospital labels each patient's case mild, medium, or severe. The “medium” category appears most often in our sample, so we include two dummy variables in all models: “Severe” and “mild” are both equal to one if the hospital has labeled the patient as such and zero otherwise. We expect “severe” cases to generally last longer than the other two categories, as indicated by Table A1 in Appendix S1.
Age
For every patient we control for age since operations may last longer for older patients.
Male
Finally, we also control for the gender of each patient by including a dummy variable equal to one if the patient is male and zero otherwise.
Calculation of Key Independent and Control Variables
Table 1 shows how we calculate Focal Experience, Concurrent Variety, and Non‐concurrent Variety for each surgeon in our sample. As mentioned, one operation may include more than one surgery type. That is, while the most typical case is to perform only one type of surgery (e.g., only a CABG or only a valve replacement), a good number of operations include more than one surgery type (e.g., a CABG plus a valve replacement), depending on the patient's medical requirements. It is also worth noting that when we calculate Non‐concurrent and Concurrent Variety, we remove the corresponding recent varieties to avoid confounding our variables. 4
At time t 1, the surgeon performs an operation that includes a CABG and a valve repair. Then in our first observation at time t 2 (our observations are the operations that include only a CABG surgery), her score for Focal Experience will be equal to 1, since she has conducted one CABG surgery prior to t 2. Similarly, her score for Concurrent Variety will be equal to 1, because she has conducted a valve repair and CABG concurrently prior to t 2. Finally, her score for Non‐concurrent Variety will be equal to 0, since she has not yet conducted any other type of surgery individually (non‐concurrently). Then at time t 3 she performs an operation that includes only a valve repair. In our next observation for our study at time t 4, her score for Focal Experience will become 2, because she has conducted two CABG surgeries (at times t 1 and t 2) prior to t 4. Her score for Concurrent Variety will remain 1, and her score for Non‐concurrent Variety will now become 1 because she has conducted an individual valve repair surgery on a patient prior to t 4 (i.e., at time t 3).
Empirical Approach and Results
We test our hypotheses using a fixed‐effects panel regression with AR(1) disturbance based on Baltagi and Wu (1999). That is, we examine the effect of concurrent and non‐concurrent variety for lead surgeon i on the duration of CABG operation j. Our panel data structure, in which surgeons perform operations over time, poses several challenges. First, there may be unobserved heterogeneity between surgeons in areas such as ability, education, or experience, and this heterogeneity may bias our results. Second, there is potential serial correlation between operations that were performed by the same lead surgeon close in time. In fact, when we use the Durbin‐Watson test (Durbin 1954), we find that autocorrelation in our sample is likely to be first‐order (i.e., AR(1)). Third, we have an unbalanced panel structure with unequally spaced observations over time. Our fixed‐effects regression with AR(1) disturbance based on Baltagi and Wu (1999) (i.e., xtregar, fe command in STATA) explicitly takes into account all these issues and eliminates all unobserved time‐invariant individual effects of lead surgeons.
Also, our analyses of the distribution of dependent and continuous independent variables (ladder and gladder commands in Stata) reveal skewed distributions. Consequently, and in line with our theoretical arguments and previous studies of the use of conventional learning‐curve models to examine completion times (Argote 1999, Reagans et al. 2005), we take the logarithm of all these variables (apart from age). We also confirm the normality of residuals and check for heteroscedasticity by using the Breusch–Pagan test (1979), which does not reject the null hypothesis, thereby confirming that heteroscedasticity does not pose a threat to our analyses.
Our model is the following:
In the above model, α i represents unobserved lead surgeon characteristics such as education and background. Our fixed effects estimation eliminates α i by de‐meaning the variables using the within transformation. Also, in the model, t j represents time effect, indicating the year, month, and day of the week that the operation takes place. In addition, u ij is the serially correlated error term, with |ρ| < 1 being the first‐order autocorrelation coefficient, and e ij is independent and identically distributed with zero mean and constant variance.
Table 2 shows descriptive statistics and correlations among the dependent, independent, and control variables for the lead surgeons. Table 3 shows the results for all our hypotheses for the lead surgeons. 5 Due to the AR(1) covariance structure we employ, the number of observations in the table is equal to 3261. In Model 3.1 we include only our control variables. As expected, focal experience and average individual experience have negative and significant coefficients, suggesting that they reduce completion times, whereas team size increases duration. In addition, compared with the baseline group of medium, severe operations take longer and mild operations are shorter in duration.
Descriptive Statistics for Lead Surgeons
Notes
+, * and ** denote significance at 10%, 5%, and 1% levels, respectively. Logged values of all variables except Severe, Mild, Age and Male.
Regression of Task Variety on Surgery Duration for Lead Surgeons
Note
+, * and ** denote significance at 10%, 5%, and 1% levels, respectively.
In Model 3.2 (Table 3), we add our first variable of interest: Non‐concurrent Variety. The adjusted R 2 is increased by 3.45%, and an F‐test shows that Model 3.2 is superior to Model 3.1 (p < 0.01). Non‐concurrent Variety has a positive and significant coefficient (p < 0.01), providing support for our first hypothesis. In Model 3.3 (Table 3) we add Concurrent Variety. The adjusted R 2 is further increased by 8.00%, and an F‐test shows that Model 3.3 is superior to Model 3.2 (p < 0.01). Concurrent Variety has a negative and significant coefficient (p < 0.01), supporting Hypothesis 2.
In Model 3.4 (Table 3) we add the interaction terms Non‐concurrent Variety × Recent Non‐concurrent Variety and the variable Recent Non‐concurrent Variety. The adjusted R 2 is further increased by 1.85%, and an F‐test indicates that Model 3.4 is superior to Model 3.3 (p < 0.05). We also see that Non‐concurrent Variety × Recent Non‐concurrent Variety is significant (p < 0.01) and positive, providing support for Hypothesis 3. Finally, in Model 3.5 (Table 3) we include the interaction term Concurrent Variety × Recent Concurrent Variety and the variable Recent Concurrent Variety. The adjusted R 2 is further increased by 1.82%, and an F‐test indicates that Model 3.5 is superior to Model 3.4 (p < 0.05). Concurrent Variety × Recent Concurrent Variety is significant (p < 0.05) and negative, providing support for Hypothesis 4 6 .
According to Model 3.3 (Table 3), an increase of 20% in Non‐concurrent Variety multiplies expected duration by e 0.312×ln(1.2) = 1.0585, that is, duration is increased by 5.85%. The average duration in our sample is 297.77 minutes. Hence this increase translates into 5.85% × 297.77 = 17.420 minutes. Similarly a 20% increase of Concurrent Variety decreases duration by 6.03% (17.956 minutes). This 20% increase corresponds to, 7 on average, around 32 (=159.91 × 0.2) non‐concurrent and 27 (=136.88 × 0.2) additional operations for each lead surgeon. In our sample, on average, each lead surgeon performs 0.74 non‐concurrent operations and 0.57 concurrent operations per week. Hence, based on existing surgery frequencies in our sample, it takes a lead surgeon roughly 32/0.74 = 43 weeks and 27/0.57 = 47 weeks to achieve a 20% increase for non‐concurrent and concurrent variety, respectively. These numbers imply that lead surgeons can indeed achieve meaningful productivity improvements within less than a year.
One limitation of our fixed‐effects AR(1) regression based on Baltagi and Wu (1999) is that it is not possible to report robust standard errors. The Breusch–Pagan test (1979) reveals no heteroscedasticity, but, nevertheless, we additionally run fixed‐effects regression with robust standard errors (xtreg fe with robust option) and test all our hypotheses. Tables A8 and A9 in Appendix S1 show the results for our hypotheses using this alternative model. The results for all our hypotheses remain the same in terms of significance and support and are close in terms of coefficients.
Robustness Checks
We perform several additional analyses to examine the robustness of our results and to rule out potential alternative explanations.
Data and Variables’ Operationalization
As in any empirical study, our dataset is limited. A potential concern is that we have no data for surgeries or surgeons prior to the beginning of our dataset, which may influence our results. To address this, we repeat our analysis after excluding different time intervals from the beginning of our dataset. That is, we remove the first 12 months and 24 months, and we calculate our dependent variable and all independent variables using the remaining data (e.g., when we remove 24 months, we calculate surgery durations using only those surgeries that took place between months 24 and 87, and when calculating concurrent exposure to variety, for instance, we similarly use all operations between months 24 and 87). We then repeat our analysis. The results for all our hypotheses remain the same quantitatively (i.e., the sign of the coefficients for our variables of interest remain the same and significant either at 1% or 5% level). Finally, Lapré and Tsikriktsis (2006) suggest that studies examining a learning effect a long time after the beginning of the learning curve can eliminate potential bias by using a log‐linear model. Consequently, we repeat our analysis for H1 and H2 by using a log‐linear model and find full support for all our hypotheses. We therefore believe that missing experience (i.e., missing data prior to the beginning of our dataset) does not pose a threat to our main findings.
We also test the sensitivity of our results by repeating our analysis after removing the first 12 months of our dataset only for the dependent variable (about 14% of our initial observations). That is, while we include our entire dataset when calculating the independent variables, in this case the set of observations (and hence the dependent variable) start after month 12. This way, we investigate the robustness of our results when there is missing data for our main independent variables at the beginning. While the magnitude of effects change as expected, our primary findings remain quantitatively the same.
We also replace Average Individual Experience of the rest of the team members with a variable called Average Individual Direct Experience, which captures the number of times each non‐surgeon team member has conducted the focal type of operation (CABG) since the beginning of our dataset. Our results remain the same.
One may also be concerned that our observed effects may only work on single focal operations (CABG‐only) and may not hold for non‐focused operations that involve more than just CABG. In order to examine this, we repeat our analysis after changing the way we define our variables. Specifically, we define as focal experience all CABG and valve repair/replacement surgeries (the second most frequent operation type in our sample) and calculate Focal Experience, Non‐concurrent Variety, and Concurrent Variety accordingly. Specifically, we use all operations that include a CABG or a valve repair/replacement when calculating Focal Experience, all operations that include neither CABG nor valve repair/replacement when calculating Non‐concurrent Variety, and all operations that include both CABG and valve repair/replacement and another type when calculating Concurrent Variety. We repeat our analysis and find support for all our hypotheses. This confirms that the effects of non‐concurrent and concurrent exposure to variety hold in operations that include more than a CABG.
Moreover, we repeat our analysis using all operations (the whole sample) as observations and not just the CABG operations. This is because if our results pertaining Non‐concurrent Variety change significantly and we observe an exactly opposite effect, then this could imply very different managerial implications for organizing Non‐concurrent Variety. In this new analysis on the whole sample, depending on the currently performed operation, we define and calculate our variables of interest accordingly. If, for example, the observation is a single valve operation, then we define all valve operations in the past as focal experience, all non‐valve operations in the past as non‐concurrent variety, and all operations that include a valve operation and a second type as concurrent variety. Then we perform a single analysis in our whole dataset and get support for our hypotheses. Specifically, our results remain similar to those we obtained with the CABG‐only sample, though with smaller effect sizes for both Non‐concurrent variety and Concurrent variety. Hence, we believe that our results do not depend on the specific choice of focal operation.
Furthermore, one may argue that some non‐CABG operations are more similar to CABG than others and that this similarity might drive the results regarding the effects of concurrent and non‐concurrent variety. In order to address this, we group non‐CABG operations into two broad categories: (i) those that are more related to CABG and (ii) those that are less related to CABG, based on previous medical and management literature. In the former category, we include only valve operations, and in the latter category, we include the rest of the non‐CABG operations. Medical literature has pointed out the overlap between CABG and valve operations in terms of the surgery process (Ch'ng et al. 2015, Kouchoukos et al. 2012), an assessment which was also echoed by studies in management which highlighted the similar set of moves during both surgeries (Pisano et al. 2001). On the other hand, other types of surgeries such as tumour removal or congenital surgery have been suggested to be quite different from CABG in terms of the steps and processes involved (Kouchoukos et al. 2012). For our analysis, we break down non‐concurrent variety into more‐related non‐concurrent variety and less‐related non‐concurrent variety, and similarly concurrent variety into more‐related concurrent variety and less‐related concurrent variety. Next, we repeat our analysis for the lead surgeons using only more‐related non‐concurrent variety and more‐related concurrent variety in one model and only less‐related non‐concurrent variety and less‐related concurrent variety in another model. We report corresponding results in Table A10 and A11 in Appendix S1. Our results are consistent in both models, and they remain similar to the original results (i.e., without breaking down non‐CABG operations), providing further support for our hypotheses on the effects of concurrent and non‐concurrent variety. That is, regardless of being more related or less related to CABG, concurrent non‐CABG operations are associated with decreased duration, whereas non‐concurrent non‐CABG operations are associated with increased duration.
Finally, while our variables focal experience, concurrent variety, and non‐concurrent variety capture the experience of a surgeon during the time interval of our dataset, a potential concern could be that we do not specifically account for potential tenure/experience differences of surgeons and associated unobserved factors (e.g., familiarity with the hospital environment), as we do not have detailed personnel data. In order to address this, we conduct additional analyses in which we only consider the subsample of operations which only includes the lead surgeons who have been at the hospital from the beginning of our dataset. This way, we are able to minimize and control for any potential experience‐based differences across surgeons, as these surgeons all have been at the hospital for more than 7 years, suggesting they are all highly familiar with the hospital environment. Our results with this more conservative analysis using the subsample provide full support for all our hypotheses (Table 4).
Regression of Task Variety on Surgery Duration for Lead Surgeons with Subsample
Note
+, * and ** denote significance at 10%, 5%, and 1% levels, respectively.
Surgery Assignments
Considering our study, there may be two types of potential selection issues that may bias our results. The first one is associated with patients’ selection of specific doctors, and the second one is linked to doctors’ selection or avoidance of certain patients or surgery types. One major advantage of our setting of the private European hospital is that patients by and large are referred to the hospital and not to a specific surgeon in the hospital. The hospital then assigns patients to surgeons. This mechanism in which the hospital plays a major role in assignment considerably alleviates patient‐driven or surgeon‐driven selection concerns as patient/surgeon assignments are plausibly random.
However, this mechanism may then arguably give rise to another potential selection concern in that the hospital may assign certain cases to certain doctors (e.g., more severe cases to more skilled, expert surgeons or easier cases to less experienced surgeons). If this is the case, then it could bias our results. This is unlikely to be the case in our setting, because all surgeons at this highly reputed private hospital are expected to be highly skilled and experienced and are expected to readily perform all types of surgeries. Our interviews with hospital administration and staff confirmed this. We also examine this concern empirically. Specifically, we first investigate the distribution of severe cases among surgeons and do not observe any patterns. Second, we conduct a chi‐square test to ensure that the mild, medium, and severe cases are evenly spread across lead surgeons. The results (Table A5 in Appendix S1) indicate that there is no difference across lead surgeons in terms of severity of assignments. Next, we repeat our analysis after dropping the most critical cases from our sample, which includes patients that died in the hospital after the operation, and find the same results quantitatively. We also investigate the spread of deaths among surgeons and examine any potential correlations of these deaths with our key variables. We find that these in‐hospital deaths are spread evenly across surgeons and show no correlation with our key variables other than the “severe” patient condition. These analyses suggest that there is no systematic difference across surgeons in terms of severity of the cases being assigned to them.
A severe case can be more critical than another severe case, and, similarly, one mild case might be easier than another. Because the hospital does not make these kinds of distinctions within severe cases and within mild cases, we further examine this issue: We repeat our analysis after excluding the severe cases (using only the mild and medium ones). Then we repeat our analysis after excluding only the mild cases from our initial sample (using only the medium and severe ones). Finally, we also repeat our analysis using only the medium cases. In all three cases, while the magnitude of effects changed considerably as expected, our results remain the same (please refer to Tables A12–A17 in Appendix S1 for corresponding results).
Finally, we further investigate the relationship between in‐hospital mortality and duration. Specifically, we create two groups of observations: The first group includes all the operations that resulted in the death of the patient after the completion of the operation (i.e., post‐operation in‐hospital mortality; this is the only post‐operation mortality information we have available to us). Also please note that all these operations were characterized as severe. The second group includes all the severe operations that did not result in the death of the patient. Please note that in the former group, we do not include the operations that resulted in the death of the patient during the operation, since surgery duration would be biased in these cases. We conduct a t‐test to compare the average duration of these two groups. Our results indicate that there is no significant difference in the duration of these two groups (p = 0.16). Next, we create a more granular matched sample using the Coarsened Exact Matching algorithm described by Iacus et al. (2011) using the Stata command cem. This enables us to compare the duration of the operations in a more closely matched sample. Specifically, we focus only on the severe cases, exclude the cases in which the patient died during the operation, and use a dummy variable equal to 1 if the operation resulted in the death of patient after the operation and 0 otherwise as our treatment variable. We also use the covariates of age, gender, lead surgeon, and surgery type for our matching process. Our results indicate that 79% of our observations are matched and the L1 statistic—a widely accepted measure of global imbalance comparison (Iacus et al. 2011)—is reduced by 19.6% (from 0.449 to 0.361), indicating a significant decrease of imbalance after our matching process. The t‐test after the matching process indicates no significant difference in duration between severe operations with mortality and severe operations with no mortality (p > 0.10). While this analysis is helpful, we should acknowledge that its extent is limited due to our limited data on patient characteristics and all mortalities were observed in the severe group.
Instrumental Variables Approach
As discussed in the previous section, we believe that our particular empirical setting (where surgeon/patient assignment is plausibly random) and additional robustness checks address most selection concerns. In addition, our fixed‐effects regressions accounts for all observed and unobserved time‐invariant heterogeneity across lead surgeons through a demeaning process. We also include a set of important control variables, such as the lead surgeon's focal (CABG) experience, other team members’ average experience, team size, time effects, and indicators of the severity of the case, as well as patient age and gender, which may affect the duration of an operation. These, we believe, address most plausible endogeneity concerns with regards to our findings about concurrent and non‐concurrent variety on surgery duration.
However, if there are unobserved time‐varying individual factors that affect both lead surgeons’ task variety and their surgery duration at the same time, then this may be a concern. In order to address such potential concern, we conduct an additional instrumental variables analysis with 2SLS specification. Specifically, we identify and use two plausible instruments for surgeons’ concurrent and non‐concurrent variety based on surgeons’ vacations in our context. We define vacations if the lead surgeon does not appear in our sample for 1 week or more (we also consider shorter and longer durations and obtained similar results) and then reappears (which means that she is still working at the hospital). Our surgeons work solely for our client hospital, so their absence in our sample does not indicate that they may be working in another hospital or in private practice.
The first instrument we use is Others’ vacation. 8 For the focal lead surgeon i and operation j, this is calculated as the total number of vacation days of all other lead surgeons in the hospital from the time lead surgeon j joined the hospital until the current operation j. 9 The idea is that the higher the number of days other surgeons are on vacation, the greater the expected number of concurrent and non‐concurrent tasks performed by the focal surgeon (e.g., in others’ absence, there is plausible increase in operations performed by the focal surgeon as absent surgeons’ operations will be distributed among present ones). 10 We can empirically verify this by checking the first‐stage regression. However, other surgeons’ previous days on vacation should not have any direct influence on the focal surgeon's current surgery duration, except through increasing her concurrent and non‐concurrent task variety. That is, we expect this instrument to satisfy the relevance and exclusion conditions and have a significant positive effect on concurrent and non‐concurrent task variety in the first stage.
The second instrument we use is the focal surgeon's vacation days in the past year, excluding the most recent 3 months. For ease of reference, we call this variable Days of vacation. For the focal lead surgeon i and operation j, this is calculated as the total number of vacation days of the focal surgeon i during 1 year 11 before the operation j, except the most recent 3 months. We expect that the higher the surgeon's number of days spent on holiday in the past year, the lower her concurrent and non‐concurrent task variety, as the surgeon cannot perform operations during these periods. Also, in order to eliminate any possible direct effect of a surgeon's holiday on her surgery duration (e.g., warming up, adjustment periods), we exclude the most recent 3 months in our calculation of the instrument (which is the mean inter‐holiday time of surgeons in our sample). 12 That is, while this instrument helps explain variation in a surgeon's concurrent and non‐concurrent variety, it should not have any direct influence on her current surgery duration, therefore satisfying the exclusion condition.
First, we examine the first‐stage regression to verify the relevance of our instruments. The coefficient of Others’ Vacation is significant and positive, whereas Days of Vacation is significant and negative in the first‐stage regressions on Concurrent Variety and Non‐Concurrent Variety with a high F‐statistic and high adjusted R 2 of the corresponding models. Specifically, Table 5 shows the results of our IV approach where we use both Other's Vacation and Days of Vacations as instruments to Non‐Concurrent Variety and Concurrent Variety in two stage estimation. Our instruments are significant in the first stage regression with an F‐statistic well above 10 (equal to 66.03 and 78.87, respectively).
Regression of Task Variety on Surgery Duration with Instrumental Variables Approach
Note
+, * and ** denote significance at 10%, 5%, and 1% levels, respectively.
In addition, the value of the Kleibergen‐Paap Wald statistic for weak instruments and the underidentification test (Kleibergen LM statistic) indicate that our instruments indeed have power and satisfy the relevance condition. 13
After presenting the first stage, we next focus on our second‐stage estimation results with the instrumental variables approach. Second‐stage estimation (Model 5.3) in Table 5 provides support for both H1 and H2. Specifically, Non‐concurrent Variety has a positive and significant coefficient (p < 0.01), and Concurrent Variety has a negative and significant coefficient (p < 0.01). These results are similar to our previous findings without the use of instruments. 14
In the above IV analysis, similarly to what we discussed at the end of section 6.1, a potential concern (especially with regards to the instrument Others’ vacation) could be that we do not specifically account for potential tenure/experience differences of surgeons and associated unobserved factors (e.g., familiarity with the hospital environment) as we do not have detailed personnel data. We therefore repeat our IV analysis only considering the subsample of operations which only include the lead surgeons who have been at the hospital from the beginning of our dataset. Our results with this analysis provide full support for H1 and H2 (Table 6).
Regression of Task Variety on Surgery Duration with Instrumental Variables Approach with Subsample
Note
+, * and ** denote significance at 10%, 5%, and 1% levels, respectively.
Discussion and Conclusions
Professional service firms are the epitome of the increasingly knowledge‐based economies of the world. While there is growing interest in the study of professional services in the management literature (Gardner et al. 2008, Greenwood et al. 2005, Hinings and Leblebici 2003, Maister 1993), the study of professional service work from an operations standpoint has been quite limited (Lewis and Brown 2012, Roth and Menor 2003). Despite the clear importance of such white‐collar professions to the economies of developed countries, their operations are much less understood than the operations of blue‐collar work (Hopp et al. 2009).
Learning is a critical component of white‐collar work (Argote and Ingram 2000). From an operations perspective, the essential issue with regards to learning in white‐collar settings is how it affects performance (Hopp et al. 2007). While past experience is generally associated with an increased learning rate at the individual level (KC and Staats 2012, Narayanan et al. 2009, Staats and Gino 2012), some experiences can also have a negative effect on individual productivity (Allport et al. 1994, Argote and Miron‐Spektor 2011, Lapré 2011, Lapré and Nembhard 2010). Our goal in this study was to explore the effects of a specific kind of experience—that is, experience from performing related tasks (i.e., related variation)—on individual learning and focal productivity when the related tasks are performed concurrently vs. non‐concurrently with the focal task. We find that while concurrently performing another task over time enhances the productivity of the focal task, performing another task non‐concurrently reduces this productivity. We also introduce recent exposure to variety as an important moderator which amplifies the relationship between variety and productivity.
While the extant literature has recognized task variety as a potentially important driver of individuals’ task performance, evidence on productivity implications of task variety have been mixed. It is only recently that researchers have started to disentangle more nuanced elements of task variety and investigate their effects on individual productivity. In an experimental study, Schilling et al. (2003) make an important distinction between related and unrelated task variation and show that the learning rate under conditions of related variation is significantly greater than under conditions of unrelated variation. In two other recent studies, Staats and Gino (2012) investigate the productivity implications of exposure to variety in the long term vs. the short term, and KC and Staats (2012) introduce subtask variety (within task variety) as an important driver of individual performance. Although these studies have considerably improved our understanding of task variety's effects on individual productivity, several elements of task variety, such as the relationship between focal tasks and varied tasks and their impact on productivity, have been much less well understood. Our study sought to contribute to this line of inquiry by focusing on when the varied task has been performed with regards to the focal task. That is, we introduce the way different task are performed (concurrently vs. non‐concurrently) as an important dimension to consider in understanding the effect of task variety on productivity.
Our extensive dataset, which covers a time interval of more than 7 years, allowed us to investigate the influence of exposure to variety on individual learning and productivity over time. In addition, we were able to observe a variety of surgeries performed in different configurations (e.g., CABG only, other type only, CABG and other type) which are driven primarily by exogenous (i.e., medical) factors.
Surgeons have been identified as typical 21st‐century knowledge workers (KC and Staats 2012) in that they experience constant learning throughout their careers. Accordingly, our results can provide useful insights for other professional service workers with regards to task allocation and timing strategies. Our findings are particularly relevant to settings that are characterized by high levels of worker discretion and control over how to conduct a variety of tasks. Our results suggest that performing other related tasks concurrently with a focal common task can improve individuals’ learning and productivity over time. Our results indicate that a 20% increase in exposure to other related tasks can decrease the time an individual needs to perform the focal task by 3.69% (which, in the case of our surgeons, translates to around 16 CABG operations per year in terms of average time savings). According to the American Heart Association (Roger et al. 2011), a CABG costs on average $117,094. Hence, performing 16 more CABG operations may correspond to up to $1,873,504 per year in financial terms. 15 On the other hand, we observe that performing related tasks in isolation (non‐concurrently) does not provide such learning and productivity‐improvement opportunities. In contrast, such non‐concurrent variety could be detrimental to productivity due to negative transfer effects. Our results suggest that a 20% increase in non‐concurrent exposure to variety can decrease average surgery duration by 5.07%, which for our surgeons translates to around 23 CABG operations per year in terms of average time savings. Similarly, 23 more CABG operations can translate to $2,693,162 per year.
To maximize individual learning and improve productivity in common tasks, therefore, knowledge workers may consider pairing their most common task(s) with other related tasks and try performing them concurrently as much as possible. In addition, our results on the moderating role of recent exposure to variety suggest that short‐term exposure to variety may matter for subsequent task productivity. That is, recent variety amplifies the respective influences of concurrent and non‐concurrent variety on subsequent task productivity. Therefore, when individuals are devising strategies to improve their productivity on tasks they perform frequently, it may help to consider tasks that they have carried out both in the long term and recently.
Like all empirical studies, our study has limitations. First, although we have highly granular longitudinal data on surgery operations in terms of their type, constituents, duration, surgeon team, and other variables, we have limited information about patients’ conditions before the operation. Specifically, our hospital groups patients into mild, medium, and severe categories by taking into account various clinical factors, such as their medical conditions and surgery requirements. Admittedly, this is quite a crude characterization, and there is likely variation within the three groups (e.g., some cases might be more severe than others despite being in the same severe category). Ideally, we would like to have more detailed information about patients’ conditions, such as their EuroScore or Higgins score (KC et al. 2013), but this was not available in our dataset. In examining the role of task variety, future work could combine detailed surgery data with more granular patient characteristics to develop more comprehensive risk and outcome measures for patients.
In addition, we should acknowledge that our dataset greatly limit our ability to examine quality (e.g., mortality) implications of task variety or potential associations between surgery duration and in‐hospital mortality, since our dataset only includes 1‐month in‐hospital mortality information and all mortalities come from the “severe” patient group. Consequently, like other studies (e.g., Edmondson et al. 2003, Pisano et al. 2001, Reagans et al. 2005), we only focused on surgery duration, as we were unable to simultaneously explore other performance dimensions such as mortality. We used our limited in‐hospital mortality data to investigate the relationship between duration and mortality rates. Our analysis, despite its limitations, suggested that shorter durations do not come at the cost of increased mortality rates. This is also consistent with previous medical research, which in fact indicates that shorter durations could be associated with better clinical outcomes as a result of reduced risk of infections (Gaynes et al. 2001, Gibbons et al. 2011). Future research could examine how concurrent and non‐concurrent exposures to variety affect other performance dimensions, such as quality. If certain task configurations provide better learning opportunities and result in higher‐quality outcomes for knowledge workers, identifying and examining these would provide important contributions to operations management literature.
Also, while we believe that our particular hospital setting and our empirical approach with extensive robustness analyses considerably alleviate surgeon/patient selection concerns and we rule out many alternative explanations of our findings, in the absence of fully randomized assignment of tasks or exogenous shocks to concurrent and non‐concurrent variety, our results should be interpreted with caution. Past research, for example, has highlighted potential motivational benefits from exposure to variety (Hackman and Oldham 1976, Herzberg 1968, Staats and Gino 2012), which in our case may limit the detrimental effects of non‐concurrent variety. In our setting, we do not expect to observe many motivational gains: Cardiac surgery settings are high‐pressure and dynamic environments (Edmondson 1999, Nembhard and Edmondson 2006, Tucker and Edmondson 2003, Wetzel et al. 2006); therefore, there is not much motivational gain to be realized as a result of exposure to variety (KC and Staats 2012). This, of course, might not be the case for settings with more repetitive and routine tasks.
Building on our insights about concurrent and non‐concurrent variety and considering our work's limitations, we next discuss potential avenues for future research.
In our study, we only considered timing of task variety; however, an important discretionary responsibility of many knowledge workers involves sequencing of tasks. On this front, recent work examines drivers and implications of deviating from prescribed sequence (Ibanez et al. 2017). Additional research could explore, for example, whether the negative effects of non‐concurrent variety could be mitigated in certain task‐sequencing configurations.
We believe that the term “concurrent variety,” introduced in our study, is a promising and germane concept for operations management, as it can extend existing theories of learning and knowledge transfer considerably. Many knowledge workers use discretion in deciding whether or not to perform various tasks concurrently. While the boundaries and durations of tasks are quite clear in our surgery setting, they can be less so in many other knowledge‐intensive contexts. For example, a research scientist may or may not prefer concurrently working on multiple tasks or projects at the same time, each of which may also be at different stages (e.g., working on an early stage proposal draft, completing a report, revising a study). It would be interesting to explore how learning and productivity implications of such choices may be associated with concurrent variety.
Also, in our setting, the typical duration of a focal task was approximately five hours, and the maximum number of concurrent tasks was three. Future research could consider, for example, implications of concurrent variety when a larger number of concurrent tasks are performed. As the number of concurrent tasks increases, the positive effect in the form of implicit learning from concurrent variety could potentially be reduced as a result of increased coordination issues. Future research could also explore identifying an optimal number of concurrent tasks in various knowledge‐work contexts to maximize learning and productivity.
Moreover, we only focus on timing of different tasks (concurrent vs. non‐concurrent), and, other than a simple robustness check with two categories involving more‐related and less‐related non‐CABG operations, we do not consider the degree of structural similarity or difference between tasks, as we did not have a suitable way to accurately quantify this. Both the main effect of the degree of similarity between tasks on productivity and its interaction with concurrent and non‐concurrent variety could be interesting to explore further. For example, the degree of structural similarity or difference between tasks may moderate the relationship between concurrent/non‐concurrent variety and productivity. Also, interaction with task complexity could have an effect on concurrent learning and individual productivity.
Although surgeons are characterized as a good example of 21st century knowledge workers in the extant literature and cardiac operations provided us with a very suitable setting in which to study the role of task variety in highly cognitive and knowledge intensive yet repetitive tasks, surgery settings could be unique in terms of the nature of the tasks, their life‐and‐death outcomes, and surgeons’ learning mechanisms. So, one has to be cautious in generalizing our findings to other professional service settings. It would be interesting to see future research on this subject and whether and to what extent our contrasting findings about concurrent and non‐concurrent variety would hold in other professional service settings such as law, consulting, and research and development.
Finally, our focus in this study has been productivity implications of surgeons’ learning from concurrent and non‐concurrent task variety. For this reason, we are interested not in the entire surgical team but rather in individual surgeons and how they learn from task variety that takes place concurrently vs. non‐concurrently. This is because, while different surgery types will lead surgeons to perform different set of steps and procedures during the surgery, tasks and activities for other team members (e.g., anesthesiologist preparing the patient, nurses providing the equipment, etc.) are more or less the same in various surgery types. So getting exposure to task variety only makes sense for surgeons. However, learning is a complex phenomenon with many individual, team‐level, contextual, and organizational dynamics. Considering the recent research emphasis on the role of team‐level mechanisms, such as the effect of team experience on learning and performance (Avgerinos and Gokpinar 2017, Huckman and Staats 2011, Reagans et al. 2005), future research could explore the interplay between task and team configurations and their learning implications in less hierarchical team settings with less clear task boundaries, such as management consulting or product development teams.
Footnotes
Acknowledgments
We are grateful to the Department Editor Aleda Roth, the Senior Editor and the reviewers for their constructive and insightful feedback throughout the review process, which have improved the paper substantially.
1
Note that the question we examine is fundamentally different from that of Staats and Gino (
), who examine the performance implications of “same‐day different task” and “all prior days’ different tasks.” By contrast, we build on the observation that knowledge workers can perform various tasks either separately (i.e., non‐concurrently) or together (i.e., concurrently) and highlight how this differential way of getting exposure to different tasks influences subsequent productivity on a common focal task.
2
We also investigate potential effects of the differences between non‐CABG operations in the robustness checks section.
3
We use valve operation as the illustrative example because previous literature has highlighted the common occurrence of CABG and valve operations (Ch'ng et al. 2015, Kimiyoshi et al. 2009, Saxena et al.
). Indeed, valve repair/replacement is the most frequent concurrent operation with a CABG in our sample.
4
We also repeat our analysis without removing corresponding recent variety and get very similar results.
6
While the interaction effects for Hypotheses 3 and 4 are statistically significant in all models, post hoc plots (Figures A3 and A4 in Appendix S1) reveal that the effect of the moderation is quite modest in terms of economic significance (also note the coefficients of the interaction terms in our models).
7
According to Table A3 of Appendix S1, Non‐concurrent Variety and Concurrent Variety have average values of 159.51 and 136.88, respectively.
8
We include summary statistics for our instruments in Appendix S1.
9
We also repeat our analysis by calculating Others’ vacation using the beginning of our dataset for all lead surgeons and get the same results.
10
Our hospital does not hire temporary surgeons, so in planning the absence of a surgeon, her anticipated operations are distributed among her colleagues.
11
Our choice of 1 year for measuring this instrument is based on the fact that this operationalization gives the strongest negative correlation with concurrent and non‐concurrent task variety in the first‐stage regressions. See, for example Miguel et al. (
), where instrument operationalization choice is similarly based on the strength of the first‐stage regressions. Our analyses with durations shorter and longer than 1 year (e.g., 9 months and 15 months) reveal similar results.
12
Our rationale is that any direct influence of vacation on surgery duration (e.g., warming up, adjustment) should disappear by the time a surgeon takes another holiday. We also consider slightly longer and shorter terms and obtain results similar in significance and effect size.
13
Please note that because we have robust standard errors, we use these two tests instead of the Crag‐Donald Wald F statistic and the Anderson canonical correlation LM statistic.
14
15
One should be cautious, of course, when translating these time savings (additional operations) into financial terms, since the total cost of a CABG involves multiple elements including physician fees, hospital/room fees, the costs of drugs and medical tests, etc.
