Anatomy of a Newsvendor Decision: Observations from a Verbal Protocol Analysis

Abstract

An exploratory analysis of verbal protocols from a think‐aloud newsvendor experiment provided deeper insights into the decision‐making process, enabling us to formulate a number of questions that are worth answering in future research. In a think‐aloud experiment, subjects verbalize their cognitions while performing a task; responses are then recorded, transcribed, and analyzed. A majority of the subjects struggled with the abstractness of the business setting and were keen to know information on the product type, industry setting, decisions taken in the past, competitor's situation, etc. A large portion of the participants correctly identified the overage and underage costs, but failed to convert that information into the optimal order quantity. Finally, the bias in the order quantity was significantly influenced by the specific type of risk (overage or underage) that was identified closer to the decision, alluding to the presence of a recency effect. As a first application of verbal protocol analysis to inventory decision making, this study gives us an opportunity to highlight the strengths and weaknesses of this research methodology.

Keywords

newsvendor verbal protocol analysis behavioral operations inventory control decision making

1. Introduction

After many years of focusing on rigorous mathematical analysis, researchers in operations management have only recently started studying the behavioral aspects of the decision‐making process. Amaral and Tsay (2009) is a recent example of this growing stream of research. While it should not have been a big surprise, it was observed that the decisions made by real people were substantially different from the optimal decisions predicted solely based on mathematics. These observations resulted in a significant change in the way people approached research in operations management. Boudreau et al. (2003) and Bendoly et al. (2006) provide detailed reviews of experimental and behavioral research in operations management. The newsvendor problem, which deals with stocking level decisions in the presence of uncertainty and costs associated with overstocking and understocking, has been a popular domain for behavioral research.

Schweitzer and Cachon (2000), studying end‐point decisions over 30 periods, reported that there was significant anchoring around the mean of the demand distribution and that the subjects always chose a stocking level that was between the mean demand and the optimal quantity. Bolton and Katok (2008) confirmed this phenomenon in experiments over a longer time horizon and further demonstrated that experience and feedback could lessen the anchoring effect of the mean demand. We investigate newsvendor decision making further, attempting to find out more about the thought processes in play as these decisions are made. We use audio recordings and protocol analysis of the resulting transcripts to identify reasons for the reported biases in newsvendor decisions. Our primary objective is to illustrate the use of verbal protocol analysis methodology for exploratory theory development around the newsvendor‐ordering anomaly.

Based on our experience in teaching and research, we broadly divide the newsvendor decision‐making process into the following three stages: (i) the information gathering phase, (ii) the analysis phase; and (iii) the final decision phase. Existing experimental research in newsvendor decisions has primarily focused on the results from the final decision phase. Experiments using computers were conducted en masse and not designed to capture and analyze data from the information gathering and analysis phases. Our exploratory study audiotapes one‐on‐one interactions between subjects and the experimenter, followed by verbal protocol analysis to map the process the subjects go through while making their decisions. The observations reported here, while only directional due to the limitations of the current experimental design, enable us to formulate a number of questions worth pursuing in future research.

Before presenting the details of the experiment, the method, and the results, we summarize our main findings. We observed that subjects tended to focus on the basic information relevant to the decision and did not seek some of the advanced information that would have enabled a better decision. The participants had difficulty dealing with the abstractness of the task at hand and were very keen to obtain information that they could possibly use to anchor their decisions. For example, they asked for information related to product and industry settings, decisions from the past, competitors' actions, and the vendor situation. In the absence of that information, they anchored their decisions around the mean demand, the only significant piece of the information available to them. Most of the subjects correctly identified the precise overage and underage costs, but failed to convert that information effectively into the optimal order quantity. This suggests that the percentile calculation (as described on p. 522 of Collier and Evans 2007) is not as intuitive as the operations management community perceives it to be. The risk identified closer to the decision played a major role in the bias of the order quantity chosen by the subjects. Close to the decision, if the subject was focused on the risk associated with excess inventories, then his/her order quantity was lower. On the other hand if the subject was focused on the risk associated with unsatisfied demand, then his/her order quantity was higher. We position these findings purely as exploratory with the intention that future well structured and focused studies will ascertain their validity.

The rest of this paper is organized as follows: The next section introduces the Verbal Protocol Analysis (VPA) methodology and details its strengths and weaknesses. We follow that up with a description of our experimental setup. We then present the results from a protocol analysis of the transcripts. The subsequent section describes the limitations of our study and how they can be alleviated in future studies. We conclude the paper with a summary discussion.

2. VPA Methodology

VPA is one of two (see Ford et al. 1989) methods commonly used to map the cognitive processes in decision making. It can be performed either concurrently (during the performance of a task) or retroactively (i.e., after the task has been completed). Under both approaches, the participants are urged to verbalize the thought processes underlying their decisions and the resulting audio tapes are transcribed, segmented, and encoded to obtain a trace of the decision‐making process. This approach has been widely applied in the fields of psychology, education, and cognitive science (Ericsson and Simon 1993). Estrada et al. (1997) and Isen et al. (1991) applied it in medical decision making while Isen and Means (1983) used it to study the impact of affect on decision‐making strategy. Despite its investigative power and popularity, VPA methodology has been criticized (Nisbett and Wilson 1977) for its apparent incompleteness and interference. It is perceived that no verbalization can capture all the thoughts that a decision maker's mind goes through, eventually skewing the observations made by the researchers. At the same time, the act of verbalization and the associated scrutiny may compel the decision makers to change the way they perform the task leading to inaccurate conclusions. Ericsson and Simon (1993) proposed verbal protocol collection methods and analysis procedures to overcome these confounding issues.

While it has never been applied to inventory decision making, VPA studies in the operations management context do exist. Crawford et al. (1999) investigated the work of industrial schedulers, van Wezel and Jorna (2009) studied the shunting operations at the Netherlands Railways, and Sanderson (1996) applied it to process control and transportation. Crawford et al. (1999) acknowledge the advantages of concurrent protocols, but also recognize their infeasibility and use retrospective protocols to characterize the relationship between human schedulers, technical systems, and the work environment. Van Wezel and Jorna (2009) describe the advantages of using experienced planners as subjects instead of students who, while being easily available, tend to be novices. Using observations of shunting schedulers, they identify task structures and develop a prototype planning system tailored to meet their needs. Sanderson (1996) demonstrates SHAPA, an interactive verbal protocol analysis tool, through its application in three experiments in process control, transportation, and navigation. Bainbridge and Sanderson (1995) provide a comprehensive description of VPA applications in operations management. With these studies as the background, we adapt verbal protocol analysis to a newsvendor decision‐making experiment.

3. Experiment Design

Our experiment was designed to observe how newsvendor subjects decide order quantities when the demand is random and there are costs associated with ordering too much or too little. To enable verbal protocolling, we designed the experiment so that only the most basic information is initially given to the subjects and they are expected to seek additional pieces of information that could play a role in their decision. This starts the process of verbalization, and they continue to do so during their decision‐making process. Participants were first presented with a sheet that contained the details of the task, the business setting, and the method of the experiment.

3.1. Task Instructions

Task: Your task is to determine the purchasing quantity of a product for the upcoming selling season. The forecasting process placed the expected demand at a value of 10,000 but the actual demand is uncertain. You need to determine an order quantity that maximizes the profit for your company. If you order too much, you will incur costs associated with items left over and if you order too little, you will be foregoing profits that you could have otherwise collected. So you must choose the order quantity carefully.

Business Setting: You are the purchasing manager in a big company that sells many products to many market segments. Among other things, you are responsible for one specific product.

For the upcoming selling season you have to decide the stocking level of this product in order to appropriately meet the demand for that product. Your company is very successful in the market segments it participates in and has invested significant amounts of money in technology in order to remain one of the leaders in the business. As a result of these technological investments, a vast amount of information is available to everyone in the company. You are allowed to ask the experimenter for information that you think could help you, but you need to be specific. Simply ask the experimenter for the information and if that information is available, it will be given to you. If it is not available, you will be informed of that as well.

Experimental Methodology: The methods behind this experiment use what is commonly known as a “think aloud” approach. As you perform your task, please express all the thoughts that enter your mind so that they can be recorded and analyzed. The experiment monitor is only present in the room in order to ensure that (i) the experiment runs smoothly and (ii) to provide any information you request. This person is not an expert in issues related to this business context.

Once the subject has reviewed the instructions, he/she signals the experimenter that the study can be started. The experimenter makes sure that the recording equipment is switched on and presents the subject with a sheet to record his/her decision. To enable the subject to start thinking aloud and also to refresh the subject's memory of the task details, the experimenter reads aloud a shorter version of the task instructions. The subject is then asked to perform the task of determining the order quantity while thinking aloud. As and when questions arise about the information required to make the decision, the subject can ask the experimenter to provide the information. While it is difficult to a priori think of all the information the subjects could ask for, we identified, based on our teaching and research experience, 10 (the subjects were not informed of this number) pieces of information, detailed in Table 1, that could be the most likely candidates. We identified a number of key words and representative questions and trained the experimenter to use them in figuring out which piece of information to provide to the subject. This aspect of the experiment was designed along the lines of Isen et al. (1991).

Table 1

Ten Pieces of Information that We Had Readily Available if the Subject Were to Ask for Them

Information type	Information description
DEMAND RANGE	The demand could be as low as 0 units and as large as 20,000 units
DEMAND DISTRIBUTION	The demand is uniformly distributed between the minimum and maximum. That is, the demand is equally likely to fall anywhere between the minimum and the maximum. There is a 50% chance that the demand is below the mean and there is a 50% chance that demand will be above the mean
SELLING PRICE	The product sells for US$900 a unit
PURCHASING COST	We pay US$300 per unit to purchase this product
SALVAGE VALUE	Leftover inventory can be salvaged for US$100 per unit. That is, we can receive a benefit of $100 for every unsold unit of inventory
LOSS OFGOODWILL	There is no loss of goodwill when we are unable to meet a customer's demand. We will, however, not be able to collect the profit margin we would have collected if we had the necessary inventory
QUALITY PROBLEMS	After inspection, 10% of the units are returned to the supplier. It is not clear whether the products are damaged during shipping or defectively manufactured. We must order more than what we need in order to ensure the quality of products sold to our customers. We obviously will pay only for the good products we keep
QUANTITY DISCOUNTS	The supplier offers a discount of US$50 per unit if the order quantity is at least 20,000 units. This discount is based on the order quantity and will not be affected by the number of defective units we send back to the supplier
DEMAND MANAGEMENT	We can invest US$250,000 in a CRM strategy and ensure that the demand falls between 5000 and 20,000 units
RAINCHECKS	We can give a US$100 off coupon or raincheck and be confident that the customer will come back later for the product

3.2. Participants

Twenty‐one second year MBA students who had expressed specific interest in operations participated in this study. They had been previously exposed to the newsvendor model in the classroom setting, but it is conceivable that they might have forgotten the mathematical details. In addition, having been involved in a course that had a number of factory visits and guest speakers from the industry, these students should have been in a position to ask the right questions. It is worth noting that the number of subjects used in this study is similar to or slightly larger than the number of subjects used in earlier decision‐making studies using verbal protocol analysis. Isenberg (1986) used 15 subjects and Ball et al. (1998) used 20. As in Isen et al. (1991), the incentive for participation was a flat (i.e., not impacted by the subject's performance) US$15 payment.

3.3. Discussion of our Experimental Setting

Our experimental setting differs from the previously conducted newsvendor experiments in many ways. Ours was a one‐time decision‐making situation whereas the earlier experiments studied these decisions in a repeated environment. In spite of this difference, we were pleased to observe the pattern of anchoring and insufficient adjustment reported in earlier literature. Based on the analysis of the order quantities chosen by our subjects, we are very confident in concluding that they behaved similarly to the subjects in earlier studies. We, however, now have information on how these subjects reached these decisions.

Our experiment also differs in the manner in which information was presented. In the previous experiments, all the relevant information (such as costs, demand distribution) was given en masse to the subjects. In our setting, only the most basic information (i.e., demand forecast) was initially told to the subjects. They had to obtain the rest of the information by specifically asking for it. This enabled us to determine which pieces of information they were able to recognize as being pertinent to this decision. As a result of the different subsets of information obtained by the subjects, each one would have a different decision that would be optimal for them. We evaluated the effectiveness of a subject's decision by comparing it with the optimal solution dictated by the set of information he/she possessed.

The previous experiments, with their emphasis on the end decisions, were conducted in a computer laboratory setting. Our experiment, due to its emphasis on interaction and audio recording, was conducted individually in a small room with a one‐on‐one interface between the subject and an experimenter. The experimenter was not an expert on inventory control, but was trained (supported by the use of key words and representative questions) to understand what the subject could be asking for. As a result, there was always a chance that the experimenter misunderstood the information requests made by the subjects. We are glad to report that our protocol analysis of the transcripts did not identify any such errors.

The interaction between the subjects and the experimenter was not designed to be a conversation and the subjects were made aware of this. For any information requested by the subject, the experimenter either: (i) acknowledged the availability of information and provided it, (ii) informed the subject that the information requested was not available, or (iii) reminded the subject that the request has to be specific. The idea behind designing the study in this manner was to simulate the availability of an information database (and not a domain expert) that the subjects could use to obtain any information that they felt could help them in their decision.

4. Results from Protocol Analysis

We transcribed each audio recording onto paper and performed a detailed protocol analysis of these transcripts. We first describe the methodology we used to analyze the verbal protocols. After that, we present the results of our analysis that can be broadly divided into four categories. We start with the information gathering efforts of the subjects. Initially we focus on the information that was available in the experiment and later we describe the information that was extraneous to the experiment. After that, we focus on the mechanics of the inventory control decision and how the subjects approached it. Then we focus on the specific risks to which the subjects paid attention and how that impacted their order quantity decisions.

4.1. Coding Process for the Verbal Protocols

We first parsed each protocol into thought fragments (segments of one or two sentences each) and every one of these fragments was classified into one of the following categories: (i) Task clarification; (ii) Information gathering; (iii) Statement of logic; (iv) Numerical calculation; (v) Statement of feeling such as frustration, excitement; and (vi) Finalizing the decision. Since these categories are clearly distinct, there was no subjectivity associated with them and the thought fragments could be easily and objectively categorized. Our analysis is mainly focused on categories (ii), (iii), (iv), and (vi) to gain new insights into the thought process behind the newsvendor decisions.

4.2. Seeking Information that was Available

Ten pieces of information were available to the participants. Table 2 shows which information was sought by each subject. An “×” in a cell indicates that the subject sought and acquired that piece of information. Notice that most subjects asked for and obtained pieces 1–5 only. The other five pieces of information were requested by only 2.4 subjects on average (ranging from 1 to 4). Only two pieces of information, namely purchase cost and selling price were acquired by all the subjects. The demand distribution (including the minimum and maximum) and the salvage value information were sought by two‐thirds of the subjects.

Table 2

Details on which Subjects Sought which Pieces of the Available Information

Subject number	Information available
Subject number	Demand range	Demand distribution	Selling price	Purchasing cost	Salvage value	Loss of goodwill	Quality problems	Quantity discount	Demand management	Rain checks
1	×	×	×	×	×					×
2	×		×	×	×
3	×	×	×	×	×
4	×	×	×	×		×
5	×	×	×	×
6			×	×
7			×	×
8	×	×	×	×	×					×
9		×	×	×	×
10	×		×	×	×	×
11	×	×	×	×	×	×
12	×	×	×	×	×
13	×	×	×	×	×		×	×
14	×	×	×	×	×
15			×	×		×
16	×		×	×			×	×
17	×	×	×	×	×
18	×	×	×	×	×
19		×	×	×	×
20			×	×	×					×
21			×	×					×
Total	14	13	21	21	14	4	2	2	1	3
Percentage	67	62	100	100	67	19	10	10	5	14

This leads us to conclude that most of the subjects were able to recognize the major factors (costs and demand uncertainty) that play a role in the newsvendor inventory decision, but failed to recognize the importance of more advanced (or non‐trivial) information that would have significantly influenced their decision. However, it is also interesting to see that five subjects (almost 25%) did not ask for demand distribution information. In the absence of that information, they have no choice but to anchor their decision to the mean demand.

4.3. Seeking Information that Was Not Available

Next, we focus on the efforts of our subjects in gathering information that was not available in the experiment. It should not be a surprise that during the study the subjects asked a large number of questions for which the experimental design did not have the information. While it is not clear whether having the information would have changed the way the subjects made their decisions, it is important to analyze those questions as well, since that could provide additional insights into the subjects' thought processes.

Table 3 contains the list of questions asked by the subjects and the frequency with which they were asked. Questions seemed to focus on different aspects of the business setting such as the product characteristics, where its utility came from, what other products the company manufactured, whether the product was perishable, etc. Clearly, the subjects seem interested in creating a business environment for their decision‐making situation. It is striking that more than two‐thirds of the participants asked about the decisions made in the past, decisions made by the competitors, decisions made by the vendors, etc. Perhaps they were apprehensive about making a decision with sole responsibility and were looking for other sources to justify their decision. That is, they may have thought that, if someone else such as the previous decision maker or a competitor made a decision similar to the one they were considering, they would be on firmer ground.

Table 3

A Summary of the Non‐Available Pieces of Information Sought by the Subjects

Information sought	No. of questions
What did we do in the past? Has there been a change this year?	17
Are there any storage and distribution costs or restrictions?	9
What is the Product? What Industry?	8
What are the competitors doing?	8
What is the customer base? Affordability? Market Size?	8
How was the forecast arrived at?	4
Is there any production capacity restriction?	4
Is the product perishable?	4
Can we place multiple orders in the season?	3
What is the length of a season?	3
Do you have the formula or equation?	3
Is the technology good? Is there new technology?	3
Who is the vendor? Where are they located?	2
What is the cost of capital?	1
What is the elasticity of demand?	1
What is number of stores selling this product?	1
Are there any batch sizes in production?	1
What is the correlation between demand and order quantity?	1
Is it sold in the US or abroad?	1

4.4. Order Quantities Chosen

Table 4 contains a summary of the order quantities selected and the time taken by each of the 21 participants in the study. These order quantities covered the whole range of the demand. That is, the lowest chosen value was zero and the highest chosen value was 20,000. Because of the availability of the transcripts, we can explore more about how the participants made their decisions. Order quantities may be classified into four different categories: (i) mean demand used, (ii) extreme value selected, (iii) optimal value chosen, and (iv) miscellaneous values.

Table 4

A Summary of the Order Quantities Chosen by the Twenty‐One Subjects

Subject number	No. of experiment questions	Order quantity	Time (seconds)	No. of other questions	Optimal order quantity	Order quantity differential
1	6	8900	593	3	13,333	−4433
2	4	15,000	621	5	15,000	0
3	5	11,000	538	0	15,000	−4000
4	5	10,000	727	1	13,333	−3333
5	4	15,000	860	2	13,333	1667
6	2	12,000	685	5	10,000	2000
7	2	9000	529	7	10,000	−1000
8	6	0	1219	5	6667	−6667
9	4	10,000	290	1	13,333	−3333
10	5	10,000	676	8	15,000	−5000
11	6	10,000	998	10	15,000	−5000
12	5	15,000	593	1	15,000	0
13	7	20,000	713	2	20,000	0
14	5	12,000	791	3	15,000	−3000
15	3	10,000	897	9	10,000	0
16	5	15,000	676	5	20,000	−5000
17	5	20,000	603	1	15,000	5000
18	5	15,000	948	0	15,000	0
19	4	12,000	676	3	15,000	−3000
20	4	10,000	676	6	10,000	0
21	3	20,000	819	5	15,000	5000

The optimal order quantity reported here is specific to the subset of information that the subject acquired.

4.4.1. Mean Demand Used. Six (#4, 9, 10, 11, 15, 20) out of the 21 subjects identified the mean demand (10,000) as their order quantity. Two (#19 and 20) of these subjects did not ask for the minimum and maximum values of demand or its distribution. As a result, they were not able to figure out the variation in demand and decided to stick with the mean. Two other participants (#4 and 9) were not able to estimate (or even identify) the risks associated with overage and underage and as a result were not able to use the distribution information even though they had it available. Subject #9 stated that he/she wanted to “be risk averse” and decided to go with the mean demand value. Subject #4 stated that he/she wanted to “order close to what I think will sell” and decided to order 10,000 units. The remaining two subjects (#10 and 11) correctly identified the overage and underage costs as US$200 and US$600, respectively. Subject #10 stated that “some operations model can be used” but could not figure out how to do it and decided to “sit at 10,000.” Subject #11 stated that he/she wanted to “go on the higher side of demand” but could not figure out what the extra quantity would be. So he/she decided to stay with the mean demand.

4.4.2. Selecting an Extreme Value. Four of the subjects (#8, 13, 17, and 21) chose an extreme value (0 or 20,000) as their order quantity. Subject #8 chose an order quantity of zero because he/she asked for and received the information about the rainchecks. Recognizing that US$100 per unit is a small sacrifice to battle against the uncertainty in demand, he/she decided to order zero units. Subject #17 compared the profits realized from 0 and 20,000 units and decided that ordering 20,000 units was more profitable and decided to go ahead with that.

Participants 13 and 21 were able to obtain information that helped them to lean toward ordering 20,000. Subject #13 asked for and received information on the price discount for ordering 20,000 units and the presence of a defect rate. He/she put those two pieces of information together and decided to order 20,000 units. Subject #21 asked for and obtained the information on the CRM strategy that guarantees that sales will be at least 5000 units. That encouraged him/her to order the full 20,000 units.

4.4.3. Ordering the Basic Optimal Value. Five subjects (#2, 5, 12, 16, and 18) figured out that the underage cost was three times the overage cost and used that ratio to determine that they should order three‐fourths of 20,000. This resulted in an order quantity of 15,000, which would be the optimal order quantity if only the first five pieces of information were available. Three (#5, 12, and 18) of them knew all the information necessary for them to make this determination. Subject #2, however, did not know the demand distribution but knew that the demand ranged from 0 to 20,000. He/she must have guessed that the demand was uniformly distributed between the minimum and maximum. Subject #16 knew the information on the defect rate and the quantity discounts, but ignored them while determining the order quantity.

4.4.4. Selecting Miscellaneous Values. The remaining six (#1, 3, 6, 7, 14, and 19) subjects chose order quantities ranging from 8900 to 12,000. Three (#3, 14, and 19) of these subjects correctly computed the overage cost (US$200) and the underage cost (US$600) and had the demand distribution information available to them, but they were not able to put this information together to come up with a good order quantity. Rather, they recognized that they should be above the mean and not knowing how far above, they chose 11,000 (subject #3) and 12,000 (subject #14 and 19) as their order quantities. Subject #6 chose 12,000 as his/her order quantity and upon closer examination, we realized that he/she did not have the demand distribution information. He/she was only aware of the selling price (US$900) and the purchasing cost (US$300). He/she, however, recognized that it “it is more costly to be under‐stocked than over‐stocked,” and ordered 2000 (randomly chosen, we believe) above the mean demand.

The other two subjects (#1 and 7) chose 8900 and 9000 as their order quantities. A closer examination of their analysis showed that they made major mistakes in their analysis. For example, subject #1 somehow figured that the cost of stocking out is US$700 per unit and the cost of overstocking is US$800 per unit. Based on these costs, he/she estimated that the order quantity should be below the mean and decided to order 8900 units. Subject #7 gathered only the information on selling price and purchasing cost, and using that, he/she correctly calculated that he/she would need to sell 3333 units to break even if he/she purchased 10,000 units. Not knowing where to go from there, he/she said that “to be conservative,” he/she decided to order 9000 units.

4.4.5. Relationship between Order Quantity and the Number of Questions. Figure 1 contains a graph of the number of questions asked by the participants seeking information that was not available in the experiment versus the order quantity selected by them. Notice that those people who ordered close to the mean asked significantly more questions than the rest. As discussed above, the subjects who chose values ranging from 8900 to 12,000 were the ones who seemed to have the most trouble with this task. They demonstrated a lack of confidence in solving this problem and this lack of confidence appears to have manifested itself in the number of questions they posed to the experimenter.

Figure 1

A Plot of the Order Quantity Versus the Number of Questions Seeking Information that Was Not Available in the Experiment

4.4.6. Summary of Observations on Order Quantities. 1

Most of the subjects (19 out of 21) had reasonable logic in determining the order quantity. Only the remaining two (about 10%) subjects made analytical or logical mistakes that prevented them from reaching a reasonable decision. However, for most of the subjects, the reasoning was not rigorous enough to lead them to the optimal decision.

Ten out of the 21 subjects used a parameter (minimum, mean, or maximum) of the demand distribution as their order quantity.

Five of the 21 subjects chose 15,000, which would be the optimal order quantity if using only the basic (pieces 1–5) information that was available to them. This subset of people fully understood the newsvendor model, remembered it, and was able to use it.

Four participants chose values of 11,000 or 12,000, recognizing that they should choose a quantity above the mean, but could not figure out how far above the mean they should be.

There were six people who correctly computed the overage and underage costs and were aware of the demand distribution. They were unable to convert that information correctly into the optimal order quantity, however. This indicates that this computation may be the hardest part in the newsvendor calculation.

The people who seemed to be unsure (and chose a value close to the mean demand) of their solution tended to ask more questions for which the information was not available in the experiment.

4.5. Time Taken for the Decision

The time taken by the subjects to complete this study varied from 529 to 1219 seconds. Not surprisingly, the subjects who spent more time asked more questions that were included in the experiment and also asked more questions that were not included in the experiment. We investigated whether this increase in the number of questions enabled them to make better decisions. Figure 2 contains a graph of the time taken by the subjects versus the error in their order quantity. There is no clear trend in the graph. We repeated this analysis using the percentage error as the measure of performance and there was no trend in that data either. Thus, we conclude that, while the people who took longer asked more questions, their decisions were no better.

Figure 2

A Plot of the Time Taken Versus the Error in the Order Quantity Decisions of the Subjects

4.6. Risk Identification Sequence and its Impact on Order Quantity

Here we focus on the sequence in which the risks were identified by each subject and how that impacted their order quantity decision. For each subject from the protocol analysis, we were able to determine whether he/she was first focused on the risk of excess inventory or the risk of unsatisfied demand. We were also able to determine the time at which they identified that risk. We attempt to relate that information to their eventual order quantity decision. Table 5 contains the details of that analysis.

Table 5

First Identified Risk and the Bias in the Order Quantity

Subject number	First identified risk	Order quantity	Bias in the order quantity
1	Underage	8900	−4433
2	Overage	15,000	0
3	Overage	11,000	−4000
4	Underage	10,000	−3333
5	Underage	15,000	1667
6	Overage	12,000	2000
7	Overage	9000	−1000
8	Underage	0	−6666
9	Underage	10,000	−3333
10	Overage	10,000	−5000
11	Underage	10,000	−5000
12	Underage	15,000	0
13	Overage	20,000	0
14	Underage	12,000	−3000
15	Overage	10,000	0
16	Overage	15,000	−5000
17	Underage	20,000	5000
18	Overage	15,000	0
19	Underage	12,000	−3000
20	Underage	10,000	0
21	Overage	20,000	5000

Ten subjects first identified the overage risk (the risk associated with excess inventory), while the other 11 subjects first identified the underage risk (the risk associated with unsatisfied demand). For the participants who identified the overage risk first and the underage risk later, the average order quantity was 13,700 units. On the other hand, for those who identified the underage risk first and overage risk later, the average order quantity was 11,173. This trend was also present in the bias (distance between the subject's decision and his/her optimal solution) of the order quantity. For the subjects who identified overage risk closer to the decision, the order quantity was on the average 800 units below the optimal value. On the other hand, for those who identified the overage risk closer to the decision, the order quantity was on the average 2000 units below the optimal value. This difference, illustrated in Figure 3, indicates the presence of a recency effect, which is known to exist (see Table 1 in Hogarth and Einhorn 1992) in complex tasks with end‐of‐sequence processing of information. The recency effect could be the driver behind the demand chasing behavior that is known (see Bolton and Katok 2008) to exist in newsvendor decisions.

Figure 3

A Plot of the Order Quantities Chosen by the Subjects as a Function of the Sequence in which they Identified the Risks Associated with Overage and Underage

5. Discussion

Based on the results observed from the protocol analysis of the subjects' thought processes while making the newsvendor decision, we gained new insights into how inventory decisions are made. This enabled us to formulate the following questions that could be studied in future research.

What role does the problem abstractness play in preventing the decision makers from choosing the optimal order quantities?

If the subjects are presented with information such as past decisions and competitor decisions, would they still anchor to the mean or would they choose a different anchor?

If the subjects were provided a decision support system (that computed the optimal order quantity given demand distribution, overage, and underage costs) would the subjects be able to reach the optimal decision?

Is the recency effect present in the newsvendor decision and how can it be used to enable the subjects to make better decisions?

Operations management decisions are so ubiquitous and critical in the modern economic environment and yet very little is known about the mental models (Senge et al. 1994) underlying these decisions. Verbal protocols and other process mapping techniques can make the thinking processes visible enabling the development of strategies to effectively battle the biases in the newsvendor, beer game, supply contracts, and other operations management decisions.

6. Limitations of Our Study

In spite of the number of new insights that our study provided, we recognize that it suffered from some experiment design issues, and here we detail how some of these limitations can be addressed in future studies. We use students as subjects to understand a ubiquitous real world decision while it would have been better to use practitioners as subjects. However, using students enables us to perform concurrent verbal protocol analysis which is richer (Crawford et al. 1999) than ex post verbal protocol analysis. We use the VPA method in an adaptive manner in that the subjects have to ask for and obtain new pieces of information and that possesses its own challenges. What information to make available and how to avoid the subjectivity associated with providing this information is an issue that requires a closer review. In a real world environment, this problem would never arise, but there information gathering and decision making occur over many days (Crawford et al. 1999) prohibiting detailed observation. Due to the small sample size, we were able to manually analyze the verbal protocols. There is a large volume of existing knowledge (Bainbridge and Sanderson 1995) on coding of verbal protocols that should be used if the sample size is large and/or if the protocols are long. Observations from Figures 1 and 2 could suffer from issues of endogeneity as one characteristic of the subject (e.g., confidence level, intelligence) could play a role in the reported outcomes. Such endogeneity can be avoided by measuring the subjects' intelligence, confidence level or other appropriate characteristics and evaluating their role in the experimental results. Our subjects received a flat compensation fee and, in the future, it may be necessary to devise a compensation scheme that paid subjects commensurate with their performance.

7. Conclusion

In this paper, we reported our observations on the newsvendor decision makers' thought processes achieved via protocol analyses of audio recording of participants' verbalization of their thoughts while solving the problem. These transcripts allowed us to understand the information‐gathering efforts of our subjects and further decipher how they used that information to make their inventory decisions. Thus, this study illustrates how protocol analysis may be used to investigate the processes of operational decision making.

We first observed that most of the subjects sought only the most basic information and failed to recognize the possibility of additional, non‐trivial information that should have helped them in their decision‐making process. Further, we observed that subjects had significant trouble with the abstractness of the problem setting and asked a number of questions aimed at removing this abstractness. Once the information was available, most participants were able to compute the overage and underage costs accurately, but failed to couple that with the demand information to determine the optimal inventory level. This led us to believe that computation of the critical ratio (and the subsequent conversion of that to the inventory level) is not as intuitive as commonly perceived by the academic community.

Finally, we noted the sequence in which the two risks (overage and underage) were identified by the decision makers and then observed that this sequence had a significant impact on the order quantity chosen. Those who first identified overage risk ordered a larger amount of inventory, while those with the other sequence (underage risk followed by overage risk) order a smaller amount of inventory. In addition to providing several insights into the decision makers' thought processes, this study raised a number of issues that could be pursued in future research.

References

Amaral

Tsay

A. A.

. 2009. How to win ‘‘spend’’ and influence partners: Lessons in behavioral operations from the outsourcing game. Prod. Oper. Manag. 18 (6): 621–634.

Bainbridge

Sanderson

. 1995. Verbal protocol analysis. Wilson

J. R.

Corlett

E N.

eds. Evaluation of Human Work—A Practical Ergonomics Methodology. Taylor & Francis, Philadelphia, PA, pp. 169–201.

Ball

C. T.

Langholtz

H. J.

Auble

Sopchak

. 1998. Resource‐allocation strategies: A verbal protocol analysis. Org. Behav. Hum. Dec. Proc. 76 (1): 70–88.

Bendoly

Donohue

Schultz

. 2006. Behavior in operations management: Assessing recent findings and revisiting old assumptions. J. Oper. Manage. 24 (6): 737–752.

Bolton

G. E.

Katok

. 2008. Learning‐by‐doing in the newsvendor problem: a laboratory investigation of the role of experience and feedback, Working Paper, Penn State University. Manuf. Serv. Oper. Manage. 10 (3): 519–538.

Boudreau

Hopp

McClain

J. O.

Thomas

L. J.

. 2003. On the interface between operations and human resource management. Manuf. Serv. Oper. Manage. 5 (3): 179–202.

Collier

D. A.

Evans

J. R.

. 2007. Operations Management: Goods, Services and Value Chains. Thomson South‐Western, Florence, KY.

Crawford

MacCarthy

B. L.

Wilson

J. R.

Vernon

. 1999. Investigating the work of industrial scheduler through field study. Cogn. Technol. Work. 1 (2): 63–77.

Ericsson

K. A.

Simon

H. A.

. 1993. Protocol Analysis: Verbal Reports as Data. MIT Press, Cambridge, MA.

10.

Estrada

Isen

A. M.

Young

M. J.

. 1997. Positive affect facilitates integration of information and decreases anchoring in reasoning among physicians. Org. Behav. Hum. Dec. Proc. 72: 117–135.

11.

Ford

J. K.

Schmitt

Schechtman

S. L.

Hults

B. M.

Doherty

M. L.

. 1989. Process tracing methods: Contributions, problems, and neglected research questions. Org. Behav. Hum. Dec. Proc. 43 (1): 75–117.

12.

Hogarth

R. M.

Einhorn

H. J.

. 1992. Order effects in belief updating: The belief-adjustment model. Cogn. Psychol. 24 (1): 1–55.

13.

Isen

A. M.

Means

. 1983. The influence of positive affect on decision‐making strategy. Soc. Cogn. 2 (1): 18–31.

14.

Isen

A. M.

Rosenzweig

A. S.

Young

M. J.

. 1991. The influence of positive affect on clinical problem solving. Med. Dec. Mak. 11: 221–227.

15.

Isenberg

D. J.

1986. Thinking and managing: A verbal protocol analysis of managerial problem solving. Acad. Manage. J. 29 (4): 775–788.

16.

Nisbett

R. E.

Wilson

T. D.

. 1977. Telling More than We Know: Verbal Reports on Mental Processes. Prentice‐Hall, Englewood Cliffs, NJ.

17.

Sanderson

P. M.

1996. Verbal protocol analysis in three experimental domains using SHAPA. Hum. Factors. Ergon. Soc. Ann. Proc. 40: 1280–1284.

18.

Schweitzer

M. E.

Cachon

G. P.

. 2000. Decision bias in the newsvendor problem with a known demand distribution: Experimental evidence. Manage. Sci. 46 (3): 404–420.

19.

Senge

P. M.

Kleiner

Roberts

Ross

R. B.

Smith

B. J.

. 1994. The Fifth Discipline Fieldbook: Strategies and Tools for Building a Learning Organization. Doubleday, New York, NY.

20.

Van Wezel

Jorna

. 2009. Cognition, tasks, and planning: Supporting the planning of shunting operations at the Netherlands Railways. Cogn. Technol. Work 11 (2): 165–176.