A Dynamic Model for Digital Advertising: The Effects of Creative Format,Message Content,and Targeting on Engagement

Abstract

The authors study the joint effects of creative format, message content, and targeting on the performance of digital ads over time. Specifically, they present a dynamic model to measure the effects of various sizes of static (GIF) and animated (Flash) display ad formats and consider whether different ad contents, related to the brand or a price offer, are more or less effective for different ad formats and targeted or retargeted customer segments. To this end, the authors obtain six months of data on daily impressions, clicks, targeting, and ad creative content from a major U.S. retailer, and they develop a dynamic zero-inflated count model. Given the sparse, nonlinear, and non-Gaussian nature of the data, the study designs a particle filter/Markov chain Monte Carlo scheme for estimation. Results show that carry-over rates for dynamic formats are greater than those for static formats; however, static formats can still be effective for price ads and retargeting. Most notably, results also show that retargeted ads are effective only if they offer price incentives. The study then considers the import of these results for the retailer's media schedules.

Keywords

online advertising ad formats (static vs. animated)ad content dynamic zero-inflated Poisson particle filtering/smoothing

Advertisers often use multiple creative formats in their digital campaigns to target and retarget consumers with product-based messages and price incentives. These include static formats (e.g., GIF, JPG) that offer neither animation nor interactivity; simple Flash formats (e.g., SWF) that offer animation but no interactivity; and rich-media formats (e.g., HTML, Java) that offer both interactivity and animation, with elements such as sound, video, floating images, and screen take-overs. As a result, advertisers have the nontrivial task of jointly assessing the effects over time of design elements available in the large number of such formats as they decide on budgets, message objectives, and consumer targeting. There is, however, some evidence from industry studies that ad format size, location, and creative elements such as color, interactivity, and animation may all independently influence engagement (e.g., Cole, Spalding, and Fayer 2009; DoubleClick 2009). Yet this evidence raises difficult questions. For example, a retailer may still wonder whether product-based content or price incentives would be more suitable message content for animated and static ads; or which ad formats and message are more effective for retargeting, the canonical tactic of tracking visitors to a firm's site and then serving the firm's ads to them when they visit other sites (Lambrecht and Tucker 2013).

The retailer may also be interested in the temporal effects of online ads, but extant work has been largely cross-sectional and so cannot help to formulate dynamic advertising strategies (Breuer, Brettel, and Engelen 2011). Internet ad–exposure models in marketing (e.g., Danaher 2007; Danaher, Lee, and Kerbache 2010) have, however, explored the performance of ad formats over time. Thus, it is necessary to consider not only when formats work but also how long their effects persist, so that firms can better match formats to ad messages and targeted consumers (e.g., Tellis, Chandy, and Thaivanich 2000). For instance, a large body of work on offline ads has suggested that ads have both instantaneous and long-term, or carryover, effects (e.g., Sethuraman, Tellis, and Briesch 2011). Yet studies of digital ads have largely ignored carry-over, attributing consumer engagement to recent impressions. Braun and Moe (2013) model carry-over effects but treat them as homogeneous. Given the evidence that carry-over may differ, for example, across e-mail and online channels (Breuer, Brettel, and Engelen 2011), a useful direction to explore would be to model heterogeneous carry-over effects. Furthermore, we know that the effects of ad messages may vary across media and markets (Deighton, Henderson, and Neslin 1994; D'Souza and Rao 1995; MacInnis, Rao, and Weiss 2002), so we might want to consider how such effects differ across online retargeted consumers. Knowing these features of digital ads, the effects of carry-over, format, target, and message could help managers improve ad engagement. This, in turn, could ultimately help firms better allocate ad resources throughout their digital advertising campaigns (Rust and Leone 1984).

This study attempts to fill some of the central gaps in extant work by developing a dynamic response model to study the joint effects of creative format, message content, and targeting/retargeting on the performance of digital ads over time. Specifically, the model examines the dynamic effects of ad theme (price or product) and creative format (animated and static ads of varied sizes) on the clicking behaviors of targeted versus retargeted consumers. We address the following substantive questions: How do carry-over effects vary across animated and static ads and across targeted consumers? What is the effect of format (size/position) on consumer clicking behavior? What are the effects of price theme versus product theme within different digital formats? (Prior research has posited that ad format effectiveness can vary with ad copy elements; Grass and Wallace 1969; Naik, Mantrala, and Sawyer 1998). Most important, which ad format and copy theme (price vs. product) are most effective for retargeting?

We also innovate methodologically to be able to extend econometric studies of advertising's dynamic and content effects (e.g., Bass et al. 2007; Chandy et al. 2001; Clarke 1976; MacInnis, Rao, and Weiss 2002) to the domain of digital advertising. Digital ad response data, consisting of clicks in this study, are time series of counts, which contain a high frequency of zeros due to nonresponse that result in “zero inflation.” This presents a challenge: failure to account for zero inflation and/or dynamics may result in misleading inference and the detection of spurious associations. To address these two concerns and the substantive questions posed earlier, we propose a dynamic, state space, zero-inflated count model (e.g., Poisson, negative-binomial). The resulting response model is both dynamic and nonlinear, and therefore, we estimate it using a combination of particle filtering and Markov chain Monte Carlo (MCMC) procedures (Doucet, De Freitas, and Gordon 2001; Liu and Chen 1998; Ristic, Arulampalam, and Gordon 2004). Particle filtering, in its many variations, is widely applied in statistics; it is a flexible Bayesian inferential method used to estimate nonlinear/non-normal dynamic systems. In these systems, the posterior distributions of the state space parameters are analytically intractable, and thus, the filter operates by drawing weighted samples from a time-varying proposal distribution (i.e., an importance function). The analytic expression for the optimal form of this importance distribution—optimal in terms of computational efficiency—is available only in special cases (e.g., Doucet, De Freitas, and Gordon 2001). It is possible, however, to obtain a linear/normal approximation of this function at its mode, where the mode arises from an iterative Newton-Raphson step embedded within the particle filter. The resulting algorithm provides an approach to estimate any state space model within the exponential family (Doucet, Godsill, and Andrieu 2000), and it is more general than Gaussian filters such as the extended and the unscented Kalman filters (Ristic, Arulampalam, and Gordon, p. 32).

The article, therefore, contributes to an emerging literature on digital ad response models in the following ways. First, we find that animated ads have significantly higher carry-over effects and impact consumer engagement over a longer duration than static ads, in all ad formats and among both targeted and retargeted consumers. Second, within the animated formats, price-themed ads are more effective than product-themed ads. Third, retargeted ads are effective only when they offer price incentives, a finding consistent with Lambrecht and Tucker (2013), who find retargeted ads effective only when consumers have strong preferences such that they have incentives to buy. Fourth, we find that all ads (i.e., by formats and messages) targeted to the female segment are effective; this suggests that, in our example, female shoppers are more willing to engage, perhaps confirming the axiom from brick-and-mortar studies that “women shop; men buy.”¹ Fifth, to answer our questions, we had to introduce new Bayesian methods that respect features of digital ad data and the underlying nonlinear dynamics process that generates them. We believe this approach, which is our main contribution, complements other methods, including bandit problems (sequential experiments) and the Bayesian algorithms (e.g., Thompson sampling) used to study them (Scott 2010); for modeling nonlinear dynamics, data sparsity (e.g., nonresponse), and the effects of multiple exposures (i.e., ad repetition) present challenges to such algorithms (e.g., Agarwal 2010; Schwartz, Bradlow, and Fader 2016). Admittedly, hierarchical modeling (i.e., Bayesian) can help obviate the sparsity problem when Thompson sampling is used, but dynamics and multiple exposures are not straightforward extensions. Finally, we conduct simulations to show the import of our findings; these should be of interest to online retailers and digital media planners.

“‘Men Buy, Women Shop': The Sexes Have Different Priorities When Walking Down the Aisles,” Knowledge@Wharton podcast (Nov 28, 2007), http://knowledge.wharton.upenn.edu/article/men-buy-women-shop-the-sexes-have-different-priorities-when-walking-down-the-aisles/.

To address the questions in the study, we obtained panel data from a major U.S. retailer in an industry that provides products and services for the home. The data set offers a selection of daily ad impressions and their associated clicks, with both clicks and impressions disaggregated by consumer targets, ad format, and message content; ad networks commonly release such data to their clients. Specifically, the data set contains a panel of click counts for 154 days, across six creative formats and four targeted segments: Flash and GIF formats² in each of three size-orientation combinations (160 × 600, 300 × 250, and 728 × 90), and segments classified as retargeted, male, female, and age. A unique feature of the data is that daily impressions (within format and target) cluster into price, product, and control impressions, where price impressions are price promotion ads, product impressions are ads that stress brand benefits other than price; and control impressions are blank impressions used to exclude non-U.S. consumers from viewing specific ads. We (and the retailer) note that these blank ads often artificially inflate engagement measurements because viewers click on them, largely from curiosity (e.g., white objects become visually salient), but also in error (McConnell 2012). Finally, we model impressions as potentially endogenous (Lee, Hosanagar, and Nair 2015) because they may depend on omitted factors such as website content, format type, or clicking history.

Simple Flash ads contain animation frames and multiple click-through buttons but lack interactive elements; GIF ads have only a single click-through button and no animation frames or interactive elements.

The remainder of the article is organized as follows. The next section provides a brief review of the relevant streams in the advertising content and dynamic effects literatures. Subsequent sections develop the empirical model and describe its estimation and the data we employ, in that order. The last two sections describe our estimation results and conduct simulations to summarize their impact on a hypothetical media schedule. The article concludes with an overview of the findings and the limitations of the study.

Literature

We provide a brief review of the academic literature relevant to the effects of ad formats (size and animation), content, targeting, and carry-over rates, all on response metrics such as click-through rate (CTR), attention, and recall. Admittedly, we know much about the effectiveness of traditional ads, but our understanding of the effectiveness of digital ads is rapidly evolving. This review reflects that notion.

The Impact of Ad Size

Although one would expect larger banner ads to be more effective than smaller ads (ceteris paribus), the evidence seems inconclusive. Larger ads seemingly could improve memory for products and are more likely to be seen and remembered relative to smaller ads (Chandon, Chtourou, and Fortin 2003; Cho 1999). They have also been associated with greater attention and response (Baltas 2003), greater intention to spread positive word of mouth (Chtourou and Chandon 2000), higher recall (Chatterjee 2008), and higher CTR (Rettie, Grandcolas, and McNeil 2004; Robinson, Wusocka, and Hand 2007).³ Yet Cho (2003) and Drèze and Hussherr (2003) find no significant effect of ad size on engagement. They suggest that users learn to avoid looking at ads, even though the ads may affect them through their peripheral vision. These differing results perhaps suggest a tension between the ability of large ads to attract attention and their more intrusive nature that leads to avoidance. Thus, the problem needs more study, with a focus on both the research methods and the ad context (e.g., type of products/websites).

Not surprisingly, given its interest in ensuring continued growth of digital advertising, DoubleClick (2009) also reports that CTRs for large ads (300 × 600 and 240 × 400) were three times greater in their study than those for smaller ad formats.

The Impact of Animation

Experiments, in contrast, have confirmed that animation in banner ads can attract users’ attention and increase engagement. For example, Li and Bukovac (1999) find that users are able to quickly identify and better recall animated banner ads than static banner ads. Cho, Lee, and Tharp (2001) show that a higher degree of forced exposure to animated banner ads yields higher CTRs and more favorable attitudes among users. Animation has also been associated with greater clicking behavior in econometric studies (Hong, Thong, and Tam 2007; Lohtia, Donthu, and Hershberger 2003; Tsang and Tse 2005). These studies propose that when consumers have not decided on the items they want, they are more likely to click on animated ads because these ads may lead them to attribute a higher quality to the advertised products or pay greater attention. Similarly, other works have suggested that animation is more likely to be effective when user experience and brand familiarity are low (Dahlen 2001) or when users are searching for fun rather than for specific information (Tuten, Bosnjak, and Bandilla 2000).

Impact of Ad Content

There have been several major studies on the effects of ad creative or content in offline advertising, but there have been no major studies considering these effects in digital ads. For example, MacInnis, Rao, and Weiss (2002) find in a study of TV commercials that emotional content is more likely to increase sales and that ads that use rational appeal are less likely to produce increases. Chandy et al. (2001) study the effects of advertising on sales across multiple creatives. While they find many creative executions to be ineffective in increasing sales, they confirm that emotional ads are more effective in mature markets and argument-based appeals more effective in newer markets. Similarly, Bass et al. (2007) find that rational ads wear out faster than emotional ads: for example, in their data, price advertising has the highest wear-out among all appeals. Still, in the digital space evidence about the role of ad content is still emerging. Chtourou, Chandon, and Zollinger (2002) suggest that banner ads with promotional incentives have higher CTRs than those that lack incentive offers. Xie et al. (2004) also find evidence that incentive offers improve CTRs but that the effect varies by the type of appeal (rational vs. emotional). Thus, for example, in their study, banners with positive emotional appeals and incentive offers generated higher click-through than those with positive appeals and no incentives. Similarly, Hupfer and Grey (2005) show that banner ads that offer a free sample achieved higher click-through than banner ads with information only. Braun and Moe (2013) also find that the effects of creative content in banner ads can differ, even though their data do not ascribe substantive meaning to these contents. Nevertheless, taken together, these studies indicate the importance of ad content for digital media.

Impacts of Ad Targeting

The marketing literature has shown that more precise targeting can increase CTRs for banner ads (Briggs and Hollis 1997; Chandon, Chtourou, and Fortin 2003; Chatterjee, Hoffman, and Novak 2003; Sherman and Deighton 2001). For example, with regard to retail shopping, there is some evidence that women are more invested in the experience and thus more likely to spend more time browsing online; in contrast, men are more goal oriented (Passyn, Diriker, and Settle 2011), and for many product categories, women are the primary purchasers. Moreover, given current and exact technology, once consumers browse a firm's website, an ad network can use their browsing histories to serve that firm's banner ads to them when they the visit other sites. Research suggests that such retargeted ads are, on average, surprisingly ineffective unless the consumers’ preferences for products viewed earlier are well defined, that is, unless “they have a detailed view of what product they wish to purchase” (Lambrecht and Tucker 2013, p. 2). This suggests that retargeted ads that offer the consumer incentives to buy should, on average, be more effective than ads that merely provide nonprice information.

Carry-Over Rates for Different Media and Targets

Braun and Moe (2013) evaluate the carry-over effects of banner ads in a model designed to study effectiveness of creative content, where carry-over is the extent to which past impressions affect the contemporaneous effects of banner ads on response behavior (Bass et al. 2007). The study uses data across individuals to obtain a homogeneous estimate of carryover.

It is, however, well documented that the effects of advertising (and, thus, carry-over) can differ across channel, target markets and media. For example, Sethuraman, Tellis, and Briesch (2011) report, from a meta-analysis of 56 studies, that television advertising has higher short-term elasticity but lower long-term elasticity than print advertising. Berkowitz, Allaway, and D'Souza (2001) model weekly data from three stores of a large national retailer and find that the carry-over effect of radio is higher than that of billboards. Similarly, Naik and Raman (2003), in a study that considers media synergy, find that carry-over for television is approximately 2.5 times that of print. With regard to markets, the literature (Deighton, Henderson, and Neslin 1994; D'Souza and Rao 1995) have reported that advertising is more effective among consumers who are more loyal for experience than for search goods (Hoch and Ha 1986). Finally, Breuer, Brettel, and Engelen (2011) find in a study of online channels that e-mail advertising has a longer effect than banner advertising. This raises questions of how carry-over effects may vary across different online formats, animated and static ad types, and different targets. Our study provides some answers to these questions.

Dynamic Model of Digital Advertising

We now present a nonlinear, state-space model to track the effectiveness of online display ads over time, across digital formats and targets. The model adopts an observation equation in which daily clicks follow an event-count distribution (e.g., Poisson), extended however to allow for different forms of nonresponse (e.g., zero clicks), because it can be shown that the presence of zeros in count data may lead to overdispersion, where the variance of the count distribution exceeds its mean (Greene 1994). This and other forms of dispersion violate basic assumptions in the standard event-count models. The state equation, in contrast, assumes a model of advertising goodwill (Nerlove and Arrow 1962) in which goodwill evolves over time as a function of banner size, animation, ad targeting, and different thematic impressions (price and product). The natural thematic variation in the data allows for the identification of the effects of price and product ads. Moreover, we control for the potential endogeneity of targeted impressions because such impressions may covary with unobservable site content, format type, or clicking history.

First, in a dynamic model of display advertising, one has to account for presence of excess zeros (zero-inflation) because of the typical low level of response to digital ads. Here, one may observe zero clicks because online consumers are unaware of advertising impressions, or they are aware but choose not to respond for numerous reasons, many unobservable to the researcher. In the former case, we observe what is often termed “structural zeros,” which are inevitable; in the latter case, we observe “sampling zeros,” which occur at random; the two events emerge from potentially distinct data generating processes (e.g., Greene 1994; Lambert 1992).

Let f(y_ijkt|λ_ijkt) be the distribution for the random number of clicks y_ijkt during period t, for format i (flash or GIF), size j (728 × 90, 300 × 250, or 160 × 600) and target k with mean E(Y_ijkt) = λ_ijkt. In Equation 1, we adopt the familiar Poisson count distribution (Poisson-P); in a later section, we consider a linear model and other count models, including the negative binomial and the zero hurdle models:

f^{p} (y_{ijk t} | λ_{ijkt}) = \frac{exp (- λ_{ijkt}) λ_{ijt}^{Y_{ijkt}}}{y_{ijkt}!} .

(1)

As just described, we may observe zero clicks under distinct data-generating processes: we may observe structural zeros when, say, impressions go unnoticed, or randomly as a count event from Equation 1. Suppose p_ijkt is the probability of observing structural zero clicks for format I, size J, and target k; and, conversely, supposed 1 – p_ijkt is the probability that click-through occurs at some rate λ_ijkt, with I(Y_ijkt = 0) an indicator function; then the distribution of clicks is the following:

π (Y_{ijkt} = y_{ijkt}) = {\begin{array}{l} p_{ijkt} + (1 - p_{ijkt}) f (0 | λ_{ijkt}) & if y_{ijt} = 0 \\ (1 - p_{ijkt}) f (y_{ijkt} | λ_{ijkt}) & if | y_{ijkt} > 0 \end{array},

(2)

or, equivalently,

π (Y_{ijkt} = y_{ijkt}) = p_{ijkt} I (y_{ijkt} = 0) + (1 - p_{ijkt}) f (y_{ijkt} | λ_{ijkt})

(3)

for I = 1, 2, …, I; j = 1, 2, …, J; k = 1, 2, …, K; t = 1, 2, …, T.

Note that Equations 2 and 3 can be viewed as a two-component mixture of an ordinary count distribution f(y_ijkt|λ_ijkt) and a degenerate distribution having a point mass at zero; that is, the probability of no response, π(Y_ijkt = 0), is a weighted average of both outcomes described previously. Note here the standard Poisson count model (obtained when p_ijk = 0) is fully embedded in Equations 2 and 3. Finally, let c_ijkt be a dichotomous variable that indicates whether the observed response (i.e., nonresponse) comes from the degenerate (c_ijkt = 1) or randomly from the ordinary event-count component (c_ijkt = 0). We propose a simple hierarchical model for c_ijkt, where c_ijkt ~ Bernoulli(p_ijk) and the probability p_ijk of the degenerate event has the logistic transformation p_ijk = [1 + exp(−γ_ijk)]^–1. Therefore, Equations 1–3 constitute a zero-inflated Poisson (Greene 1994; Lambert 1992); it is the most widely applied technique for addressing overdispersion in count data. Finally, the log-likelihood contribution from a single format and target in the Poisson case is therefore the following:

{LL}_{ijk} = \sum_{i = 1}^{T} log [p_{ijk} I (y_{ijkt} = 0) + \frac{(1 - p_{ijk}) exp ({−λ}_{ijkt}) λ_{ijkt}^{y_{ijkt}}}{y_{ijkt}!}] .

(4)

Given the preceding familiar framework, we can now develop a model to study the dynamic effects of advertising throughout a digital campaign. For instance, one may ask whether online ads exhibit carry-over similar to offline ads, and if they do, whether carry-over varies by digital media format and target. That is, does a consumer's decision to click on an ad at time t depend not only on the current impression but also on past, or carry-over, impressions? One could assess, too, whether some formats are better for different thematic impressions—specifically, whether incentive-based messages are better for retargeting—and assess the impact of size and animation on digital ad effectiveness. To do this, we employ a flexible state-space model, wherein unobservable mean clicks λ_ijkt in Equations 1–4 evolve over time in the following multiplicative way:

\begin{matrix} λ_{ijkt} = λ_{ijkt - 1}^{δ_{ijk}} exp (α_{ik} + σ_{jk} + \sum_{l = 1}^{L} β_{ijkl} \bar{f} (a_{ijklt}) + v_{ijkt}^{g}), \\ v_{ijkt}^{g} \sim N (0, ω_{ijk}^{2}), \end{matrix}

(5)

where goodwill is log of the latent mean clicks, g_ijkt = log(λ_ijkt), and

g_ijkt

= goodwill of ads in format i (Flash or GIF), size j, target k at time t;

\bar{f} (a_{i j k t})

= a function of ad impression a_ijklt in format i, size j, target k, theme l at time t;⁴

The estimation uses a semilog transformation: $\bar{f} (a_{i j k t})$ = ln(1 + a_ijklt). For the justifications for these functional forms, see Bass et al. (2007).

β_ijkl

= effectiveness of impression in format i, size j, target k, theme l;

σ_ijk

= carry-over rate in format i, size j, target k;

α_ik

= fixed effect of animation in Flash ads in target k;

σ_jk

= fixed effect of size across Flash and GIF ads in target k; and

V_ijkt^g

= mean-zero, normal error for format i, size j, target k.

Thus, with log-link g_ijkt = log(λ_ijkt), Equation 5 is the familiar discrete-time goodwill model due to Nerlove–Arrow. That is, one assumes that goodwill g_ijkt decays in proportion to prior goodwill g_ijkt–1 and is sustained here by an additive function of advertising exposures $\bar{f} (a_{i j k t})$ . Moreover, although in the digital arena consumers can only click through if they see an ad, the decision to click is attributed to both the current and the cumulative effects, or goodwill of past ad impressions. Without carry-over, our model would attribute ad response only to current impressions at time t. The fixed-effects parameters (α_ik and σ_jk) control for the possibility that digital media characteristics (i.e., size and animation), influence response; that is, online consumers may respond differently to messages in different ad formats and sizes. To separately identify the effects of size and animation, however, we set α_ik = 0 for the GIF formats. Apart from their simple interpretations in terms of our substantive questions, there are other desirable features of these fixed effect parameters: first, they help impose correlation among multivariate count data; second, they explicitly account for one source of endogeneity that could arise if they were omitted, because format and size effects could indeed be correlated with ad impressions.

Yet more likely sources of endogeneity in ad impressions, $cov (a_{i j k l t}, v_{i j k t}^{g}) \neq 0$ , may be due to the context consisting of unobservable site features such as the information content of the site. Ad networks are likely to serve more impressions to sites whose context matches the advertised product. Given these reliability concerns, we follow Naik and Tsai (2000) and Sonnier, Rutz, and McAlister (2011) to account for endogeneity:

m_{kt} = θ_{kt} + η_{kt} z_{kt} + v_{kt}^{m};

(6)

θ_{kt} = B_{k} θ_{kt - 1} + v_{kt}^{θ},

(7)

where

m_kt =

$[\bar{f} (a_{11 k 1 t}), \dots, \bar{f} (a_{23 k 1 t}), \bar{f} (a_{11 k2t}), \dots, \bar{f} (a_{23 k2t})]',$

v_{kt}^{m}

$[v_{11 k 1 t}^{m}, \dots, v_{23 k 1 t}^{m}, \dots, v_{11 k 2 t}^{m}, \dots, v_{23 k 2 t}^{m}]',$

v_{k t}^{θ}

$\begin{array}{l} [v_{11 k 1 t}^{θ}, \dots, v_{23 k 1 t}^{θ}, \dots, v_{11 k 2 t}^{θ}, \dots, v_{23 k 2 t}^{θ}], \\ v_{i j k t}^{θ} \sim N (0, Σ_{i j k}), [v_{i j k t}^{g}, v_{i j k t}^{m}] \sim N (0, H_{i j k}), a n d \end{array}$

H_ijk =

$[\begin{matrix} ω_{i j k}^{2} & S_{i j k}^{'} \\ S_{i j k} & Ω_{i j k} \end{matrix}] .$

That is, Equations 6–7 model impressions (m_kt) across formats, messages, and targets as functions of (1) covariates Z_kt, dummy variables for the category of the product embedded in these impressions; (2) a random measurement noise $v_{k t}^{θ}$ ; and (3) a latent time-varying component θ_kt that is governed by a first-order autoregressive process (Naik and Tsai 2000). The category dummies are proxy measures for the web context, aiming to capture endogeneity in impressions due to the matching of website with the retailer's advertised product; the latent measure θ_kt captures variation in impressions due to other unobserved factors. Recall that our data capture the product-related promotions of a (multicategory) retailer. As such, the objectives of these ads are to create awareness of and engagement with the retailer's products among targeted consumers, using targeted price- and product-related messages. Note that many of these products have specific uses and thus are often advertised on sites where they are related to the content of the sites; the retailer's targets are likely to visit contextually matched sites, and when they do so, they are likely to engage. This type of matching suggests that our product category dummies are potentially valid instruments, related to website content and traffic.⁵ Note also that our instrumental variable model is a state-pace model, with a time-varying intercept θ_kt as defined in Equation 6. This helps control for other time-varying unobservables that could covary with ad impressions. Formally, to control for potential endogeneity (which becomes relevant when elements of $cov (v_{i j k t}^{g} v_{i j k t}^{m}) = S_{i j k} \neq 0$ ), we condition the analysis of Equations 1–5 on v_ijkt^m (see, e.g., Rossi, Allenby, and McCulloch 2005).

The population R² values from the regression of log(1 + ad impression) against these instruments range from .63 to .73 across the four consumer targets (see Web Appendix for further assessment of our instruments).

In summary, we propose a model to investigate the effects of digital ads served across multiple formats, messages, and targeted consumer segments over time. The model has three major components: a nonlinear model of ad response that accounts for the presence of zeros in event-count data; a model of ad dynamics that links impressions and targeting decisions to ad response; and a linear measurement model that controls for endogeneity in ad impressions. The model can address several questions about the duration of advertising across digital formats, including whether some formats and retargeting strategies are more effective with price-based incentives and what the impact of size and animation are on digital ad effectiveness. First, however, we must develop an estimation scheme, whose primary challenge will be to recover time-varying vectors that include both linear and nonlinear components.

Estimation and Inference

We adopt a Bayesian approach to estimation because of its versatility and our need to evaluate nonlinear, nonnormal state-space parameters. With few exceptions (e.g., Lopes et al. 2010), the Bayesian approach to such problems relies on conditional independence to iteratively sample a sequence of conditional posteriors (for the fixed and time-varying parameters) rather than sample directly from their intractable joint (Doucet, De Freitas, and Gordon 2001). How, then, does conditional independence help resolve our estimation problem defined by Equations 1–7? First, consider our essential task: to recover a joint, but intractable, posterior p(θ_t, g_t|y_t, m_t, ζ) (intractable because y_t is nonlinear/non-Gaussian), where g_t = {g_11t, g_12t, …, g_IJt} (g_t = log(λ_t)) and θ_t = {θ_11t, θ_12t, …, θ_IJt} are the vectors of goodwill and measurement state variables just described; ζ is a collection of all the static parameters; and y_t and m_t are clicks and impressions, respectively, at time t (here we suppress the target subscript, k) (see Equations 1–5). Thus, in our case, conditional on g_t, the clicks y_t provide no further information for estimating the measurement state variable θ_t. In other words, g_t becomes a sufficient statistic for estimating θ_t; thus, Equations 5–6 become the linear observation equations and Equation 7 becomes the linear system equation for the state θ_t. We can therefore apply the basic Kalman filter/smoother algorithm to estimate p(θ_t|, g, m_t, ζ) and its related fixed parameters in ζ (see, e.g., Bass et al. 2007; Carter and Kohn 1994; Frühwirth-Schnatter 1994).

In contrast, the conditional posterior p(g_t|, y_t, ζ) of goodwill is nonlinear and non-Gaussian (because its observation Equation 2 is zero-inflated Poisson), and so there is no general (or closed-form) expression for its probability density function; we thus approximate it using the particle filter (Bruce 2008; Doucet, De Freitas, and Gordon 2001; Liu and Chen 1998). Particle filtering belongs to a class of sequential Monte Carlo integration methods based on Bayesian inference. It is more flexible than the extended and unscented Kalman filters, methods that work with Gaussian approximations for posterior densities, which make them simpler to implement and faster to execute but preclude them from modeling the higher-order moments of truly non-Gaussian distributions (Ristic, Arulampalam, and Gordon 2004). Particle filtering involves the use of particles, samples drawn from an importance function, and their associated weights to approximate the probability density function. The procedure, based on importance sampling (e.g., Geweke 1989), provides a discrete approximation to the posterior density of the states through a set of support N_s points (or particles) ${g_{0 : t}^{n}}_{n = 1}^{N_{s}}$ and their respective weights, ${w_{0 : t}^{n}}_{n = 1}^{N_{s}}$ , where $w_{t}^{n} > 0$ and $\sum_{n = 1}^{N_{s}} w_{t}^{n} = 1$ . These particles are drawn from an importance function; the choice of this density is one of the most important decisions in constructing particle filter algorithms.

Choosing an Importance Function

In many applications of the particle filter, the chosen importance function is the transition density (Equation 5) because it is simple and readily available from the model. Yet we know that particle filter algorithms that use this (prior) importance function often suffer from the degeneracy problem; that is, the variance of the importance weights increases over time. Intuitively, if the data are very informative (i.e., the variance of data distribution is very small), the algorithm would waste many samples and time by exploring regions of low importance. To make the method more effective, Doucet, De Freitas, and Gordon (2001) and Liu and Chen (1998) suggest importance functions of the form p(g_t|g_t–1, y_t), that is, ones that incorporate both the system and observation processes. Indeed, Doucet, De Freitas, and Gordon (2001) show that this importance function p(g_t|g_t–1, y_t) addresses the degeneracy problem by minimizing the variance of the (unnormalized) importance weight ${w_{0 : t}^{n}}_{n = 1}^{N_{s}}$ . Nevertheless, it is very difficult to derive optimal importance functions of the form p(g_t|g_t–1, y_t) analytically, outside a few special cases (e.g., Bruce 2008) and certainly not in our case, where the observations are nonlinear and non-Gaussian. Yet we can derive a linear-normal approximation of this optimal function (see the Web Appendix):

\tilde{p} (g_{t} | g_{t - 1}, y_{t}) = N (g_{t}^{*}, - {[l^{″} (g_{t}^{*})]}^{- 1}),

(8)

where l(g_t) = ln p(y_t|g_t)p(g_t|g_t–1), with derivatives of this log distribution, l′ and l″, evaluated at its mode:

\begin{array}{l} l^{'} (g_{t}^{*}) = \frac{\partial l (g_{t})}{\partial g_{t}} |_{g t = g t^{*}}, and \\ l^{"} (g_{t}^{*}) = - \frac{\partial^{2} l (g_{t})}{\partial g_{t} \partial g_{t}^{'}} |_{gt = g_{t}^{*}} . \end{array}

We can obtain the mode g_t* of l(g_t) by applying an iterative Newton–Raphson procedure, initialized with g_t⁰ = g_t–1 at each step of the filter:

g_{t}^{k + 1} = g_{t}^{k} - {[l^{w} (g_{t}^{k})]}^{- 1} [l' (g_{t}^{k})] .

(9)

Data and Identification

Recall that our substantive aim is to explore how central features of digital ads affect consumer engagement over time. Thus, to identify the cross-sectional and temporal features of the problem, we acquired panel data from a major U.S. retailer in an industry that provides products and services for the home. The data contain daily ad impressions served via an ad network, as well as the resulting clicks. Both sets of data are disaggregated by target, format, and message and cover a period of T = 154 days, from February 14, 2011, to July 17, 2011. In this campaign, the retailer targeted four broad segments, one behavioral (retargeted) and three demographic (male, female, age); and employed two ad formats, Flash (animated) and GIF (static). Flash ads appear as a sequence of four to eight time-delayed images, with the last image identical to the corresponding static GIF image. Flash ads not only include colorful, attractive animation but also deliver a longer message than GIF ads. There are also three standard size–orientation combinations for ads: 728 × 90 (“leaderboard”), 160 × 600 (“skyscraper”), and 300 × 250 (“box”) (see Figure 1). The retailer categorizes ads as price messages if they mention prices or price discounts and as product messages if they convey product attributes without reference to price. Finally, the ad network serves blank impressions (white spaces) to non-U.S. consumers to preclude them from viewing the ads. The network serves product offer and control impressions exclusively in Flash and GIF formats, respectively.

Figure 1

Digital Ad Formats

Model identification thus draws on a balanced panel of 24 time series (24 = 4 targets × 2 formats [Flash, GIF] × 3 sizes) each of length T = 154. How does this panel help identify our substantive parameters, the effect of ads across formats, messages, and targets? First, we have day-to-day variations in clicks and their associated impressions within each of the 24 target–format time series, with the impressions in each series further classified into price and product messages for Flash ads and price and control messages for GIF ads. Moreover, the correlation between impression pairs (price/product and price/control) in our sample is low (median = .0378); that is, there is daily variations in the number of impressions served for each theme such that comovement is negligible. This natural thematic variation (Schumann and Clemons 1989) allows one to recover the separate effects of price, product, and control impression on click response in each of the 24 target–format (time series) combinations.

Table 1 provides summary statistics for the data. We see that the average total number of clicks for Flash ads is about 100 times more than clicks for GIF ads, across all ad sizes and targets. Within the Flash ads, the leaderboard ads generate the highest average number of clicks, whereas the box ads have the lowest average. The average total number of clicks within the age segment is considerably higher than clicks within the other three segments. Quick calculations show that the firm serves about 40% product ad impressions and 60% price ad impressions. Nearly 59% of the retailer's ads are served to the age target segment; the remaining impressions are served to the retargeted (17%), male (8%), and female (16%) segments. Figure 2 plots time series of clicks for T = 154 across the formats. Note the spikes in the numbers of clicks around period 10; these occurred during the early spring, when consumers are interested in home improvement projects as winter is ending. Ad messages here offer specific promotions that take advantage of this interest.⁶ Figure 3 summarizes the data across formats and targets.

We provide evidence in the Web Appendix that the spike, which could induce endogeneity and as a result overstate advertising effects, is not a problem in this sample. We thank the Associate Editor for suggesting this additional analysis and discussion.

Table 1

Data Summary by AD Targets (Means)

Measure and Format	Retargeted	Male	Female	Age	Total
Clicks
Flash (160 × 600)	155.82	139.43	285.82	1,382.68	2,117.46
Flash (300 × 250)	292.94	70.55	217.36	1,453.74	2,034.59
Flash (728 × 90)	420.68	128.80	274.52	1,704.24	2,528.24
GIF (160 × 600)	1.45	.74	1.79	14.83	18.81
GIF (300 × 250)	1.53	.25	2.21	19.69	23.68
GIF (728 × 90)	1.60	.31	1.69	16.41	20.01
Impressions
Flash (160 × 600)
Product	181,845.58	114,546.73	149,402.92	1,098,548.9	1,544,344.1
Price	402,601.12	239,398.52	520,822.05	1,024,514.6	2,187,336.3
Flash (300 × 250)
Product	188,393.25	85,656.40	140,812.78	1,027,470.3	1,442,332.74
Price	448,608.59	17,8961.81	412,859.85	1,207,589.7	2,248,019.91
Flash (728 × 90)
Product	265,517.16	142,603.98	204,317.82	1,475,542.2	2,087,981.16
Price	570,619.16	276,865.37	601,745.19	1,508,415.9	2,957,645.10
GIF (160 × 600)
Price	331.16	64.19	449.08	529.85	1,116.22
International^a	155.09	443.08	814.88	6,584.31	7,997.36
GIF (300 × 250)
Price	311.68	47.95	927.75	535.08	1,580.63
International^a	235.36	297.36	1,203.09	14,454.53	16,190.34
GIF (728 × 90)
Price	259.74	69.18	794.23	584.44	1,527.88
International^a	363.46	460.91	1,215.66	13,104.37	15,144.39

“International” refers to blank impressions (white spaces) served in the place of ads to exclude consumers in overseas markets.

Notes: T = 154 days.

Figure 2

Clicks by AD Formats

Figure 3

Clicks by AD Formats and Targets

Table 2 compares the CTRs for different formats and ad sizes, across consumer targets. Flash ads presented to remarketing clickers in the GIF (300 × 250) format have the highest average CTR of .14%, suggesting GIF ads can be effective in some context. Nevertheless, these CTRs are very low, as expected. For example, the Hash ads have an average total CTR that ranges from .05% to .057%; CTR values for similar GIF ads range from .013% to .059%, if one ignores clicks from the large percentage of blank impressions served to non-U.S. visitors (McConnell 2012).

Table 2

CTR (%) by Media Formats and AD Targets

Format	Retargeted	Male	Female	Age	Total
Flash (160 × 600)	.0530	.0394	.0426	.0652	.0567
Flash (300 × 250)	.0460	.0267	.0393	.0651	.0551
Flash (728 × 90)	.0503	.0307	.0340	.0571	.0501
GIF (160 × 600)^a	.1003	.0906	.0043	.0684	.0134
GIF (300 × 250)^a	.1359	.0068	.0716	.0598	.0686
GIF (728 × 90)^a	.0357	.0727	.0525	.0413	.0435

CTR – price ads.

Notes: T = 154 days.

Estimation Results

Tables 3–7 display the results of our empirical analysis. Tables 3–5 report findings related to robustness checks and to the potential endogeneity of ad impressions. Tables 6 and 7 report estimates of the main parameters of the proposed model (dynamic zero-inflated Poisson; DZIP). Significance estimates in boldface are estimates whose 95% highest posterior density interval (HPDI) excludes zero. What follows are, first, reviews of the robustness and endogeneity results; next, reviews of results (Tables 6–7) related to the effects of ad format, carry-over, and message content across the four consumer targets in the study; and finally, a brief summary of the main conclusions.

Table 3

Alternative Models

Model	Description	DIC	Rank
Model 1	Dynamic zero-inflated Poisson (DZIP)	14,256.9	1
Model 2	Dynamic Poisson (DP)	22,989.4	2
Model 3	Dynamic hurdle Poisson (DHP)	23,733.6	3
Model 4	Zero-inflated Poisson (ZIP), no dynamics, δ = 0	25,402.3	4
Model 5	Dynamic zero-inflated negative binomial (DZINB)	31,705.8	5
Model 6	Dynamic negative binomial (DNB)	31,978.5	6
Model 7	Zero-inflated negative binomial (ZINB), δ = 0	32,556.4	7
Model 8	Normal dynamic linear model (NDLM)	33,455.8	8

Table 4

Effect of Product Contextual Variables on Impressions

	Retargeted	Male	Female	Age
Product Ads
Category 1	.0919	.0407	–.0122	–.0094
Category 2	–.0442	–.0284	–.2460	–.1262
Category 3	.5249	.6214	.8791	.6189
Category 4	.7994	.9754	1.1196	.9306
Category 5	.6459	.5762	.8399	.9794
Category 6	.3274	.2446	.2356	.3586
Category 7	–.1851	–.3391	–.4695	–.5563
Category 8	1.2745	1.3370	1.8473	1.3779
Category 9	–.0643	–.1789	–.3717	–.1785
Category 10	.1265	–.1205	–.0904	–.4301
Category 11	–.0387	–.1015	–.0675	.0682
Category 12	–.7680	–.9852	–1.4167	–.8812
Category 13	–.7346	–.8950	–1.3714	–.8087
Category 14	–.1503	.4245	–.1945	–.6860
Price Ads
Category 1	.0640	.0510	.0346	–.1529
Category 2	.0960	–.0135	.3031	.2283
Category 3	.0257	–.0559	.2017	–.0575
Category 4	.1597	.0530	.1768	.1298
Category 5	.3704	.4333	.8596	.8727
Category 6	–.0762	–.0776	.0800	–.0998
Category 7	–.0525	.0486	.0717	–.0942
Category 8	–.1455	–.1300	.1539	–.1491
Category 9	.2914	.3909	.2642	.4060
Category 10	.0729	–.0262	–.1305	.0099
Category 11	–.1270	–.2344	–.2613	–.1664
Category 12	.1459	–.0054	–.2542	.0390
Category 13	.1426	.2781	–.4545	.1064
Category 14	.2922	.2111	.4132	.4688

Notes: Boldface indicates values for which the 95% HPDI excludes zero.

Table 5

Measurement Model: Correlations with Goodwill Error

Format and Message	Retargeted	Male	Female	Age
Flash (160 × 600)
Product	.1801	.2788	.1596	.2224
Price	.1342	.2669	.1810	.3137
Flash (300 × 250)
Product	.2350	.1737	.2074	.2556
Price	.1294	.2492	.2788	.4743
Flash (728 × 90)
Product	.2188	.2660	.2077	.2191
Price	.1810	.2387	.2111	.3029
GIF (160 × 600)
Price	.1168	.0035	.1290	.2438
International^a	.1155	.0877	.0432	–.0592
GIF (300 × 250)
Price	.2051	.0660	.1416	.2316
International^a	.1402	.0255	–.0996	.4237
GIF (728 × 90)
Price	.1016	.0063	.2059	.3756
International^a	.0930	.3027	.0583	.0226

“International” refers to blank impressions (white spaces) served in the place of ads to exclude consumers in overseas markets.

Notes: Boldface indicates values for which the 95% HPDI excludes zero.

Table 6

Estimates From Proposed Model by Formats, Messages, and AD Targets

Parameters	Retargeted	Male	Female	Age
Flash Effect	2.4688	2.3394	2.8292	2.5346
Banner Orientation–Size
Combination
160 × 600, σ₁	–1.5515	–1.8996	–2.5996	–2.1887
300 × 250, σ₂	–2.0080	–2.1199	–2.7617	–1.3978
728 × 90, σ₃	–1.1505	–1.8390	–2.6533	–2.3606
Flash Formats
Flash (160 × 600)
Carry-over rate, δ₁	.5243	.6180	.5582	.6767
Product offer, β₁₁	.0025	.0254	.0372	.0303
Price offer, β₁₂	.1264	.0965	.1472	.1164
Flash (300 × 250)
Carry-over rate, δ₂	.6849	.7004	.5996	.6397
Product offer, β₂₁	.0110	.0308	.0335	.0245
Price offer, β₂₂	.0960	.1051	.1341	.0865
Flash (728 × 90)
Carry-over rate, δ₃	.6734	.6764	.5835	.7547
Product offer, β₃₁	–.0046	.0068	.0325	.0223
Price offer, β₃₂	.0784	.0406	.1330	.0935
GIF Formats
GIF (160 × 600)
Carry-over rate, δ₄	.1190	.1223	.1482	.1116
Price offer, β₄₁	–.0179	.0231	.1062	.0243
International, β₄₂^a	.7783	.3196	.8475	.8196
GIF (300 × 250)
Carry-over rate, δ₅	.1258	.2725	.1916	.1714
Price offer, β₅₁	.0961	–.0483	.1230	–.0050
International, β₅₂^a	.7774	.3661	.8076	.7053
GIF (728 × 90):
Carry-over rate, δ₆	.0869	.1410	.1615	.1288
Price offer, β₆₁	.0722	.0117	.1344	–.0188
International, β₆₂^a	.6507	.1920	.7671	.7942

“International” refers to blank impressions (white spaces) served in the place of ads to exclude consumers in overseas markets.

Notes: Boldface indicates values for which the 95% HPDI excludes zero.

Table 7

90% Depreciation and AD Elasticity

Target and Format	Mean (δ)	D₉₀ (Days)	Elasticity of Product	Elasticity of Price
Retargeted
Flash (160 × 600)	.5243	4.8404	.0043	.2663
Flash (300 × 250)	.6849	7.3075	.0344	.3128
Flash (728 × 90)	.6734	7.0502	–.0157	.2437
GIF (160 × 600)	.1190	2.6136	—	–.0182
GIF (300 × 250)	.1258	2.6639	—	.0983
GIF (728 × 90)	.0869	2.5217	—	.0786
Male
Flash (160 × 600)	.6180	6.0277	.0695	.2540
Flash (300 × 250)	.7004	7.6855	.1073	.3708
Flash (728 × 90)	.6764	7.1155	.0209	.2440
GIF (160 × 600)	.1223	2.6234	—	.0354
GIF (300 × 250)	.2725	3.1651	—	–.0621
GIF (728 × 90)	.1410	2.6805	—	.0140
Female
Flash (160 × 600)	.5582	5.2118	.0885	.3471
Flash (300 × 250)	.5996	5.7505	.0968	.3051
Flash (728 × 90)	.5835	5.5284	.0784	.3210
GIF (160 × 600)	.1482	2.7032	—	.1233
GIF (300 × 250)	.1916	2.8483	—	.1362
GIF (728 × 90)	.1615	2.2761	—	.1595
Age
Flash (160 × 600)	.6767	7.1221	.0956	.3660
Flash (300 × 250)	.6397	6.3907	.0682	.2412
Flash (728 × 90)	.7547	9.3869	.0939	.3933
GIF (160 × 600)	.1116	2.5918	—	—
GIF (300 × 250)	.1714	2.7789	—	—
GIF (728 × 90)	.1288	2.6430	—	—

Notes: Boldface indicates values for which the 95% HPDI excludes zero. Elasticity evaluated at posterior draws; elasticity = (∂λ/∂a)(a/λ) = aβf′(a)/(1–δ).

Model Selection

Table 3 compares the proposed DZIP model to seven alternative models, including the normal dynamic linear model (NDLM) and alternative count models—specifically, variants of the zero hurdle and negative binomial models, respectively:

\begin{array}{l} π^{H} (Y_{ijkt} = y_{ijkt}) \\ \begin{array}{l} = {\begin{array}{l} p_{ijk} & if y_{ijkt} = 0 \\ (1 - p_{ijkt}) f (y_{ijkt} | λ_{ijkt}, y_{ijkt} > 0) & if y_{ijkt} > 0 \end{array}, and \end{array} \end{array}

(10)

{\begin{array}{l} f^{NB} (Y_{ijkt} = y_{ijkt} | λ_{ijk t}) & = & \frac{Γ (k_{ijk} + y_{ijkt})}{Γ (k_{ijkt}) y_{ijkt}!} (\frac{k_{ijkt}}{k_{ijk} + λ_{ijkt}}) \\ \times {(\frac{λ_{ijkt}}{k_{ijk} + λ_{ijkt}})}^{k_{ijkt}} . \end{array}}^{k_{ijk}}

(11)

Note the proposed DZIP model dominates all alternatives, as indicated by its deviance information criterion (DIC) value. The DIC and similar model selection methods (Akaike information criterion [AIC] and Bayesian information criterion [BIC]) include penalty terms to offset gains in model fit due solely to added complexity, since more complex models with more parameters generally provide better fit. For Bayesian hierarchical models, however, the number of parameters is less clear. Spiegelhalter et al. (2002) propose the DIC to address this uncertain complexity in Bayesian hierarchical models (e.g., Equation 5). With the DIC, then, the worst model (Model 8) is the NDLM; this confirms that here, a normal approximation to the distribution of the data (clicks) is inappropriate. The negative binomial models also perform poorly relative to Poisson models, and among the latter, the DZIP model dominates. Thus, we show that it is important to consider the dynamic effects of digital ads as well as to control for overdispersion. Finally, Figure 4 shows the fit of the DZIP model by plotting its posterior mean (λ_ijkt) against the actual number of daily clicks for different formats and targets. The proposed model fits the data quite well.

Figure 4

Clicks Vs. Posterior Means by AD Formats and Targets

Endogeneity of Impressions

Tables 4 and 5 show the results of an analysis into the potential endogeneity of ad impressions. In implementation, this means controlling for the potential comovement of the measurement (v_ijkt^m) and goodwill noises ((v_ijkt^g) (Naik and Tsai 2000; Rossi, Allenby, and McCulloch 2005), after accounting for the unobserved information context of the publisher's site, using product category dummies. Table 4 reports the effects of these dummies on the volume of impressions. The significant parameters show that some category dummies predict product and price ad impressions and thus could account for the unobservable matching features of the publisher sites; therefore, if the correlations between v_ijkt^m and v_ijkt^g are not significantly different from zero, then one can conclude that measurement noise is inconsequential in our sample. Table 5, however, shows that 5–10 of the 12 correlations in each target are still significant. This suggests controlling for endogeneity is essential for determining the effectiveness of digitals ads (e.g., Lee, Hosanagar, and Nair 2015).

Animated versus Statics Display Ads

Tables 6 and 7 report estimates from the proposed DZIP model. First, Table 6 shows that Flash ads have significantly more average clicks than GIF ads, as seen by the fixed effect of Flash ads, supporting the notion that animation can foster engagement (Li and Bukovac 1999). These results are consistent across all consumer targets. Recall that the dependent parameter in Equation 5 is the log-link, log(λ_ijkt); thus, the effectiveness of Flash ads across retargeted, male, female, and age segments is 11.8, 10.4, 16.9, and 12.6 times (respectively) that of similar GIF ads, ceteris paribus. These results reflect the much greater sparsity of click response to GIF ads (Table 1). The effects of orientation–size combinations (exp(σ_ik)) across segments are significant too, but their relative effects on engagement are mixed, as predicted (e.g., Chandon, Chtourou, and Fortin 2003; Cho 2003). For example, box ads are most effective in the age segment, but leaderboard ads are most effective among retargeted consumers, and all are equally effective among women. The latter result seems to support the prediction that female retail shoppers are more likely to browse (e.g., Passyn, Diriker, and Settle 2011).

Carry-Over Effects

Carry-over rates for Flash (GIF) ads are significant across the four segments (Table 6), with values ranging from .52 to .75 (.09 to .27). Thus, animated banner ads have significantly higher carry-over rates than GIF ads, across consumer segments and size–orientation combinations. The increase in carry-over rates is roughly three to five times greater when one uses animated ads rather than static ads, across targets and formats. These results seem consistent with Naik and Raman (2003), who find that carry-over for TV (animated) is approximately 2.5 times that of static print. Thus, in our study, animated ads have the potential to engage consumers for longer periods. To make this result more concrete, we computed the 90% duration for each format and target (D₉₀ days), that is, the number of days it takes for an ad to lose 90% of its effect. Thus, in Table 7, the average D₉₀ across the four segments ranges from 4.8 to 9.4 days for Flash ads, while it is approximately 2.2–3.1 days for GIF ads. Similarly, in Table 7, the mean ad elasticity (calculated using the posterior draws) for Flash ads ranges from .2412 to .3708, whereas the range for GIF ads is .0682 to .1595.

Price- versus Product-Based Messages

Consider now the effects of product-based and price incentive–based messages—specifically, how these effects vary across creative formats and targeted consumer segments. Table 6 reports the immediate, or short-term, effects β_ijk of ads by themes, across formats and sizes, and among different consumers. From these results, we can see that price ads are more effective than product ads within the Flash format, in all sizes and target markets; this result builds on evidence that price incentives can motivate engagement (Chtourou, Chandon, and Zollinger 2002; Hupfer and Grey 2005; Xie et al. 2004). Product ads, nevertheless, are still effective in the male, female, and age segments across all size–orientation combinations (with one exception: leaderboard ads among men); although these effects differ marginally, they are on average highest among targeted women (.0325, .0335, and .0372, for 728 × 90, 300 × 250, and 150 × 600 Flash ads, respectively; see Table 6), whom retailing studies predict to be more engaged shoppers. Yet product ads are ineffective among retargeted consumers, while, in contrast, price ads in Flash format that are otherwise similar are effective in all segments, even among retargeted consumers. Recall that evidence suggests that retargeted ads are ineffective unless served to consumers who have well-defined preferences, such that they are willing to purchase (Lambrecht and Tucker 2013). Thus, our finding suggests that when retargeting consumers, one should also recognize that price incentives can be useful in making ads more effective by addressing consumer willingness to pay.

The discussion, hitherto, has reviewed the effects of Flash ads; earlier, we reported that, ceteris paribus, Flash ads garner more engagement than GIF ads. Table 6, however, shows that static GIF ads with price offer messages can be effective among retargeted and female shoppers. Furthermore, while price ads are more effective in generating engagement in the Flash than in the GIF format for the male and age segments (Li and Bukovac 1999), price ads are equally effective for GIF and Flash among women and the retargeted. Finally, we note the parameters for international GIF ads (e.g., β₄₂). Recall that because this campaign targets U.S. consumers, the ad server sends blank impressions to non-U.S. consumers. Nevertheless, these consumers may still click on blank images, usually from curiosity (e.g., when blank ads are visually salient; Wedel and Pieters 2008) but also in error (McConnell 2012). As a result, the parameters that capture the effects of these clicks are large and significant. While these measures have no managerial interpretation in terms of ad content, they do show how the tactic of serving blanks can distort naive measures of campaign effectiveness (e.g., CTR).

In summary, Tables 6–7 help reveal the workings of digital ads. For instance, in our sample, animated ads are more effective than static ads and have longer duration. There is also heterogeneity in the performance of banner ads across formats, messages, and targets. For example, within the Flash format, price ads are more effective in generating engagement than product ads in all three size–orientation combinations and all four target markets defined in this study. Product ads, in contrast, are ineffective among retargeted consumers. Thus, retargeted consumers are less likely to engage when ads exclude price incentives. Finally, although Flash ads engage more consumers than GIF ads, they are still effective for engaging retargeted and female consumers; the latter consumers are seemingly more willing to engage with ads of all formats and messages.

Robustness Check of Results

As a final step, we investigate the robustness of our findings by comparing them with results from five (simpler) variations of the proposed model (see the Web Appendix):

•

Model A: a linear state-space model (NDLM) with click data on original scale;

•

Model B: a log-linear state-space model with click data log-transformed;

•

Model C: a dynamic Poisson model without endogeneity or zero inflation;

•

Model D: a static Poisson model without endogeneity or zero inflation;

•

Model E: a dynamic negative binomial without endogeneity or zero inflation.

Notably, results from the generalized linear models (GLMs) (Models C, D, and E) are more consistent with the results from the proposed model (Table 3; Tables S1–S5 in the Web Appendix). The NDLM, by contrast reports mixed findings for Flash and size effects. That is, in some cases, Flash ads, on average, are no more effective at generating clicks than GIF ads, ceteris paribus. Similarly, the effects of some ad sizes are not significant. Although the log-linear Model B produces many results similar to the GLM findings, it too reports mixed results for the fixed effects of size and format. (We also estimated, but did not report, a square root–transformed data model and found conflicting evidence.) In general, we know that GLMs are better suited for count data, more so when they include zero observations; and log transformations are more effective when mean counts are large and overdispersion is small (e.g., O'Hara and Kotze 2010).

Reallocation Analysis

The final task of this study is to conduct a simulation that summarizes the import of the previous results. One approach is to see how these results influence the reallocation of ad impressions across the duration of the campaign. That is, given hyperparameters ζ, we solve a problem that reallocates the total ad impressions (b_t) in each period across ad format (GIF, Flash), sizes, themes, and targets to maximize the total expected clicks (y_t) over T = 154 days. That is, with estimates of the state vectors from the particle filter ${g_{0 : T}^{n}, w_{0 : T}^{n}}_{n = 1}^{N_{s}}$ , we solve the following problem:

\begin{array}{l} max_{a_{111} … a_{IJT}} \sum_{t = 1}^{T} \sum_{n = 1}^{N} w_{ijt - 1}^{h} E (y_{ijt} | g_{ijt - 1}^{n}) \\ such that \sum_{k = 1}^{K} \sum_{i}^{I} \sum_{j}^{J} \sum_{l}^{2} a_{ijk l t} \leq b_{t}, a_{ijklt} \geq 0, t = 1, \dots, T \end{array}

(12)

where $a_{i j t} = {{a_{i j k / t}}_{l = 1}^{2}}_{k = 1}^{K}$ and ${g_{i j t}^{n}, w_{i j t}^{n}} = {g_{i j k t}^{n}, w_{i j k t}^{n}}_{k = 1}^{K}$ are impressions, goodwill, and particle weights across formats, messages, and targets. In addition, $E (y_{i j t} | g_{i j t - 1}^{n})$ is the one-step-ahead forecast vector at the particle { $g_{i j t - 1}^{n}$ }, and a_ijklt is the impression for ad theme 1 in format i, size j, and target k at period t.⁷ We select solutions to Equation 12 that give allocations that represent improvements in ad efficiency. To do this, we exploit an advantage of the state space approach to optimize successively for one period (t) given the data at t – 1; this is a more appealing solution because impressions are bought in real time. (We also optimized similarly for two and four periods at a time; we report all results in the Web Appendix.)⁸ Finally, we keep the total number of ad impressions for a given optimization interval the same as the actual number of impressions but allow a reallocation across formats, sizes, targets, and messages.

Solved in Tomlab/SNOPT.

We thank the Associate Editor for this suggestion.

Table 8 shows the solution to Equation 12 with all hyperparameters ζ at their mean values. It reports, for target, digital format, and message, the actual number of impressions and the model-based prediction of the number of impressions needed to generate a higher number of clicks. Here, the model-based allocations generate approximately 17% more clicks than the current allocation. The results in Table 8 are largely consistent with findings discussed previously. Thus, overall, the model suggests a 19% decrease in number of impressions of product ads and a 13% increase in impressions of price ads. Much of this increase is attributable to the shift from product to price ads in the Flash format. Similarly, we observe higher impressions in the retargeted (21%) and female (5%) consumer segments. Consequently, the model recommends increases in static GIF (price) ad impressions, given that these were effective for retargeted and female consumers (Table 3). Finally, as a robustness check, we also solved Equation 12 over 1,000 random draws from the posterior, using a shorter period, given the computational complexity of solving Equation 12. The results, reported in the Web Appendix, are consistent with results in Table 8.

Table 8

Allocations: Actual and Model-Based Impressions

	Actual Impressions	Model-Based Impressions	Percentage Change (t – 1\t)(1)
Format and Message
Flash (160 × 600)
Product	2.3783	2.3053	–3.0676
Price	3.3685	4.5319	34.5372
Flash (300 × 250)
Product	2.2212	1.6987	–23.5230
Price	3.4620	3.4985	1.0540
Flash (728 × 90)
Product	3.2155	2.3561	–26.7275
Price	4.5548	4.7973	5.3249
GIF (160 × 600)
Price	.0017	.0037	115.2477
GIF (300 × 250)
Price	.0024	.0091	279.9628
GIF (728 × 90)
Price	.0024	.0072	201.8751
Targeted Segments
Retargeted	3.17005	3.8480	21.3872
Male	1.59892	1.0788	–32.5273
Female	3.12948	3.2943	5.2674
Age	11.31338	10.9905	–2.8541
Message Content
Product	7.8150	6.3601	–18.6164
Price	11.3920	12.8477	12.7806

Notes: Values for impressions are in hundreds of millions.

Conclusion

This study explores how the performance of digital ads is influenced by the joint effects of creative format, message content, and targeting as well as retargeting. Its goal is to reveal how central features of a digital campaign affect consumer engagement over time. The study accomplishes this by constructing a dynamic model and estimating it using a panel data set obtained from a major U.S. retailer. Formally, it proposes a dynamic (state-space) zero-inflated count model (Poisson), given the potential for zero inflation and temporal correlation in count series. The resulting model is both dynamic and nonlinear; therefore, we estimate it using a combination of particle filtering and MCMC. The resulting algorithm provides one approach to estimating any state-space model within the exponential family, and it is more flexible than Gaussian filters such as the extended and the unscented Kalman filters. The estimation also allows for endogeneity in ad impressions, possibly due to unobserved context of the publisher's site.

The study finds a number of substantive results. First, animated ads had significantly higher carryover effects and thus affected engagement over a longer duration than static ads, in all ad formats and among both targeted and retargeted consumers. Second, among animated formats, price ads were more effective than product ads. Third, retargeted ads were effective only when they offered price incentives, a finding consistent with Lambrecht and Tucker (2013), who find retargeted ads to be effective only when consumers have strong preferences such that they have incentives to buy. Ours is a useful result because it suggests that price sensitivity (perhaps more observable than consumer preferences) could help select retargeted consumers for engagement. Third, although Flash ads were more effective at engaging consumers, simpler static GIF ads could also be effective; in our case, they were effective for price ads served to retargeted and female consumers. Finally, note that all the retailer's ads (i.e., all formats and messages) targeted to the female segment were effective; this suggests that female shoppers were largely more willing to engage, perhaps confirming the axiom from brick-and-mortar studies that “women shop; men buy.”

Still, the work has a few potential limitations that could be addressed in the future. First, some of our findings may not generalize to other contexts. For example, the gender effects we note may have arisen because of the match between the retailer's product category and gender. Similarly, our data come from a single (albeit major) retailer, so, given recent findings (e.g., Li and Kannan 2014), we would be reluctant to generalize these results to other industries or smaller firms. Second, we estimate the model with daily but aggregate data, at the level of target, message, and format; our method lacks features such as those of exploration and exploitation embedded in sequential experiments (e.g., Thompson sampling). Yet ad networks commonly release such data to their clients, and it is in these cases that our method is most applicable. Individual data nevertheless could obviate some of the endogeneity issues we address statistically, but there are challenges to estimating dynamics at the individual/cookie level. For example, one would be required to estimate a large number of parameters from sparse data; although the data will contain many individuals, many of these will be unique or onetime visitors. To address this sparseness problem, one could perhaps build a hierarchical dynamic model using demographics and retargeting data (Agarwal 2010) to define segment-level distributions from which individual behavior could arise. Notably, in this case, the substantive parameters would again be at the segment level. Finally, although our model fits the data satisfactorily, another potential criticism, given the full Bayesian approach, is that we adopt standard parametric assumptions for all model components, for example, normal random noise in the state equation (Equation 5). To mitigate this criticism, one could model errors as Gaussian mixtures or take a fully Bayesian nonparametric approach in which the distributions of the errors are themselves unknown and treated as objects to be estimated (Hjort et al. 2010; Phadia 2013). Again, a nonparametric approach could be more feasible at the segment level, given data sparseness at the cookie level.

Footnotes

Overview of MCMC Algorithm

This appendix provides an overview of the MCMC algorithm we employ to recover both time-varying (g_t, θ_t) and fixed parameters ζ. Recall, our main task is to estimate a joint conditional posterior p(θ_t, g_t|y_t, m_t, ζ) that includes both linear and nonlinear time-varying components. It is easier to obtain this posterior from conditionals p(θ_t|, g, m_t, ζ) and p(g_t|, y_t, ζ). That is, we sample the first conditional with the Kalman filter because its state vectors, the measurement parameters $θ_{0 : T} = {θ_{t}}_{t = 1}^{T}$ , are linear, and the second with the particle filter because its state vectors are the nonlinear goodwill vectors $g_{0 : T} = {g_{t}}_{t = 1}^{T}$ . (Note that for simplicity, we will suppress the target index, k.)

References

Agarwal

Deepak

(2010), “‘A Modern Bayesian Look at the Multi-Armed Bandit’ by Steven Scott: Discussion,” Applied Stochastic Models in Business and Industry, 26 (6), 639–58.

Baltas

George

(2003), “Determinants of Internet Advertising Effectiveness: An Empirical Study,” International Journal of Market Research, 45 (4), 505–13.

Bass

Frank

Bruce

Norris

Majumdar

Sumit

and Murthi

B.P.S.

(2007), “Wearout Effects of Different Advertising Themes: A Dynamic Bayesian Model of Advertising-Sales Relationship,” Marketing Science, 26 (2), 179–95.

Berkowitz

David

Allaway

Arthur

and D'Souza

Giles

(2001), “Estimating Differential Lag Effects for Multiple Media Across Multiple Stores,” Journal of Advertising, 30 (4), 59–65.

Braun

Michael

and Moe

Wendy

(2013), “Online Display Advertising: Modeling the Effects of Multiple Creatives and Individual Impression Histories,” Marketing Science, 32 (5), 753–67.

Breuer

Ralph

Brettel

Malte

and Engelen

Andreas

(2011), “Incorporating Long-Term Effects in Determining the Effectiveness of Different Types of Online Advertising,” Marketing Letters, 22 (4), 327–40.

Briggs

Rex

and Hollis

Nigel

(1997), “Advertising on the Web: Is There Any Response Before Clickthrough?” Journal of Advertising Research, 37 (2), 33–46.

Bruce

Norris

(2008), “Pooling and Dynamic Forgetting Effects in Multi-Theme Advertising: Tracking the Ad Sales Relationship with Particle Filters,” Marketing Science, 27 (4), 659–73.

Carter

and Kohn

(1994), “On Gibbs Sampling for State Space Models,” Biometrika, 81 (3), 541–53.

10.

Chandon

Jean-Louis

Chtourou

Mohamed Saber

, and Fortin

David R.

(2003), “Effects of Configuration and Exposure Levels on Responses to Web Advertisements,” Journal of Advertising Research, 34 (3), 217–29.

11.

Chandy

Rajesh K.

Tellis

Gerard J.

MacInnis

Deborah J.

and Thaivanich

Pattana

(2001), “What to Say When: Advertising Appeals in Evolving Markets,” Journal of Marketing Research, 38 (November), 399–414.

12.

Chatterjee

Patrali

(2008), “Are Unclicked Ads Wasted? Enduring Effects of Banner and Pop-Up Ad Exposures on Brand Memory and Attitudes,” Journal of Electronic Commerce Research, 9(1), 51–61.

13.

Chatterjee

Patrali

Hoffman

Donna L.

and Novak

Thomas P.

(2003), “Modeling the Clickstream: Implications for Web-Based Advertising Efforts,” Marketing Science, 22 (4), 520–41.

14.

Cho

Chang-Hoan

(1999), “How Advertising Works on the WWW: Modified Elaboration Likelihood Model,” Journal of Current Research in Advertising, 27 (1), 33–50.

15.

Cho

Chang-Hoan

(2003), “The Effectiveness of Banner Advertisements: Involvement and Click-Through,” Journalism & Mass Communication Quarterly, 80 (3), 623–45.

16.

Cho

Chang-Hoan

Lee

Jung-Gyo

, and Tharp

Marye

(2001), “Different Forced Exposure Levels to Banner Advertisements,” Journal of Advertising Research, 41 (4), 45–56.

17.

Chtourou

Mohamed Saber

and Chandon

Jean-Louis

(2000), “Impact of Motion, Picture, and Size on Recall and Word of Mouth for Internet Banners,” INFORMS Internet and Marketing Science Conference (May), University of Southern California.

18.

Chtourou

Mohamed Saber

Chandon

Jean-Louis

, and Zollinger

Monique

(2002), “Effect of Price Information and Promotion on Click-Through Rates for Internet Banners,” Journal of Euro-marketing, 11 (2), 23–40.

19.

Clarke

Darell G.

(1976), “Econometric Measurement of the Duration of Advertising Effect on Sales,” Journal of Marketing Research, 13 (November), 345–57.

20.

Cole

Sally G.

Spalding

Leah

and Fayer

Amy

(2009), “The Brand Value of Rich Media and Video Ads,” DoubleClick research report (accessed April 16, 2013), http://static.googleusercontent.com/media/www.google.com/en/us/doubleclick/pdfs/DoubleClick-06-2009-The-Brand-Value-of-Rich-Media-and-Video-Ads.pdf.

21.

Dahlen

Michael

(2001), “Banner Advertisements Through a New Lens,” Journal of Advertising Research, 41 (4), 23–30.

22.

Danaher

Peter J.

(2007), “Modeling Page Views Across Multiple Websites with an Application to Internet Reach and Frequency Prediction,” Marketing Science, 26 (3), 422–37.

23.

Danaher

Peter J.

Lee

Janghyuk

and Kerbache

Laoucine

(2010), “Optimal Internet Media Selection,” Marketing Science, 29 (2), 336–47.

24.

Deighton

John

Henderson

Caroline

and Neslin

Scott A.

(1994), “The Effects of Advertising on Brand Switching and Repeat Purchasing,” Journal of Marketing Research, 31 (February), 28–43.

25.

DoubleClick (2009), “2009 Year-in-Review Benchmarks,” (accessed April 16, 2013), https://static.googleusercontent.com/media/www.google.com/en//doubleclick/pdfs/DoubleClick-07-2010-DoubleClick-Benchmarks-Report-2009-Year-in-Review-US.pdf.

26.

Doucet

Arnaud

de Freitas

Nando

, and Gordon

Neil

, eds. (2001), Sequential Monte Carlo Methods in Practice. New York: Springer.

27.

Doucet

Arnaud

Godsill

Simon

and Andrieu

Christophe

(2000), “On Sequential Monte Carlo Sampling Methods for Bayesian Filtering,” Statistics and Computing, 10 (3), 197–208.

28.

Drèze

Xavier

, and Hussherr

François-Xavier

(2003), “Internet Advertising: Is Anybody Watching?” Journal of Interactive Advertising, 17 (4), 8–23.

29.

D'Souza

Giles

, and Rao

Ram C.

(1995), “Can Repeating an Advertisement More Frequently Than the Competition Affect Brand Preference in a Mature Market?” Journal of Marketing, 59 (April), 32–42.

30.

Fruhwirth-Schnatter

Sylvia

(1994), “Data Augmentation and Dynamic Linear Models,” Time Series Analysis, 15 (2), 183–202.

31.

Gamerman

Dani

(1997), Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference. London: Chapman and Hall, 124–32.

32.

Geweke

John

(1989), “Bayesian Inference in Econometric Models Using Monte Carlo Integration,” Econometrica, 57 (6), 1317–39.

33.

Godsill

Simon

Doucet

Arnaud

and West

Mike

(2004), “Monte Carlo Smoothing for Nonlinear Time Series,” Journal of Statistical Association, 99 (465), 156–68.

34.

Grass

Robert

and Wallace

W.H.

(1969), “Satiation Effects of TV Commercials,” Journal of Advertising Research, 9(3), 3–8.

35.

Greene

William H.

(1994), “Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models,” Working Paper EC-94-10, Department of Economics, New York University.

36.

Hjort

Nils L.

Holmes

Chris

Muller

Peter

and Walker

Stephen E.

eds. (2010), Bayesian Nonparametrics. Cambridge, UK: Cambridge University Press.

37.

Hoch

Stephen J.

and Ha

Young-Won

(1986), “Consumer Learning: Advertising and the Ambiguity of Product Experience,” Journal of Consumer Research, 13 (2), 221–33.

38.

Hong

Weiyin

Thong

James Y.L.

, and Tam

Kar Yan

(2007), “How Do Web Users Respond to Non-Banner-Ads Animation? The Effects of Task Type and User Experience,” Journal of the American Society for Information Science and Technology, 58 (10), 1467–82.

39.

Hupfer

Maureen E.

and Grey

Alex

(2005), “Getting Something for Nothing: The Impact of a Sample Offer and User Mode on Banner Ad Response,” Journal of Interactive Advertising, 6 (1), 105–17.

40.

Lambert

Diane

(1992), “Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing,” Technometrics, 34 (1), 1–14.

41.

Lambrecht

Anja

and Tucker

Catherine

(2013), “When Does Retargeting Work? Information Specificity in Online Advertising,” Journal of Marketing Research, 50 (October), 561–76.

42.

Lee

Dokyun

Hosanagar

Kartik

and Nair

Harikesh S.

(2015), “Advertising Content and Consumer Engagement on Social Media: Evidence from Facebook,” working paper, Wharton School of Business, University of Pennsylvania.

43.

Hairong

and Bukovac

Janice L.

(1999), “Cognitive Impact of Banner Ad Characteristics: An Experimental Study,” Journalism & Mass Communication Quarterly, 76 (Summer), 341–53.

44.

Hongshuang (Alice)

, and Kannan

P.K.

(2014), “Attributing Conversions in a Multichannel Online Marketing Environment: An Empirical Model and a Field Experiment,” Journal of Marketing Research, 51 (February), 40–56.

45.

Liu

Jun S.

and Chen

Rong

(1998), “Sequential Monte Carlo Methods for Dynamic Systems,” Journal of the American Statistical Association, 93 (443), 1032–43.

46.

Lohtia

Ritu

Donthu

Naveen

and Hershberger

Edmund K.

(2003), “The Impact of Content and Design Elements on Banner Advertising Click-Through Rates,” Journal of Advertising Research, 43 (4), 410–18.

47.

Lopes

Herbert F.

Carvalho

Carlos M.

Johannes

Michael S.

and Polson

Nicholas G.

(2010), “Particle Learning for Sequential Bayesian Computation,” in Bayesian Statistics 9, Bernardo

, Bayarr

, Berger

, Dawid

, Heckerman

, Smith

, , eds. New York: Oxford University Press.

48.

MacInnis

Deborah J.

Rao

Ambar

and Weiss

Allen

(2002), “Assessing When Increased Media Weight of Real-World Advertisements Helps Sales,” Journal of Marketing Research, 39 (November), 391–407.

49.

McConnell

Ted

(2012), “How Blank Display Ads Managed to Tot Up Some Impressive Numbers,” Advertising Age (July 23), http://adage.com/article/digital/incredible-click-rate/236233.

50.

Naik

Prasad A.

Mantrala

Murali

and Sawyer

Alan

(1998), “Planning Media Schedules in the Presence of Dynamic Advertising Quality,” Marketing Science, 17 (3), 214–35.

51.

Naik

Prasad A.

and Raman

Kalyan

(2003), “Understanding the Impact of Synergy in Multimedia Communications,” Journal of Marketing Research, 40 (November), 375–88.

52.

Naik

Prasad A.

and Tsai

Chih-Ling

(2000), “Controlling Measurement Errors in Models of Advertising Competition,” Journal of Marketing Research, 37 (February), 113–24.

53.

Nerlove

Marc

and Arrow

Kenneth

(1962), “Optimal Advertising Policy Under Dynamic Conditions,” Economica, 29 (May), 129–42.

54.

O'Hara

, and Kotze

(2010), “Do Not Log Transform Count Data,” Methods in Ecology and Evolution, 1 (2), 118–22.

55.

Passyn

Kirsten A.

Diriker

Memo

and Settle

Robert B.

(2011), “Images of Online Versus Store Shopping: Have the Attitudes of Men and Women, Young and Old Really Changed?” Journal of Business & Economics Research, 9 (1), 99–110.

56.

Phadia

Eswar G.

(2013), Prior Processes and Their Applications: Nonparametric Bayesian Inference. New York: Springer.

57.

Rettie

Ruth

Grandcolas

Ursula

and McNeil

Charles

(2004), “Post Impressions: Internet Advertising Without Click-Through,” paper presented at Academy of Marketing Annual Conference 2004, Cheltenham, U.K., http://eprints.kingston.ac.uk/2104/1/Post_Impressions_Internet_Advertising_without_Click-Through.pdf.

58.

Ristic

Branko

Arulampalam

Sanjeev

and Gordon

Neil

(2004), Beyond the Kalman Filter: Particle Filters for Tracking Applications. Boston: Artech House.

59.

Robinson

Helen

Wusocka

Anna

and Hand

Chris

(2007), “Internet Advertising Effectiveness: The Effect of Design on Click-Through Rates for Banner Ads,” International Journal of Advertising, 26 (4), 527–41.

60.

Rossi

Peter

Allenby

Greg

and McCulloch

Rob

(2005), Bayesian Statistics and Marketing. Hoboken, NJ: John Wiley & Sons.

61.

Rust

Roland T.

and Leone

Robert P.

(1984), “The Mixed-Media Dirichlet Multinomial Distribution: A Model for Evaluating Television-Magazine Advertising Schedules,” Journal of Marketing Research, 21 (February), 89–99.

62.

Schumann

David W.

and Clemons

D. Scott

(1989), “The Repetition/Variation Hypothesis: Conceptual and Methodological Issues,” in Advances in Consumer Research, Vol. 16, Srull

Thomas K.

, ed. Provo, UT: Association for Consumer Research, 529–34.

63.

Schwartz

Eric M.

Bradlow

Eric T.

and Fader

Peter S.

(2016), “Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments,” working paper, DOI: http://dx.doi.org/10.2139/ssrn.2368523.

64.

Scott

Steven

(2010), “A Modern Bayesian Look at the Multi-Armed Bandit,” Applied Stochastic Models in Business and Industry, 26 (6), 639–58.

65.

Sethuraman

Raj

Tellis

Gerard J.

and Briesch

Richard

(2011), “How Well Does Advertising Work? Generalizations from a Meta-Analysis of Brand Advertising Elasticity,” Journal of Marketing Research, 48 (June), 457–71.

66.

Sherman

and Deighton

(2001), “Banner Advertising: Measuring Effectiveness and Optimizing Placement,” Journal of Interactive Marketing, 15 (2), 60–64.

67.

Sonnier

Garret P.

Rutz

Oliver

and McAlister

Leigh

(2011), “A Dynamic Model of the Effect of Online Communications on Firm Sales,” Marketing Science, 30 (4), 702–16.

68.

Spiegelhalter

David

Best

N.G.

, Carlin

B.P.

, and Linde

A.V.

(2002), “Bayesian Measures of Model Complexity and Fit,” Journal of the Royal Statistical Society B: Methodological, 64 (4), 583–639.

69.

Tellis

Gerard J.

Chandy

Rajesh

and Thaivanich

Pattana

(2000), “Decomposing the Effects of Direct Advertising: Which Brand Works, When, Where, and How Long?” Journal of Marketing Research, 37 (February), 32–46.

70.

Tsang

P.M.

and Tse

(2005), “A Hedonic Model for Effective Web Marketing: An Empirical Examination,” Industrial Management & Data Systems, 105 (8), 1039–52.

71.

Tuten

Tracy L.

Bosnjak

Michael

and Bandilla

Wolfgang

(2000), “Banner-Advertised Web Surveys,” Marketing Research, 11 (4), 17–21.

72.

Wedel

Michel

and Pieters

Rik

(2008), “Informativeness of Eye Movements for Visual Attention: Six Cornerstones,” in Visual Marketing: From Attention to Action, Wedel

Michel

and Pieters

Rik

, eds. New York: Lawrence Erlbaum Associates, 43–71.

73.

Xie

Tian (Frank)

, Donthu

Naveen

, Lohtia

Ritu

, and Osmonbekov

Talai

(2004), “Emotional Appeal and Incentive Offering in Banner Advertisements,” Journal of Interactive Advertising, 4 (2), 30–37.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.69 MB

0.00 MB