Abstract
Technological innovations, including artificial intelligence (AI), are progressively being integrated into marketing and branding. However, the novel nature of these technologies often shrouds their potential benefits and associated risks. This holds also true for the implementation of AI in the domain of brand voices. Brand voices may play a crucial role in fostering human-like brand perception (brand anthropomorphism), potentially offering competitive advantages for companies. However, the integration of AI in shaping brand voices in an advertising context remains a relatively unexplored area. This study addresses this gap by examining the impact of pre-recorded and AI-generated synthetic brand voices on brand anthropomorphism and brand equity. Conducted through a 3 × 2 online experiment the study shows that both pre-recorded and synthetic brand voices positively influence brand anthropomorphism. Additionally, brand anthropomorphism emerges as a strong mediator between brand voices and brand equity. Notably, the positive effects persist even when the synthetic brand voice is disclosed as being AI-generated. Theoretical and practical implications of these findings are derived.
Keywords
Introduction
Today’s brand and marketing landscape is increasingly transformed by technological advances (Mustak et al., 2021). To stay relevant, brand managers need to assess the extent to which they can use innovative technologies to evolve their brands and stay in touch with their target audience (Mogaji, 2021). AI-based tools in particular have become increasingly relevant at present, as they have the potential to accelerate and automate parts of marketing and advertising by, for example, creating content or analysing customer data on a large scale (Kietzmann et al., 2018).
While there is a fast-growing body of research in the implementation of artificial intelligence (AI) in marketing (Mustak et al., 2021), there is still little evidence on the effects of using AI specifically in brand communication and advertising which involve the strategic use of brand elements, to create ‘mental structures and helping consumers organise their knowledge about products and services in a way that clarifies their decision making and, in the process, provide value to the firm’ (Keller, 2003, p. 8).
Besides brand names, logos and slogans, companies are also using vocal brand elements (brand voices) in the context of their advertising activities to influence consumers’ brand attitudes. Numerous brands have adopted the distinctive voice of a spokesperson to serve as their representative, often spanning years and multiple channels (Zoghaib, 2017). For example, Apple adhered to a consistent instructive, neutral male voice-over for its iPhone advertisements until 2014. A significant shift occurred as the company transitioned to the vibrant voice of a youthful woman (Zoghaib, 2017). Another example is the German brand voice of Ikea, spoken by Jonas Bergström. The distinctive German voice with a Swedish accent has been conveying the brand’s identity since 1999.
In the context of advertising, brand managers often try to use brand elements in such a way that consumers perceive brands as human-like (Epley et al., 2007). The literature shows that this brand anthropomorphism strategy (Puzakova et al., 2009) can have a positive impact on consumers’ brand attitudes and behavioural intentions (Golossenko et al., 2020; Sharma & Rahman, 2022). However, while the use of visual brand elements such as the logo or the brand character to elicit brand anthropomorphism has been extensively explored in the literature, there is a lack of insight into the role of brand voices in this context.
Given the human-like character of a brand’s spokesperson voice, it can be proposed that brand voices also have the potential to help consumers perceive brands as anthropomorphic. Prior studies on the relationship between AI-generated voice assistants as conversational agents and perceived anthropomorphism point in this direction (Vernuccio et al., 2021). However, the effects of (pre-recorded vs. synthetic) brand voices via perceived brand anthropomorphism on brand perceptions and attitudes have not been examined in an advertising context yet.
Against this backdrop, this study conducts a 3 × 2 online experiment to investigate the relationship between (AI-generated) brand voices, perceived brand anthropomorphism and consumer-based brand equity as the sum of consumers’ perceptions of, attitudes as well as behavioural intentions towards a brand (Keller, 1993). More specifically, we examine the effects of two different types of brand voices (pre-recorded vs. synthetic) on brand anthropomorphism. Furthermore, we analyse how brand anthropomorphism, in turn, mediates the effects of brand voices on brand equity. To better understand the role of AI-generated synthetic brand voices, we also examine the influence of AI disclosure on brand anthropomorphism and brand equity.
Overall, the contribution of this article is threefold. First, while the literature has established the positive effects of brand anthropomorphism, previous studies have predominantly examined the use of visual brand elements in advertising, such as logos and brand characters, to induce brand anthropomorphism. We fill this gap by analysing the role of brand voices in this context.
Second, the investigation of AI for brand voices to generate brand anthropomorphism has so far taken place mainly in the context of interactive chatbots. This study takes a different approach, as we investigate the mediating role of brand anthropomorphism on the impacts of brand voices on brand equity in a non-interactive advertising environment.
Third, our study aims to provide insights that can help practitioners evaluate the implementation of AI-generated brand voices for their business. More specifically, this study may assist brand managers in implementing AI to increase advertising impact in terms of brand anthropomorphism and resulting brand equity.
Thus, the remainder of this article is structured as follows. In the next section, hypotheses will be developed based on a literature review regarding the interplay among brand voices, AI, brand anthropomorphism and brand equity. We then present the study design of the online experiment and results of the empirical study. In the last section, we conclude the paper with a summary of our results and a discussion of the implications and limitations of this research as well as recommendations for future studies.
Literature Review
Brand Voice and Brand Anthropomorphism
Audio branding is the process of building and managing a brand through the use of acoustic elements (Kilian, 2009). Acoustic elements can be expressed, for example, through music, sounds or speech (Gustafsson, 2015). The term ‘audio branding’ is also understood in the literature as ‘brand sound’, ‘sound branding’, ‘corporate sound’, ‘acoustic branding’, ‘sonic branding’ and ‘sound mark’ (Kilian, 2009). The overarching aim of audio branding is to establish associative anchors in the minds of the consumers in order to achieve perceived differentiation from the competition (Gustafsson, 2015; Kilian, 2009).
Fraile et al. (2021) distinguish seven different categories of audio branding, including brand voice. They refer to brand voices as the vocal element of brand communication. Like other brand elements, brand voice acts as a representation of brand identity to target audiences to achieve a desired image (Zoghaib, 2017). Similarly, Patrizi et al. (2024) conceptualise brand voices as an ‘innovative, fundamental component of brand semiotics, contributing to the creation of the central brand meaning system’ (p. 1).
Zoghaib (2017) distinguishes the voice of a brand spokesperson as a unique type of brand voice that can be specified by its human sound. Regarding these brand voices with a human sound, Vernuccio et al. (2023a) differentiate between a pre-recorded human voice and a synthetic voice. While the latter is generated via computer software such as a voice synthesiser, the human brand voice is pre-recorded by an actual human being (Chérif & Lemoine, 2019; Vernuccio et al., 2023).
For example, Vernuccio et al. (2021), drawing on the example of in-car name-brand voice assistants, show that brand voices with a human sound are typically used by brand managers with the intention to develop the anthropomorphic profile of the voice assistant. Anthropomorphism is the tendency to attribute human-like descriptors (traits, intentions, motivations or emotions) to the imagined or real behaviour of non-human actors (Epley et al., 2007). These can include physical appearance and typically human emotional and mental states. In psychology, anthropomorphism is considered an automatic and invariant psychological process that is a normal part of human judgement (Epley et al., 2007).
Brands can also be charged with human descriptors. Thus, brand anthropomorphism describes the perception of brands as entities with human-like characteristics that are capable of mapping mental and emotional states (Epley et al., 2007). Accordingly, Puzakova et al. (2009) have defined brand anthropomorphism as ‘brands perceived by consumers as actual human beings with various emotional states, mind, soul, and conscious behaviors that can act as prominent members of social ties’ (Puzakova et al., 2009, pp. 413–414).
The perception of a brand as a human-like entity is often achieved through the use of human characteristics and traits in branding and advertising (Golossenko et al., 2020). Branded products have been humanised in broadcast advertising, particularly through visual tactics such as imbuing them with human characteristics (Aggarwal & McGill 2007; Puzakova et al., 2013). This includes mascots and human-like characters (Patterson et al., 2013), depicting human actions (Golossenko et al. 2020) and using textual techniques such as first-person language (Aggarwal & McGill, 2007; Puzakova et al., 2013).
Prior research indicates that brand voices may also induce ‘humanised’ brands, for example, through the perceptions of warmth and competence (Wiener & Chartrand, 2014) as two dimensions of brand personality (Aaker, 1997). The concepts of brand personality and brand anthropomorphism are interrelated (Vernuccio et al., 2021). However, the role of brand voice as a vocal brand element in building brand anthropomorphism has relatively seldom been investigated in an advertising context. Consequently, Golossenko et al. (2020) as well as Guido and Peluso (2015) have called for further research on the effects of vocal brand elements on brand anthropomorphism.
On the other hand, regarding human–computer interactions, scholars have investigated the perception of voice assistants as a human interlocutor in several studies (Fernandes & Oliveira, 2021; Patrizi et al., 2021). In this context, experimental studies have highlighted the role of vocal stimuli as determinants of virtual assistants’ human-likeness. For instance, Cho et al. (2019) demonstrated that using voice instead of text enhances the perceived human-likeness of virtual assistants. For an overview, see Vernuccio et al. (2023).
Based on this, as well as the existing findings on the effect of brand elements that convey human characteristics on brand anthropomorphism in an advertising context, it is assumed in the following that a pre-recorded brand voice with a human sound, for example, in the form of a brand spokesperson’s voice, leads to perceived brand anthropomorphism. Therefore, the following hypothesis can be formulated.
H1: Advertisements featuring a pre-recorded human brand voice will lead to greater perceived brand anthropomorphism compared to advertisements without a brand voice.
AI and Brand Voice
AI is a catch-all term for algorithm-based machines that use a variety of technologies to learn from data and make predictions (Mustak et al., 2021). AI is transforming the modern economy in many areas: By 2022, global AI revenues are predicted to be $383.3 billion, with annual growth projected in the high teens (IDC, 2022).
AI is also becoming increasingly important in marketing and brand management (Mustak et al., 2021). Among other things, efficiencies are expected in the processes of campaign creation, planning, targeting and evaluation (Lu et al., 2019). AI enables a more comprehensive understanding of consumer behaviour and market trends, enabling more effective marketing strategies (Syam & Sharma, 2018).
AI is also becoming increasingly relevant for branding in the context of brand voices. Brands can nowadays use AI to design their own brand voices and deliver them at relevant touchpoints to target groups (He & Zhang, 2023).
Synthesised brand voices based on AI offer several advantages in terms of cost and flexibility (Stern et al., 2006). These advantages have long been offset by the fact that the voice may appear unnatural due to a lack of inflection and prosody (Kühne et al., 2020). Consequently, prior literature has emphasised that consumers are more likely to perceive an agent as anthropomorph when it has a pre-recorded human voice (and not a synthetic one) (Epley, 2018; Schroeder & Epley 2016). Furthermore, as Chérif and Lemoine (2019) have shown, virtual assistants with a human voice have a stronger positive effect on consumer responses.
However, the better the speech AI, the easier it is to overcome these hurdles. Until recently, only human brand voices could provide sufficient expressiveness and persuasiveness. Recent developments in voice synthesis make it possible to overcome these quality barriers (Tan et al., 2022). This means that AI-generated brand voices can more easily be imbued with human characteristics. Accordingly, Vernuccio et al. (2023) have argued that ‘if designed with human characteristics such as pitch, accent and quality, the human-like brand voice can induce in the consumer’s mind the perception of an anthropomorphic brand’ (p. 305).
Based on this, it can be assumed that when AI-generated synthetic brand voices are imbued with human characteristics, for example, as a brand spokesperson’s voice, consumers will get the impression of an anthropomorphic brand. This suggests that AI-generated synthetic brand voices will have a positive impact on brand anthropomorphism. With this in mind, the following hypothesis can be formulated.
H2: Advertisements featuring an AI-generated synthetic brand voice with a human sound will lead to greater perceived brand anthropomorphism compared to advertisements without a brand voice.
Brand Equity
In his seminal article, Keller (1993) understands brand equity as perceptual customer-based brand equity, which consists of several dimensions including brand awareness, brand perceptions, brand attitudes and behavioural intentions. Consumer-based brand equity is defined as the added value attributed to products or services based on consumers’ associations of a brand (Keller, 1993). Based on this, Yoo and Donthu (2001) have developed the concept of an overall brand equity, which represents the difference in consumer preferences between a focal branded product and an unbranded product given the same level of product features.
Many studies have investigated the antecedents and consequences of brand equity (e.g., Egbert & Rudeloff, 2023). However, no studies to date have analysed the relationships between (AI-generated) brand voices, brand anthropomorphism and brand equity. Thus, Vernuccio et al. (2023a) emphasise the need for further work on the effects of brand voice–induced brand anthropomorphism on consumers’ cognitive, affective and behavioural dimensions. This is plausible, as previous studies have highlighted the positive effects of perceived brand anthropomorphism on consumers’ perceptions of a brand.
For example, Valette-Florence et al. (2011) showed a positive relationship between brand personality and brand equity. Chen and Lin (2021) demonstrated that anthropomorphised brands positively affect consumers’ brand attachment by evoking positive emotions. Similarly, Ma et al. (2023) found that brand anthropomorphism significantly enhances emotional brand attachment.
Aggarwal and McGill (2007) have shown positive effects of brand anthropomorphism on brand evaluations. Guido and Peluso (2015) demonstrate the impacts of brand anthropomorphism on brand loyalty as ‘an important component of brand equity’ (p. 12). Furthermore, regarding behavioural intentions Lee and Oh (2021) have shown that anthropomorphism in hotel advertising leads to customers’ higher visit intention.
Similarly, Golossenko et al. (2020) show a positive relationship between brand anthropomorphism and brand trust. Furthermore, Singh et al. (2021) draw the connection between brand anthropomorphism and brand love.
Furthermore, Patrizi et al. (2024) showed that brand anthropomorphism positively affects brand trust as well as consumer–brand engagement in voice-based AI contexts.
In summary, the literature suggests positive effects of brand anthropomorphism on consumer attitudes as well as behavioural intentions towards brands. Thus, referring to Keller (1993), it can be proposed that consumers add value to a brand through perceived anthropomorphism. In other words, it can be expected that brand anthropomorphism induced by brand voices will in turn impact brand equity.
Therefore, the following hypothesis can be formulated.
H3: Brand anthropomorphism mediates the effects of brand voice on brand equity.
The Role of AI Disclosure
With recent technological advancements in AI-powered speech synthesis, it is becoming increasingly challenging for consumers to differentiate between pre-recorded human and synthesised brand voices (Tan et al., 2022). As a result, brands and media outlets are expected to label AI-generated content, including synthesised brand voices, to provide greater transparency (Epstein et al., 2023). At the same time, in line with the principles of ‘algorithmic aversion’, it can be hypothesised that disclosing the AI origin of brand voices will influence their impact on consumer responses (Mahmud et al., 2022).
As Mahmud et al. (2022) conclude, people tend to focus their attention on the source of a message, consciously or unconsciously avoiding or rejecting information or decisions generated by AI. This behaviour is observed even when AI-generated decisions are identical to those proposed by humans (Berger et al., 2021).
Similarly, other studies show that people believe algorithmic work, such as recorded music or art designs, is less authentic than human work because they believe it exhibits comparatively less moral authenticity or sincerity relevant to a specific category (Jago, 2019).
Furthermore, in the domain of voice-based chatbots, research shows that while their effectiveness equals or even surpasses that of human agents, the prior disclosure that a conversation involves an AI chatbot as a source significantly reduces the subsequent purchase rate by nearly 80% (Luo et al., 2019).
Specifically, with regard to brand anthropomorphism, Vernuccio et al. (2023a) show that a human-like synthetic brand voice of an interactive service chatbot, when perceived as AI-generated, does not positively influence brand anthropomorphism.
Based on these findings, it is expected that AI disclosure will weaken the relationship between a synthetic brand voice and brand anthropomorphism as well as the relationship between brand anthropomorphism and brand equity.
Thus, the following hypotheses can be derived.
H4: Disclosure of the artificial nature of a synthetic brand voice will lead to a less positive effect of the brand voice on brand anthropomorphism compared to non-disclosure.
H5: Disclosure of the artificial nature of a synthetic brand voice will lead to a less positive effect of brand anthropomorphism on brand equity compared to non-disclosure.
Methods
To test hypotheses, we conducted a between-subjects 3 (pre-recorded human brand voice, AI-generated synthetic brand voice with a human sound, no brand voice) ×2 (AI disclosure, non-disclosure) online experiment (n = 169).
Procedure
In the experiment, the fictitious brand ‘Upstream’ as an on-demand taxi service modelled on Uber or Lyft was presented. In the stimuli, the brand voices speak both the brand name and the claim ‘Upstream. I’ll call you a taxi.’
To develop the stimuli, we cooperated with the sound agency WESOUND network GmbH that has developed an AI-generated brand voice, espesy, based on the real human voice of the professional speaker Elisa Pape, who recorded around 55,000 sentences over several days. The human voice used in this experiment was part of the training data.
To ensure that the effects of the brand voice on brand anthropomorphism were not distorted by the visual part of the stimuli, a non-human trapezoidal shape was deliberately chosen for the logo presented. Care was also taken not to simulate human behaviour on a visual level. The brand name was also deliberately chosen to avoid any association with brand anthropomorphism.
At the beginning of the animation, the animated logo of the fictitious brand ‘Upstream’ is shown. At the same time, a brand’s spokesperson voice pronounces the brand name. During the fade-in of the logo, the brand voice also speaks the brand claim. Subsequently, two icons appear in the next step, referring to the Google and Apple app stores.
The control group was only presented with the visual stimulus without the brand voice. The experimental groups were presented with the visual stimulus with either an AI-generated or a human brand voice. Furthermore, in one experimental group, the AI origin of the brand voice was disclosed. For a better overview of the allocation of stimuli to the control and experimental groups, see Table 1.
Overview Control and Experimental Groups.
Measurements
Brand anthropomorphism was measured using the Brand Anthropomorphism Scale of Golossenko et al. (2020), which is based on the conceptualisation of brand anthropomorphism by Puzakova (2009, 2013). The scale consists of four indicators, each of which contains three items. The interdependent indicators are ‘appearance’, ‘moral virtue’, ‘cognitive experience’ and ‘conscious emotionality’.
The development of the scale in Golossenko et al. (2020) was based on the use of visual stimuli. To use the measure with auditory stimuli, the items were slightly adapted.
Consumer-based brand equity was operationalised using the overall brand equity scale developed by Yoo and Donthu (2001), which measures the relative value of a brand to consumers compared to similar brands.
All brand anthropomorphism items were asked on a 7-point Likert scale. All brand equity items were asked on a 5-point Likert scale.
Table 2 displays all measurement items of the variables brand anthropomorphism and brand equity.
In addition, a manipulation check was carried out in groups 2–4. To check the perception of the auditory stimulus, groups 2–4 were asked whether they had heard a voice in the video. In group 3, participants were informed of the AI origin of the brand voice and, after the stimulus presentation, had to confirm that they were aware that the brand voice in the video was an AI.
Measurements.
Sample
Participants were recruited via convenience sampling between 1st May and 1st June 2023. The survey link was posted on various social media channels, and snowball sampling was used. 213 persons took part in the experiment. However, 44 participants (20.6%) did not complete the questionnaire in full. Therefore, after cleaning the data, 169 fully completed and usable data sets remained. Furthermore, all 169 participants correctly identified the intended manipulation, as evidenced by their appropriate answers in the treatment check. Therefore, no further participants needed to be excluded prior to data analysis.
A total of 63 men, 103 women and 3 persons of a non-binary gender took part in the experiment. The most frequently represented educational group (30.8%) was persons with a bachelor’s degree, followed by high school graduates (20.7%). The least represented in the sample was the doctorate degree with 3.0%.
The most common age group represented in the experiment was 21–29-year-olds (38.5%), followed by 40–49-year-olds (17.2%). An overview of the age distribution can be found in Table 3.
Preliminary descriptive statistics indicate that for the brand anthropomorphism variable group 2 (pre-recorded brand voice, no AI disclosure) has the highest values (M = 3.94, SD = 1.34), while group 1 (no brand voice) has considerably lower mean values (M = 3.01, SD = 0.978). Additionally, the overall mean value for brand equity (M = 2.79, SD = 0.89) is lower than the mean value for brand anthropomorphism (M = 3.59, SD = 1.19). No clear differences are seen when comparing the mean values of brand equity in groups. However, group 2 (pre-recorded brand voice, no AI disclosure) has the highest mean value (M = 2.99, SD = 0.97). An overview of the distribution can be found in Table 4.
Age Distribution.
Descriptive Data Brand Anthropomorphism (BA) and Brand Equity (BE).
Randomisation Check
To check the randomisation, the Kruskal–Wallis test indicated no significant difference between the groups for the variables age (p = .774) and education (p = .873). To test for group differences regarding ‘gender’, a Chi-square test was performed. This also demonstrated no significant differences (p = .851). It can, therefore, be concluded that the randomisation was successful.
Reliability
To check the reliability of the latent constructs, Cronbach’s alpha was computed, resulting in scale reliability for brand anthropomorphism of α = 0.884. For brand equity, Cronbach’s alpha is α = 0.949. Thus, the reliability of constructs could be determined.
Hypotheses Testing
To test H1, a t-test was performed, indicating significant differences between the control (n = 35) and the experimental groups (n = 91) (t(76.0) = 3.43, p < .001, d = 0.780). The group in which the human brand voice was used (M = 3.94, SD = 1.34) differed significantly from the control group regarding the extent of participants’ perceived brand anthropomorphism (M = 3.01, SD = 0.978). This confirms H1.
To test H2, a t-test was performed. The results showed a significant difference between the control group (n = 35) and the experimental groups (n = 91) (t(124) = 2.72, p = .007, d = 0.542) regarding brand anthropomorphism. The groups in which the AI-generated synthetic brand voice was used (M = 3.65, SD = 1.24) differed significantly from the control group in which no brand voice was used (M = 3.01, SD = 0.978). This confirms H2.
To test H3, the bootstrapping method was conducted to analyse the mediating effect of brand anthropomorphism on the relationship between brand voice and brand equity. We followed the Hayes and Scharkow (2013) procedure by using 5,000 bootstrapping resamples to generate a 95% CI for the statistical significance. The results show that the indirect effect of brand voice on brand equity via brand anthropomorphism was significant (p < .001, b = 0.6, SE = 0.717, 95% CI = [0.2541, 0.936]). At the same time, no significant direct effect of brand voice on brand equity was found (p = .745, b = 0.0207, 95% CI = [–0.0969, 0.148]), thus illustrating the crucial role of brand anthropomorphism for building brand equity through brand voice. This confirms H3.
The mediation estimates and the path estimates are displayed in Tables 5 and 6.
Mediation Estimates.
Path Estimates.
To test H4, a t-test was performed between groups 3 and 4 regarding brand anthropomorphism. The results indicated no significant difference between group 3 (n = 45) and group 4 (n = 46) (t(87.4)= −1.73, p < .173, d = −0.288). The group with AI disclosure (M = 3.47, SD = 1.14) did not differ significantly from the group without AI disclosure (M = 3.83, SD = 1.33). H4 is not supported.
To test H5, two linear regressions between brand anthropomorphism and brand equity were run to compare groups 3 and 4. Regressions were statistically significant for group 3 (F(44) = 120, p < .001, R² = 0.731, b = 0.617, p < .001) as well as for group 4 (F(43) = 25.7, p < .001, R² = 0.374, b = 0.536, p < .001). Since both regressions were statistically significant, it can be assumed that brand anthropomorphism induced by a synthetic brand voice has a positive effect on brand equity even when the artificial nature of the brand voice is disclosed. H5 is not confirmed.
To reinforce the confidence of our findings, we conducted power analyses using the G*Power software 3.1. Statistical significance was set at p < .05. We computed the achieved power via post hoc analysis for each hypothesis test based on the sample size and reported effect size. The results display that the achieved power of the tests ranges between 0.999 and 0.856, thus is above the suggested threshold of 0.8, except for H4 with a power of 0.389.
Discussion
Overall, this study brings two important findings to light. First, (AI-generated synthetic) brand voices with a human sound positively impact perceived brand anthropomorphism, which in turn contributes to consumer-based brand equity. Second, AI-generated synthetic brand voices have shown to be equally powerful in inducing brand anthropomorphism and subsequently brand equity, even when their artificial origin is disclosed. These are novel findings that can enrich the growing body of research on brand anthropomorphism and the use of AI technologies in branding and advertising.
Specifically, as our study shows, the use of brand voices has a significant effect on brand anthropomorphism (H1). This study extends the current research on the use of human characteristics and traits in branding and advertising. Previous studies have primarily focused on the impact of visual brand elements on brand anthropomorphism. Furthermore, the result confirms previous research in the field of human–computer interaction, which has shown that vocal stimuli play a crucial role in determining the human-likeness of voice assistants (Vernuccio et al., 2023). This study extends the state of research by demonstrating the positive influence of brand voices on band anthropomorphism in the advertising context, that is, a non-interactive setting in which the brand voice is not a conversational agent.
As H2 is also confirmed, it can be concluded that AI-generated brand voices have a similar impact. Thus, both types of brand voice (pre-recorded and synthetic) exert significant positive effects on brand anthropomorphism. As the effect of the pre-recorded human brand voice was statistically more significant than the AI-generated voice, it could be argued that a real human brand voice has a slightly more positive impact on brand anthropomorphism. However, this difference is gradual and, therefore, rather negligible.
Prior research examining the effects of brand voices has concluded that a pre-recorded voice has a more positive impact than a synthetic voice. For instance, Stern et al. (2006) demonstrated that a pre-recorded human voice is often more persuasive than a synthetic voice. Chérif and Lemoine (2019) found that a virtual assistant with a human voice creates a stronger sense of social presence, a concept closely linked to anthropomorphism, compared to a virtual assistant with a synthetic voice.
It can be proposed that this study arrived at different conclusions due to recent advancements in AI-based brand voice synthesis technology. These have made it more difficult for consumers to differentiate between pre-recorded and synthetic brand voices with a human sound (Efthymiou et al., 2023).
Furthermore, as H3 was confirmed, a strong mediating effect of brand anthropomorphism on the impact of brand voice on brand equity was shown. Thus, it can be concluded that building anthropomorphic brands through brand voices can potentially lead to a competitive advantage for companies by shaping positive perceptions of and attitudes as well as behavioural intentions towards the brand. Similarly, for example, Golossenko et al. (2020) have demonstrated the positive effects of brand anthropomorphism on brand trust and brand commitment. Furthermore, prior studies have shown a positive influence of anthropomorphism perceptions on trust towards voice-based virtual assistants (Pitardi & Marriott, 2021). Also, more specifically, Patrizi et al. (2024) have recently shown that brand anthropomorphism positively affects brand trust and consumer–brand engagement in voice-based AI contexts. Thus, our findings align with prior research on the important role of brand anthropomorphism and extend those findings regarding the specific effects of vocal brand elements in a non-interactive advertising setting.
Furthermore, our results relate to prior studies which have analysed the impact of brand voices on brand equity. For example, Wiener and Chartrand (2014) have demonstrated the positive effects of auditory stimuli on consumer responses to advertisements. Also, Chattopadhyay et al. (2003) have shown the impact of speech characteristics on brand attitudes. Our study extends this literature in demonstrating the impacts of brand voices on behavioural intentions via perceived brand anthropomorphism.
Contrary to H4 and H5, AI disclosure did not attenuate the positive effects of brand voices on perceived anthropomorphism and the relationship between brand anthropomorphism and brand equity. This finding contradicts some of the existing literature which has indicated predominantly negative effects of AI disclosure on consumer perceptions (e.g., Luo et al., 2019). However, it must be taken into account that most of the existing studies assessing negative effects of AI disclosure were conducted in the context of interactive chatbots. For instance, Patrizi et al. (2024) found that perceived privacy risks moderate the influence of brand anthropomorphism on brand trust when consumers interact with name–brand voice assistants. However, it can be argued that perceived privacy risks emerge dependent on the interactive nature of the exchange between consumers and AI-generated brand voice assistants. In an interactive setting, consumers share information with the AI agent, which does not occur in the advertising environment we have investigated. Thus, it can be hypothesised that the lack of privacy risks linked to non-interactive brand voices may have contributed to our findings that AI discourse did not have a negative impact on the effect of brand anthropomorphism on brand equity.
Thus, our study extends the literature by investigating brand voices as brand elements in a non-interactive setting. Based on our results, it can be assumed that the situational context in which brands implement brand voices may exert effects on how consumers evaluate AI disclosure. This seems plausible, as consumers will have different expectations about the origin of a brand voice depending on whether they are interacting with the AI or not, and at what stage of the customer journey they are confronted with an AI brand voice (He & Zhang, 2023). Prior studies have mentioned several variables such as perceived brand personality fit (Yang & Hu, 2022) or ideological views (Davenport & Ronanki 2018) that may exert an effect on how AI disclosure will be evaluated by consumers. Our findings contribute to this literature on potential influencing factors regarding consumer assessments of AI disclosure.
Furthermore, our findings emphasise the pivotal role of anthropomorphism in the relationship between AI disclosure and consumer perceptions. Based on our results, it can be assumed that potentially negative perceptions of AI disclosure may be attenuated by endowing AI artefacts with human-like features that are capable of inducing brand anthropomorphism. This is also reflected in prior studies (Konya-Baumbach et al., 2023) and can be illustrated regarding the new phenomenon of human-like AI-generated social media influencers, which can exert positive brand benefits similar to those produced by human celebrity endorsers (Böhndel et al., 2023). These endorsers possess distinct human-like visual representations, thus enabling them to induce anthropomorphic perceptions. Consequently, Ahn et al. (2022) have found that the virtual influencers’ perceived anthropomorphism effectively enhances their social presence, which in turn drives consumer evaluation outcomes.
As our study shows, not only human-like visual representations but also human-like vocal representations are enablers of anthropomorphic perceptions and can, therefore, have positive effects on consumers’ evaluations of brands that have implemented AI technologies in branding.
Practical Implications
Both AI-generated synthetic and pre-recorded human brand voices have a positive impact on brand anthropomorphism and consumer-based brand equity. Thus, by harnessing and fostering brand anthropomorphism through (AI-generated) brand voices, companies may improve brand perceptions and thus operate more successfully in the marketplace.
As AI-generated brand voices offer various advantages in terms of cost and flexibility, it would be conceivable for companies to rely on AI instead of humans for creating brand voice in the future. The opportunities that come with AI make it possible to generate brand voices more flexibly and to even keep the voices of speakers alive even after their deaths. For example, long-dead founders of famous brands could be brought back to life vocally and staged as iconic brand voices.
Furthermore, brand managers may create specific voice profiles for different stages of a consumer journey (Efthymiou et al., 2023). For instance, brands may address consumers in their advertisements differently in the pre-purchase phase than in the post-purchase phase. As suggested by Efthymiou et al. (2023), using a whispering tone in a brand voice may create a sense of intimacy and increase levels of trust with consumers in the post-purchase phase.
Brand managers could also consider adapting the brand voice in their adverts more flexibly to the different target groups they are addressing. This may be relevant, particularly in relation to the perceived gender and age of the brand voice, as this may increase perceived consumer-brand-fit and, thus, improve advertising performance. Additionally, AI-generated brand voices may be used to promote different brand personalities more flexibly. However, it is important to ensure that a consistent brand identity is maintained overall and that the different brand voices do not dilute the brand image.
Given these possibilities, we can expect to see a significant increase in the extent to which the rapid improvements in voice AI are applied to audio branding.
Limitations and Future Research
One limitation of our study was the relatively small sample size, which may have influenced the statistical power of our analyses regarding H4. The small sample size may have limited our ability to detect significant effects, potentially leading to the rejection of H4 that might have been supported with a larger sample.
To address the limitations of the current study, future research endeavours should consider employing larger sample sizes to improve statistical power and enhance the validity of our conclusions.
In terms of study design, the online experiment presented short fictitious brand stimuli for only eight seconds. Thus, it could be fruitful to investigate the effects of brand voice on brand anthropomorphism and brand equity with other longer brand assets. For example, a three-minute image film or a typical 15-second YouTube ad could be considered.
Furthermore, the stimuli were adapted from a real mobility provider campaign in Germany and created in cooperation with a professional sound agency to increase external validity. However, we did not conduct a pretest of the stimuli, which is a limitation of our study that should be addressed in future research.
The brand voice in the treatment was deliberately chosen to be female, as people generally prefer to interact with female AI voices (Van den Hende & Mugge, 2014). In the future, it would be interesting to compare the effects of male and female brand voices.
Furthermore, while the randomisation checks showed no significant differences between groups regarding socio-demographic variables, we did not control other potential confounders, such as prior exposure to AI-generated brand voices. This is a limitation of our study which should be addressed in future research.
As our results differ partly from previous studies that have investigated the influence of AI brand voices in the context of chatbots or voice assistants, it would be useful to further evaluate the effects of brand voices by directly comparing interactive and non-interactive settings.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The authors received no financial support for the research, authorship and/or publication of this article.
