Abstract
The authors examine (1) the extent to which cross-national marketing scholars report measurement invariance (MI) assessment results and (2) what cross-national marketing scholars think about MI assessment in general. In Study 1, the authors analyze all cross-national empirical articles (243) published in 15 well-respected and peer-reviewed marketing journals from 2000 to 2005. Although the results indicate a steady growth of published cross-national empirical marketing research and assessment of MI, only 28% of the studies undertook the procedure. In Study 2, the authors analyze responses from 86 cross-national empirical marketing scholars regarding their knowledge about, attitudes toward, and use of MI assessment. The results indicate that the relatively low utilization of MI assessment is due to low MI knowledge and the sophistication of the techniques. The authors conclude with suggested implications for the field of international marketing and a discussion of future research directions.
Globalization continues to drive rapid growth of international trade (Holt, Quelch, and Taylor 2004), global corporations, and nonlocal consumption alternatives (Alden, Steenkamp, and Batra 2006). Therefore, it is not surprising that the number of studies examining cross-national marketing topics is growing (e.g., Griffith, Myers, and Harvey 2006). Although such research provides valuable insights for both academics and practitioners, several scholars have emphasized the importance of minimizing the possibility of underlying biases in cross-national empirical research due to faulty data collection or analysis. Recommended approaches to avoid such problems include controlling for biases before or during data collection (e.g., Craig and Douglas 2000; Griffith and Schuster 2002; Van Herk, Poortinga, and Verhallen 2005) and assessing the measurement invariance (MI) of data already collected (e.g., Mullen 1995; Steenkamp and Baumgartner 1998).
With respect to the latter recommendation, several scholars have argued that the validity of cross-national data analyses could be questioned if MI is not established and reported (Hui and Triandis 1985; Sekaran 1983; Singh 1995; Van de Vijver and Leung 1997). Yet the results of MI are often not included in published studies (Aulakh and Kotabe 1993; Hult et al., in press; Malhotra and Agrarwal 1996; Sin and Cheung 1999). Through content analysis of consumer studies published between 1991 and 1996, Sin and Cheung (1999) find that researchers often take steps to minimize bias before data collection (e.g., by conducting double-back translation of surveys), but few report MI assessment thereafter.
More recently, Steenkamp and Baumgartner (1998) and Myers and colleagues (2000) have stressed the importance of conducting MI assessment in cross-national empirical marketing research and have provided step-by-step procedures. Following their call, in Study 1, we content-analyzed 243 cross-national empirical studies published between 2000 and 2005 in 15 top marketing journals for the presence or absence, methodology (if present or reported), and proper use or misuse of MI assessment. In Study 2, we conducted a survey of scholars who published cross-national empirical marketing research during this same period to understand more completely the results of the content analysis reported in Study 1. Before reporting the results of both studies, we offer a brief review of focal issues in MI assessment in cross-national marketing.
MI in Cross-National Empirical Marketing Research
Cross-national researchers frequently collect data in multiple countries. However, this draws attention to a potential source of bias: Namely, it is possible that observed differences in the results are not due to the manipulations or relationships of interest but rather to systematic cultural differences in interpretation and/or responses. For example, consumers may not interpret a given construct similarly across cultures (Myers et al. 2000; Singh 1995), or they may vary cross-culturally in their tendency to respond to certain scale items (Hui and Triandis 1989; Riordan and Vandenberg 1994). As a result of potential biases, scholars have argued that construct, metric, and scalar equivalence should be examined before results are compared across cultural or national boundaries (e.g., Mullen 1995; Steenkamp and Baumgartner 1998; Van Herk, Poortinga, and Verhallen 2005).
Over the years, researchers have recommended various procedures for establishing cross-national equivalence. For example, some have suggested assessing construct equivalence by comparing factors across independent samples using exploratory factor analyses (Reynolds and Harding 1983). Others have recommended methods such as visually checking patterns for possible invariance (Van de Vivjer and Leung 1997) or comparing Cronbach's alpha across groups. Moreover, researchers have proposed score standardization and ipsatization (Cunningham, Cunningham, and Green 1977) procedures to remedy possible biases. In addition, optimal scaling can help researchers investigate response set biases (Mullen 1995). Unfortunately, these methods either lack statistical power (e.g., exploratory factor analysis) or run the risk of eliminating all possible differences between groups, including those related to the research topic (e.g., standardization and ipsatization; Van de Vijver and Leung 1997).
To address these limitations, researchers have recently proposed more detailed cross-national MI procedures. These approaches often apply multigroup confirmatory factor analysis (CFA) or multigroup measurement model testing, using structural equations to diagnose both construct- and scale-related biases. For example, Mullen (1995) recommends use of multigroup LISREL to analyze the equality of the error covariance matrices in the measurement model. Singh (1995) pays special attention to construct equivalence and suggests use of latent variable structural equations to assess factor loading equivalence across cultures. Cheung and Rensvold (1999) extend Byrne, Shavelson, and Muthén's (1989) procedure of examining factorial invariance through CFA. They maintain that factorial invariance has two requirements: (1) that item responses load on the same constructs across groups and (2) that factor loadings are not significantly different from each other. Furthermore, Myers and colleagues (2000) note that examination of the covariance matrix for errors associated with items in the measurement model is also necessary.
Steenkamp and Baumgartner (1998) recommend a multigroup CFA approach to assessing MI, including the following: (1) configural—to determine whether the basic meanings and structure of research constructs are understood and conceptualized similarly in the different groups (e.g., countries, cultures); (2) metric—to determine whether scale intervals are perceived similarly across the groups; (3) scalar—to determine whether there is systematic response bias due to cross-group differences; and (4) two additional steps with even stricter requirements for determining MI. They also emphasize the importance of matching the level of invariance assessment with the research purpose.
Configural invariance is sufficient if the research purpose is to explore cross-nationally the basic structure of the construct. However, configural, metric, and scalar invariance are required for comparisons of means across countries. Finally, further complexities may arise depending on the specific analysis conducted by each research project—for example, whether standardized or unstandardized regression coefficients are compared. Moreover, the CFA approach has been extended to cross-national empirical studies that deploy emic or country-specific measures (Baumgartner and Steenkamp 1998). Many researchers find multigroup CFA to be a valuable diagnostic tool for evaluating MI (Myers et al. 2000).
Because several scholars note that enhanced validity accompanies establishment of MI (e.g., Steenkamp and Baumgartner 1998), it is important to investigate how MI tests have been received by the cross-national marketing academic community. With this in mind, in Study 1, we examine all empirical cross-national empirical marketing studies in 15 top area journals from 2000 through 2005.
Study 1
Sample
Empirical marketing articles with cross-national content published from 2000 to 2005 in 15 peer-reviewed journals constituted the sample for this study (see Table 1). The selection of journals was based on existing journal rankings and journals used in prior marketing content analyses (e.g., Nakata and Huang 2005). We selected the period 2000–2005 because it followed the publication of important methodological studies involving cross-national MI (e.g., Mullen 1995; Singh 1995; Steenkamp and Baumgartner 1998). As such, this time frame should effectively capture the assimilation and application of the most recently introduced MI assessment approach—namely, the multigroup CFA approach. The adoption of other, more traditional MI approaches has been well documented by prior content analysis research (e.g., Hult et al., in press; Sin and Cheung 1999).
Definitions for Coding Scheme
Using the title and abstract of each article published in the stated period, two researchers identified articles to be included, following these criteria: (1) analysis of multicountry samples, (2) inclusion of an empirical component, and (3) use of self-report data collected by a survey or in an experiment. Studies using only objective data (e.g., foreign direct investment, per capita income) were excluded. This process produced a sample of 243 articles.
Coding Procedure
The coding scheme is described in Table 1. Two researchers coded articles independently. We assessed intercoder reliability (91.6% agreement) using Perreault and Leigh's (1989) coding criteria. We report indexes for individual coding categories in Table 1. Differences were resolved through discussion.
Column 2 of Table 1 shows specification of the dimensions. If MI was assessed in the study, it was placed in one of two categories. We label the first category “CFA approaches.” These procedures use CFA or similar tools to assess MI (e.g., Mullen 1995; Singh 1995; Steenkamp and Baumgartner 1998). Although CFA approaches can vary in their specific analysis, they all assess MI by testing a multigroup measurement model, which in general is viewed as a high-quality diagnostic tool for evaluating MI (Myers et al. 2000). We refer to the second MI category as “other approaches.” These include use of other analytical efforts to assess MI, such as exploratory factor analysis (Reynolds and Harding 1983), variance checks for floor or ceiling effects (Van de Vivjer and Leung 1997), and score standardization (Cunningham, Cunningham, and Green 1977).
The three dimensions of “purpose,” “level of MI assessment,” and “fit” address Steenkamp and Baumgartner's (1998) recommendation that the level of invariance assessment match the research purpose. For studies using CFA approaches other than Steenkamp and Baumgartner's (1998) method, coders assessed the level of MI by matching the analyses reported in these studies with the different levels of Steenkamp and Baumgartner's methodology.
Results
As noted previously, our screening process identified 243 empirical international articles in 15 major journals from 2000 to 2005. Table 2 presents the number and percentage of articles published in each year. It shows that from 2000 to 2002, the number of cross-national articles grew steadily. Although a sharp decline occurred in 2003, the number of cross-national articles published since 2003 has increased. Table 3 depicts the distribution of articles across the 15 journals included in this study.
Cross-National Empirical Marketing Articles per Year
Cross-National Empirical Marketing Articles per Journal
Cross-National MI Assessment
In this section, we review our data set of 243 cross-nationally focused marketing articles to determine (1) whether these studies reported MI testing following data collection; (2) if so, which MI assessment technique was reported; and (3) whether the assessment technique reported fits the research purpose. Table 4 presents our overall findings regarding the use of MI assessment from 2000 to 2005. Of 243 articles, 67 (28%) reported assessing MI.
MI Assessment: Overview
Among studies that reported assessment of MI, 82% employed CFA. Non-CFA methods used in the other studies included (1) exploratory factor analysis (Begley and Tan 2001; Mehta 2001; Neelankavil 2000; Tsang 2002), (2) generalizability theory (Cronbach et al. 1972; Sharma and Weathers 2003), (3) Cronbach's alpha (e.g., Mattila and Patterson 2004; Souchon et al. 2003), (4) profile analysis (carried out by displaying item means for each country and visually checking for equivalence; Morris and Pavett 1992; Souchon et al. 2003), and (5) face validity (Deshpandé, Farley, and Webster 2000). We now turn to a more detailed discussion of CFA use.
According to Steenkamp and Baumgartner (1998), it is essential to assess MI at a level that matches the research purpose. Thus, we also determined whether the MI assessment level matched the study's stated research purpose. For example, if the researchers were interested in examining nomological structure across national samples, determination of metric equivalence is sufficient. In contrast, if the general linear model is to be used, scalar equivalence is needed.
Of 55 CFA studies, 41 reported MI assessment at the level recommended, given research objectives and analyses, whereas 14 other studies did not report MI assessment at the appropriate level. For example, a few studies only reported configural invariance when scalar invariance should have been analyzed and reported as well. Several other studies employed analysis of variance to examine cross-national differences without reporting scalar invariance. In these studies, the potential for bias remains. Overall, our content analysis suggests that the validity of many cross-national empirical marketing studies could be enhanced with more consistent and complete investigation of MI.
Table 5 displays the extent of MI assessment in different journals. Journal of the Academy of Marketing Science ranks first with all four cross-national empirical articles published from 2000 to 2005 reporting MI. Among journals emphasizing cross-national research, International Journal of Research in Marketing has the highest percentage of studies (57%, 8 of 14) that report MI. Journal of International Marketing is second with 48% (12 of 25). We found lower percentages of MI reporting in journals that publish relatively few cross-national studies, including Journal of Marketing Research (0%, 0 of 4) and Journal of Consumer Research (11%, 1 of 9). However, among journals that publish relatively more cross-national studies, there are some with relatively low levels of reported MI assessment (e.g., Journal of International Business Studies, International Marketing Review).
MI Assessment per Journal
Table 6 presents information regarding MI assessment for studies with different research topics. Our analysis shows that the likelihood of reporting CFA approaches varies significantly across different research areas (χ2 = 7.86, d.f. = 2, p = .02). Specifically, 23.3% (21 of 90) of the studies in consumer behavior, 20% (29 of 145) of the studies in strategy, and 62.50% (5 of 8) of the studies in methodology reported the use of CFA approaches. Overall, our review indicates that the majority (72.4%) of cross-national empirical marketing studies between 2000 and 2005 did not report MI.
MI Assessment by Research Topic
Possible Reasons for Nonadoption of MI Assessment
At least two limitations of the MI assessment technique may have hindered its adoption. First, multigroup analysis might prove daunting when data from a large number of countries are collected (Baumgartner 2004). Multigroup CFA has sample size requirements (Bagozzi and Yi 1989; Bollen 1989; Myers et al. 2000). Thus, obtaining sufficient numbers of respondents in each country site could prove problematic, leading to poor model fit. Second, studies using single-item measures cannot be tested for MI with multigroup CFA (Mullen 1995).
To examine the first possibility, we checked whether MI assessment differed depending on the number of countries included in a study (see Table 7). This analysis showed that the likelihood of reporting CFA approaches varied significantly depending on the number of countries investigated (χ2 = 5.44, d.f. = 2, p = .033). Among 21 studies that collected data in more than ten countries, only one reported MI using CFA. This finding suggests that the difficulty of multigroup CFA increases with the number of countries involved. A study conducted by Van Birgelen and colleagues (2002) is the only one with data from ten or more countries to report MI. However, the authors obtained 68 or more observations in each country—a sufficiently large sample in all countries for multigroup CFA.
MI Assessment by Number of Countries investigated
To examine the second possible reason for the relatively limited number of cross-national empirical marketing studies reporting MI, we coded each study in the sample on whether single or multiple items were used to compare constructs of interest. Our analysis identified seven studies that used single-item measures. In these studies, MI could not be and was not assessed using multigroup CFA. Combined, these two reasons exemplify challenges that may help explain the relatively limited use of multigroup CFA analyses to establish MI.
Conclusion
In Study 1, we addressed the behavioral component of the question, How are proposed MI approaches received by cross-national empirical marketing scholars? As such, we attempted to learn more about what cross-national empirical marketing scholars actually do with regard to MI testing. Notably, we found an increase in published cross-national empirical marketing research during the examined period and a paralleling trend of increased MI reporting. Although this trend is encouraging, only 28% of all articles reviewed reported MI. Furthermore, of the articles that did report MI results, 25.5% did not report MI assessment at the recommended level.
Given the results of Study 1, it is reasonable to question why MI results are not reported in the majority of the cross-national empirical studies under examination. In part, the answer can be found by better understanding what cross-national empirical marketing scholars think about MI assessment: Were the authors unaware of the necessity of MI assessment? Did they lack the statistical knowledge to conduct MI assessment? Did they decide to leave out MI assessment because it was detrimental to their results? Were they advised by reviewers or editors not to report MI assessment results? To answer these questions, we conducted Study 2.
Study
2 Sample
We collected data using an online questionnaire. The initial sample consisted of all the authors of the 243 articles analyzed in Study 1. Excluding overlaps (i.e., same author across articles) and authors for whom we could not find a valid e-mail address, we identified 335 unique marketing scholars. We sent these scholars an e-mail, inviting them to participate in an online survey about cross-national MI assessment. We sent a follow-up e-mail after approximately four weeks. This procedure resulted in a response rate of 26% and 86 usable surveys.
Measures
In the online questionnaire, we asked respondents to rate their knowledge of MI assessment with three items on seven-point scales. We also asked respondents to select the MI approaches they believed to be capable of establishing cross-national MI (from a list of ten approaches that we had identified in Study 1). Following this, we asked respondents whether they had reported MI assessments in their own cross-national empirical studies and, if so, which approaches they had employed. We also asked them about their reasons for not reporting MI results as well as for their feedback from the reviewers and editors regarding nonreporting of MI assessment. Finally, we asked about their general attitudes toward MI assessment.
Results
MI Knowledge
Although all scholars who participated in Study 2 had published cross-national empirical research (i.e., data were collected in more than one country), their self-reported knowledge on MI assessment was relatively low (M = 4.51 on a seven-point scale, where 7 = “in-depth knowledge”). In addition, reported experience with conducting MI tests was limited. For example, 17.4% of the respondents had never assessed MI, and 30.2% had assessed MI only once or twice (see Table 8). Consequently, almost 50% of the sample had no or limited experience with the method.
Experience on MI Assessment
Notably, respondents did not view MI assessment as particularly important (M = 4.12 on a seven-point scale, where 7 = “very important”). However, MI knowledge was positively associated with respondents’ ratings of importance. Respondents who reported higher levels of MI assessment training were more likely to believe that establishment of MI is critical to the validity of a cross-national empirical study (standardized β = .47, t = 4.88, p < .001, adjusted R2 = .21). These results imply that one driver of the limited MI assessment reporting in cross-national empirical studies may be insufficient knowledge.
Rating of Alternative MI Assessment Approaches
Table 9 presents respondents’ perceived validity and preferences of different MI assessment approaches. Almost 56% of all respondents (48 of 86) believed that the CFA approach was capable of ensuring the validity of cross-national data. The item response theory (IRT) approach (21%) and face validity approach (17%) were second and third, respectively. Notably, 14% of all respondents viewed none of the ten MI approaches as valid. Instead, they proposed other methods, such as multigroup causal analysis or maximum difference scaling, as possible means to assess MI.
MI Assessment Approaches
In confirmation of Study 1's findings, a high percentage of cross-national empirical marketing scholars had not reported MI assessment results in their published cross-national empirical marketing research. Indeed, 58% of the respondents stated that they had not reported MI assessment results in all of their published cross-national empirical research, and 15% said that they had never included such information. When reported, CFA was the most frequently mentioned MI assessment method (82.35%), followed by exploratory factor analysis (27.45%) and Cronbach's alpha (25.49%). Significantly, the IRT approach, though believed to be a valid method by approximately 21% of the sample, was reportedly used by only 7.84%.
Reasons for Not Reporting MI Assessment Results
As we noted previously, approximately 58% of all respondents stated that they did not report MI assessment results in all of their cross-national empirical research. Respondents offered three explanations for this: (1) The data were not conducive to MI assessment (32%), (2) MI assessment was not viewed as necessary (32%), and (3) familiarity with MI assessment methodology was insufficient. In addition, 10% of the respondents conducted MI assessment but did not report the results (see Table 10). Of the scholars in our sample who did not include MI assessment in their study, 72% stated that neither reviewers nor editors mentioned the need for such information during the review process.
Reasons for Not Reporting MI
In conclusion, Study 2 implies that the relatively slow growth of MI assessment in cross-national empirical research, as identified in the Study 1 literature review, may be due to a lack of MI knowledge. The technological sophistication of the MI approaches also might hinder their use. As one respondent noted, MI assessment tools “are so complex and time-consuming that the focus of our research work could end up deviating from substantive issues.”
General Discussion
Cross-national empirical marketing researchers, by definition, collect data in more than one country. Often, these data are then compared to determine the extent of national or cultural similarities and differences. This can be problematic, however, if perceptions of the measurement scale are dissimilar. To ensure valid analyses, cross-national empirical researchers have proposed several methods of MI assessment (e.g., Steenkamp and Baumgartner 1998).
Given the role of MI in establishing cross-national comparative validity, it is important to know how frequently MI is assessed and whether it is analyzed correctly. It is also important to know what cross-national empirical marketing scholars think about MI. Our findings are surprising. In general, we found that MI assessment has been repeatedly recommended in the cross-national empirical marketing literature (e.g., Craig and Douglas 2000; Mullen 1995; Myers et al. 2000; Steenkamp and Baumgartner 1998). Yet only 28% of the published cross-national empirical marketing studies from 2000 to 2005 reported doing this, and of these, approximately 75% did so appropriately, according to our review. With regard to our second purpose, we found that the low adoption rate may be due in part to cross-national empirical marketing scholars’ lack of MI knowledge coupled with the perceived sophistication of the different MI approaches.
Consideration of the current distribution channels of academic technology suggests possible strategies to increase awareness and understanding of cross-national MI assessment approaches. First, conferences and colloquia could provide vehicles for wider exposure to MI techniques. Second, special journal issues on MI assessment and related approaches might generate further discussion. Third, the gatekeepers in the marketing discipline (e.g., editors, reviewers) might consider making MI assessment (or some other generally accepted approach set) an important criterion during the evaluation of cross-national empirical manuscripts.
Our findings also suggest that cross-national empirical marketing scholars should continue to conduct research on this topic to further help disseminate the assessment of MI in cross-national empirical research. The slow adoption of MI assessment approaches found in Study 1 might be partially due to the sheer number of possible approaches. To illustrate, cross-national empirical marketing scholars can choose among multigroup CFA (Mullen 1995; Myers et al. 2000; Singh 1995; Steenkamp and Baumgartner 1998), the IRT approach (De Jong, Steenkamp, and Fox 2007; Lord and Novick 1969), measurement theory (Ewing, Salzberger, and Sinkovics 2005; Rasch 1960), and so on. The problem is that there are no agreed-on standards. This no doubt contributes to uncertainty among researchers, reviewers, and editors. Further research should compare different approaches and identify contingent factors that support the use of one method or another.
The respondents in Study 2 repeatedly called for a simple MI assessment approach. To date, most methodological studies on MI are somewhat complex. A straightforward manual might be helpful for cross-national empirical marketing scholars. Method simplification is also important for cross-national marketing managers who need to understand and adopt MI assessment approaches as well. For example, “a checklist for establishing data equivalence,” as Hult and colleagues (in press) propose, might be helpful.
With this said, it is unclear whether failure to assess MI always leads to false results. One respondent pointed out that it would be useful to show that MI assessment has a significant impact on a researcher's results and conclusions. Therefore, the potential for bias exists without assessing MI. However, researchers in the future should empirically investigate the question whether the failure to address MI is a fatal flaw or a study limitation, specifically the extent to which nonassessment of MI has produced inaccurate hypotheses tests and potentially incorrect conclusions. A meta-analysis might be an appropriate means to investigate this issue.
Despite repeated calls to report MI assessment, our research reveals a somewhat surprising reality—namely, limited reports of MI in cross-national empirical marketing articles and a lack of MI knowledge among cross-national marketing scholars. This reality raises questions about the validity of many cross-national empirical marketing studies. Establishing methodological standards for all published cross-national empirical marketing articles (e.g., reporting MI assessment) would increase confidence in and respect for the field. This effort would require collective efforts on the parts of scholars, conference organizers, reviewers, and editors. However, the effort required to make MI assessment standard practice would surely yield returns that far exceed the initial investment. A few international marketing journals (e.g., Journal of International Marketing, International Journal of Research in Marketing) already have led the way. We hope that our results will help convince others to follow.
