Abstract
This article proposes a cognitive structural account of the emergence of social worlds that organize the expectations of producers and audiences. The proposed model considers the emergence process in three stages: (1) the arrival of distinctive clusters of objects and their producer, (2) conceptualization of the labeled clusters in high-energy social structures that we call a “scene,” and (3) whether a scene’s coordinating concept becomes generally taken for granted. We first develop an ontology for analyzing these processes. We then analyze the role of social structure in these stages and define the emergence process in terms of hazards, that is, the event rates of each stage. The proposed account can explain the emergence of worlds in a variety of social contexts, including organizational forms in markets, genres in creative industries, disciplines in science, and forms of social protest.
Social life abounds with complexes of producers and audiences whose interactions are shaped by collective concepts that distinguish their activities. Examples include styles in fashion, genres in art, schools of thought in science, and product types in exchange markets. Such concepts set expectations for specific sets of works, performances, or goods. These concepts also shape aesthetics, valuation, and participation/choice decisions.
We have a good understanding of how such concepts work in particular contexts, for example, how aesthetic conventions help organize the work of artists, galleries, collectors, and museums in contemporary art worlds (Becker 1982) or how industry categories help coordinate analyst coverage and price movements in the stock market (Zuckerman 1999). Less work, however, has examined the emergence of these concepts.
We ask: How do these culturally coordinated producer-audience systems emerge in the first place? How do producers and audiences come to conceptualize a new “thing,” engage with it, and reach a collective understanding of it? The question of emergence is an important one theoretically and pragmatically. Addressing this question contributes to a more comprehensive account of the diversity of organized forms of social life (DiMaggio and Powell 1983; Hannan and Freeman 1986). Our theory advances existing accounts in several ways. We base the theory on cognition. We propose a mechanism based on engagement with objects that links individual and collective cognition. Our model assumes that collective concepts emerge from observations and interpretations of others’ categorizations. Finally, because social structure affects who can observe whose actions, we examine the causal role of social structure in the emergence process.
We focus on a particular type of cultural and social system we call a “producer-audience world.” Throughout, when we refer to a world, we mean such a world. The collective cognitive component of a world concerns its core activity. This activity’s orienting, or focal, concept distinguishes one world from related worlds.
Novel worlds arise in all kinds of domains. Researchers use different terminology when addressing different domains. In the case of social-movement worlds, the focal concepts are usually called forms, for example, craft and industrial forms of unionism (Hannan and Freeman 1987) or ethnonationalist and Marxist social movements (Olzak 2022). In commercial worlds, collective concepts are referred to as market categories; for example, retail banking participants distinguish between traditional institutions, such as banks or credit unions, and unconventional lenders, such as payday loan providers (Negro, Visentin, and Swaminathan 2014). In art worlds, collective concepts are called “genres” or “styles” (DiMaggio 1987).
A world includes observable public representations that instantiate and express the concept (Sperber 1996). For example, the jazz world is populated by performances and recordings. Such public representations make visible the private meanings and aesthetics of agents in that world (Phillips 2013). We refer to these representations simply as “objects.”
We cast the emergence of a novel world as a three-stage process. For each stage, we consider only one existing “parent” world. The first stage occurs when a cluster of objects and their producers are recognized as distinctive in this world and their distinctiveness is signaled by giving it a new label. So, the first stage concerns the arrival of new labeled clusters in an existing world.
The second stage begins when members of the parent world who focus on the cluster seek to conceptualize it. In this phase, agents seek to understand what makes the objects in the cluster different. Such understandings do not arise from thin air. Usually, these understandings emerge from loci of intense interaction, which we refer to as “scenes” (Lena and Peterson 2008; Straw 1991). In our account, scenes develop within worlds and sometimes develop into new, full-fledged worlds. Therefore, the second step involves modeling the likelihood of the emergence of a scene.
The final step examines the likelihood a scene will develop into a world. We propose that this step involves the initiation of widespread taken-for-grantedness of the meaning of the scene’s organizing concept.
We first develop an ontology of clusters, scenes, and worlds. We then use this ontology to define the key parameters: (1) the hazard of arrival of a labeled cluster, (2) the hazard of emergence of a scene from a cluster (where actors interested in a cluster actively search for meaning), and (3) the hazard of emergence of a world from a scene. Finally, we examine how the social structure of a world affects these hazards. This application shows how the theory can be deployed in sociological arguments.
Related Work
Prior sociological research typically discusses the growth and expansion of producer-audience concepts in single settings. Some work focuses on the emergence of organizational forms, 1 such as small-scale, artisanal beer (Carroll and Swaminathan 2000). This research highlights the role of public participation in social spaces as instrumental in the evolution of the collective concept. However, this work pays less attention to cognitive mechanisms than to the provision of organizational resources or shifts in social trends, such as the diffusion of anti–mass market sentiments among consumers, as factors that facilitate the development of the collective concept.
Typical analyses of emergence suffer from a single-minded focus on the positive cases of emergence. These analyses take as a starting point that some focal collective concept did emerge, and they then examine the conditions that appear to explain the observed emergence. Such “endogenous sampling” precludes valid inference about the causes of emergence. One practical implication is bias in estimates of effects of social structural conditions on emergence.
To the best of our knowledge, only a few studies have avoided this selectivity problem. Ruef (2000) analyzed the historical emergence of organizational forms in the health care sector by using media mentions to build a semantic space, divided the space into regions, and treated the first media mention of a new specific form in any subregion of the space as the measure of emergence. This approach allowed the identification of both positive (observed emergence in a region) and negative (no emergence in a region) cases.
Van Venrooij’s (2015) analysis of the emergence of genres of electronic dance music follows the general outlines of Ruef’s (2000) study. Van Venrooij uses a rich corpus of texts to build a semantic space and analyze the occurrence and nonoccurrence of arrivals of new genres in regions of the meaning space. The arrival of a genre in that semantic region is measured as the release of a compilation album of music in that genre. Koch, Silvestro, and Foster (2020) shift the unit of analysis from genre to music bands as objects; they analyze the proliferation and diversification of styles of metal music by estimating birth and death rates of bands in these styles in the metal domain.
These studies rely implicitly on a cognitive argument. They use the theory of density-dependent legitimation and competition (Hannan 1986; Hannan and Carroll 1992). One component of this model treats the hazards of founding and mortality of organizations characterized by a focal form as being shaped by its taken-for-grantedness, where this term refers to being perceived as a natural part of the social world.
One limitation of this work is that it treats an organizational form as having emerged when the first instance has appeared (Bogaert et al. 2014). But a collective concept such as an organizational form gets established with the emergence of consensus. The theory we propose fits better with the imagery that a collective concept builds with multiple instances through taken-for-grantedness (Hannan 2022b).
Taken-for-grantedness refers to familiarity (what more recent work calls cognitive fluency; Hannan et al. 2019) with the collective concept. Such familiarity strengthens with greater exposure to repeated instances of that concept. According to the standard model of density dependence, taken-for-grantedness increases with exposure at a decreasing rate.
Research based on density dependence relies implicitly on collective cognition, but it does not treat this explicitly. This work does not develop a micro-macro connection (there are no agents in this account), and it does not examine the role of social structure in enabling interaction and communication among social agents.
This article highlights the role of cognitive mechanisms and proposes a way to avoid selectivity. In addition, we examine how social structure among agents can shape the dynamics of cognition and engagement in the emergence process. We build the argument using standard first-order logic (the table in the Appendix provides a glossary of relevant notation).
Concepts and Categories
Individual Level
We begin by considering individual cognition and establishing the notions on which we build an account of collective concepts. A concept is a mental representation by a person that provides expectations and beliefs about entities (e.g., people and events), which we call objects. These expectations concern the values of a set of relevant “features.” A person’s mental representation of a particular entity, such as a software product, a sports contest, or an academic publication, can be represented as a position in semantic space. The dimensions of such a space are the possible values of the features for which the concepts provide expectations and that matter for judging whether entities are instances of the concept. Concepts are generally paired with labels (linguistic tags) because people cannot directly share their mental representations but do communicate using labels. Our formalization uses labels to indicate concepts.
A contemporary rendering treats a concept as a probability density function, π(c | x), defined over a semantic space. Here, x denotes the position of the mental representation of an object in the space. This function, called a “concept likelihood,” gives the subjective probability (or belief) that an object that is an instance of a concept will have some combination of values of relevant features (a specific position in the semantic space). Contemporary quantitative semantics (including that built into large language models) treats this probability measure as providing the “meaning” of a concept (e.g., Boutyline and Arseniev-Koehler 2025).
The foundational empirical work on concepts and categories did not pay much attention to semantic spaces. Work in the style of the pioneer Eleanor Rosch uses commonly understood concepts and subconcepts (e.g., fruit: apples, grapes, and tomatoes). 2 Researchers reasonably take for granted that subjects from the same culture use a common semantic space to represent such concepts. If researchers have doubts that people use common semantic spaces, they generally conduct experiments in which they teach subjects novel concepts. That is, researchers design the space and use it to represent stimuli. 3 However, when attention shifts to concept learning in natural settings, attention must be paid to the space. How do agents come up with featural representations? Previous research reveals that some features of the context, such as the task at hand, induce agents to engage in this cognitive work. We argue that the emergence of a distinctive cluster of objects simplifies the task. Collective action by interested agents supplies the necessary motivation.
Categorization means labeling, or assigning observable entities to concepts, for example, “That is a large language model.” Because concepts lack boundaries, categorization involves some uncertainty. The modern Bayesian approach looks at categorization as a problem of statistical inference (Anderson 1991; Tenenbaum and Griffiths 2002). Such an analysis considers a person deciding whether an object is an instance of a concept. The instance-of idea can be expressed formally with the predicate IS-A(c, o), which has the intended reading of “The focal person believes object o is an instance of concept c.” This approach interprets the strength of an agent’s belief in the truth of hypothesis IS-A(c, o) as a subjective probability.
Suppose one must answer “yes” or “no” to the question, “Is this researcher, o, a cognitive sociologist, c?” The Bayesian approach holds that the probability of answering “yes” is given by the Bayesian categorization probability:
where π(x | c) is the concept likelihood and x is the position of object o in the semantic space. The other terms on the right side, Pr(x) and Pr(c), are the agent’s priors on the general distribution of feature values in the domain and on the likelihood an entity in that domain is an instance of the concept. This formulation links the notion of a concept to judgments about objects.
It might be helpful to give a toy example of a probabilistic concept. Suppose someone’s concept “circus” involves three binary features: (1) a traveling troop of performers, (2) high-wire acrobatics, and (3) trained animal acts. The semantic space here is a cube with eight vertices ([111], [110], and so forth). An individual’s circus concept assigns probabilities to the eight feature combinations, and the concept is given by π(f1,f2,f3 | c). For instance, it might assign 0.70 to (111), 0.15 to (110), 0.15 to (101), and zero (epsilon) to the others. Entities with all three features are highly likely to be categorized as a circus, those with troops of performers and one other feature have a low likelihood, and all other configurations have extremely low (essentially zero) likelihood.
A category is a stochastic realization of an underlying Bayesian categorization process. In formal terms, a person’s category for the concept labeled c is the crisp set of objects the person judges to be instances:
Collective Level
The analysis of worlds requires the concepts and categorizations of individuals be aggregated at a collective level. What groups of agents “think” reflects the concepts of each member because collectives do not possess concepts in the usual sense. But exactly how do collectives think? This is an unresolved question. We address it with a formulation of the degree to which a set of agents believes their meanings are the same or at least highly similar.
Taken-for-Grantedness
The taken-for-grantedness of concept c for individual m, in notation, g(c, m), is related to the categorizations of other members of the group. Because others’ concepts are not knowable, agents learn what others think by attending to their observable categorizations of objects. The more an agent observes that categorizations by other members agree with their own, the higher the likelihood of relying on such categorizations for inferences about objects when its feature values are unobserved. We refer to this process as involving taken-for-grantedness.
We use the model proposed by Hannan et al. (2019), which begins with the effect of an object’s categorization by one group member on the inference made by another. The postulated process works as follows. A Bayesian agent faces a situation that calls for an expectation about the feature values of a not-yet-seen object. A Bayesian will use the prior probability distribution of the feature values in the domain, Pr(x) in Equation 1, to form an expectation. If a member learns of another member’s categorization and treats it as credible, then her expectations about the relevant feature values will be based on her own concept likelihood, π(x | c).
The postulated process treats inference in the case of possible social influence as involving a mixture distribution, a weighted combination of Pr(x) and π(x | c) with weights that reflect social influence at work. If alter’s categorization has no effect on ego’s beliefs, then the mixture puts all the weight on the prior—there is no information update. Otherwise, the weight on the concept likelihood increases with the strength of the influence. 4
The next step considers one member and the group. The taken-for-grantedness of a concept for an individual in the context of a group is the strength of the individual’s propensity to accept the implications of categorizations by any other group member for whom they lack observations of prior categorizations. An agent takes the meaning of a concept for granted in the context of a group if she treats categorizations by any of its members as informative, even without prior information about the member (Hannan et al. 2019:236).
We assume taken-for-grantedness in a group is about the meaning of a concept,
If all group members rely only on their own categorizations (there is no social influence), then G is zero. Otherwise, G increases with the average over all (ordered) pairs of members of the weight placed on others’ observable categorizations (Hannan et al. 2019:241).
In defining a collective version of categorization and other judgments, we carry over the perspective built at the individual level. Doing so encounters complex aggregation issues (Hannan 2022a, 2022b). Some members of a collective might regard an object as an instance of a concept and others not.
Introducing a complicated account of the nature of aggregation can deflect us from the substantive issue. We simply refer to collective-level categorization and other judgments and leave open exactly what this means. We refer to general agreement to identify situations in which the level of agreement among members is high enough to justify concluding the collective thinks (nearly) alike about the matter.
We introduce a predicate, TFG(c, M), to express such an aggregation. Its intended reading is “The concept denoted by c has a generally taken-for-granted meaning for the set of agents M.” The following meaning postulate fills in the meaning of this predicate, links it to taken-for-grantedness, and instantiates the existence of the threshold.
Meaning Postulate 1: A generally taken-for-granted concept is defined as follows.
Let c denote a label for a concept and M denote membership of a group.
By leaving the threshold ω unspecified, we construct a qualitative account, one that expresses only our directional intuitions.
The tendency to update expectations about objects from categorizations by others comes from learning that other group members use the concept nearly the same way (Hannan et al. 2019). This means observing that others’ categorizations largely agree with one’s own. Because linguists and logicians dealing with concepts and categorization generally refer to a category as an extension of the concept, the argument we follow is expressed in terms of extensional agreement.
Group-level conceptual agreement is not observable because this requires measuring the distances between members’ concepts, and concepts are not observable. In contrast, agreement about categorizations can be observed and measured (Hsu 2006). This means empirical tests of the theory we present will likely depend heavily on the agreement on categorization.
We next consider an agent’s experienced agreement in categorization, first with another group member and then with the group as a whole. Let ea(c, a, b) denote the similarity of the categorizations in c by the focal agent, a, and by another group member, b, and let
Postulate 1: The taken-for-grantedness of a concept for an individual in the context of a social group is an increasing function of their experienced extensional agreement with group members about that concept 5 :
To connect the individual and group levels, we need to specify the relation between extensional agreement and individual-level taken-for-grantedness. We assume a linear relationship exists between the two quantities. We formulate this in the following auxiliary assumption 6 :
Auxiliary Assumption 1:.
With this additional assumption, G(c, M) is increasing in the group-level experienced extensional agreement, EA(c, M) (Hannan et al. 2019:248).
Proposition 1: The taken-for-grantedness of a concept in a group is increasing with average experienced extensional agreement about that concept.
An Ontology for a Producer-Audience World
The Cognitive Level: Concepts
Our representation of a world has two components: cognitive and material. The cognitive part consists of a triplet of concepts. The first is the organizing or focal concept c that gives meaning to a set of observable objects, as described in the previous section. The other two concepts are “producer of c” and “audience for c.”
Agents have concepts that provide meanings for producer of c and audience for c. Fit to expectations about the values of relevant features, these concepts shape judgments about who gets regarded as a producer or audience member. Concepts that give the meaning “producer of c” establish expectations about the making and public presentation of objects. For example, a producer of science, or a scientist, is expected to pursue the creation of new knowledge by conducting research, publishing their findings in journals or books, presenting these results at conferences, and giving lectures or teaching.
Similarly, the concept “audience of c” establishes expectations about agents who have some significant involvement with the objects presented by producers. For example, in banking, the audience includes the consumers making deposits and borrowing money and the regulators supervising the conduct of financial institutions. The audience role is not passive; enacting this role requires conceptual work and social behavior. Take the role of the critic in cultural production. Critics proclaim that some objects are exemplary, and they provide interpretations, cultural context, and aesthetic or quality judgments (Caves 2000).
Often, all that is expected of audience members is that they engage with what producers present publicly. But sometimes, the roles are more detailed. For example, in education and medical care, the roles of student and patient come with expectations that involve hierarchical components of authority, power, and prestige (Parsons 1975). Given that our model of taken-for-grantedness depends on agents learning from others’ categorizations, we consider only public engagement, or actions observable to others.
The concepts of producer and audience presumably have a probabilistic structure. Some configurations of feature values are highly likely, and others are less likely or unlikely for incumbents in these roles. The meanings of these concepts can vary in different contexts and over time. However, introducing such extreme flexibility introduces too much complexity. For tractability, we rely on the simplifying assumption that the meanings of the producer and audience roles are inherited from higher-order worlds, not newly created in a context. For instance, the role of professor varies little over the nature of the subject (e.g., literature or logic), as does the role of peer evaluator or student (the audience). With this simplification, we can identify worlds by labels or their focal coordinating concepts.
The Material Level: Objects and Agents
So far, our construction is purely cognitive. We fill out the meaning of a world by considering the agents who take on the roles and what they produce and evaluate. Objects have a central role because they are public demonstrations of private representations, carriers of forms both produced and observed. This notion reflects Sperber’s (1996, 2005) framework for cultural evolution, which treats culture as a set of private representations diffusing via public discussions of these interpretations of objects that manifest these representations.
Producers and objects are interdependent. An observable realization of a concept is the result of a producer’s thoughts and actions, and an agent is a producer by virtue of creating and presenting objects intended to exemplify the concept. To take the producer role with respect to a concept, an individual must have conceptualized it and presented an object as an instance. We denote the set of producers of the focal concept as “P.”
An agent is an audience member because the actor has conceptualized an object as a realization of c, inspected it, and engaged with it. We denote the set of audience members for the focal concept (at a time point) as “A.”
Production and engagement are social. Developing a new concept in complete isolation can produce novelty and, in principle, can sow the seeds of the emergence of a novel world. However, unless the work is shared with others and allows others to become interested, connect, and participate, the concept and its observable realization will remain private and will not have sociological significance.
Engagement requires nontrivial individual participation in a social context. It can take different forms and vary in intensity. In a music world, for example, attending a concert, opening a music club, performing music in public, signing an artist to a record label, and reviewing music performances each represent a form of engagement. Delivery of cleaning supplies to the club where music is performed, in general, does not. 7
Concept-related engagement means engagement in light of the concept. Engagement differs from consumption because audience members can engage with an object or performance with the goal of making a purchase decision or simply inspecting it out of curiosity. Similarly, consumers can buy without paying attention to the aesthetics that informed production.
We define the audience for a concept (genre, style, or form) as the set of agents that exhibit nontrivial levels of concept-related involvement (levels exceeding a threshold). The producer concept involves attending to the work of other producers. For instance, in science worlds, expectations for scientists include evaluating the work of other scientists, for example, as referees for publication or funding, for promotion and recognition as part of professional committees, or providing developmental feedback in a mentorship relationship. In general, producers are audience to one another.
The notions of object, producer, and audience are tied to a particular coordinating concept. This matters because two agents might engage an object in light of different concepts that come from different higher-level worlds. For example, a customer in a restaurant might consider the meal an instance of Japanese fusion, and another might consider it an instance of luxury dining.
Finally, membership of a world is the collection of producers and audience members, M = P ∪ A = A, where the second equality follows from the assumption that producers are audience to each other. We complete the ontology by combining the three components to define a producer-audience world, using the predicate WORLD(c, O, M).
Definition 1: Producer-audience world:
WORLD(c, O, M) is the case if and only if
1. the sets of its members, M, and objects, O, are not empty; and
2. its coordinating concept has a generally taken-for-granted meaning for the membership, that is, TFG(c, M).
This definition does not require members of a world share ongoing social ties or an identity. We opt for this less restrictive definition to accommodate situations in which the organizing concept has been highly institutionalized. In such worlds, such as classical Western music or medical care, the nonproducer audience does not typically share a common identity and interact outside the world. Nonetheless, many worlds (e.g., political movements) have a more social character and develop shared identities.
The Stages in World Emergence
Stage 1: Distinctive Clusters of Objects
When do novel worlds arise? This process starts with the rise of what get perceived as distinctive clusters of objects in a parent world. Hannan, Pólos, and Carroll (2007) argue that new organizational forms are created when distinctive clusters of producers/products appear in some setting, are labeled, and gain collective meaning. Explaining where these objects and producers exactly come from is not the focus of our theory. However, we expect that producers experimenting with ongoing activities will originate such transformations, for example, scientists conducting research that introduces new ideas or artists composing, painting, and designing new aesthetic solutions to creative problems. New producers in a setting using new methods also contribute to these dynamics.
Sperber’s (1996) model of cultural evolution provides a representation of how the process can start. An object connects the mental representation of its “creator” to that of another agent, who then goes on to produce something of their own. In this transmission process, either by direct communication or by imitation, objects transform; they do not necessarily reproduce exactly. This happens because some information might be lost in the process or because the agents do not seek to produce a replica but rather something that follows their own dispositions and preferences (Sperber 2005).
Another aspect of Sperber’s (1996) model helps explain how object clusters arise. Objects and their mental representations do not, he argues, depart randomly from those that preceded them. Instead, they tend to gravitate around existing variants in the space of what is seen to be possible. Existing objects and concepts, in other words, function as “attractors.” With cultural items clustering around these attractors, culture allows for change while also maintaining some stability.
The key is that a subset of members of the focal world applies a new collective label to the cluster of distinctive objects. In building this construction, we use a label function, lab(o, a), that maps pairs of objects and agents to a collection of labels. This notion of label differs from the IS-A predicate introduced previously. The latter is a conceptual statement: The agent regards the object as an instance of the concept. The label function does not require this. This function is extensional: It points to a set of objects.
A subset of the existing world’s membership that applies the same label, say l, to some subsets of the world’s objects is given by
This generic construction does not identify a unique set. Suppose M(l) contains two agents, a and b. The formula says that both members apply the label under consideration. But then the sets that contain only a and only b also satisfy the condition defining the l-applying membership (Equation 3). This is a problem if we want to connect this formulation to a research design that supports tests of hypotheses about properties of worlds that make them more (or less) likely to generate clusters. Doing this requires a definition that identifies a unique set of members. It seems natural to single out the most inclusive set.
We can do this by examining the powerset of the set of members, denoted
Definition 2: A maximal label-l cluster in a world is a set of objects, the extension of l for the maximal collection of members that apply l.
Finally, there is an issue of the social significance of the cluster. According to the proposed definition, a single member can be responsible for a cluster. We want to restrict attention to clusters that are large enough to serve as a basis for the emergence process. We introduce a threshold on the minimum size of the maximal membership that applies a particular label. For example, one might impose the restriction |M*| > θ.
We see no basis on which to predict what clusters emerge, just that some do. In this view, the natural way to model the emergence of clusters in a world is as an arrival process. Let N(W, t) denote the number of labeled clusters (of any label) that have emerged in the world W until time t. The arrival process is defined in terms of the following hazard.
Definition 3: The hazard of arrival of (labeled) clusters is given by
Stage 2: Exploring the Meaning of a Cluster—Scene
A cluster of objects recognized by producers and other audience members in a world is a candidate novel world. Whether a candidate becomes a full-fledged world depends on whether the world’s membership comes to conceptualize the label and then generally takes for granted that others attribute the same meaning to this label.
The cognitive and behavioral work of conceptualizing a cluster and coordinating concepts and categorizations generally takes place in localized structures characterized by repeated patterns of perception, interaction, and discussion among social agents concerning objects. As visible representations, objects have mental representations as their cause and effect. Sperber formalized this idea in a model of cultural evolution that treats constructive cognitive processes as involved both in representing mental inputs and producing observable outputs (Claidiere and Sperber 2007; Sperber 1996). Mental representations sparked by perceptions of objects can, in turn, cause the production of further objects that can spark revisions of mental representations, and so on.
We focus on a localized high-energy social structure, which we call a scene. 9 The concept of a scene has been used in sociological studies of the production, performance, and reception of popular music. Such work focuses on situations in which spatially close groups of producers, musicians, and fans come together to share their common musical tastes, create music for their own enjoyment, and distinguish themselves from others (Lena 2012; Lena and Peterson 2008; Peterson and Bennett 2004; van Venrooij 2015).
Consider, for example, the scene that developed from 1973 to 1975 in New York City’s East Village and Gramercy neighborhoods. During that time, famed music clubs, such as CBGB and Max’s Kansas City, opened to book young, untested talent to play urgent, edgy, anti-establishment music (Lena 2012). This “new wave” of music came to be referred to as “punk” (Kristal 2005). 10
CBGB’s owner, Hilly Kristal, recounted there were many young bands in New York at the time but just a few places where they could perform and play their own music. The new clubs offered them a home. Tommy Ramone of the band the Ramones, who had an early residency at CBGB, recalled that the initial patrons were “artists, bohemians, drag queens, and Hell’s Angels” (Kristal 2005). Others started coming around, musicians primarily, and then customers, too. Magazines such as Creem started reporting on the scene. The reputations forged at these clubs would lead many bands to sign record-label deals, with Patti Smith among the first to do so in 1975 with the new Arista label.
Despite its widespread use, the notion of scene lacks a precise definition. Silver and Clark (2015) describe it as an open idea, and Woo, Rennie, and Poyntz (2015:288) consider it a “sensitizing concept” used to give a general sense of reference and guidance. We try to be more specific about what we mean by a scene: an interaction of producers and audiences characterized by high energy. This use appears consistent with the original interpretation of scenes in cultural studies. For instance, Straw (1991, 2015) characterizes scenes as social spaces of effervescence, perhaps echoing Durkheim’s notion of the feeling of belonging and assimilation produced by collective ritual action, where agents express an excess of sociability. Collins (2004) developed a related notion of emotional energy, a feeling of enthusiasm, confidence, and motivation individuals derive from participating in a group.
High energy is a subjective state, but it is connected to cognitive and behavioral processes that appear critical in a scene. Becker (1982) claims that groups that become interested in new artistic work do not simply observe and choose favorites among known producers but also engage in actions requiring significant effort. In a scene, individuals take producer and audience roles and interact to understand and refine the meaning of the core activity and associated aesthetics. In doing so, they explicitly reflect on conceptual issues.
Scenes, in our view, are not restricted to arts worlds but occur in all kinds of domains. Consider, for example, this description of the early days of cognitive science:
In some respects, Cognitive Science was most successful at the very moment it was born. It is difficult to recapture the excitement of the first Cognitive Science Society meeting, but it was palpable. Each of the subdisciplines associated with cognitive science was well represented. It was like going to an exotic marketplace and sampling an intriguing variety of products. That there was not a common language only added to the interest and desire for commerce. (Bender, Hitchins, and Medin 2010:374)
Negro, Hannan, and Olzak (2022) examine such a dynamic in the domain of winemaking, specifically, the production of Barolo and Barbaresco wines (considered among the world’s best). Until the late 1970s, producers of these wines used a method that had not changed in a century. This traditional method produced very tannic wines that needed long aging in large barrels that were difficult to clean, which often gave a “barnyard” odor to the wine (appreciated mainly by connoisseurs). Younger producers looked to France for superior methods and began to use the small barrel common in the Bordeaux region. Soon, more young producers joined them. About 20 producers started meeting as a group, talking to each other about their experiments in vineyard and cellar practices, tasting and comparing each other’s wines, and learning from these critiques. By the mid-1980s, as these meetings continued, the members began to produce a distinctive cluster of wines, which critics called “modern.”
As these examples highlight, scenes are social structures in which a membership explores the meaning of a cluster of objects. As we see it, such structures heighten the chances for the emergence of collective taken-for-grantedness. The model of this process sketched prevoiusly requires that agents in a group can observe each other’s relevant categorizations. The intense interaction and communication that characterize scenes supply the needed opportunity.
Insofar as meanings diverge for members of the scene and the rest of the world, “disfluent” exchanges require effort at clarification over the boundary of the scene. 11 Sorting out misunderstandings and disagreements requires effort to articulate, discuss, share, compare, and contrast ideas and experiences. In other words, such situations can be characterized as involving high-energy states.
The onset of taken-for-grantedness lowers the collective energy state. Once members assume they have (nearly) the same meaning for the key coordinating concept, they are less likely to engage in discussion and debate about it. For this reason, we restrict the meaning of scene to the higher-energy state characterized by a lack of taken-for-grantedness.
We introduce a predicate to express the idea that the membership that attends to a cluster is exploring its meaning, seeking to understand how the cluster differs from what is usually encountered in the focal world. Let EXP(l, M(l)) indicate “the membership M(l) is actively exploring the meaning of the cluster labeled l.” The high-energy activity of exploring meaning by introspection and communication with others is inconsistent with the low-energy state characterized by general taken-for-grantedness.
Postulate 2: The meaning of a label cannot be both under exploration and taken for granted by a membership at a time point.
We now turn to a formal definition of the scene.
Definition 4: A collection of members interested in a cluster of objects to which they assign a common label creates a scene if they begin (generally) to explore the meaning of the label.
Consider a (parent) world W = ⟨c, O, M ⟩.
SCENE(⟨W, l, O*, M*⟩) is the case if and only if l ≠ c, O* ⊂ O, M* ⊂ M, and the membership M* is actively exploring the possible meanings of the label of the cluster, that is, EXP(l, M*) is the case.
The hazard of emergence of a scene is the limiting probability that those who attend to a cluster begin this search for meaning. So we express the hazard that a cluster becomes a scene as follows (here and in what follows, we must make the time points explicit in the notation).
Definition 5: The hazard of scene emergence is defined as
Stage 3: Emergence of a Novel World
Once a cluster has emerged, there are two labels in the picture: c and l. The two labels have a different standing: c is associated with a taken-for-granted concept, but l has only an extensional collective meaning. The last stage in the emergence of a novel world is for l to gain an intensional meaning, to become paired with a new taken-for-granted concept.
According to Definition 4, a scene is not a world (because its core concept is not taken for granted), but it would become one if the taken-for-grantedness grew to reach the threshold ω. The conversion of a scene into a world occurs if and when its membership takes for granted the meaning of the coordinating concept. Therefore, we define the hazard of the emergence of a world as follows.
Definition 6: The hazard of emergence of a novel world from a scene in a parent world is defined as follows.
Let
1 W ≡⟨c, O, M⟩ and S ≡⟨W, l, O*, M*⟩, where l ≠ c, O*⊂ O, and M*⊂ M;
2 WORLD(W, t) be the case for all the time points under consideration.
The first line in this definition gives a literal statement of the limiting probability of a transition from a scene to a world. The second line gives a logically equivalent restatement using Definition 4, which holds that the only difference between a scene and a world is the lack of taken-for-grantedness of the coordinating concept of a scene. The third line gives an equivalent statement using Meaning Postulate 1.
Social Structural Effects on the Three Transitions
The ontology and specification of hazards governing transitions in the dynamics of world emergence establish a possible foundation for testing causal theories. At the beginning of this article, we proposed that social structure affects who can observe whose actions, including categorizations. Hence, social structure can play an important role in the emergence process. We now develop this idea further and propose arguments about the effects of several dimensions of social structure on the world emergence process.
Transition 1: Arrival of Distinctive Object Clusters in a World
What properties of a world make it likely to spawn clusters of objects similar to each other and dissimilar from those in the rest of the world? Producers’ decisions and actions shape the features of objects, so we focus on producers. Producers are generally tied to other producers. They share new ideas with each other, and sharing new ideas in subgroups of producers can result in new clusters of objects. From this perspective, the degree of localization of social ties, generally called “network clustering” (Holland and Leinhardt 1971; Watts and Strogatz 1998), plays a key function. Clusters in a network of producers result in producers making objects that are similar to each other but different from the rest of the objects in the world.
The experiences, interactions, and communication shared in a group of producers are instrumental in achieving consensus on concepts. Research in cognitive science suggests that repeated interactions allow individuals to discuss and integrate different experiences and knowledge and to derive shared abstract representations of a problem (Fjaellingsdal et al. 2021). Sociological research has developed arguments consistent with this idea. Farrell (2003) studied groups of collaborators in art, science, and politics. These groups, which he called “circles,” are small scenes of peers who, through periods of dialogue and collaboration, construct a collective vision on how to produce new kinds of artworks, literary works, and social reform actions.
As Farrell’s (2003) analysis shows, the escalating reciprocity among scene members who work together leads them to experiment and discover new ideas or techniques that none would imagine alone. Members of circles generally show commonalities in beliefs, attitudes, and values. Their communication reinforces such commonalities so that they cohere into consistent cultural assumptions. These conditions appear to facilitate cognitive consensus: “[T]hinking aloud together, the collaborative process draws on each other’s memories, ideas, and thought processes. They operate almost as one mind” (Farrell 2003:285).
We formalize these ideas in terms of the coefficient of clustering in the network of ties among a world’s producers (in notation, PC). Our main line of argument here links the rise of clusters of objects to the degree of producer clustering. Producer clustering is measurable with the clustering coefficient, or degree of “community” structure as it is described in contemporary network science (Girvan and Newman 2002).
Postulate 3: Producer clustering increases the hazard of object clustering: [PC ↑OHet].
In the foregoing equation, we use a symbol ↑ to denote a positive monotonic temporal relationship between the term to the left and the one to the right. More specifically, φ ↑ ψ means that if the level of ψ is the same for 2 units, say x and x′, at a time point and for a subsequent interval φ(x) > φ(x′), then ψ(x) > ψ(x′) at the end of the interval. The Appendix provides the formal definition.
What accounts for the high clustering of producers? We propose that heterogeneity plays an important role. A producer network with high social heterogeneity is likely to fracture into relatively disconnected clusters due to homophily effects. Homophily draws together socially similar producers. When the overall level of heterogeneity is high, distinct clusters of producers form that are more similar within and more dissimilar between.
A cluster, by definition, is characterized as having a relatively high level of interaction (Ertug et al. 2022) and communication (Bang and Frith 2017). Evidence consistent with these claims comes from studies at multiple levels of analysis. Sociological research at the individual or dyadic level finds that culturally similar individuals are more likely to voluntarily sort into social connections (Lizardo 2006; Mark 1998). At the meso level, Farrell (2003:285) found that more social and cultural similarities among peers in professional collaborations were positively associated with the formation of groups, particularly egalitarian groups. For example, members become more knowledgeable about each other’s work. They provide intellectual, emotional, and material support helpful for producing new kinds of objects and developing collective meaning. These groups also stand out from the rest, presenting what they do as an alternative—sometimes even an opposition—to the status quo. This situation is positively associated with homogeneity within producer clusters and heterogeneity between them and with the degree of clustering in the overall producer network.
Postulate 4: The coefficient of clustering of the producer network in a world increases with its social heterogeneity: [PHet ↑ PC].
Postulates 3 and 4 warrant the main result about the effect of heterogeneity in clustering.
Proposition 2: The hazard of the arrival of a cluster in a world increases with the social heterogeneity of the world’s producers: [PHet ↑ κ].
The proof of this proposition, and most that follow, applies the chain (or cut) rule: If φ, ψ, and χ are formulas, then φ → ψ and ψ → χ imply φ → χ. We supply proofs for the other cases.
Transition 2: The Emergence of Scenes
The previous stage focuses on the arrival of clusters of objects and producers. Cognitive work is involved in this process, with producers thinking about their activity and sharing their experiences with their peers. It does not necessarily follow that clustering leads to a collective representation of these objects and producers by the full membership.
How does social structure contribute to the likelihood a cluster gets conceptualized by a group? Hannan et al. (2007) argue that agents are unlikely to engage in cognitive and social effort to coordinate on a new label unless the cluster is relatively distinctive and large. Distinctiveness matters because it improves perception and comprehension (Murphy 2002). Distinctive clusters are sets of objects that are similar to each other, closely spaced in the semantic space, and distant, on average, from objects in the parent world. Size also matters because very small clusters (e.g., songs by a single artist recorded at nearly the same time or publications of a single research group about the results of related experiments) will generally have the greatest average similarity. The principles of cognitive economy, which discourage the formation of a great multiplicity of concepts for a world, imply that very small clusters are unlikely to be conceptualized. However, very large clusters are also unlikely to be conceptualized because they are unlikely to be distinctive from the rest of the world. Due to these opposing forces, clusters are most likely to be conceptualized at some middle levels of size and distinctiveness. For a Bayesian concept learner, this is a rational way to make sense of the world (Anderson 1991; Dasgupta and Griffiths 2022; Tenenbaum and Griffiths 2002).
This conclusion is similar to Rosch’s claim about the importance of basic-level concepts (Rosch et al. 1976). Rosch argued there is a trade-off between descriptive accuracy and parsimony. Suppose we want to use concepts that provide detailed and accurate distinctions. To do so, we would use very narrow concepts that create very high similarity among the items included. However, this cognitive strategy involves making a very large number of conceptual distinctions. The generally accepted view holds that humans are cognitive misers who seek to economize the use of scarce cognitive resources; this results in a tendency to form what Rosch called basic-level concepts. “Chair,” for example, is a basic-level concept; people find it easier to use “chair” than “furniture” (higher level) or “bar stool” (lower level). A large body of research supports Rosch’s claims (Murphy 2002).
We now build on the static image developed in this previous work using the notion of salience. Taylor and Fiske (1978) defined (perceptual) salience as the quality of a stimulus that makes it prominent, conspicuous, or otherwise noticeable compared with its surroundings; they describe salience as the “figure” that stands out against the less noticeable “ground.” Let salience(c, m, t) be a nonnegative real-valued function that records the perceptual salience of the object cluster k at time t for individual member m, and let Salience(t) denote the average salience of the focal cluster for members at time t.
Postulate 5: The likelihood a set of members of a world seeks to understand the meaning of a cluster label increases with the average salience of the cluster: [Salience ↑ Pr(EXP)].
Proposition 3: The hazard that a scene develops around a cluster increases with its average salience. If the Conditions 1 and 2 in Definition 4 are satisfied at all the time points considered, then [Salience ↑ λ].
Proof: The consequent in this proposition is a limit of the probability members in L are exploring the meaning of c at time t + u given they have not done so at time t (Definition 4). Postulate 5 and the definition of ↑ complete the chain.
Transition 3: The Emergence of Novel Worlds
A novel world is more likely to emerge from a scene when participation in it is more stable and coordinated. High consensus on the collective concept underlies such stability and coordination. This consensus is achieved when the taken-for-grantedness of the concept reaches a high level, which, as Berger and Luckmann (1966) argue, gives “cognitive validity” to shared meanings.
Empirical network science suggests that systematic engagement within a group generally increases the likelihood of consensus (Baronchelli 2018). However, sociological research following Simmel (1904) suggests that interaction and communication can also spark disagreement and cause conflict, even division within a group. Disagreement can arise from divergence in members’ vision of the scene as its activity becomes more intense. Or preexisting disagreement, not previously recognized, can become apparent from more active discussions among members that reveal different motives for participating in the scene. Or disagreement can reflect competition to gain higher status within the scene. The drive for control over interpretation of the activity can precede or be the by-product of scene emergence. Either way, disagreement can result in scene members developing different interpretations (concepts).
We do not need to adjudicate between these rival claims. Either path can yield the pattern we claim. On one path, engagement allows many opportunities to learn categorizations of other members. If these categorizations generally agree, possibly due to the updating of concepts to move closer to the group, then the resulting taken-for-grantedness will be high (Hannan et al. 2019). On another path, members who find their categorizations do not agree with the consensus are likely to leave the scene.
We denote the average level of engagement for members of a scene as Eng(S). The main intuition is that high average engagement promotes the development of extensional agreement for the coordinating concept.
Postulate 6: The degree of extensional agreement about a coordinating concept increases with the average level of engagement of the scene’s members: [Eng ↑ EA].
Postulate 7: The average level of engagement in a scene decreases with the social heterogeneity among its members: [MHet ↓ Eng].
Now we have implications that connect member heterogeneity with extensional agreement and the hazard of emergence.
Lemma 1: Extensional agreement about a scene’s focal concept decreases with the social heterogeneity among its members: [MHet ↓ EA].
Proposition 1, Lemma 1, and Definition 6 imply the main result.
Proposition 4: The taken-for-grantedness of the focal concept of a scene decreases with the social heterogeneity among members of the scene: [MHet ↓ G].
We want to develop the implications of the argument chain that warrants Proposition 4 for the hazard of emergence. It seems intuitive that the higher G at one time increases the likelihood that G at a later time will have crossed the threshold ω. However, this does not follow logically from the information contained in the argument. One would need a model of the dynamics of G to learn the conditions under which intuition can be proven. Lacking such a model, we offer a hypothesis that builds on the argument behind the proposition but does not follow logically from it. We do this so we can tie the argument back to the hazard of emergence.
In stating the hypothesis, we need to impose the condition that G lies below the threshold for TFG, denoted here by ω, during the interval that ends just before the hazard is evaluated because the hazard is undefined otherwise. We do this as follows (the Appendix explains the notation):
Hypothesis 1: The hazard that a novel world emerges from a scene decreases with the social heterogeneity among its members: [Ω → MHet ↓ µ].
Role Heterogeneity in Scenes
Social heterogeneity, as discussed previously, concerns sociodemographic characteristics. Another relevant form of heterogeneity involves role differences among scene members. When a high proportion of members adopt the producer role, the membership engages in conceptual work and responds to one another’s ideas. One can expect communication to flow more freely and openly.
Participating in a scene as a producer does not simply mean making an object and presenting it to an audience. Producing something novel requires several activities that take time and effort. Research on creativity suggests the production of novel objects is a multistage process (Mumford 2003). The production process starts with investing effort in learning and practicing the knowledge, skills, and expertise for work in a domain. After this preparation stage, ideas can flow through conscious reflection or unconscious incubation. A stage of insight follows where a producer experiences the emergence of new ideas. Subsequently, the ideas are evaluated to determine whether to discard, retain, or review them. The final stage is the externalization of an idea in a concrete and observable form.
Producers thus engage with a concept before they make an object, and the very activity of making the object also results in engagement. Nobel laureate Morrison (1993) described herself as someone who “practices and practices and practices,” and by her own admission, it was this sustained application of ideas that made her aware of formal structures in her art and promoted literary inventions.
The practice of production also contributes to agreement about the collective concept. Farrell (2003) studied groups of scientists, social activists, and artists, including the Impressionists, who in the 1860s challenged the rules of French academic painting. The Impressionists lived and worked in proximity; they often painted alongside one another, observing and commenting on each other’s work. They participated in aesthetic experimentation, collectively judging their innovations, incorporating some and rejecting others, and effectively working out a consensus on what their new art was and was not. Farrell (2003) calls this engagement process “instrumental intimacy,” a situation in which individual members align their cognitive processes. When such alignment occurs, group members feel validated in challenging established concepts and authority because their individual ideas “feel” stronger because they are shared by others in the group. In such social contexts, high engagement sustains the scene, and the associated shared experience facilitates consensus on the focal concept.
Because producers are audience to one another, our model does not require the presence of nonproducer scene members for a world to emerge. In practice, many scenes include nonproducers. Members of a scene not directly involved in production, of course, engage with objects and their producers. They communicate and interact. Some of these members develop enough expertise to become intermediaries, such as professional critics paid to share their assessments and opinions with the public. At the same time, nonproducers have less access to and know less about the production activity than producers themselves. In scientific worlds, the output of academic knowledge looks fully interpretable; nonetheless, scientists find it illuminating to discuss their work with their peers. In seminars or project collaborations, they can share their discovery process and discuss ideas they explored but that did not work out or results that do not get reported in research publications.
More direct knowledge of the production process contributes to shared participation in an activity and agreement about what is or is not part of that activity. However, with a larger share of nonproducers, fewer scene members will have developed an understanding of the scope of the preproduction process and therefore share all the relevant information. Cognitive processes are less aligned across the membership as a result, and extensional agreement will also be lower. In this situation, different understandings and perspectives can coexist, which we call heterogeneity in orientation (in notation, OHet).
Because we assume that producers also adopt the audience role, we implement the intuition described in the preceding paragraphs in terms of the proportion of membership in a scene only in the audience role 12 :
Now we turn to the postulate that codifies our intuition that proliferation of nonproducer members in a scene lessens engagement by some producers in introspective and conceptual work and in interaction and communication that sustain common mental representations. In general, less engagement can undermine consensus among producers and reduce the chance of world emergence.
Postulate 8: The orientation heterogeneity of a scene increases with the proportion of the nonproducer audience: [A-only ↑ OHet].
There are several reasons to expect that greater heterogeneity in orientation will reduce agreement about what is and is not c (extensional agreement). First, such divergence will cause networks of producers to fragment, thereby reducing opportunities to discuss marginal objects and possibly reducing agreement. Divergence in orientations will also generally cause fractions to pay attention to different (new) objects. Finally, nonproducer scene members might activate different sets of motivations for engaging with the scene (e.g., market success in a purely aesthetic activity), which can shape members’ perspectives on the focal activity and their patterns of collaboration.
Postulate 9: Extensional agreement about a focal concept in a scene declines with orientation heterogeneity of its scene members: [OHet ↓ EA].
Proposition 5: The taken-for-grantedness of a collective concept decreases with the proportion of the scene’s membership purely in the audience role: [A-only ↓ G].
Hypothesis 2: The hazard that a world emerges from a scene decreases with the proportion of the scene’s membership purely in the audience role: [Ω → A-only ↓ µ].
The Ecology of World Emergence
A scene that could spawn a world develops in the presence of other scenes from the same or different parent worlds. So, it makes sense to adopt an ecological perspective. Membership engagement in a scene follows the logic of resource allocation under constraint. Each individual member presumably has a limited budget of resources—cognitive effort, time, money, and so forth—to invest in participation in a world (McPherson 1983). An individual who allocates more resources to one scene has fewer resources to allocate to others. Therefore, scenes compete for engagement. Lena (2012) describes how the growth of grunge music stalled when performers began to devote their energies to other musical styles and streams within rock.
Homophily provides a basis for examining the ecology of the emergence of worlds. Individuals who are similar in sociodemographic characteristics, motivations, or roles attract each other and are more likely to form and maintain relationships, interact, and communicate effectively with each other (McPherson, Smith-Lovin, and Cook 2001). The ecological theory of voluntary affiliation asserts that associations compete for members in a property space defined by members’ sociodemographic characteristics (McPherson 1983). Individuals join groups based on sociodemographic similarity (McPherson et al. 2001). Competition between voluntary associations occurs when their membership niches overlap, that is, when groups seek to recruit the same kind of members. As more groups/organizations recruit in a neighborhood of the social space—the combined set of sociodemographic characteristics of individuals that influence social interaction—the more intense their competition is.
The combination of competition and homophily results in a distinctive dynamic of voluntary-group affiliations (Popielarz and McPherson 1995). Membership in an association located close in social space to many other associations has higher turnover rates. Furthermore, members whose sociodemographic characteristics are atypical in an organization leave groups at higher rates because they are weakly attracted to the focal association’s membership and are more likely to have connections with members of other nearby associations.
When applied to our issue, these arguments suggest the likelihood of the emergence of a world from a scene decreases as the overlap of its membership with that of other scenes increases. Consider two scenes in a focal world, S = ⟨W, l, O*, M*⟩ and S′ = ⟨W, l′, O*′, M*′⟩. A useful measure of their overlap is the Jaccard similarity of their memberships:
The total overlap of a scene with all other scenes in the focal world is
Postulate 10: Increasing total overlap of a scene lowers its average levels of engagement: [OV ↓ Eng].
The postulates that justify Postulate 6, Definition 6, and Postulate 10 imply the main result.
Proposition 6: Greater total overlap of a scene lowers the taken-for-grantedness of its focal concept: [OV ↓ G].
Hypothesis 3: The hazard that a world emerges from a scene decreases with the scene’s total overlap: [Ω → OV ↓ µ].
Discussion
The world emergence process can reshape the cultural landscape in two ways. In one, the new collective concept is considered a subconcept of the parent world. For instance, the rise of the beer type called “hazy” (or “New England”) IPA results in a conceptual and material elaboration of beer. In the other case, the new collective concept combines semantic spaces of two or more parent concepts. This is a kind of hybridization. Examples include brewpub, computational linguistics, and romantic comedy.
In the hybrid case, our focus on one parent world and clusters and scenes in that world might appear myopic. With this focus, we can see the consequences of actions by agents who belong to that world, but we do not observe that they also belong to other worlds, as we discussed in modeling the ecology of worlds. This narrow approach gives a useful partial view of the complex emergence process. It has the advantage of avoiding the selectivity problem discussed in the beginning of this article. One could also avoid the problem by somehow limiting the set of higher-level worlds and analyzing clusters and scenes in all the combinations of those worlds. However, this approach would be very complex to operationalize.
The empirical purchase of the theory we present depends, at least partly, on the feasibility of designing measures of the key constructs in each stage of the emergence process. The analysis starts with clustering and labeling by a subset of a world’s membership. The “audience turn” in organizational analysis (Pólos, Hannan, and Carroll 2002) involved a recognition that a cognitive perspective depends on learning what the audience thinks and not what distinctions analysts make. One could elicit clusters from surveys of members, but a more feasible approach would focus on observable categorization (labeling).
In the kinds of worlds we have described, labeled objects can be physical or digital products or services in market settings; speeches, party platforms, social media posts, or public demonstrations in political settings; or images, three-dimensional objects, or sounds in artistic settings. Earlier analyses, such as Ruef (2000), traced the emergence of new organizational forms to the first mention of a new label in a search of histories of relevant forms or corpora of media publications. However, when a new label appears in these sources, it has already generally received widespread attention and survived a selection process. This research design needs to be revised by expanding the search for early collective use of labels, such as in art-exhibition catalogs or music compilations of the kind used in van Venrooij (2015). Online archives can be used to measure early label adoption, including for labels that reached only a limited local use.
For an analysis of the appearance of clusters that get labeled, the key is to measure aspects of the social structure of producers, especially network clustering. Here, researchers can rely on metrics such as the local average clustering coefficient (Watts and Strogatz 1998) for producers’ neighborhoods associated with new labels in a parent world, particularly its variants for two-mode networks (Latapy, Magnien, and Del Vecchio 2008), to measure ties to objects. An alternative approach could build on measures of niche overlap for producers associated with the new label, comparing their overlap with other producers from the parent world with and without the new label, similar to studies of technological innovation and competition in market settings (Podolny, Stuart, and Hannan 1996). These different approaches are not mutually exclusive and can be combined.
Analyses of the second stage of world emergence require measures of time-varying average levels of engagement in a scene. Events such as technology trade shows or art shows can be a source of data on engagement. Fairs are gatherings involving creators or companies representing them who display their goods to potential buyers as well as critics, the media, and competitors. The producers inform participants of their respective interpretations of their goods and the social situations in which they become involved for the sake of this trade (Lampel and Meyer 2008). Data on participation in such events can be collected through different empirical strategies, from surveys to ethnographic participation and quantitative databases.
Observing engagement in a scene at the member level means measuring instances of verbal communication about or physical participation with the objects in terms of frequency, intensity, variability, and so on. One example of a study measuring multiple dimensions of communication in scene-like spaces is Budak and Watts’s (2015) computational analysis of the Occupy movement in Turkey. Although Budak and Watt did not focus on concepts, their empirical strategy offers a useful operationalization of forms of engagement in a scene-like setting. They collected Twitter posts and users in Turkey who used any of the hashtags associated with the Occupy movement during the period in which protest events took place in Gezi Park in Istanbul. Participation in protest events was measured using Foursquare check-ins, which allowed users to share their locations and create a record of their experiences. Each measurement strategy can be used individually or conjointly to measure engagement with a collective concept and heterogeneity in some dimensions of social characteristics, such as political beliefs.
Finally, we come to perhaps the most important empirical challenge, measuring taken-for-grantedness. As noted earlier, agents cannot share their cognitive processes, so the sociological question of conceptual agreement and the analysis of the hazard of world emergence become more tractable by relying on extensions. Following the idea that the emergence process represents what members of a scene think and do through reference to objects, researchers could conduct surveys and interviews and ask members questions about their use of labels and beliefs and the share of other members they think use the same labels to identify the shared activity.
Insofar as the research design collects data on how scene members categorize objects, observed categorizations allow direct measurement of extensional consensus. Hsu (2006) was the first study to follow this approach. Hsu’s analysis of audience appeal for feature films used archival media records of genre categorizations of films by several highly visible professional critics. Hsu calculated consensus about a film’s genre classification as a similarity measure of genre label assignments for pairs of critics and then averaged over pairs. Empirical analysis of our model’s propositions can adapt this operationalization of consensus to collective concepts. If archival materials provide such categorization matrices for multiple time points, then research can trace the evolution of consensus. In the best case, such a record would cover the early history of a scene.
Finally, observing the diffusion of categorizations over time allows measurement of the dynamics of taken-for-grantedness of a focal concept. Media sources can be probed to identify label usage in the public domain. For example, the Discogs database contains information about audio recordings (commercial releases as well as promotional, bootleg, and off-label recordings) and uses stylistic tags generated by users to catalog them. Measuring the historical trajectory of new tags across users over time can provide a proxy for the taken-for-grantedness of emerging music genres.
Future empirical research can discover whether and how each function—measuring clusters of distinctive objects, extensional agreement, and engagement in a scene—is specific to a content domain or occurs across multiple domains. For example, physical participation in collective events might matter less for the vitality of scenes because the formation of such spaces occurs in many corners of the digital world. At the same time, sustained face-to-face interactions might be more significant for the diffusion of taken-for-grantedness in content domains like the arts, where conceptual work is often driven by intuition and must be unpacked through complex communication possible only when agents are physically co-located. In contrast, experimentation with new engineering solutions is based on scientific knowledge that is more clearly codified and can be sustained more effectively through virtual interactions.
Footnotes
Appendix
Acknowledgements
We thank the anonymous reviewers for guidance. We received helpful comments from Greta Hsu, Gael Le Mens, Helena Miton, Susan Olzak, Michele Piazzai, Elizabeth Pontikes, and Andy Walder.
