Abstract
This essay develops an integrated account of aesthetic experience by bringing neuroscience into dialogue with psychoanalysis. It critiques disembodied, oculocentric models of visual perception, proposing instead that aesthetic engagement is mediated by embodied simulation—a neurofunctional mechanism enabling viewers to reenact observed gestures, affects, and movements. This simulation activates a prereflective, affective unconscious rooted in bodily memory and relational experience. Drawing on Winnicott’s concept of transitional phenomena and Kris’s notion of regression in the service of the ego, the author frames the aesthetic image as a transitional object that facilitates affective modulation and subjective reorganization. Aesthetic experience emerges not as symbolic interpretation but as a temporally structured act of play, attunement, and transformation. By articulating the convergences between neuroscience and psychoanalysis, the essay offers a novel model for understanding how images engage our bodies, shape our unconscious, and participate in the ongoing formation of subjectivity.
Keywords
Introduction: Rethinking Aesthetics Beyond the Visual Brain
The experience of artistic images cannot be adequately understood within the confines of a purely visualist paradigm. The long-standing dominance of what can be termed oculocentrism in aesthetic theory and empirical aesthetics has privileged the eye as the primary organ of aesthetic access, while marginalizing the bodily and affective dimensions of perception. This reductive approach, still prevalent in much of cognitive neuroscience, treats the visual aesthetic object as a stimulus to be decoded by specialized areas of the visual cortex, thereby severing the encounter with the image from the lived body of the beholder (Kawabata & Zeki, 2004; Zeki, 1999; see also Kandel, 2016). In contrast to this representational model, the framework of embodied simulation developed over the past two decades, offers a radical reconfiguration of aesthetic experience as inherently multimodal, affectively charged, and motorically grounded (Freedberg & Gallese, 2007; Gallese, 2009, 2017, 2018, 2024).
From this perspective, perceiving an image is never a passive registration of its features, but an active bodily engagement with its affordances, gestures, and expressive qualities. The image is not a “picture” observed from a distance, but a site of potential bodily resonance.
This resonance is made possible by the functional architecture of the brain-body, in which sensorimotor, affective, and cognitive domains are tightly interwoven. Numerous neuroscientific studies—including those investigating the mirror neuron system and its human homologues—have shown that the observation of actions, facial expressions, and even static depictions of movement activates motor and emotion-related areas of the brain in the observer (Gallese, 2014; Gallese, Eagle, & Migone, 2007; Keysers et al., 2004; Wicker et al., 2003). These activations are not epiphenomenal: They constitute the prereflective scaffolding of our capacity to understand and emotionally relate to the world, including the domain of artistic images.
The hypothesis of embodied simulation proposes that the aesthetic power of images arises, at least in part, from their capacity to activate these prereflective, bodily based mechanisms of understanding. Such mechanisms allow the viewer to internally reenact, simulate, and thus “feel into” the gesture, movement, or affect depicted in the image. Crucially, this process is not restricted to the domain of action perception. It extends to our engagement with images that depict emotional states, bodily sensations, or even abstract forms that suggest a certain kinetic or affective potential. In all these cases, aesthetic experience emerges as a form of mediated intersubjectivity, that is, a virtual, yet bodily grounded as-if relation between the subjectivity of the beholder and that of the artist, encoded or evoked by the images’ contents and/or by the artistic gestures that created the image. Here, intersubjectivity is not used in the strict psychoanalytic sense of a reciprocal meeting of two minds (Benjamin, 1990), but as an as-if relationality. The image is literally what stands between beholders and the artist, mediating their bodily relation (Freedberg & Gallese, 2007; Gallese, 2017). By describing aesthetic experience as a form of “mediated intersubjectivity,” I do not suggest that the artwork itself possesses subjectivity (but see Belting, 2011, and Bredekamp, 2017, for an attribution of agency to images). Rather, I emphasize that the image functions as a mediating presence that recruits embodied dispositions—such as motor resonance, affective attunement, and implicit relational knowing—that ordinarily support face-to-face social interactions. In this sense, the relation between beholder and image is not interpersonal in a literal sense, but intersubjective by analogy: It activates the bodily and affective architecture through which intersubjectivity is enacted. Intersubjectivity here designates a mode of embodied relation grounded in mechanisms that evolved for social interaction, but which can be redirected toward cultural artifacts. Calling it “mediated” highlights that the relation is triangular: self–image–world, where the image acts as a mediating device that reactivates intersubjective dispositions while simultaneously reshaping them within symbolic and cultural frames.
This view entails a significant theoretical shift. It requires us to reconsider not only the mechanisms of aesthetic perception, but the very nature of subjectivity and its relation to the image. It suggests that aesthetic experience is not primarily about symbolic meaning or interpretative decoding, but about a specific mode of relational being-in-the-world—an encounter that mobilizes our sensorimotor system, our affective attunement, and our embodied memory. This opens the door to a richer account of aesthetic experience, one that resonates with certain strands of philosophical aesthetics (Dewey, 1934; Merleau-Ponty, 1964) and psychoanalytic thought, while remaining firmly grounded in empirical neuroscience.
In this essay, I aim to extend this embodied approach by reintroducing the notion of the unconscious—but in a way that departs from the classical Freudian model of repression and symbolic substitution. The unconscious I refer to here is not a repository of repressed desires or infantile fantasies, but a domain of bodily based, nondeclarative processes that structure our engagement with the world in a prereflective and affectively modulated manner. It is, in this sense, a sensorimotor and affective unconscious—a set of embodied dispositions and affective schemata that shape perception, emotion, and action below the threshold of conscious awareness (Bargh & Morsella, 2008; Damasio, 1999; LeDoux, 1996). By affective unconscious I mean the domain of prereflective, nondeclarative processes—sensorimotor patterns, vitality affects, and implicit relational schemata—that shape experience without entering awareness. This usage draws on Freud’s (1915/1957) descriptive unconscious, Bucci’s (1997) subsymbolic systems, Stern’s (2010) vitality affects, and Lyons-Ruth’s (1998) implicit relational knowing. The affective unconscious thus refers to the bodily and relational ground of subjectivity that remains implicit yet active, continuously informing perception, affect, and aesthetic response.
Such a reconceptualization of the unconscious resonates with recent developments in cognitive neuroscience and affect theory, but it also finds important precedents in psychoanalytic authors who emphasized the role of embodied processes and creative play in the formation of subjectivity. As the essay unfolds, I will draw on the work of Donald Winnicott and Ernst Kris—two figures who, in different ways, theorized the aesthetic experience as a site of transitional, transformative engagement with the world. Their insights offer a fruitful complement to the neuroscience of embodied simulation, providing a vocabulary for describing the subjective, affective, and developmental dimensions of image perception that are not captured by brain activation maps alone (Kris, 1952; Winnicott, 1971/1989).
This rearticulation of aesthetics in embodied and psycho-biological terms has important implications. It allows us to move beyond the dichotomy between cognition and affect, or between perception and imagination, by showing how these dimensions are always already entangled in our bodily engagement with images. It reframes the aesthetic encounter not as a contemplative act of disinterested judgment, but as a dynamic interaction between organism and artifact, shaped by motor intentionality, affective resonance, and the sedimentation of prior experiences. And it opens the possibility of theorizing images not merely as representational objects, but as affective dispositives—structures that solicit, modulate, and transform the viewer’s embodied experience (Manning, 2012; Marks, 2002).
The aim of this essay, then, is twofold. First, to elaborate a theoretical framework that integrates embodied neuroscience with a nonrepressive, affectively grounded notion of the unconscious. Second, to apply this framework to the analysis of aesthetic experience, understood as a temporally structured, affectively resonant, and sensorimotorically enacted form of mediated intersubjectivity. The focus will be on manmade images—paintings, photographs, and digital artifacts—but the underlying principles potentially extend to other modalities of artistic and symbolic production.
By combining empirical findings on mirror mechanisms, affective resonance, and aesthetic judgment with psychoanalytic insights into creativity, transitional phenomena, and regression in the service of the ego, this essay aims to develop a novel account of how we engage with images, and what is at stake in this engagement. In doing so, it hopes to contribute to a broader rethinking of aesthetics as a science not only of beauty or form, but of the embodied processes through which we encounter and are transformed by the world.
The Image as Encounter: Temporality and Affective Resonance
To understand the aesthetic experience of images, it is not sufficient to describe what is depicted or to account for how the visual brain decodes pictorial stimuli. Rather, we must explore what happens in the encounter between the image and the beholder—an encounter that unfolds as a dynamic process of resonance, projection, and transformation. This encounter is not abstract or disembodied; it is affectively and temporally structured. The image appears as something more than an object of contemplation: It is a site of potential activation, drawing the viewer into a relational space that mobilizes memories, affects, and motor intentionality.
When we look at a painted figure in motion, for example, we do not merely register its position on the canvas. We implicitly recognize the gesture as part of a movement that has a past and anticipates a future. This dynamic temporal structure is central to what I have elsewhere described as the narrative temporality of images. The painted instant is not a frozen slice of time, but a condensate of a temporal trajectory. Our sensorimotor system spontaneously projects forward and backward: The raised arm is understood as having risen from a previous position and as likely to descend or extend further. This projection is not a conscious inference but an embodied simulation—a prereflective, motorically grounded enactment of the movement encoded in the image.
The temporality at stake here is not clock time but lived time, phenomenologically and affectively modulated. The aesthetic experience of an image thus entails a temporal unfolding within the beholder—a temporal activation that is bodily in nature. This dimension has often been overlooked in both philosophical aesthetics and neuroscientific accounts. Yet it is crucial if we are to grasp the force of images, their capacity to arrest and transform perception.
There is a striking parallel between this temporal activation and the psychoanalytic notion of Nachträglichkeit, or deferred action (après-coup in French). First introduced by Freud (1895/1960, 1914/1958) to describe the retroactive structuring of psychic meaning through temporal delay and resignification, the concept was profoundly reelaborated by Jean Laplanche (1999), who elevated Nachträglichkeit to a fundamental dynamic of the unconscious—highlighting how meaning emerges not at the moment of experience, but through its translation and reactivation in a later context, often in relation to enigmatic messages from the other.
In the context of aesthetic experience, we can interpret Nachträglichkeit as the process by which an image reactivates previous embodied schemata—motor, affective, mnemonic—imbued with a new salience. The image does not simply present something new; it reconfigures what was already latent in the subject’s body. This process, while not repressive in the Freudian sense, belongs to a broader conception of the unconscious as temporal, embodied, and relational.
The image, under specific perceptual and affective conditions, can function as what ethology terms an Auslöser—a releaser: a stimulus that evokes a patterned, embodied response without requiring conscious mediation. First developed in the work of Konrad Lorenz (1937, 1964), the concept of Auslöser refers to perceptual cues that trigger innate behaviors in animals. I employ this concept heuristically, not literally: Unlike Lorenz’s fixed-action patterns in animals, the responses to artworks are not innate reflexes. Here, Auslöser does not denote instinctive behavior but refers to the capacity of the image to elicit embodied simulation through its formal, gestural, or affective affordances. Rather, the image can function as an affordance (Gibson, 1979), selectively eliciting sensorimotor and affective resonance. The image operates less as representation than as an affective device, mobilizing what could be called an embodied mneme of relational life.
This understanding converges with Aby Warburg’s (2010) Pathosformel—historically recurring visual motifs charged with affective intensity—which likewise operate across temporal and cultural boundaries as affective triggers. The image, as Auslöser, reanimates the viewer’s affective unconscious: not through symbolization, but through resonance. This shift—toward understanding the image as a neuroaffective releaser—marks a development in the embodied aesthetics proposed here.
In this view, the image triggers affective and sensorimotor patterns that were inscribed in the subject’s history. These patterns may be autobiographical, but they nevertheless belong to the shared architecture of embodied intersubjectivity, the neural and bodily substrate that enables the projection and reenactment of affective intentionality. The neuroscientific literature provides compelling evidence that the observation of emotionally salient images activates cortical and subcortical structures implicated in affective resonance—such as the anterior insula, the anterior cingulate cortex, the premotor and the somatosensory cortex (Bastiaansen, Thioux, & Keysers, 2009; Bonini et al., 2022; Wicker et al., 2003). These activations are not merely correlative; they instantiate a bodily reenactment of the affective quality perceived in the image. In this sense, the aesthetic encounter becomes a reencounter—with the other, through the body’s own capacity for response.
But images do not only trigger memories or the simulation of movements. They also produce a specific mode of attention and presence. Hans Ulrich Gumbrecht (2004) has argued that aesthetic experience is characterized by the “production of presence”—a form of intensity that displaces everyday instrumental perception and makes us momentarily inhabit a different experiential register. While Gumbrecht does not develop a neurobiological account of this phenomenon, his phenomenology of aesthetic presence resonates with the notion of embodied simulation. The sense of presence is not a purely cognitive effect; it emerges from the activation of bodily schemata that underlie our capacity to feel the image as here, now, and meaningful to us—even if that meaning resists explicit articulation.
This affective immediacy is also temporal in structure. The image captures a moment, but that moment is experienced as thick with potential: It evokes a before and an after, a tension between what has happened and what might happen. This is especially evident in depictions of movement, such as Caravaggio’s Judith Beheading Holofernes, where the gesture is frozen at its most charged point—halfway through the act of decapitation (Figure 1). The viewer is caught in the suspense of the gesture, which simultaneously suggests a past action (the lifting of the sword) and a future consequence (the fall of Holofernes’s head). The image thus generates what I would call a sensorimotor anticipation, a forward-directed simulation of the kinetic and emotional consequences embedded in the scene.

Judith Beheading Holofernes by Michelangelo Merisi da Caravaggio (ca. 1598–1599). Oil on canvas, 145 × 195 cm. Galleria Nazionale d’Arte Antica, Palazzo Barberini, Rome
This anticipatory structure engages not only motor systems but affective ones as well. The image does not merely show pain or ecstasy—it enacts them in the viewer through affective mirroring.
This sensorimotor anticipation is psychodynamically charged: It may mobilize, for example, unconscious fears of castration, fragmentation, and the collapse of symbolic authority. Of course, these are just possible associations; in different beholders the same image might evoke different memories, feelings and personal associations. The suspended gesture—midway between action and consequence—becomes a temporal fold where the viewer encounters both fascination and horror, echoing Julia Kristeva’s (1982) notion that the abject both repels and attracts, marking the edge of the subject’s symbolic coherence.
This movement, however, is not directed solely outward toward the depicted object or scene. It simultaneously turns us inward, mobilizing a reflective process that is not primarily cognitive, but affective and temporal in nature. The aesthetic encounter thus acts as a dynamic mirror: not reflecting stable knowledge, but modulating our bodily states, shifting our affective tonality, and reorganizing our temporal orientation. Engaging with the image transforms our internal landscape—it alters our bodily affect, redirects our attention, and reconfigures our sense of time as we oscillate between memory and anticipation. In this way, the image operates as a transitional object (Winnicott, 1971/1989): a mediating artifact that sustains the negotiation between inner and outer reality, between past experiences and future possibilities, and ultimately between self and other.
The image, then, is not a passive object but an active participant in a relational process. It stages an encounter whose temporality is that of the body: rhythmic, affective, anticipatory. To look at an image is to be taken into a trajectory—to inhabit a temporal arc that is not given but enacted in the act of perception. This enactment draws on prereflective bodily knowledge, accumulated through past interactions with the world, others, and cultural artifacts. The image resonates because it finds a place within this sensorimotor-affective history, and because it reorganizes it in the present.
Such a view of aesthetic experience challenges the traditional boundaries between perception, imagination, and memory. It also invites a reconsideration of the unconscious—not only as a repository of symbolic content, but as the embodied archive of past engagements that continue to shape perception in the present. This unconscious is not hidden, but enacted; not repressed, but modulated; not static, but dynamic and temporally inflected.
In sum, aesthetic experience is best understood as a temporally structured encounter with an image that resonates within the embodied subject. This resonance is affective, motoric, and historical. It involves a prereflective simulation of the depicted gesture or affect, a reactivation of prior embodied schemata, and an anticipatory structuring of the perceptual present. The image becomes a catalyst for transformation—a site where perception, memory, and motor intentionality converge in an embodied act of meaning-making.
This account, however, is still too general and vague. Things are far more complicated. The complex interplay between narrative temporality and affective presence can be further illuminated by a brief detour into the visual logic of a single image—Giotto’s San Francesco dona il mantello a un povero—which offers a vivid example of how framing and proximity modulate the embodied experience of an artwork.
Interlude: Two Views of Giotto—Narrative Depth and Affective Presence
Giotto’s San Francesco dona il mantello a un povero (Saint Francis Giving His Cloak to a Poor Man, 1296–1299, Upper Basilica of San Francesco, Assisi) offers a revealing experiment in the modulation of aesthetic experience through framing and proximity (Figure 2). Observing the fresco in its full compositional context, the viewer is drawn into a spatially coherent narrative. The architectural backdrops, the sloping terrain, and the disposition of the figures construct a perspectival stage on which the act of charity unfolds. This mode of viewing engages embodied simulation in a temporal and relational register: We project movement, anticipate gesture, and empathetically enter the action. The aesthetic experience here is one of narrative resonance—an image that simulates time through its structure.

Saint Francis Giving His Cloak to a Poor Man by Giotto di Bondone (ca. 1296–1299). Fresco, Upper Basilica of San Francesco, Assisi. (A) Full image. (B) Detail
In contrast, when we isolate the detail, by coming closer to the fresco, or zooming the image—focusing on the moment of exchange, the gaze, the drapery—the fresco is transformed. Spatial depth collapses into surface; the naturalistic background gives way to a gold-toned abstraction that evokes the timeless aura of icon painting. The affective tone shifts: from empathetic witnessing to ecstatic suspended presence. This is no longer a story observed, but a moment inhabited. Sensorimotor simulation becomes focused on the haptic and the intercorporeal—the subtle inflections of hand, fabric, and gaze. What we encounter here is not the temporally extended action experienced when beholding the fresco in its entirety, but a suspended relational field: The fresco’s detail becomes a transitional object, a surface that holds and intensifies an affective moment.
This oscillation—between narrative temporality and suspended embodied presence—illustrates with clarity the multilayered nature of aesthetic experience. It invites us to consider how formal and perceptual framing recalibrates the simulation enacted by the viewer. The image does not change; what changes is our being-with the image. And in that change lies a model for how the same aesthetic object can mobilize distinct registers of the affective unconscious: from movement to stillness, from projection to immersion, from narrative empathy to a focused, suspended form of embodiment that hovers at the threshold of action and presence.
Embodied Simulation and the Affective Unconscious
The embodied approach to aesthetic experience hinges on a simple yet profound proposition: that perceiving an image is not a disembodied act of interpretation, but—first of all—a situated, affectively modulated event grounded in the sensorimotor and affective capacities of the body (Gallese & Guerra, 2019). Embodied simulation provides the neurofunctional mechanism through which this resonance occurs. Originally developed to account for action understanding, the notion of simulation has since been extended to encompass a broader array of perceptual, emotional, and cognitive phenomena. At its core lies the idea that the brain-body does not merely process inputs, but actively models the bodily states associated with them—reenacting, within the observer, aspects of what is seen, imagined, or felt.
This reenactment is prereflective and automatic. When we observe a painting that depicts pain, our brain activates regions associated with our own experience of pain—not metaphorically, but through actual patterns of neural activity. Functional imaging studies show that viewing facial expressions of pain, whether rendered artistically or captured in naturalistic photography, recruits the anterior insula, the anterior cingulate cortex, and primary somatosensory areas (Ardizzi et al., 2021). These activations reflect a bodily resonance with the depicted state, which grounds the possibility of empathy. Importantly, the strength of this neural resonance is not uniform: It is modulated by individual traits (such as empathic disposition), context (art vs. documentary), and the intensity of the aesthetic judgment. Aesthetic experience is not merely about “seeing” an emotion, but about feeling into its embodied texture.
What emerges here is a conception of the unconscious that differs from its classical psychoanalytic definition. This is not the unconscious of repression, symbolic condensation, or dreamwork. Rather, it is a procedural and affective unconscious—comprising sensorimotor dispositions, affective tendencies, and embodied schemata that structure our relation to the world prior to and independently of verbal cognition.
This view resonates with contemporary understandings of implicit memory and nondeclarative processing (Damasio, 1999; Schacter, 1987), but also with psychoanalytic perspectives that emphasize the body as the locus of early relational experience (Stern, 2004; Winnicott, 1971/1989). These bodily traces, sedimented through lived experience, remain active below the threshold of conscious awareness and are reactivated in aesthetic encounters.
Embodied simulation thus offers a bridge between neuroscience and a prereflective and nonpropositional account of the unconscious. It allows us to understand how aesthetic experience can be shaped by prior relational histories. For example, the spontaneous activation of facial mimicry when observing expressive images is not governed by a deliberate act of empathy, but by an automatic, embodied resonance. Such activation can be attenuated or amplified depending on the viewer’s affective memory—an internal archive of past experiences with similar gestures, expressions, or moods. This memory is not stored as a linguistic proposition, but as a set of bodily dispositions, ready to be reenacted when the right perceptual affordance—the right Auslöser—is encountered.
This process bears important implications for our understanding of aesthetic pleasure and aesthetic transformation. The act of viewing an artwork is not passive reception but a bodily event, in which the viewer’s sensorimotor and affective systems are mobilized. This mobilization is often subtle and not consciously noticed, yet it can lead to significant shifts in mood, attention, and relational orientation. These shifts are the result of a reorganization of the subject’s embodied state—a transformation that is best described in terms of attunement rather than interpretation. This aligns with Daniel Stern’s (2010) notion of vitality affects—dynamic patterns of feeling that are not reducible to specific emotions but modulate the tone and rhythm of experience.
From this perspective, aesthetic experience becomes a privileged site for the emergence and reconfiguration of the affective unconscious. Images do not simply express emotions—they solicit, amplify, and transform affective states in the viewer. They operate not by transmitting explicit messages, but by modulating the relational field. This is particularly evident in the response to abstract art, where the absence of narrative or representational content makes the viewer’s own sensorimotor history the primary locus of meaning. The embodied simulation of brushstrokes, lines, or rhythmic arrangements of form and color engages the viewer’s proprioception and kinesthetic memory. As proposed by Freedberg and Gallese (2007), even static visual stimuli can activate motor areas of the brain, suggesting a covert bodily engagement with the gesture implied in the mark.
This activation is not uniform but selectively tuned to the expressive qualities of the image. Just as we can sense the tension or fluidity of a dancer’s movement, we can feel the force or delicacy of a painter’s stroke. These qualities are not properties of the image alone; they are relational effects, coproduced by the viewer’s embodied simulation of the depicted or implied artist’s gesture. In this sense, the affective unconscious is not private or solipsistic—it is intercorporeal, shaped by our capacity to resonate with the bodies of others, even when these bodies are virtual or symbolic.
This intercorporeality is not restricted to human figures. We can also resonate with objects, landscapes, and abstract configurations when they are perceived as expressive (Böhme, 2018; Griffero, 2016; Lingiardi, 2025; Morelli, 2011). The neural reuse of sensorimotor circuits allows us to project affective intentionality onto forms that lack agency, yet evoke action or emotion through their shape, texture, or composition. This capacity underpins the metaphorical structure of aesthetic experience: A sloping line can feel “melancholic,” a jagged contour “violent,” a smooth curve “sensual.” These are not arbitrary associations, but the result of embodied mappings grounded in our history of action and perception (Lakoff & Johnson, 1999).
By activating and modulating the affective unconscious, aesthetic experience can also acquire a therapeutic or transformative function. It can reorganize the subject’s embodied schemas, reconfigure habitual affective patterns, and open new modes of relational being. This idea finds resonance in Ernst Kris’s (1952) seminal notion of “regression in the service of the ego.” In artistic experience, he argued, the mind can temporarily suspend reality-bound constraints and enter a state of controlled regression (i.e., from secondary processes to primary processes, in Freudian terminology)—reviving earlier, more plastic modes of experiencing without disintegration. This regression, I propose, is not only symbolic or imaginative, but also sensorimotor: a loosening of fixed bodily patterns and a reactivation of latent affective possibilities.
I submit that what facilitates this regression in the service of the ego is precisely its embodied framing. The simulated movement or affect is not acted out impulsively, but experienced within a safe, distanced context. The viewer does not become the suffering or ecstatic figure in the painting but resonates with it at a safe remove. This dynamic—simultaneous identification and distinction—is central to the aesthetic modulation of the affective unconscious: It has been framed as “liberated embodied simulation” (Gallese, 2017, 2018; Gallese & Guerra, 2019; Wojciehowski & Gallese, 2011). It allows the viewer to experiment with affective positions, to inhabit forms of movement or stillness that may not be accessible in ordinary life, and to return from the encounter with a reorganized experiential horizon.
In this light, we can view embodied simulation not merely as a mechanism of empathy or perception, but as a modality of affective reorganization (Loewald, 1960). Aesthetic images do not speak to us only through propositions or symbols; they also act upon us, shaping our bodily and affective states in ways that remain, for the most part, outside the domain of conscious control. This action is not irrational or chaotic; it follows the logic of the body—of resonance, rhythm, and attunement. To theorize the unconscious in aesthetic experience, then, is to recognize the primacy of this embodied logic, and to understand how it can be mobilized in creation, transformation, and aesthetic enjoyment.
Transitional Phenomena and the Creative Illusion (Winnicott)
If embodied simulation provides a functional description of how aesthetic images engage the sensorimotor and affective systems of the viewer, psychoanalysis—particularly in the work of Donald Winnicott—offers a conceptual framework for understanding the subjective experience of this engagement. Winnicott’s theory of transitional phenomena and the potential space they inhabit is especially fertile for thinking about art and aesthetic experience. These concepts allow us to describe the encounter with the image not only in neurofunctional or phenomenological terms, but as a mode of being-with—a relational and imaginative act grounded in the body, yet oriented toward transformation.
In his seminal book Playing and Reality, Winnicott (1971/1989) argues that the formation of the self depends on the ability to negotiate the tension between inner and outer reality. This negotiation is not solved through repression or rationalization but is mediated through transitional objects and phenomena—soft toys, sounds, gestures, or activities that serve as bridges between the subjective world of fantasy and the shared world of reality. The space in which these transitional phenomena occur—the potential space—is a zone of experience that is neither entirely inside nor outside the subject but co-constituted by both. It is in this space that creativity arises, and it is here that Winnicott locates the origin of cultural experience.
Aesthetic images, I argue, can be understood as adult forms of transitional phenomena. 1 They do not simply depict reality or express inner states; they create a shared space in which the viewer can play with perception, feeling, and meaning. This play is not frivolous—it is a serious, embodied activity that allows for the modulation of affective states and the rehearsal of alternative relational configurations. The “illusion” offered by the artwork is not deceptive, but creative. It enables a kind of make-believe that is experientially real and transformative precisely because it is held—neither fully believed nor dismissed but lived through as if.
This aesthetic illusion is fundamentally embodied. It is not confined to the cognitive elaboration of symbolic content but enacted through sensorimotor resonance and affective attunement. The viewer simulates the gestures, affects, or dynamic forms embedded in the image, and in doing so, enters a transitional relation with it. The image, like the infant’s transitional object, is not questioned—its ontological status is bracketed in favor of experiential engagement. The viewer does not ask whether the image is real or fictional; what matters is the quality of the encounter it affords. This echoes Winnicott’s claim that the transitional object “is not an internal object (which is a mental concept)—it is a possession. Yet it is not (for the infant) an external object either” (Winnicott, 1971/1989, p. 9).
The potential space of the aesthetic encounter is also temporally structured. As discussed earlier, images evoke a lived temporality—a tension between past, present, and future that is enacted within the viewer’s sensorimotor system. In Winnicott’s terms, this corresponds to the experience of playing: a temporally extended act in which the child rehearses scenarios, tests affect and explores relational possibilities without being overwhelmed by them. The adult viewer of art engages in a structurally analogous process. The image becomes a space of exploration—a container for affective experimentation, where meaning emerges not through interpretation alone, but through movement, attunement, and embodied play.
This insight has significant implications for understanding the aesthetic function of form. Consider, for example, the expressive line in drawing or the dynamic gesture of a brushstroke. These are not merely stylistic features; they are invitations to simulation. The viewer’s eye traces the curve, enacts the gesture, and through this micromovement enters into relation with the artist’s embodied act. This process recalls Winnicott’s emphasis on the continuity between playing and cultural experience. Just as the child uses the object to symbolize the presence of the mother, the viewer uses the image to sustain a relation—virtual, yet affectively real—with the embodied intentionality of the artist.
Importantly, this encounter is not regressive in a pathological sense. Following Ernst Kris, Winnicott distinguishes between pathological regression, which entails a loss of ego functions, and regression in the service of the ego, which enables creative reorganization. The transitional space of aesthetic experience supports this second type of regression. It permits the loosening of habitual patterns of perception and feeling, making room for novelty, ambiguity, and ambiguity tolerance. The viewers are not required to resolve the tension between form and content, gesture and figure, or beauty and pain; rather, they are invited to dwell in it—to sustain a complex, affectively rich engagement without premature closure.
From a neurocognitive perspective, this dwelling is underpinned by the same sensorimotor systems that support action and social cognition. Embodied simulation allows for a decoupled yet vivid experience of gesture, posture, and expression. In the context of transitional phenomena, this simulation takes on a dual character: It is both a form of recognition (of familiar bodily intentionalities) and a site of play (where those intentionalities can be reconfigured). The image, in this sense, becomes a kind of embodied hypothesis—a space where new affective combinations can be tested, rehearsed, and integrated.
Winnicott’s insistence on the holding environment—the reliable yet flexible relational field provided by the caregiver—finds a conceptual echo in the material and spatial features of the image. The frame of the painting, the edge of the screen, or even the stylistic coherence of an abstract composition provides a boundary that supports the viewer’s exploration. This boundary is containing: It allows for affective experimentation without the threat of disintegration. It is what makes the experience of powerful or painful images tolerable, even meaningful. The analogy between the pictorial frame and Winnicott’s holding environment is not meant to collapse their different contexts, but to underline a shared structural function. The frame, of course, does not “care” for the painting as the caregiver cares for the infant, but it establishes a boundary that makes possible a space of play and transformation. In Winnicott’s sense, holding provides a secure containment in which the child can experiment with impulses and affects without the risk of disintegration. Likewise, the frame and formal boundaries of an artwork create a delimited space where regression and affective experimentation can unfold without overwhelming the beholder. The emphasis is therefore on the formal and structural analogy: Both frame and holding environment sustain an intermediate area of experience by containing intensity while permitting openness and play. As such, the aesthetic field shares with the therapeutic setting the function of holding: a space of protected encounter in which affective truth can emerge.
This also suggests a deeper affinity between art and play—not in the trivial sense of entertainment, but in the Winnicottian sense of creative exploration. The viewer, like the playing child, brings past experiences, bodily dispositions, and relational templates into the encounter, and in return receives the possibility of transformation. The aesthetic illusion is not a flight from reality, but a way of engaging with it differently: from a position of openness, plasticity, and embodied imagination.
The unconscious engaged by the aesthetic experience is not hidden or censored, but latent—waiting for the right affordance to be reactivated and reconfigured. The image provides that affordance: It offers a transitional field in which affective material can surface, be held, and potentially be worked through. This process is not one of insight, but of enactment. It is lived through the body, through simulation, tension, and release. The viewers may not be able to articulate what has changed, but they may leave the encounter with a subtly altered relation to themselves, others, or the world.
To conceive of aesthetic experience in Winnicottian terms is thus to recognize its fundamentally relational and embodied structure. It is to understand the image not as a representation to be deciphered, but as a presence to be inhabited—an invitation to play with affect, meaning, and the boundaries of the self. This play, sustained by the containment of form and the openness of simulation, is at once profoundly personal and intrinsically social. It connects the viewer to the otherness embedded in the image, and through that connection, to the shared ground of human embodiment.
Ernst Kris and the Dual Function of Art
In Ernst Kris’s pioneering reflections on the psychology of art, we find a conceptual bridge between psychoanalysis and aesthetics that foregrounds creativity not as an exceptional gift but as a transformation of everyday mental processes. In his essay “The Aesthetic Enjoyment and the Reality Principle” (Kris, 1938), and later in Psychoanalytic Explorations in Art (Kris, 1952), Kris articulates a dual-function of the aesthetic experience: It simultaneously engages the ego’s capacities for mastery and reality-testing while allowing for a controlled regression to earlier, more fluid forms of thought. This tension—between the structured and the regressive, between control and surrender—provides an important complement to the embodied approach I have outlined so far.
Kris’s insight is that the aesthetic experience occupies a liminal zone, where primary and secondary processes can coexist without conflict. The primary processes, characterized by condensation, displacement, and associative freedom, are not eliminated in adult consciousness but remain latent, manifesting in dreams, symptoms, or—more constructively—in art. Under ordinary conditions, the ego maintains strict control over these archaic forms of experience. But in the aesthetic encounter, this control is temporarily relaxed. The viewer is permitted to regress—to let go of rigid cognitive and emotional defenses—without losing coherence. This form of regression is not pathological; it is “in the service of the ego.” That is, it enables the subject to enrich and reorganize experience without being overwhelmed.
When considered from the standpoint of embodied simulation, this controlled regression is not only symbolic but sensorimotor and affective. The aesthetic image—through its form, gesture, and affective charge—invites the viewer into a state of bodily resonance. This resonance draws on deeply ingrained patterns of perception, movement, and feeling that were first acquired in early relational contexts. In the presence of art, the viewers may simulate a gesture they have never performed, empathize with an emotion they have never felt, or reenact a bodily state whose origin is not consciously accessible. These simulations activate prereflective layers of the self—layers where affect, memory, and movement are still entwined.
The aesthetic image acts as a medium for reaccessing early affective and sensorimotor configurations, but within a bounded and symbolically structured field. This is the core of Kris’s model: Art creates the conditions for “playing with reality” without disavowing it. The viewer is not asked to believe in the literal reality of the image, but neither is the image inert or purely decorative. It functions as a zone of transformation, a space where bodily affective memories can be reactivated and potentially reorganized through symbolic play.
This model offers an important corrective to overly cognitive accounts of aesthetic experience. While Kris remains within a Freudian framework in many respects, his emphasis on ego functions and on the structuring capacity of form opens a path for integrating psychoanalytic theory with contemporary neuroscience. The aesthetic image, as I’m proposing, is a site of dynamic interaction between ego and id, between voluntary control and involuntary resonance. The viewer enters a state of dual consciousness—simultaneously aware of the image’s fictive nature and absorbed in its affective reality.
This duality is also reflected in the neural architecture of aesthetic engagement. As empirical studies have shown, perceiving emotionally charged or dynamic images activates not only visual areas but also regions involved in action planning, touch, and interoception (Gallese, 2017, 2019; Gallese & Freedberg, 2007). These activations suggest a distributed, embodied network of aesthetic processing—one that supports Kris’s view of the aesthetic as a hybrid state, where cognition and affect, representation and enactment, imagination and embodiment coexist.
The dual function of art also involves the tension between aesthetic distance and affective immersion. This tension, far from being a limitation, is generative. The structured form of the artwork—its composition, rhythm, internal coherence—provides a scaffolding that allows the viewer to approach intense or even disturbing affects without disintegration. The image “holds” the affect, in much the same way as Winnicott’s transitional object holds the child’s conflicting desires and anxieties. The viewer can approach ambivalence, contradiction, and emotional intensity within a frame that permits both containment and play.
This scaffolding is not only formal but temporal. As we have seen in the previous sections, aesthetic experience unfolds in time, allowing for successive phases of identification, distancing, and reengagement. In this temporal unfolding, the viewer can navigate affective landscapes that might otherwise remain inaccessible. A painting depicting suffering, for instance, may first provoke empathic resonance, then cognitive reflection, then aesthetic pleasure at the rendering of form. Each phase involves different levels of bodily and affective engagement, structured by the viewer’s capacity to move in and out of resonance with the image.
Kris’s model thus converges with the idea that aesthetic experience is not merely passive reception but active transformation. The viewer is not only stimulated but subtly reorganized through the aesthetic encounter. This reorganization is not always conscious; it often takes place on the level of affective disposition, bodily tone, or relational orientation. The image functions not as a message but as a device—a dispositive—that modulates the viewer’s embodied relation to meaning.
Moreover, Kris’s notion of regression “in the service of the ego” can be read today considering contemporary models of neural plasticity. The idea that structured play with affect and perception can lead to reconfiguration of experiential patterns aligns with findings in affective neuroscience, which show that repeated exposure to affective stimuli in safe contexts can reshape emotional responses (LeDoux & Pine, 2016). In this light, aesthetic experience becomes not only a domain of enjoyment or reflection but a mechanism of implicit affective learning—a way of expanding the repertoire of felt relations to the world.
This has relevance also for the contemporary digital landscape, where images proliferate, and where aesthetic forms often operate outside traditional institutions or frames. The question becomes: Under what conditions can digital images support this dual function? Can the scrolling viewer, momentarily gripped by a meme, a video fragment, or an artful interface, enter a Winnicottian or Krisian transitional space? My hypothesis is that they can—but only when the image can hold both aspects of the dual-function: the regression to affectively charged, sensorimotor resonance, and the structuring function that allows for symbolic containment and transformation. In this regard, the difference between an image that manipulates and one that transforms may lie in the latter’s capacity to engage both poles simultaneously.
Kris’s theory also sheds light on the role of formal innovation in art. By disrupting habitual forms of representation, modern and contemporary art force the viewer to renegotiate their perceptual and affective stance. This disruption can induce a temporary loss of ego control—what Kris calls a “regression”—but if the work provides a new formal scaffolding, the viewer may emerge with a restructured sense of relation, both to the artwork and to themselves. Embodied simulation, in this case, does not merely reproduce familiar patterns but participates in the generation of new ones.
In sum, Ernst Kris offers a conceptual model that complements the embodied account of aesthetic experience by emphasizing its dynamic structure, its capacity for transformation, and its grounding in a balanced play between archaic and adult modes of being. His notion of regression in the service of the ego finds new resonance when understood not only as a psychic dynamic but as a neurobiological and sensorimotor process. Art, in this light, may not just be the sublimation of conflict but its embodied rehearsal and transformation.
Conclusion
What does it mean to encounter an image today—an image that moves, demands response, records reaction, and circulates through networks far beyond the individual gaze? This essay has sought to articulate a framework for answering this question by integrating embodied neuroscience with psychoanalytic theory, to understand aesthetic experience as a relational, sensorimotor, and affective event. Rejecting reductive accounts that isolate visual perception from the lived body, I have argued that aesthetic images engage us through embodied simulation: a neurofunctional process that reenacts gestures, affects, and movements, creating a bodily resonance with what is perceived. This process does not unfold in isolation; it is shaped by the subject’s affective history and relational templates—by what I have described, following both psychoanalysis and contemporary neuroscience, as the affective unconscious.
This unconscious is not the Freudian unconscious of repression and drive displacement. It is nonrepresentational, and fundamentally bodily. It resides in the sensorimotor dispositions, the microaffective adjustments, and the sedimented patterns of response that shape how we encounter the world, including works of art. From the moment an image solicits our motor system, it begins to act upon this unconscious register—activating not only recognition, but the possibility of transformation.
While the present account emphasizes the affective and sensorimotor dimensions of the aesthetic encounter, this is not to deny the relevance of cognitive and symbolic processes. Cognitive appraisals, conceptual framing, and interpretive reflection play important roles in how we engage with images—particularly in sustained, culturally mediated aesthetic practices. Similarly, the Freudian model of the unconscious as repressed remains crucial for understanding how aesthetic forms can mobilize symbolic displacement, condensation, and the return of the repressed.
This affective unconscious should not be confused with the Freudian notion of the preconscious, which refers to mental content that is latent yet symbolically structured and accessible to awareness. By contrast, the register I refer to is prereflective and embodied: It consists of sensorimotor dispositions, affective tendencies, and relational templates that shape experience without necessarily becoming available to introspection. It is closer to what Bucci (1997) describes as the subsymbolic, and what Stern (2004) calls implicit relational knowing—forms of embodied and affective organization that operate beneath the threshold of representation, yet are foundational to how we feel, move, and relate.
Rather than viewing these frameworks as mutually exclusive, I propose that they operate on different levels of the aesthetic experience, and that a complete account must consider their interplay. Nonetheless, my emphasis here falls on the prereflective and affective strata, as these are often neglected in contemporary theory and yet foundational to any subsequent elaboration of meaning or form.
It is precisely this foundational role that makes the structure and containment of aesthetic form essential. This is why the aesthetic field has long been theorized as a space of play, symbolization, and sublimation. But the approach developed here, informed by Winnicott and Kris, reframes these processes in more corporeal and relational terms. The artwork, in the first place, is a transitional object: a medium of encounter between inner and outer, self and other, sensation and imagination. It permits a temporary regression into affective and sensorimotor configurations not otherwise accessible, while maintaining the integrative functions of the ego. This regression is not disorganization, but reorganization (Loewald, 1960)—a loosening of fixed forms of experience that allows for new configurations to emerge.
To theorize aesthetics today requires moving beyond classical dichotomies—between subject and object, mind and body, reality and illusion. It requires understanding the image not as a window or a text, but as a threshold—a space where perception becomes relation, and where relation becomes a means of knowing and becoming. Embodied simulation provides the neural grounding for this threshold; the affective unconscious provides its dynamic field; transitional phenomena provide its experiential structure.
We do not simply look at images—we are moved by them, in every sense of the word. We enact them with our bodies, resonate with them through our histories, and sometimes, through them, become other than we were. The image, when it works, makes something happen: not only in the space of the retina or the cortex, but in the deeper strata of feeling, memory, and being. It is this happening—corporeal, affective, relational—that defines the true site of the aesthetic.
Footnotes
Acknowledgements
The author wishes to thank Paolo Migone for his insightful comments on a previous version of this paper.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the Ministry of University and Research, the National Recovery and Resilience Plan, project MNESYS (PE0000006), and PRIN grant 2020YB7J25, awarded to the author.
