Abstract
This commentary responds to Gillian Rose's ‘Visualising human life in volumetric cities: city digital twins and other disasters’ as a framework for thinking about crisis and visual cultures in digital twinning practices. It frames Rose's analysis as a specific intervention into the visual and cultural relationship between filmic depictions of disaster and catastrophe replicated through white and masculinist imaginaries of the ordered rational city. We respond to her call for investigation of the ‘untwinnable’ elements of the city by re-approaching the city digital twin from multiple genealogies including the concept of the twin, the software used to produce 3D visualisations and the panoptic fantasies of techno-surveillent discourses. As a result, we conclude by expanding Rose's arguments into the factory, the originary vision of the digital twin as a quality control system in production lines, and returning to a technological vision of the city presented by digital twins that is not only deeply masculinist, but symptomatic of the crisis of capitalism.
Rose's ‘Visualising human life in volumetric cities: city digital twins (CDTs) and other disasters’ offers a compelling analysis of the ways in which digital twinning technologies – and specifically urban or CDTs – produce and perpetuate a white masculinist vision of the city. The article focuses on the role played by visuality and visual culture – including the increasing use of dimensionality in urban digital models; the referentiality of a ‘highly mobile’ perspective towards filmic convention, and the illusion of seamlessness afforded by the rendering of digital three-dimensional (3D) objects. There is much to say on these topics, and Rose's contribution is expansive in its scope, while particular in its argument: yes, CDTs can be understood through visual cultures and imaginaries, and yes, they continue to enact technocratic visions of the city which exclude, other and impose. As Rose (describing a method for interpreting disaster media) notes of Gergan et al. (2020), it is necessary to risk overgeneralisation in order to demonstrate broad-scale shifts. The same is true of visualising the world digitally, whether through CDTs or digitally animated disaster films, as Rose attests.
Inevitably, given the focus, Rose's characterisation of CDTs is emblematic of the fundamental ambiguity and uncertainty of digital twins and twinning practices, in which specific questions around volume, computation and representation are conceptually muddled by technological claims to perfect, real-time correspondence, used to hype the spectacle of the CDT. As Rose argues, we can approach digital twins, and particularly CDTs, differently if we understand them as technological tools that ‘picture’ volume and digitally mediate human life through existing regimes (in this case, the cultural imaginary of urban management technologies, as well as the disaster film and white masculinity). However, the underlying technological slipperiness of the CDT – and its close connection to domains like urban planning, simulation and modelling and digital animation – threatens to mute the impact of the case being made, since many of these examples might also apply to other technological contexts or media (e.g. games and digital art or digital mapping). Yet, this speaks to the co-constitutive aspects of CDTs and their wider cultural milieux, as argued by Rose – undergirding the need for different modes of techno-cultural analysis of such technologies. Nevertheless, there is a risk of collapsing different modes and media (e.g. 3D modelling and four-dimensional or even five-dimensional digital twins that incorporate data, live or real-time feeds and human input), or over-ascribing qualities to CDTs which are, in fact, more general – though this is, again, part and parcel of the ambiguous CDT landscape. The stock image Rose shows in Figure 5 of ‘Visualising human life in volumentric cities' is a good example here – as Rose and others have also articulated elsewhere (Degen et al., 2017; Rose et al., 2014), smart cities and mega city projects also use similar digital aesthetics to project future cities and imaginaries.
Taking these unavoidable challenges into account, we understand Rose's article as invitation to expand the analysis of the ‘untwinn-able’ and to apply techno-cultural approaches to the problems of computational optimisation and expansive volumetric regimes (and their power), via the kinds of digital mediation available to CDTs. Beyond the filmic and the scopic, we point to academic debates from critical cartography, urban planning and game studies to support the centrality of white masculinist (and we argue, capitalist) ways of seeing in relation to digital twins. Much of this contribution follows our own analyses of the way in which the digital twin was formulated first from the factory as a production-line monitoring tool (Fraser et al., 2025; Payne et al., n.d.), as well as ongoing work on the role of dimensionality in cartography (Wilmott, 2020b), urban computation (Payne, 2024) and scopic regimes and discursive frames of disaster (Fraser, 2022).
From factory to urban fantasy
Rose's analysis centres on the formation of city or urban digital twins as ways of seeing that ‘universalis[e] a disembodied, rational, white masculine gaze which acts on cities, infrastructure and populations based on objective, and objectifying, data’ (p. 164). As she argues, CDTs project a fantasy of the city in crisis – either currently or potentially – in which the digital twin offers a technological solution. For Rose, this is a symbol of a wider crisis of both masculinity, and of computation and ‘tech bro’ culture, which can in part be traced through the history of digital animation in disaster films from 1998 to the present day. In addition to translating visions of disorder (borrowed from other visual media), we argue for the importance of tracing the digital twin's technical imaginary back to its genesis in Michael Grieves’ white papers (Grieves, 2014) – their original use for quality control in factory environments cements Rose's point: digital twins are fundamentally crisis-oriented – but they are also anchored in techno-politics of production and order not typically applied to urban space.
Through digital twins’ origins, manufacturing errors are entangled with wider crises of efficiency, and ultimately capitalism as a crisis. As Diego Botin-Sanabria et al. argue: ‘humans are more complex than manufacturing processes’ ((2022: 12), as quoted by Rose (2025: 164)). So too are cities more complex than factories; urbanisms more complicated than digital twins. As the digital twin transposes from factory to urban forum it brings along underlying ideological aspects and affordances which technically prioritise productivity and profitability above the social and relational. As we argue elsewhere (Fraser et al., 2025; Payne et al., n.d.), the focus of many high-profile CDTs on limited domains like automated traffic management and building energy performance is a function of their orientation towards technical or engineering problems over the multifaceted complexity of urban social life. Reducing friction from urban transportation and minimising waste from HVAC systems are not necessarily the most urgent or compelling goals for cities with entrenched crises of inequality and ecocide, but they are the ones solvable by the tools ready to hand. Tracing this transposition sees yet another layer of urban crisis emerge – this, too, is depicted over and over again across disaster films, video games, urban photography and other media (Fraser, 2019, 2024). Read from the factory, the relation between CDTs and crisis extends beyond the fantasy of imposing order against human chaos into a crisis of endless progress and the maximisation of surplus – crises which masculinist order and rationality seek to resolve: a visual culture of control designed to centralise production as the single function of the city.
Volumetric visions
Rose argues that CDTs are part of a changing visual landscape in urban planning and architectural disciplines: ‘As animated three-dimensional models of cities, they are part of a much broader shift towards three-dimensional imagery’ (2025: 150). Rose captures the longer trajectory of 3D animation and modelling as a cultural tendency (intentionally presenting more human contexts to counter the primarily technical academic accounts of CDTs to date). While we appreciate Rose's argument that ‘most of the academic literature on CDTs consists of technical discussions of their design or implementation’ (2025: 148), we nevertheless must relay more technological detail in order to unpack questions of volume and vision. CDT imagery is an end product of 3D modelling processes, as architectural and game engine softwares render dimensional spaces, collapsing vector-based digital objects, timelines, trajectories, lighting and equations, alongside the raster textures and skins into digital video or discrete visual frames into animated scenes. Technically speaking, an animated 3D model is enacted in four dimensions – the three geometric dimensions, which are then captured in a fourth dimension, time, through a series of still frames-run-together, which in turn, offer their own regimentation of temporality (Wilmott, 2016). This means that there are in digital media, as Verhoeff (2012) argues, at least two visualities at work: that of the flat cinematic screen (the subject of Roses’ article), and that of the multi-dimensional stage (which in film would be a set and in digital systems would be software). To complicate things further, digital twins can be described as ‘5D’, to articulate the added layer of data and real-time feedback or sensing that enables management and tracking in intensive ‘live’ detail, using models and interactive interfaces. 3D is just one component of both the wider software environment, and the associated imaginaries and spaces of CDT.
Prior to 3D special effects in film production, city models would be built as physical miniatures across which a camera would pan or zoom (Batty, 2007), an enactment of the god trick that Rose clearly describes. This ‘stage’ model is now compiled through digital software (e.g., this includes Unreal Engine, as Rose describes, as well as Unity, Blender, ArcGIS Reality and ArcGIS CityEngine, or NVIDIA's AI-enabled OpenUSD), undergirded by coordinate systems on Cartesian planes – x, y, z – through which all objects are built and placed, and all trajectories projected. This presents a system of duelling dual visualities, where elements and processes of CDT production just as closely mirror cartographic productions in the development (or ‘dev’) phase, as they do filmic ones in the renders. The difference between Rose's films and the Cartesian coordinate plane lies in the politics of the universal, inverted: in the classic film, the negatives and projections themselves are near-infinitely detailed while the stage is limited by the materiality of the set. In cartography, the opposite is true: the map is limited by the size of the page, while in theory, the stage or the landscape depicted is scalable from the minute to the massive – largely using Cartesian coordinate systems. It is in this scalability that the visual politics of volume were also presaged: maps have long had the capacity to portray multidimensional information in ratio, even if they, themselves are flat – from contouring, harchuring, sounding depths and bathymetry, to the scale bar itself, a measured and rationalised z plane is a cartographic fixture (Wilmott, 2020b).
This poses an additional set of questions about the politics of looking dimensionally, centred on the regimentation of voluminous spaces into volumetric geometries. As Billé (2020) argues, volume experiences the same tensions between materiality and representation that all geographies do: voluminous phenomena become rationalised on the z-axis, and so become pointed, and generalised through lineation and surfacing (Wilmott, 2020b), all the while, slipping and escaping from the limits of such visual regimes (Wilmott, 2020a). Rose has opened a wide range of directions for further thinking on these topics, understanding concepts of dimension, volume and digitality through digital cultures, but also gesturing towards further debates around fundamental – event ontological – questions of digital mediation.
Technological surveillance as digital panopticon
That the city can be represented as a series of geometric objects moving through space via a series of equations is one concern raised by Rose (expressed through computation and volumetric regimes) – but to what end? As noted above, the simplification of space and spatial processes by digital twins originates in the factory as a solution to inefficiencies in production and prototyping. In the factory, it is less necessary to think expansively about the possibilities inherent in a more complex space (like a city), because a factory is a tightly controlled environment (in theory, at least). A factory has a singular pre-defined purpose, wherein ideologies of production like the Six Sigma strategies approach the asymptote of mimesis by allowing fewer than four defects per million outputs (see Huxley, 2015). While a CDT might produce many identical copies of a physical object, and follow this logic of mimesis at scale, it is important to question both the truth of these models in context, and the pre-history of the city-as-(digital)-object(s).
Regimes and imaginaries of CDTs and other twins predate Grieves coining the term ‘digital twin’ in 2002 (cf. Grieves and Vickers, 2017). In the early 2000s, there was considerable debate on the politics of digital-spatial media architectures, 1 from urban planning platforms (such as Digital City Amsterdam) or virtual replicas of real cities (for instance, Virtual Helsinki) to proto-digital twinning systems. Katy Börner uses the term ‘twin worlds’ as she outlines a series of virtual archival projects – iPalace and iGarden – which would allow users to virtually engage with archival collections using spatial memory techniques (Börner, 2002). Börner (2002: 260) describes these twin worlds as ‘collaborative memory palaces’, a distinct ‘respatialization’ of the archive from the rigid and techno-centric tabular information science models that characterised the early internet, a deliberate use of digital 3D environments to resocialise technical systems that Rose (drawing on McKittrick) has critiqued as systems of homogenisation or domination. Similarly, Ishida et al. (2002) described emplaced digital spaces such as Digital City Kyoto, which was a three-thousand-store shopping mall mirrored on the shopping street in Kyoto, where users could visit stores, chat to store workers and purchase items. Importantly, however, while these early authors note that ‘the internet yields the possibility of building virtual malls comprising a huge number of shops that cannot exist in any physical city’ (Ishida et al., 2002: 247), they also argue that the basis of the success of digital city projects is that trust is pre-established in the social space of the local and situated before being extended across to the digital environment. These examples eschew the now-prevalent masculinist vision of technological dominance, which, for Rose, is housed in the fantasy of absolute transparency (McKittrick, 2006) both espoused and consumed by the tech-sector. Instead, such examples demonstrate an early understanding about the importance of co-production in urban digital spaces, grounded in material as well as social relations in a mode that is arguably deeply feminist.
These pre-CDT imaginaries of technologically-produced twin worlds were subsequently overshadowed by the push toward a more masculine visioning of the twin concept, imbued with a power of computation and representation that spectacularised the twin itself when represented in the visual mode (thus the kinds of images we see in Figure 5). The question is, then, where are the politics of masculinity grounded in these later iterations of digital cities as CDTs: what happened between early imaginaries of twin worlds as collective and fundamentally human, and current realities of city digital twins that symbolise a struggle between the human and the technological; the emotional and the rational? The digital twin of the factory, as conceived by Grieves and Vickers, was not intended to be collaborative, nor to imagine the complex social space of the factory itself. Even the slippage from twin as an adjective (as used by Börner for ‘twin worlds’) towards ‘digital twins’ as nouns or holistic objects, demonstrates this focus shift towards production and the transmission of information across tightly controlled spaces.
Rose writes that ‘a stubborn investment in the fantasy of the CDT as both offering, yet failing to deliver, technocratic control over a city splits the CDTs’ affiliated demi-gods into the manager and his other’. On this theme, Grieves, in a whitepaper from 2014, specifically argues that no longer does the manager need to be in a glass office above the floor, but is now able to exercise the same oversight (or technocratic control) over multiple factories across the world simultaneously. Instead of Bentham's physical panopticon, or even Foucault's figurative one, we are left with a hybrid distributed version of control through observation, neither fully metaphorical (since the creators and managers of the large tech conglomerates really do have the ability to surveil most of the world's population, and have frequently abused this power both with and without the participation of sovereign states) nor with a discrete tangible form. Here we can offer an even more expansive reading of a profoundly techno-capitalist, as well as techno-masculinist, vision when compared to the imaginaries of collaborative memory palaces that conceive of digital environments as co-produced, social and egalitarian mirror worlds. Is the tendency towards capital, rather than the social, rendered in and through digital twinning one of the reasons why large-scale tech companies (from Nvidia to Microsoft) have been eager to invest significant resources in the computing required to engineer CDTs as industrial spatial systems? Who is the manager's other, in the factory? The inventor, or innovator, who tinkers with machines, creating mechanical looms and spinning jennies, production lines, robotic arms, surveillance software and now, autonomous mobile robots supported by GPS trackers, digital sensors and digital twin models in order to improve efficiency, increase productivity and maximise profits. Thus, where the hero of the disaster film inverts the crisis-defender of the CDT-enabled planner, it is the inventor (or even the ‘move fast and break things’ figure of the tech bro) who chaotically innovates (and masculinises) digital tools to achieve the impossibility of perfect, infinite replication – the fantasy of the original digital twin.
Conclusion
As Rose clearly articulates, ‘CDTs must be understood as a fantasy as much as a feasible technique for urban management, as an imaginary as much as a technology’ (2025: 151). To this end, again, we offer an expansion of Rose's point: the fantasy is not only in the spectacularisation of volumetrics, computation, or even the image of the city, but the projection of a future twin that – through the myth of the god-trick and masculinised logics of technical efficiency – envisions a seamless and highly controlled city grounded by digital twin systems. As such, the digital twin is not merely reflective of the status quo (whether gender, power or otherwise). It is anticipatory and projective, predicting its own expansion: the CDT ‘offers a prospective vision rather than a settled technical landscape: a momentary projection of its yet-to-be realized future’ (Fraser et al., 2025: n.p.). This is perhaps where a CDT is most like film, in the conventional sense – not in the animations captured as digital objects moving across parabolic equations, informed by sensor-derived data, amalgamated and visualised, but rather in the subjugating and worlding qualities that persistently underly technological media under industrialisation.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
