Abstract
This article examines the ways videogames become animated by looking at gaming practices that subvert traditional notions of play: specifically tool-assisted speedruns (TAS). A TAS is a playthrough of a videogame that is preprogrammed by a human so that the inputs can be automatically played back in full without a human operator. This practice requires an intimate knowledge of the inner workings of gaming systems, often to the point of productively breaking the games through glitches and exploits. These extreme practices give a unique insight into the ways animation occurs within videogames and reveals games to be animated in a variety of ways that are often not primarily directed towards the visual nor humans. This article outlines four of these modes of animation separating them into multi-tiered ‘layers of animation’: sensory output, game states, code, material, and operator. TASs help to demonstrate these layers are actually discrete forms of animation that do not necessarily impact one another from becoming individually animated.
Videogames are animations. This statement is perhaps self-evident. After all, so much of videogame marketing emphasizes pushing graphical fidelity to the limits, to the point that it is nearly impossible to distinguish game from reality. But simply calling videogames animation elides the complex technological operations that animate the assemblage of videogames. It is not just videogames’ audiovisuals that are animated. As Andrew Johnston (forthcoming) argues, discrete elements of videogames such as ‘unseen microtemporal actions’ are actually ‘animations unto themselves’. Videogame animation, or computer animation generally, is an assemblage of many animations working in concert to produce experiences for human and nonhuman observers alike. Johnston dubs this assemblage a ‘stack of animation’. This article builds on the concept of ‘stacks of animation’ to lay out a framework to better understand the complicated ways videogames come to life on screens, in virtual worlds, within abstract computational frameworks, across circuit boards, and through the use of input devices. I use the concept of the stack as a means to pull apart the multiple interrelated yet distinct animations that are continuously present at every moment of videogame play.
In order to do this, I interrogate a player practice called tool-assisted speedrunning, or TAS. A TAS is a preprogrammed script of inputs that is fed to a videogame causing it to play automatically, based on the precise deterministic string of commands. As will be shown, the act of TASing requires a unique pulling apart of the layers of animation which lays bare the animations that are always present during videogame play. The first section explores my understanding of animation and the concept of layers of animation that will be illustrated through the TAS case studies used throughout the article. I then move to an overview of the practice of TAS before working through sections dedicated to an example from specific TASs or TASers that exemplifies each individual layer of video game animation: sensory output, game state, code, material, and operator.
Computer animation
When I use the term ‘animation’ I do not simply mean images on a screen, although that is a part of videogame animation. My use of ‘animation’ is indebted to Bernard Steigler’s (1999: 17) definition of technics as ‘the pursuit of life by means other than life’. This leads to a much richer sense of animation that includes what Daniel Johnson (2017: 224) refers to as ‘the sense of vitality that is experienced in a game world as it is produced by [a] sense of movement’. I will complicate the notions of ‘experiencing’ and ‘sensing’ as I explore the various ways videogames are animated, but this is a good point to start an investigation of videogame animation. What I mean when I say that videogames are animated is that videogames pulse with life; they are ‘disturbingly lively’ as Donna Haraway (1991) would put it. There is an anima to videogames that is difficult to locate in any one place or at any one time but is found in all of the various movements that bring videogames to life.
Animation is not an easy term to pin down as its form changes over time. Andrew Johnston (2020: 1) argues that ‘animation moves’. This not only refers to the ways that animation creates illusions of movement, but it also draws attention to the ways the definition of animation shifts with the technologies used to produce it. As the technologies for producing animation shifted to digital forms in the mid-20th century, Johnston argues that ‘articulations of temporal synthesis co-existed through two models: one indebted to the optical registration of the instant, the other to the algorithmic calculations of information processing’ (p. 9). Through the animation of the computer’s internal processes, animation ceases to be solely oriented towards viewers of images on screens or, as Johnston phrases it: ‘The life of animation is not simply bound to its moving images, but also its technical changes, which contain a vitality that produces actions’ (p. 15).
Viewers, however, do not directly access the ways computers are animated, as these kinds of animations are outside of human perceptual thresholds, which, as Mark Hansen (2015) points out, is a defining feature of new computational media: they ‘operate predominantly, if not entirely, outside the scope of human modes of awareness (consciousness, attention, sense perception, etc.)’ (p. 5). Shane Denson and Andreas Jahn-Sudmann (2013: 15) describe this phenomenon as ‘blindness to computational temporality’. This does not mean that we cease to engage with animation or computers in sensory ways, but any on-screen or auditory output is a result of imperceptible and extremely rapid micro-processes.
Computers (and by extension videogames), then, pose an interesting problem when it comes to the ontology of animation. Where exactly does animation lie during videogame play? Jacob Gaboury’s (2015) analysis of the development of computer generated graphics in the 1970s is an instructive exploration of this question. Gaboury notes that ‘the computer is not a visual medium. We might argue it is primarily mathematical, or perhaps electrical, but it is not in the first instance concerned with questions of vision or image’ (p. 40). Because the computer is not naturally oriented towards the visual, it is not trivial to force a computer to accurately render graphics. Computers do not have the luxury of a natural vantage point, so it is especially complicated when the image represented must have the illusion of three dimensions while rendered from a single vantage point on a two-dimensional screen. In early computer graphics, computers rendered the entirety of an image simultaneously with no regard to the perspective of the viewer, creating three-dimensional images that displayed portions of an image that should be obstructed. There were many solutions to this ‘hidden surface problem’, but the problem itself reveals something interesting about computer animation. Even though the computer may not graphically represent portions of an image on the screen, the computer still constructs the entire object in non-visual ways. Gaboury explains that ‘this world of graphical objects thus exists prior to the rendered output of the screen – as patch definitions, object databases, and graphical algorithms – and the image is only one of many meaningful forms this data might take’ (p. 57). There is an ontology to the object even if it is not visually rendered. It is animated through the computer’s processes, and that underlying animation is fundamentally different than more traditional forms of animations. Seth Giddings (2005) notes this directly in regard to videogames saying: ‘new media such as digital games maintain these entities’ animate existence.’ Ontological objects are created and maintained at all times in videogames. They are animated even if they never appear on a screen or are in any way perceived by a human.
This account of computer graphics demonstrates the interconnected stacks of animation that are at play in videogames. I borrow the concept of ‘the stack’ from Johnston and, although he is clearly influenced by Benjamin Bratton’s (2016) work, I am more directly interested in Bratton’s understanding of the stack as an organizing structure of computers. He argues that the same computational logic can extend to planetary-scale computing embedded in political geographies. Though my aims are not nearly as far reaching, the stack is a useful structure because of the ways Bratton conceives of the relations between every layer within the stack of computation. He argues that ‘each [layer] is considered on its own terms and as a dependent layer within a larger architecture’ (p. 11). Johnston (forthcoming) is interested in ‘moving up and down its stacks to reveal animation’s role in the diverging temporalities that mark these technologies and this moment’. Revealing the various temporalities in the stack is important to my project as well. However, while moving through the stack I will also be breaking the stack apart into its constituent parts to more fully interrogate the way that the individual layers of the stack become animated. I locate at least five layers of animation at play in videogames that are distinct yet built upon one another: sensory output, game state, code, material, and operator.
Layers are independent, yet enter into a relationship with each other, reciprocally informing one another. This is similar to Nathan Altice’s (2011) borrowing of the term ‘dynamic heterarchy’ from N Katherine Hayles (2008) to describe the ways players and videogame system form a cybernetic feedback circuit. Hayles defines dynamic heterarchy as a ‘multi-tiered system in which feedback and feedforward loops tie the system together through continuing interactions’, whose tiers ‘continuously inform and mutually determine each other’. The stack functions both as a holistic assemblage, while every layer also has its own agency to act independently of the larger structure. Instances where the layers do not impact each other demonstrate the nature of videogame animation most dramatically. By decoupling the layers, the black box that is videogame animation opens itself.
Tool-assisted speedrunning uniquely breaks the stacks of animation so that each layer can be analyzed separately. What I aim to make clear through my analysis of TAS that follows are some of the diverse interconnected yet discrete ways that videogames become animated as a means to more fully engage with the animation that happens at every moment of videogame play. The following section gives an overview of the practice of TASing which sets the stage for investigations of the individual layers in the stack of animation. I begin this section with a description of a TAS to demonstrate the possibilities of a TAS’s animation.
Tool-assisted speedruns
The title screen for Pokémon: Yellow (Nintendo, 1998), complete with the iconic mascot Pikachu, briefly flashes on-screen before ‘New Game’ is selected. Professor Oak quickly lectures about the titular Pokémon creatures, the player-character and his rival’s names are entered, and the adventure starts. The new Pokémon trainer sits in his room but, instead of exploring the world, the pause menu is immediately brought up, the save function selected, and the game is powered down and reset. Back to the title screen. This time ‘Continue’ is selected. For the briefest of moments the trainer is seen sitting in his room before the menu is brought up again. The cursor dances through the menu rearranging Pokémon and items that were never acquired. The screen takes on a strange blue tone, seemingly random alpha-numeric characters flash on the screen, and after about 30 seconds of tossing items out of the inventory and rearranging non-existent Pokémon at breakneck speeds, the screen fades to white (see Figure 1). Fade back in. The player-character enters the final room of the game where they are normally entered into the Hall of Fame after defeating the fearsome Elite Four, but, instead of congratulating the hero, Professor Oak says an iconic line from a different Nintendo title: ‘Thank you Mario! But our princess is in another castle!’ Fade to white. Another title screen, this time it belongs to Pokémon: Gold (Nintendo, 1999). A game file is loaded, and more impossibilities follow. A Pokémon that cannot normally be encountered appears, but, even more fantastic, talking to a non-playable character (NPC) brings up a game of Tetris (Pajitnov, 1984). After Tetris plays for a few moments, leaving a building sends the character to The Legend of Zelda: Link’s Awakening (Nintendo, 1993, see Figure 2), and then back to Pokémon: Gold. Examining a television summons level 1-1 of Super Mario Bros. (Nintendo, 1983). Level complete. Back to Pokémon: Yellow. Leaving a room warps to the Hall of Fame again. This time, Professor Oak doesn’t talk to the character, but he sings the credits song from Portal (Valve, 2007), ‘Alive’. After the rendition, a short, low-resolution clip from Spongebob Squarepants plays (see Figure 3). In this clip, Patrick the starfish asks what the viewer has likely been thinking during this whole video: ‘How does he do that?’ Finally, after 5:48.28 real time, 20,802 individual frames, and an ingame time of 0:00, Pokémon: Yellow’s credits roll (see Figure 4).

Glitchy visuals appear during menuing. Frame grab from Submission #5384: MrWint’s GBC Pokémon: Yellow Version ‘Arbitrary Code Execution’ in 05:48.28 (2017). Available at: http://tasvideos.org/5384S.html

Link’s Awakening appears in Pokémon: Yellow. Frame grab from Submission #5384: MrWint’s GBC Pokémon: Yellow Version ‘Arbitrary Code Execution’ in 05:48.28 (2017). Available at: http://tasvideos.org/5384S.html

Patrick from Spongebob Squarepants asks ‘How does he do that?’ within Pokémon: Yellow. Frame grab from Submission #5384: MrWint’s GBC Pokémon: Yellow Version ‘Arbitrary Code Execution’ in 05:48.28 (2017). Available at: http://tasvideos.org/5384S.html

Screenshot of Pokémon: Yellow end screen with in-game time of 0:00. Frame grab from Submission #5384: MrWint’s GBC Pokémon: Yellow Version ‘Arbitrary Code Execution’ in 05:48.28 (2017). Available at: http://tasvideos.org/5384S.html
Despite the forays into many different videogame titles (and a television program), this series of events comes from a tool-assisted speedrun (TAS) of Pokémon: Yellow for the Gameboy Color (a role-playing game that tasks the player with traveling across the world to train, capture, and battle with the titular Pokémon) played by Christian Koch, AKA MrWint, that is meant to show off the power of an exploit called arbitrary code execution (ACE) that allows MrWint to completely reprogram the game in real time through the interface of the game itself (Submission #5384). But, as we will see, that is not an entirely accurate description. This is not exactly a speedrun. It’s not exactly Pokémon: Yellow on the Gameboy Color, nor is it completely true to say that it was played by MrWint either. This account is a dramatically incomplete description of a ‘movie’ submitted to tasvideo.org, a website dedicated to promoting impressive tool-assisted speedruns.
Speedrunning is the practice of playing a videogame as quickly as possible. Methods for speedrunning are split into two main categories. Speedruns that are performed by a human playing the game live are called real-time attacks (RTA). TASs, on the other hand, are a separate practice and utilize various outside programs and devices that allow players to do things such as slow the game down in order to perfect particular strings of inputs, peer into the games memory to find specific values, or reload save states to quickly test different methods of progression. These tools allow players to laboriously program strings of inputs that are fed to the game system causing it to register those inputs at precise moments. The inputs are recorded and automatically played back, resulting in the final playthrough. There is no human player pressing buttons in a TAS. The run can be likened to a player piano that simply executes preprogrammed inputs.
The challenges and pleasures of videogames are completely recentered through this practice. Patrick LeMieux (2014) aptly describes the creation of TASs as turning games that normally require quick physical reactions into ‘turn-based puzzle games’ because the ‘tool-assisted speedrunners’ are able to ‘consider the best strategy frame-by-frame’. RTA speedruns highlight the physical abilities of human players to perfect the inputs necessary to complete games as fast as possible, whereas TASs forefront technical knowledge of the game’s systems and creative problem solving. There is one other major difference between RTAs and TASs. While many people watch recordings and live streams of RTA speedruns for entertainment, at their core, RTAs are concerned with completing games as quickly as possible. This means that players attempt to optimize their runs regardless of how enjoyable the action on screen is for viewers. Many TASs on the other hand are created purely for viewers’ entertainment, so a TAS may show off interesting glitches or intentionally appear technically impressive regardless of how long it takes to complete the game.
The final product of a TAS is called a movie, but the movie is not primarily visual in orientation. The TAS movie is a text file that contains information about the individual inputs programmed for a run. A recording of the audiovisual output of the game executing those particular inputs is often made, for example my description of MrWint’s run above is based entirely on an mp4 recording, but the audiovisual outputs and the mp4 file are only a part of the TAS movie. The other part of the movie is comprehensible only to the system that interprets the text file as inputs. TASs, then, are not as they seem on the surface. When programming a TAS, the screen can often become an afterthought as inputs are directed at manipulating microtemporal computational processes. However, surfaces still play a pivotal role in their production, as is evidenced by the screen-centric entertainment that is often the exigence for creating TASs in the first place. The TAS can be thought of as a multilayered artifact. Nathan Altice (2015: 324), for example, describes TASs as ‘all at once a text file, a performance, a dance, an animation, a procedural event, sport, entertainment, a social act, an ethics, an archive, and yet still a videogame’.
Animation is a particularly important descriptor to analyze here. TASs may be unique because of the ways players meticulously comb through and manipulate the game’s procedures to create incredibly precise performances, but through these fantastic performances they reveal the ways that all videogames are animated. In the same way that the recording of the TAS itself masks its own nature, so too does animation in videogames more generally, as much of videogames’ animation is not primarily directed at the human. In order to explore the ways videogames are animated, I will analyze MrWint’s TAS of Pokémon: Yellow along with other TASs and TASers that employ similar techniques as MrWint in their runs. The following sections work through the different layers of animation that I locate as central to videogame animation: sensory output, game state, code, material, and inputs.
Layer 1: Sensory output
The layer of animation that presents itself most readily to players is sensory output. This is the form of animation that players usually have direct access to, and is what they perceive of the gameworld during play. It comprises what is typically thought of as animation in other forms of media such as cartoons or CGI: the moving images on screen and sounds that accompany them which bring the media to life. However, audiovisuals are not the only ways videogames affect players sensorily as they can also include haptic feedback like rumble features that animate the controller in players’ hands. Thus, I use the more general term ‘sensory output’ instead of ‘audiovisuals’ to describe this layer.
Sensory output often correlates to and gives some amount of access to the ‘game state’. The game state describes any generation of objects within the gameworld. For example, when there is, say, a wall in a game environment, the visuals may represent a solid object visually, but those visuals also imply that there is a solid barrier that does not permit other objects from passing through its particular coordinates. The solidity of the wall is a part of the game state; however, the visuals are not necessary for the wall to prevent movement. This is commonplace in many games that use invisible walls to demarcate the extreme edges of gamespaces, for example. The visual representation of the wall and the physical properties of the wall are mutually exclusive as neither necessarily depend on the other to create the gamespace.
To explore this, let’s return to MrWint’s TAS of Pokémon: Yellow. Their run is meant to show off the power of an exploit called arbitrary code execution (ACE) that is present in all Generation 1 Pokémon games (Pokémon: Blue, Red, and Yellow). The term ACE comes from computer security where it describes a hacker’s ability to execute commands or code on a targeted machine. In speedrunning, ACE is the act of programming and executing code through the interface of the game itself. In effect, players hack the game through gameplay to force it to run their own customized code, which is also coded through actions taken within the game. There is no single method to accomplish these hacks. Programs feature unique vulnerabilities and methods to execute ACE but, generally speaking, ACE is most often accomplished by taking control of the program’s instruction pointer, which controls what procedures the computer executes, forcing it to point to an area of the game’s database that the player has the ability to manipulate, thus writing code within the game. ACE allows different levels of control from game to game. At a bare minimum, it is usually possible to force a game to initiate its credit sequence, which is undoubtedly the first thing speedrunners learn how to do. In some games, including Pokémon: Yellow, players are able to completely take control of the game, gaining the ability to rewrite all of the game’s code, only being restricted by the limitations of the hardware. In effect, players reanimate the game as they see fit.
MrWint’s stated goal in their TAS is to push ACE to its limits by programming another game within Pokémon: Yellow, specifically one of Pokémon: Yellow’s sequels, Pokémon: Silver, Gold, or Crystal. However, they came up against a major issue. The sequels’ cartridge features hardware that the original games did not have: a battery powered clock. Because of the differences in the hardware, the sequels’ code cannot simply be run through the same cartridge, even if the cartridge is being emulated on a computer, which is the case here. However, a TAS need not demonstrate a complete game. Because the TAS runs a narrow set of predefined inputs, all the sequel’s code does not need to be programmed, only the code that will be run during the movie. So, if the processes relegated to the additional hardware are avoided, there is no issue, but MrWint developed an even simpler solution. He explains this on his submission to tasvideos.org (2017): I realized that all the instructions that really mattered are those that put tiles on the screen or played some sounds. So all I need to do is emulate the actual audio-visual output of the game with the right timing, without any internal game state. This realization was the key to this run, as it opened many more possibilities: The source of the A/V doesn’t need to be another game.
They realized that, instead of reproducing code dedicated to producing game states, such as the physics of walls or the mechanics of encountering enemies, only code related to producing visuals on the screen and sound is programmed as these are the only forms of sensory output native to the Game Boy Color.
Accomplishing this is still no easy feat and involves a lot of complicated manipulations of the game’s inner workings. Executing ACE in Pokémon: Yellow begins with a simple exploit that makes the game think the player has 255 Pokémon in their party instead of the usual maximum of six. This overflows the party information out of the memory it is typically allotted. The position of party members can be swapped around, which changes the values associated with memory addresses those party members are written to, including the addresses that are outside of the normal area the player is supposed to have direct interactions with. By manipulating the order of the party, the item inventory can also be forced to overflow in the same way. Overflowing the item information takes up much more space in the game’s memory than the party information, giving more space to manipulate code via tossing certain items out of the inventory and rearranging others. Using this method to manipulate the code allows the instruction pointer to be controlled and audiovisual information for multiple games, songs, and television programs to be painstakingly programmed. This is technically possible to do in an RTA on cartridge, but the thousands of precise inputs that are required make a TAS the only realistic way to perform these exploits.
When these programmed procedures, called the payload, execute, the game fades to white and, even though the next thing that appears on screen is the player-character walking across a room, the underlying processes that animate the image are different than what normally occurs during gameplay. The payload only contains information regarding the visual tiles, sprites, and pixels that are displayed on the screen and the audio that is outputted through the speakers, whereas when the game is run normally, other underlying processes occur regarding the game state. For instance, for a few seconds between the 4:04 and 4:07 mark in MrWint’s movie the player-character appears to walk across the room Pokémon: Yellow begins in (see Figure 5). The player-character walks from the stairs in the upper right corner to the television at the center of the screen. The audiovisuals of this short section of the movie mirror Pokémon: Yellow’s perfectly, but much of the animation is not actually occurring when this TAS is run. For example, the Pokémon series heavily uses pseudo-random number generators (PRNG) to create sequences of numbers that influence all kinds of occurrences within the world that are supposed to be unpredictable. The PRNG cycles through its sequence very rapidly. Pokémon: Yellow’s PRNG cycles every two frames and, as the game runs at about 30 frames per second, there are 15 distinct numbers generated per second. Additionally, whenever the PRNG is queried to use one of its numbers, the sequence also advances in its cycle. Loading a sprite of a non-playable character (NPC) is one action that queries the PRNG because NPCs have unpredictable walking patterns that use a random number to determine their paths. However, even if an NPC does not walk, the PRNG is still queried. In fact, anything that can be interacted with by the player-character is treated as though it is an NPC. The game does not differentiate between items on the ground, signs, or the television at the center of the room. When the television loads on the screen, the game would normally query the PRNG and advance the sequence by one, thus changing the game state. However, none of these processes occur during this section of the TAS movie. There is no code associated with the PRNG operating: only code associated with the audiovisuals.

Player-character walks across room in MrWint’s Pokémon: Yellow TAS. Frame grab from Submission #5384: MrWint’s GBC Pokémon: Yellow Version ‘Arbitrary Code Execution’ in 05:48.28 (2017) Available at: http://tasvideos.org/5384S.html
This distinction between sensory output and game states is an important one to make, especially when it comes to determining the veracity of speedruns. Does triggering the audiovisuals of the game’s ending count as completing the game, or is there an underlying ontological game state that may not necessarily be expressed visually that counts? That is not as much of an issue in this particular run because it is created for entertainment and the demonstration of an exploit, but if one of the run’s goals is to recreate a game within another game, does it truly achieve that goal? Does the audiovisual output of Tetris, Pokémon: Gold, The Legend of Zelda: Link’s Awakening, and Super Mario Bros. constitute the game? The definition of ‘game’, one that is notoriously slippery and contentious in game studies literature (for overview of difficulties in definitions of games, see Bergonse, 2017 and Arjoranta, 2019), becomes even more uncertain in light of videogame’s multilayered animations. But this definition is not simply theoretical.
TASs become one player practice among many that destabilizes what Stephanie Boluk and Patrick LeMieux (2017) dub the ‘standard metagame’. They describe this as ‘invisible rules’ (p. 40) that ‘train players to consume software in particular, often narrowly defined, ways’ (p. 279). Videogames have come to be defined by largely standardized rules embedded in their design, marketing, and dissemination that eschew their status as one contingent historical construction. A largely homogeneous industry defines videogames (and videogame players), which in turn creates homogeneous videogame experiences and hails a narrow group of players as ‘real gamers’ (Chess, 2017). Videogames (and the rules surrounding their creation, distribution, use, and technical operations) collapse into a black box that obscures (among other things) the diverse ways they become animated.
The deliberate decoupling of layers of animation (in this case between sensory output and game state but can also result in discrepancies between other layers) allows for a wider definition of videogames that emphasizes the perceptual, technological, and creative potentials of videogames outside of the ‘standard metagame’. Other artifacts and practices shed light on similar issues involving shifting the centrality of sensory output and the necessity of game states (and often inputs as well). Scholars have noted similar phenomena in examples such as Cory Arcangel’s (2002) completely non-interactive artistic videogame ‘Super Mario Clouds’ (Franklin, 2009); narrative centric videogame genres like visual novels and so-called ‘walking simulators’ (Ruberg, 2019); the phenomena of esports broadcasting gameplay on Twitch.tv (Taylor, 2015); machinima videos that leverage videogame audiovisuals and game engines as their creative materials (Lowood, 2006); or even the common use of cut-scenes (Cheng, 2007). The black box of videogame animation is opened in these instances, allowing for different understandings of the potential of videogames. All of these, TASs included, become part of alternative metagames that reveal videogames, themselves, to be a contingent historical construction with a moving definition, just as animation is.
The posing of questions like the ones described above also demonstrates the differences between conventional animation and videogame animation. It is not only the sensory output that animates the world; there is an ontological underpinning to the world through the creation of digital objects and computational processes that manifest in a game state. To explore these ideas further, I will turn to a TAS that uses ACE to recreate the game states of games within another game, instead of simply recreating their visuals.
Layer 2: Game state
At 2014’s Awesome Games Done Quick, a charitable event where speedrunners gather to play games on live streams for donations, a TAS created by MasterJun of Super Mario World (Nintendo, 1990), a platform game for the Super Nintendo, was demonstrated (TASVideosChannel). The run begins with Mario entering the first level of the game. For about one and a half minutes, Mario runs back and forth through the level placing shells, blocks, and other objects in precise locations, all while performing various exploits to spawn other items into the level (see Figure 6). The screen then fades to black and a menu appears in bright green font with the options ‘Pong’, ‘Snake’, and ‘The End’. Pong is first selected, bringing up a screen that uses platforms from Super Mario World as paddles and Mario’s head as a ball to play the classic game Pong (see Figure 7). After two quick volleys, the screen returns to the menu and Snake is selected. This time Mario’s head becomes the front of the snake, which eats apples causing the tail made of bricks to incrementally grow longer. After playing for a short time, Snake is exited and an end credits screen is displayed. Following the demonstration, two human players plug controllers into the system and play a few rounds of Pong against each other.

Mario places sprites to set up ACE. Frame grab from TASVideosChannel (2014) AGDQ 2014 -TASBot playing SMW total control and various other TASes. Available at: https://www.youtube.com/watch?v=Uep1H_NvZS0

Pong plays within Super Mario World. Frame grab from TASVideosChannel (2014) AGDQ 2014 - TASBot playing SMW total control and various other TASes. Available at: https://www.youtube.com/watch?v=Uep1H_NvZS0
While this is similar to MrWint’s Pokémon: Yellow TAS in that it attempts to create games within an already existing game, there is a fundamental difference here in that MasterJun’s payload features the game state from the emulated games, but not the same sensory output. Both Pong and Snake are fully playable, as demonstrated by the two players that pick up controllers and play against one another. There are game mechanics, such as the physics of the ball in Pong, that adhere to an underlying ontology executed through code. However, the visuals draw from the collection of tiles and backgrounds found in Super Mario World’s video RAM.
The importance of sensory output versus gamestate for players and viewers alike is again brought to the fore here. Ian Bogost (2007) calls the sensory output of a game a ‘skin’. He argues that ‘the surface representation or graphical skin in a game is not a mere dressing for the abstract rules, such that any particular presentation of a procedural model is essentially arbitrary and dispensable’ (p. 242). He draws from Jesper Juul (2005: 15) who separates games into two layers: rules and fiction. The rules are the underlying processes of the game, and the fiction is the audiovisual output that creates the fiction of the gameworld. Juul, like Bogost, argues that it is commonplace to create games with the same rules and different fictions, but these games do not create the same experience for the player. Bogost uses the example of Ralph Koster’s reimagining of the classic Tetris that swaps blocks for bodies, becoming a grotesque and disturbing staging of mass burials and a reflection on the horrors of the Holocaust. But it is not just the images of bodies that animates this game. It is also the underlying mechanics of the game state. The graves become more and more crowded as bodies pile up, crushing the ones at the bottom and clearing space for more to be tossed in. The audiovisuals and game state are both essential to create meaningful animations, yet we still recognize the game as a form of Tetris. MasterJun’s reskinning of Pong and Snake may be much more lighthearted examples, but they still demonstrate the ways that sensory outputs and game states are mutually exclusive animations that animate in tandem. The altered audiovisuals may or may not cause players to recognize the games programmed into Super Mario World as authentic versions of Pong and Snake (or Super Mario World for that matter), but what is important here is that the layers of animation are separate operations that work together to animate the game.
Layer 3: Code
While Juul’s delineation between rules and fiction has some parallels to my categorization, it ignores other layers of animation, such as the layer of abstract code. Juul compares two games describing them as follows: In the first game, the player controls a spaceship in a battle against the heads of the hosts of a television program. In the second game, the player controls a spaceship in a battle against various theories, in this case a narratological model. (p. 13)
Juul claims that both of these games are based on ‘identical rules (and programming), but with different graphics’ (p. 13). However, this is not exactly true. The programming is not identical in these games. Because there are different sensory outputs in the games, there necessarily must be different programming, even if only slightly. Programming, the game’s abstracted language of code, becomes another distinct layer of animation.
Sensory output and game states are, by necessity, expressions of the game’s programming and always must be based on computational processes, yet the abstract programming language is a distinct, autonomous layer that both animates and is animated. To demonstrate how a game’s programming is brought to life, I will turn to another run of Super Mario World for the Super Nintendo that uses ACE to warp to the end credits directly from the first level. This method for executing ACE was first discovered by a runner who goes by あんた, or Anta, and was optimized by Masterjun along with Doomsday31415 and BrunoVisnadi to trigger the credits in only 41.68 seconds (Submission #6432). The underlying processes of this ACE are not quite as complicated as programming Pong or Snake into the game, but it still uses the same fundamental methods as the previous TAS. The highly precise inputs involved in the trick were originally thought to only be possible in TASs, but, interestingly, they have been performed in RTA speedruns. I will examine one such RTA performed live on Twitch.tv by runner Dotsarecool (Dotsarecool, 2015). While the TAS version has pushed the time for completion under a minute, Dotsarecool takes 2:44. However, the slower pace and strategies used to execute the trick makes for a slightly easier explanation of the execution of the ACE, demonstrating important characteristics of the animation of videogame code.
Dotsarecool’s run of Super Mario World begins similarly to the previous Super Mario World TAS. Dotsarecool guides Mario to the first level, throws shells, gets on Yoshi (Mario’s rideable dinosaur companion), breaks some blocks but, in the blink of an eye, the game is simply over. What seems like haphazard movement throughout the level is actually incredibly precise manipulation of the x-coordinates (or horizontal location) of sprites (the two-dimensional objects within the gameworld). The game is able to load a maximum of 10 sprites written to 1 of 10 memory slots at any time. When a sprite is spawned, its information is written to the highest available slot. So, the first sprite will load to slot 10, the second to slot 9, and so on. When a sprite despawns for any reason, it vacates the slot, allowing another sprite to be spawned, but information about a sprite’s x- and y-coordinates (vertical location) are saved until a new sprite is spawned and loaded to that slot. With this knowledge, sprites, like shells, enemies, and powerups, can be manipulated so that the x-coordinates of the highest seven sprite slots contain a specific string of hexadecimal bytes. The specific bytes in Dotsarecool’s run are as follows: A9 1C 92 75 4C 46 FF.
Everything within the coding of the game is stored as numbers, so these bytes can be interpreted as different types of code (not just positional information) if the instruction pointer is forced to read them as such. To make the instruction pointer do this, a glitch called the ‘Item Swap Glitch’ is performed that allows Yoshi to eat a sprite that is normally not able to be eaten: the ‘Charging Chuck’ enemy. When Yoshi eats any item the game attempts to jump to an assigned address in memory which corresponds to an executable command. Because it is usually impossible to eat a Charging Chuck, the address associated with it has nothing mapped in memory. If an action is unable to be completed because of an invalid address, the instruction pointer defaults to reading from what is called the Data Bus, which is a Memory Data Register that keeps track of every read and write value that happens within the game, including the string of hexadecimal values that have been set by the sprites’ positions. The complex process that happens next is nearly instantaneous. The game uses the bytes that represent the x-coordinates as instructions, becoming the payload, which were specifically placed to be read as commands that trigger the credits.
There is one other major piece of setup for this exploit to work. The x-coordinates for the sprites serve as the executable instructions, but another value is used to direct the instruction pointer to those specific values. Dotsarecool breaks two bricks during the exploit’s setup. When a brick is broken, it shatters into small pieces that quickly disappear (see Figure 8). Small sprites, such as the pieces of broken bricks, are loaded into a different table than sprites like enemies and shells; however, the small sprites’ coordinates are stored in the same way. Even after they have despawned, they persist in a table, and in this case the broken bricks’ y-coordinates are used to direct where the game looks to for its instructions. The first brick’s y-coordinate will always be 00, but the value for the second broken brick can be any one of D8, DA, DC, DE, E0, or E3. Each of these values sends the instruction pointer to a different point in memory. While the payload itself may consist of the seven bytes: A9 1C 92 75 4C 46 FF, there is actually much more information about both the sprites’ x- and y-coordinates the game has stored. A slightly longer string of stored bytes related to sprite positional information would look like this: 60 30 F4 06 F0 4C 70 70 5E 70 6A

Small sprites of broken bricks appear briefly as Mario breaks a brick. Frame grab from Dotsarecool (2015) Super Mario World - 0-exit in 2:44. Available at: https://www.youtube.com/watch?v=NvwVdY8pf_E
This run sheds some light on the complicated processes that happen at the level of code at all times during gameplay outside of the player’s direct perceptions. The game’s code is made of numbers that, in turn, create actions at the layers of sensory output and game states. This is a small snapshot of the ways numbers move and transform themselves, becoming animated, based on the rules of the programming language, operator inputs, and the hardware the code operates on.
However, the starkest example of the ways code is animated comes from the four bytes that precede the payload. The instruction pointer first goes to bytes that seemingly have no function. Within fractions of a second, the game passes over them and launches the credits, but the game still attempts to interpret this small packet of information. A micro-temporal action happens; an animation happens. Though players do not directly perceive this, these actions are vitally important to the execution of the game. And players are aware of the motions that the system undergoes, evidenced by their manipulation of the game to produce fantastically precise results. The code is animated, even if nothing is instantiated on the screen, demonstrating that code is a unique layer of animation that is animated outside of its effect on other layers of the animation stack.
Material
Abstracted code necessarily has a material instantiation. As Matthew Kirschenbaum (2008: 61) notes, ‘while bits are the smallest symbolic units of computation, they are not the smallest inscribed unit.’ There are wires, circuits, and electrical impulses that travel through a game’s hardware that provide a platform for the code to operate on and in turn create the other animations of the game. The previous example of Super Mario World nicely demonstrates the ways that the materiality of videogames is its own layer of animation. Bytes of information were ignored by the instruction pointer, but an electrical impulse was still sent to where they were stored. Just because nothing is instantiated on screen or in the game state does not mean that electricity did not flow through circuits in the game’s hardware attempting to execute a coded program. These micro-temporal processes are not inconsequential. They are fundamental to videogame animation.
This is especially true for TASers. Every computational process takes time. Pokemon: Yellow, as already discussed, runs at 30 frames per second, but that is only the speed that the audiovisuals and game state are updated. Videogames can often technically accept inputs at a much faster rate. In the case of the Gameboy Color running Pokemon: Yellow thousands of inputs can be accepted per second. This, of course, is physically impossible for human players to input, but through a TAS these speeds can be achieved. While thousands of inputs per second seems extremely fast, there is still an upward limit to the amount of inputs the game can accept that is based on both the computer’s ability to process data and the speed at which signals travel from interface to processor. The speed electrical currents can flow from the input device to the game’s processor becomes a major limiting factor on what can be achieved within the videogame, and consequently how a videogame is animated. When videogames are pushed to this extreme, it becomes clear that micro-temporal animations are vitally important as every thousandth of a second of processing becomes central to the game’s ability to register the rapid fire information coming from the inputs. The material layer, then, creates the layers stacked on top of it, while being its own distinct layer of animation comprised of electrical currents running through wires and dancing across circuit boards.
Attention to the technical specs of hardware and environmental conditions surrounding the running of videogame systems reveals the importance of the material layer. At a panel presentation about speedrunner’s hardware at the 2019 summer version of the Games Done Quick charity event, TASer dwangoAC describes some of the technical specs of the Super Nintendo’s internal clocks: The Super Nintendo . . . had a 21 mhz main clock and a 24.576 mhz sound clock. The main clock is a quartz crystal. The sound processor is driven by a ceramic oscillator that varies over temperature, and is now 25 years old. And varies by age too. Which means that you can sometimes find consoles that are so out of spec, they play notes that are off key. (12:40–13:07)
The materiality of the console’s parts has direct impacts on the animations of the game’s sensory output to the point that animations change based on the material conditions of the equipment, which is affected by things like the temperature of the room. However, the issues with the sound processor does not just affect sound. DwangoAC (2019) goes on to explain: In the case of Super Metroid on the Super Nintendo it might affect you by taking longer to get through door transitions, because the main CPU is waiting for the sound processor to say ‘Hey, done processing the sample.’ Your clock is out of spec and it is running lower than it should. It’s going to take longer, and those frames add up over a run. (13:36–13:51)
Both the sensory output and the game state are affected and, because of this, the operator as well. The small, extra amount of time it takes for the animations of the electrical impulses to travel between CPU and sound processor can impact any player, not just speedrunners, but it is absolutely vital for TASers.
A TAS requires precise, predictable processes to work as it is a preprogrammed script. Anything that even slightly decouples the expected timing of the electrical impulses sent from the inputs to the processors, such as the sound clock with Super Metroid, causes what is called a desync. When a desync happens, the TAS script continues to run but the inputs are not perfectly aligned temporally with the game’s ability to meaningfully register them, usually resulting in the game exhibiting seemingly random behavior instead of the mind-bogglingly precise movements of most TAS runs. It is not, then, just electrical impulses that animate the other layers. It is precisely coordinated electrical impulses that require specific material instantiations to create meaningful animations in other layers from moment to moment.
Operator
The inputs themselves become the final layer of animation. I have focused primarily on the videogame system, the hardware and software, up until now, but the videogame is not the only actor involved in creating the game’s animation. The player also takes part in animating. This phenomenon is not completely unique to videogames. Other forms of media like the DVD that serves as an interactive database cinema (Manovich, 2001) or the vast variety of ‘ergodic literature’ such as hypertext literature (Aarseth, 1997) involve the viewer in the creation of the text as well. In videogames, specifically, Alexander Galloway (2006: 5) delineates between two types of actions: machine acts and operator acts. This is a useful distinction to make as the operator’s input guides the ways animations occur within the game to a large degree. Galloway’s classification of gamic actions makes clear the relative role the machine and operator play in generating action shifts from moment to moment. Take, for example, the cut-scene that is dominated by the machine, bringing a cinematic scene to life with no player input. On the other hand, guiding a player-character through an environment involves some level of operator actions regarding the path the player chooses. Even if the machine takes over all action at times, operators are still able to bring life to the game through their inputs at other times. I find the term operator to be especially apt when describing this layer of animation more broadly. TASs are programmed by humans, just as the videogames themselves are, but there is no human inputting the individual commands every time the TAS is run. ‘Player’ implies a human agent that is both playful in their interactions and actively and intentionally guides play, which is not necessarily the case. The presence of an operator, human or machine, that is able to input commands is all that is necessary to animate videogames.
Human videogame operators often input commands based upon their reactions to the sensory layer of the game, which is their only access to the game state and computational layers. For example, if a player sees an enemy standing in their way in Super Mario World, they may input commands to jump over the enemy. The animation caused by the operator’s actions is stacked on the visuals of the sensory layer, which the operator, in turn, interprets as an indication of an underlying game state (i.e. the visuals of an enemy correspond to an area that will cause damage to the player-character). Kristine Jørgensen (2013) explains this through the idea of a ‘gameworld interface’. For those literate in playing a game, the game world becomes an information-rich environment that gives players some access to the game’s underlying functions. Players understand the sensory output to adhere to a specific game state, which in turn is predicated on computational processes. There is a kind of feed-back loop, or dynamic heterarchy, at play here. The layers are informed by one another and mutually determine the ways each operates.
However, when a TAS is run, the operator is no longer human and the inputs are not directly predicated on any of the other layers of animation. The TAS is a predetermined text file, a script, that directs specific inputs at precise times. During desyncs, for example, the operator continues inputting even if the inputs are nonsensical based on the other layers of animation.
Outside of TASs, human speedrunners often separate the animations of their inputs from the other layers as well because of the rote operations they execute based on their ‘attunement’ to the game (Ash, 2013). Take Dotsarecool’s Super Mario World speedrun for instance. The particular run described above was their 1044th attempt at completing that particular version of the run. The knowledge of executing it is not primarily based on what they see on the screen, which is consistent throughout attempts. They mechanistically execute the inputs again and again until they do not have to consciously think about the gameworld or the inputs becoming a form of ‘automated play’ (Taylor and Elam, 2018) and decoupling visual animations from the animations of inputs. Conversely, those that have no literacy in reading sensory output or using controllers (Marcotte, 2018) will not be able to animate the videogame with intention. Yet, similarly to desyncs, they can inexpertly input commands to bring a gameworld to life in a nonsensical dance that may become more intentional as the player becomes aware of the connections between their inputs and the resulting sensory output (Schmalzer, 2020).
The operator, whether human or machine, is itself animated and has a hand in animating the game, but the other layers are separate forms of animation that do not necessarily impact the operator’s animations. The operator’s inputs can bring the game to life, animating movement, but the operator may be indifferent (or oblivious) to the ways other layers are animated. This is yet another example of the ways animation in videogames is not solely oriented towards the human, as the human operator is not necessary for animation to occur.
Conclusion
TASs reveal something about the nature of videogames that typical gameplay practices obscure. Because sensory output is the only layer of animation produced by the videogame system that the operator has access to through normal gameplay, it typically hides other layers of animation in an attempt to create and animate a coherent fictional world. The audiovisuals of Pokémon: Yellow, for instance, create a vast land to travel across, where fantastic creatures roam, waiting to be tamed. The fictional world of Pokémon is not the only thing that is animated, even if that is what is implied by the standard metagame created by the videogame industry broadly.
The processes of animation outlined here are not solely in service of animating the fiction of the game. The hardware and software of the game itself are brought to life by becoming animated through the actions they perform. All the layers of animation are an important animation in their own right, even if players never perceive them and, through these layers, videogames become animated and reanimated during every moment of gameplay.
While sensory output provides a skin that, in tandem with the game state, emulates coherent logics such as three-dimensional space and consistent physics obscuring the computational nature of games, the TASs I have analyzed here tear apart these stacks of animation. But TASs are not the only place these animations occur. TASs are, of course, gameplay practices and artifacts created with and surrounding the same videogames that non-speedrunners play. As far as the way it is animated, there is nothing special about, say, Pokémon: Yellow or MrWint’s TAS of it; they contain the same layers of animation as any other videogame. However, TASs’ ability to deconstruct videogame animations sheds light on the ways animation occurs both within, and well below, the perceptions of human players and viewers, giving an account of the multifaceted ways in which videogame animation moves.
Footnotes
Funding
The author received no financial support for the research, authorship, and publication of this article, and there is no conflict of interest.
