Abstract

Humans have used technology for millennia to enhance and extend their limited vision. Today, machine vision fulfills this promise, enabling people to see more, see differently, and even see everything. However, each form of vision has its own characteristics and limitations, resulting in blind spots and distortions. Rettberg’s (2023) new book delves into the relationship between humans and machine vision technologies from both technical and sociocultural perspectives. The book meticulously traces the evolution of machine vision, from early inventions like mirrors and lenses to television cameras and modern digital surveillance technologies. It highlights how technological advancements have continuously expanded human perceptual capabilities. Moreover, the book focuses on the profound impact of machine vision technologies on society and individuals, addressing ethical and social issues such as surveillance, privacy, security, and identity recognition.
The book is structured with an introduction, five main chapters, and a conclusion. In each chapter, Rettberg employs a wide range of situations and stories, from art and games to films and fiction, as analytical tools. These examples provide emotionally rich and often visceral, embodied insights. Furthermore, Rettberg integrates diverse theoretical frameworks, including technical cognition theory, posthumanism, and cultural criticism, to deepen her analysis. This combination of case studies and theories highlights the book’s richness and interdisciplinary nature.
In the introduction, Rettberg presents critical concepts such as machine vision. Machine vision is defined as “the registration, analysis, and representation of visual information by machines and algorithms” (p. 3). This broad definition encompasses various types of machine vision, including recent developments and the early history of seeing with technologies. Correspondingly, machine vision’s “seeing” is not biological but rather the result of an assemblage where humans, technologies, and cultural contexts interact. Rettberg further distinguishes between representational and operational images, with the former being “the main point of the image is to show something” and the latter “the main point is to do something” (p. 16). As Laura Kurgan puts it, truth “is intimately related to resolution, to measurability, to the construction of a reliable algorithm for translating between representation and reality” (Kurgan 2013, 12–13). In this sense, contemporary machine vision undoubtedly belongs to the post-optical era.
The first chapter discusses how humans have used technology to see more. The main argument is that technology not only enhances our ability to access and interact with the world but also profoundly influences our understanding and perception of it. The chapter details the historical development of various visual technologies, including polished obsidian mirrors, the Renaissance invention of linear perspective, the camera obscura, and the creation of telescopes and photography. The author devotes considerable attention to reviewing the history of these technologies. This historical overview not only provides a foundation for understanding contemporary technologies, such as algorithmic image production, but also highlights the interactions between humans and technology. The chapter illustrates how machine vision technologies become part of an assemblage within different cultural contexts. As Rettberg aptly states, “Machine vision is about more than technology: it is an assemblage of human bodies, human culture, and technology” (p. 27).
The second chapter continues to explore how humans use technology to see differently. This chapter explains how machine vision technologies enable us to observe the world in various ways and how these variations in perception grant machines a form of agency. The chapter first references N. Katherine Hayles’s technical cognition concept to discuss the autonomy of machine vision technologies. Then, by analyzing the work of the Soviet filmmaking collective the Kinoks and Flusser’s technical images, the author presents two different perspectives on the autonomy of cameras. The former emphasizes the agency of machines, portraying them as anthropomorphized beings that collaborate with humans in creation. The latter views technology as oppressive, with machines enslaving photographers and reducing them to mere operators. The author clearly states her position, asserting that “machine vision doesn’t control us absolutely” (p. 73). In her view, the relationship between humans and machines is not oppositional but rather an assemblage. Subsequently, the author further discusses and differentiates how humans, other animals, and machines sense the world and create meaning differently, drawing on the differences between biosemiotics and cybersemiotics.
The third chapter explores the dream of seeing everything through technology. It illustrates how people aspire to achieve an all-encompassing, objective perspective through surveillance technology and examines its impact on social trust. The chapter uses a real case in Oak Park, a suburb just outside Chicago, to illustrate the controversy over installing automated license plate readers (ALPR). It showcases the public’s desire to see everything through surveillance and their yearning for a sense of security. This chapter holds personal significance for the author, as the case coincided with her own experience of an unexpected assault. Continuing the discussion of the human-technological assemblage from the second chapter, the author further emphasizes the influence of specific sociocultural contexts on machine vision. Although surveillance technology is often seen as a promise of security, there is no clear evidence to suggest that machine vision can reduce crime. Meanwhile, the potential risks of privacy invasion and racism frequently trigger criticism. The author is also concerned that this reliance on technological solutions may further perpetuate a cycle of distrust and fear.
The fourth chapter focuses on how machine vision sees humans. This chapter introduces the concept of the algorithmic gaze and examines through three case studies how algorithms perceive us in ways different from human vision. The first case study discusses selfie filters and how biometrics and facial recognition algorithms interpret human faces. Machine learning has a normalizing effect, modeling and making average predictions or inferences based on training datasets and algorithms, thereby normalizing human facial features and behaviors. This normalizing effect to some extent reflects and reinforces societal biases. The second case study explores how machine vision is used to automate grocery shopping, library access, and other interactions. This case demonstrates that the application of machine vision technologies in different social environments, such as in Norway and the United States, can lead to different outcomes. The final case study examines a fictional example of a benevolent AI dictator, Thunderhead, from Neal Shusterman’s series of novels. Although such novels showcase the protective and nurturing role of technology, familiar to teenagers growing up with artificial intelligence, the author expresses skepticism and concern about this utopian surveillance. The numerous case studies in this chapter further reinforce the author’s argument that machine vision should be seen as an assemblage.
The fifth chapter discusses the blind spots of machine vision. The author uses vivid examples to demonstrate how machine vision can be leveraged for oppression and control, as well as the possibilities for humans to resist it. Although machine vision is often perceived as capable of seeing everything, its design and the limitations of its training data introduce blind spots and flaws, preventing it from fully realizing this ideal. Through scenes from movies and science fiction, the author illustrates how humans employ various methods to trick and evade machine vision. Additionally, the chapter explores real-world examples of protesters, artists, and activists who use creative strategies to hide from facial recognition. The author also discusses algorithmic bias, highlighting how biases in training data lead to recognition errors and unfairness. These fictional and real-life examples inspire new ways of thinking about technology and society, which also hints at the theme of the next chapter, “Hope.”
In the conclusion, the author expresses the hope and future possibilities that machine vision brings. Rettberg emphasizes the complexity of the interaction between technology and humans, noting that despite issues of algorithmic bias and trust, technology can still foster sympathy and community. By using machine vision, humans can transcend the limitations of perception, view different aspects of the world, and better understand and reconnect with each other. Rettberg hopes that this book will help us reflect on the kinds of machine vision we want and the kinds of assemblages we want to be part of.
In the digital age, machine vision has deeply infiltrated people’s lives, significantly transforming the way we see the world. Rettberg’s book systematically demonstrates how machine vision technology reshapes our perception of the world. The author effectively integrates theoretical analysis with practical applications, exploring the potential and challenges of machine vision, making the content both practical and relevant. However, the book contains some technical jargon and details that may be complex and challenging for non-technical readers. Additionally, Rettberg’s focus on cultural comparisons is confined to Western cultures due to her background, which reduces the impact of cultural differences. For example, people living in an Eastern collectivist context might be more inclined to view surveillance cameras as a form of state protection and maintenance of collective interests. This perspective contrasts sharply with the more individualistic viewpoint prevalent in Western cultures, where surveillance is often seen as an invasion of privacy and a potential threat to personal freedom. The reviewer, as a Chinese, feels a sense of missed cultural dialogue. Overall, Rettberg’s book is a deep and comprehensive study that provides rich information and diverse perspectives on understanding machine vision technology, likely enormously inspiring scholars from various disciplines, including media, art, policy, and marketing.
Footnotes
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Humanity and Social Science Youth Foundation of Ministry of Education of China (No.23YJC860039).
