Generative AI

Abstract

Ever since computers were invented as devices to access and process digital communication technologies some 75 or more years ago, there has been endless speculation about how contemporary society is embedding and absorbing them into its social and economic fabric, particularly into cities (Batty, 2024a; Meier, 1962). New concepts and techniques are being continually developed and defined but there is a little sign of any convergence in the conceptual structure and the language of IT onto a stable platform, apart from the fact that the basic notion of representing many aspects of our world in terms of zeros and ones, the binary code, remains inviolate. At the onset of the industrial revolution some 250 years ago, it was assumed that the move to mechanical automation represented a clean break with the analogue world of the pre-industrial past but since then, there have been successive revolutions derived from a line of new technologies – electrical, digital, genetic, bioinformatic, and so on – which are each getting closer and closer to one other, with many of their features beginning to blur and merge (Batty, 2024a). The implication is that new technologies are now invented so quickly that it is no longer useful to separate them out from one another. They crowd into each other, disrupting what already exists, only for many of them to come back to regenerate and rekindle what has already been invented. The world of IT is thus increasingly entangled with layers and layers of related but different technologies.

The latest wave which threatens to overwhelm everything else involves AI, artificial intelligence. Although AI was there at the very beginning of digital computation, articulated amongst others by Alan Turing in his seminal 1950 paper ‘Computing machinery and intelligence’, for a long time it remained as a background concept which a small group of scientists argued had the potential to automate methods of human reasoning. Although the idea that one could program computers to mimic human intelligence was always a possibility, various steps were made in the intervening years to demonstrate how symbolic reasoning, rule-based systems, pattern recognition, and a host of other formalised logics could be programmed into machines. Amongst many other things, computers and their software evolved to ‘play a good game of chess’ as well as showing us that the best methods could simulate some of the most basic actions of a brain using analogies with neural nets and related forms of what is still called connectionism (Rumelhart et al., 1986).

AI would never have taken off in the way it is now portrayed if it had not been for the massive increases in computing power – speed, memory, falling costs changing at exponential, and super-exponential rates as reflected in Moore’s (1965) Law. This enabled extremely rapid processing of enormous data bases, such as those that characterise the World Wide Web and related systems. Big data is the driving force of AI, and this is currently changing the nature of science in general in that there are now many systems that have the potential to discover meaningful patterns in such data. These systems are moving science from being theory-led to data-led, from deduction to induction, although this does not mean that theory is being dispensed with, only that new ideas can emerge and converge from any or both of these directions. The widely quoted paper by Anderson (2008) entitled ‘The end of theory’ written some 15 or more years ago heralded a world where theory could be dispensed with on the assumption that machines would discover much more meaningful explanations than the sorts of human reflection that had characterised science hitherto. Of course, any evaluation of the history of science quickly reveals that this is manifestly false, for to explain the world, one always needs prior theory, even if big data provides some of the sparks for directing our attention at attempts to discover meaningful patterns.

When computers first emerged, the notion that computation could be reduced to binary code, set the tone for applications based on simple rule-based algorithms, involving the idea that many problems could be cast as sequences of conditional logics. As we have implied, this was almost immediately directed towards games of strategy – checkers and chess became favourites – and the notion of exploring the space in which optimal solutions to such games were played out dominated the earliest AI. Optimization in formal terms also developed alongside strategies for defining solutions in terms of costs, benefits, utilities, and such like measures of optimality but severe limits began to emerge as to the extent to which such solution spaces could yield the best answers. These spaces remain, we think, beyond our abilities to define them in that they are simply too big. Although these early ideas elevated the role of symbology rather than mathematical transformations as being central to AI, progress was slow, and different paradigms were needed to progress the field. The idea that we could ever chart the combinatorial explosion that came with every characterisation of where a solution might lie simply daunted our abilities to make progress with this kind of AI. It is still an open question as to whether we will ever be able to chart these limits.

In fact, some of the ideas from these first years lay dormant during the 1980s when AI entered what was referred to as its ‘nuclear winter’. Thus, simpler approaches based on more pragmatic ideas such as expert systems became popular. In fact, the idea that computers could be likened to brains had been popular since the notion that many elements defined as processing units might constitute the appropriate symbology for working out how brains might function. This involved continued transformation of the values associated with processing units in the form of what came to be called neural nets. These were developed by McCulloch and Pitts as early as 1943 and then by Rosenblatt (1958) in what he called a ‘perceptron' which he argued should be the basic building block for how brains might process inputs in sequential form, akin to the way it was then thought the brain operates. Rosenblatt’s perceptron however was largely a deterministic formulation of a set of how symbolic instructions might learn. His work which was extremely promising was cut short by his untimely death and the field assumed that his articulation of such networks for encoding intelligence was a dead end (Minsky and Papert, 1972).

One of the key features which now defines AI is the notion that there are many different versions. It is surprising that these different features have not been more widely discussed but this is as much because AI is potentially so broad a concept that it is hard to define; there is no real agreement over what these different features are, largely because one can approach AI in terms of the substantive problems that might be addressed in contrast to the methods that might be used. That these differences have not been writ large implies a degree of confusion about their different meanings. As we argue here, AI is essentially a wide array of contemporary information technologies dealing with different symbology, models, data bases, and so on; in short, AI covers many different variants of IT, weak and strong, general purpose and specific, broad and narrow. We can use several different dichotomies to focus the debate. One only has to examine current discussion in the serious but popular scientific press as well as the more general hype in public commentaries to realise how confusing it has become. Araon Wildavsky (1973), one of the most perceptive policy analysts in commenting on urban planning 50 years ago, said: ‘If planning is everything, maybe it’s nothing’. We could use the same phraseology for AI when we now say: ‘If AI is everything, maybe it’s nothing’. This is not meant to be a scurrilous play on words. It is a significant point in gauging the depth of new ideas and where they might lead us.

Explaining the world and changing it through design using AI is very different from methods and models that seek to explain as much as possible. Take chess for example. Despite developing computers that can engage in massive search of the potential solution space in the quest for solutions, there is no guarantee that a best strategy to generate solutions can be found. This is notwithstanding that there are many examples now where computers can always beat the best chess grand masters but there is no closure for problems like this. Many of the earliest applications of AI based on the most rudimentary of neural nets, games of strategy, and AI methods formalise the solution process as one in which elements of solution are invented rather than discovered as the problem-solving progresses. Most AI models in this sense generate solutions to problems and they can do this by continually adding to the solution space. This is best seen in connectionist systems, in neural nets where we do not know the architecture of what is related to what before we attempt to model the problem. We can add new layers of unknown elements to the network as we proceed and, in this way, begin to approach better explanation of the problem under scrutiny: this is the way in which neural nets are used to provide an architecture for deep learning, an architecture which can have as many layers of hidden nodes and their connections as we consider intuitively useful.

In fact, we can adapt these AI models to systems that enable solutions to multivariate problems to be sought in much the same way as we use statistical analysis to find solutions to models that optimise goodness of fit to data. These models are often called discriminative in that they attempt to explain everything we observe whereas the class of models that seeks to explain by adding new data are called generative . The key difference is that the solution space for generative models is not known in advance, and we have to invent it as we go along. This makes generative models much more suitable for problems where invention rather than discovery is of the essence and in this context, such AI appear much more suited to problems of design – in our context here, inventing future forms for cities which optimise various criteria rather than explaining forms that already exist.

In some senses, it might be argued that all AI models are generative. They all admit the possibility of inventing solutions. For example, Rosenblatt’s perceptron is a neural net which did not get much beyond the idea of explaining how neurons fired to produce specific outcomes, and it took developments in deep learning to move these structures to the point where they were truly adaptive and generative. This came first in pattern recognition using the famous example of building a neural net with enough hidden layers so that we can provide the net with images such as say, ‘dogs’ with a focus on explaining outcomes such as ‘cats’. In short, when the observer takes an image (such as dog) the AI attempts to explain, classify even, if the outcome is a cat. To an extent this is discriminative AI, although there are in principle no limits on the data that is provided to classify the outcomes correctly. Until about a decade or so ago, most AI problems that used neural nets to build an architecture of connections that would explain these kinds of outcome were dominated by image processing. Since then, the field has exploded with applications in text, even sound extending the field.

The focus on text has been boosted by the development of massive data bases, the easiest of which to appreciate is the data associated with the World Wide Web. Of course, Google and the early search engines focussed on organising such data so that it could become accessible to powerful algorithms that could extract meaning from such data but these have led quite quickly in the last 5–10 years to what are now called large language models (LLMs). These combine such data with neural network like architectures which enable models to be built which generate different but meaningful outcomes which can be regarded as solutions to problems of generating detailed answers to elaborate queries. In short, the models that have emerged such as ChatGPT and like LLMs are now providing researchers and practitioners in many fields with new ways of making predictions and designs. The classic examples are the construction of text which meets high levels of integrity and consistency. Such systems are now able to ‘write a good essay’ with their outputs using neural net models that are large enough to embrace all the features that define what constitutes good practice. These reflect standards of plausibility and consistency, notwithstanding that such models can at times be badly wrong, hallucinating and drawing completely incorrect conclusions from the corpus of text assembled.

The problem of defining different types of AI is closely linked to different types of application area to which such systems can be applied (Wolfram, 2023). In short, there are almost as many variants of AI as are the distinct problem areas that appear relevant to models such as ChatGPT. In this sense, the ground rules for defining appropriate AI are rather loose and it is becoming ever clearer that domain knowledge is critical to successful applications of such AI technology. Many applications to date focus on methods rather than substance and these technologies can only improve when different problems are addressed. This editorial is called ‘Generative AI' because as far as we can see, most AI is generative in the sense that it is open-ended. In some respects, all simulation and modelling using digital as well as analogue computers is generative too as there is always scope to include features in the model that will improve its performance. To an extent, the narrower forms of AI – discriminative , so-called – are closed systems much more akin to statistical models that attempt to explain everything but most of the focus at present on using ChatGPT is generative, producing outcomes, predictions, ideas, solutions, and designs that have not been developed before and are unique to each application.

With respect to the focus in this journal, generative AI is uniquely suited to designing solutions which improve the human condition, in our own case the quality of life, the sustainability and the prosperity associated with cities. In a previous editorial (Batty, 2024b), I sketched out how generative AI was an early theme in the development of configurational statistics, shape grammars, design methods, pattern languages, and related optimisation models which we published in this journal. In fact, this is likely to herald a revival in ideas about design coming from this area in the next decade. There are various articles in the works and as ever, the journal welcomes contributions that enable us to pursue this research and its applications.

References

Anderson

(2008) The end of theory: The data deluge makes thescientific method obsolete. Wired Magazine. June 23, 2008. https://www.wired.com/2008/06/pb-theory/

Batty

(2024a) The Computable City: Histories, Technologies, Stories, Predictions. Cambridge MA: The MIT Press.

Batty

(2024b) AI and design. Environment and Planning B: Urban Analytics and City Science 51(4): 799–802. DOI: 10.1177/2399808324123661.

McCulloch

Pitts

(1943) A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5: 115–133.

Meier

(1962) A Communications Theory of Urban Growth. Cambridge MA: The MIT Press.

Minsky

Papert

(1972) Perceptrons: An Introduction to Computational Geometry. Cambridge MA: The MIT Press.

Moore

(1965) Cramming more components onto integrated circuits. Electronics Magazine 38(8): 114–117.

Rosenblatt

(1958) The Perceptron: A Theory of Statistical Separability in Cognitive Systems. Buffalo, NY: Cornell Aeronautical Laboratory, Inc. Rep. No. VG-1196-G-1.

Rumelhart

Hinton

Williams

(1986) Learning representations by back-propagating errors. Nature 323: 533–536.

10.

Turing

(1950) Computing machinery and intelligence. Mind 59(236): 433–460. DOI: 10.1093/mind/LIX.236.433.

11.

Wildavsky

(1973) If planning is everything, maybe it’s nothing. Policy Sciences 4: 127–153.

12.

Wolfram

(2023) What Is ChatGPT Doing .. And Why Does it Work? Urban-Champaign, Illinois: Wolfram Media Inc.