Intelligent Machines: a brief history (Parts 1-3)

Below is a series of three blogs (part 1, part 2, part 3) I wrote for Autonomy last year on the history of intelligent machines. This serves as an introduction to anyone curious about artificial intelligence and how it might shape the future of digital automation in work and society more generally.

Introduction

The notion of what constitutes intelligence and therefore what constitutes an intelligent machine has been widely debated throughout the history of Western thought. Descartes’ mind-body dualism, Marx’s humanist distinction between the intentionality of an architect versus the functionality of bee, and Allen Newell and Herbert Simon’s ‘Physical Symbol System’ hypothesis, which argued that any representational system “has the necessary and sufficient means for general intelligent action”, are just a few examples. Stories of something approximating an intelligent machine go back to the eighth century BCE in Homer’s Iliad. These self-moving machines or ‘automata’ were made by Hephaestus, the god of smithing, and were servants “made of gold, which seemed like living maidens. In their hearts there is intelligence, and they have voice and vigour”.[i] In De Motu Animalium, Aristotle essentially conceived of planning as information-processing.[ii] In developing ontology and epistemology he also arguably provided the bases of the representation schemes that have long been central to AI.[iii] The first edition of Russell and Norvig’s famous text Artificial Intelligence: A Modern Approach [iv] even shows the notation of Alice in Wonderland author Lewis Carroll[v] on Aristotle’s theory of the syllogism – the basis for logic-based AI – on the cover.

From Descartes to Turing

The idea that we can test machinic intelligence is nearly as old as the concept of intelligent machines. Writing in 1637, Descartes proposed two differences that distinguish human from machine in a way that is much more demanding than the Turing Test (see below):

“If there were machines which bore a resemblance to our body and imitated our actions as far as it was morally possible to do so, we should always have two very certain tests by which to recognise that, for all that, they were not real men”.[vi]

The first test imagines a machine’s “being” established such that it can “utter words, and even emit some responses to action on it of a corporeal kind, which brings about a change in its organs”. However, this machine cannot yet fully produce speech such that it could “reply appropriately to everything that may be said in its presence”. This is essentially the criteria for many contemporary artificial intelligences. The second test concerns situations in which machines can “perform certain things as well as or perhaps better than any of us can do”, yet fall short in others, which means that they did not “act from knowledge”, but rather only from “the disposition of their organs”. An intelligent machine can only pass both of Descartes’ tests if it has a functionality that is beyond a narrowly defined intelligence such that it has the capacity for knowledge. It must understand any given question enough to answer beyond programmed responses. This leads to the conclusion that it is “impossible that there should be sufficient diversity in any machine to allow it to act in all the events of life in the same way as our reason causes us to act”[vii].

Intelligent machines that approximate human understanding have yet to be produced. However, intelligent machines of a narrower type have existed – first virtually, then in reality – since Charles Babbage’s Analytical Engine of 1834. This machine was designed to use punch cards (an early form of computation) and could perform operations based on the mathematization of first-order logic. The Countess of Lovelace Ada Byron King – popularly known as Ada Lovelace – worked with Babbage and prophesised the implications of the algorithms that underpinned it. We can think of algorithms as a type of virtual machine or an “information-processing system that the programmer has in mind when writing a program, and that people have in mind when using it”[viii]. Ada Lovelace theorised virtual machines that formed the foundations of modern computing, including stored programs, feedback loops and bugs among other things. She also recognised the potential generality of such a machine to represent nearly “all subjects in the universe”, predicting that a machine “might compose elaborate and scientific pieces of music of any degree of complexity or extent”, though she could not say how[ix].

Advancements in mathematics and logic allowed for a breakthrough in 1936, when Alan Turing showed that every possible computation can in principle be performed by a mathematical system. This is now called a Universal Turing Machine[x]. Turing spent the next decade codebreaking at Bletchley Park during World War II and thinking about how this virtual machine could be turned into an actual physical machine. He helped design the first modern computer, which was completed in Manchester in 1948. Turing is usually credited with providing the theoretical break that led to modern computation and AI. In an unpublished paper from 1947, Turing discusses “intelligent machines”. A few years later Turing publishes his famous paper in which he asks, “Can a machine think?” and argues that machines are capable of intelligence. To make his case, he first constructs an “imitation game” or what is now known as the “Turing Test”, which continues to influence popular debates about AI[xi]. The test involves three people – a man (A) and a woman (B) who communicate through typescript with an interrogator (C) in a separate room. The interrogator aims to determine which of the other two is the man and which is the woman. Turing argue that the question “What will happen when a machine takes the part of A in this game?” should replace the original question “Can a machine think?”. The failure to distinguish between machine and human indicated the intelligence of the machine. Turing then goes on to consider nine different objections which form the classical criticisms of artificial intelligence. One of the most enduring is ‘Lady Lovelace’s Objection’, in which she argues that computers have “no pretensions to originate anything. It can do whatever we know how to order it to perform”[xii]. However contemporary “expert systems” and “evolutionary” AI have reached conclusions unanticipated by their designers[xiii]. Interestingly, a machine with a set of responses that happen to perfectly fit the questions asked by a human would pass a Turing test, but not pass Descartes’ test.

From Russell to MINDER

Following the innovations of Turing and Lovelace, the advancement of intelligent machines picks up speed from the 1950s into the 1970s in large part to three developments: Turing’s work, Bertrand Russell’s propositional logic and Charles Sherrington’s theory of neural synapses. In a famous paper titled “A Logical Calculus of the Ideas Immanent in Nervous Activity,” the neurologist and psychiatrist Warren McCulloch and the mathematician Walter Pitts combined the binary systems of Turing, Russell and Sherrington by mapping the 0/1 of individual states in Turing machines onto the true/false values of Russell’s logic, onto the on/off activity of Sherrington’s brain cells.[xiv] During this time a number of different proto-intelligent machines were built. For example, a Logic Theory Machine proved eighteen of Russell’s key logical theorems and even improved on one of them. There was also the General Problem Solver (GPS) machines, which could apply a set of computations to any problem that could be represented according to specific categories of goals, sub-goals, actions and operators.[xv] At the time, these intelligent machines relied almost exclusively on formal logic and representation, which dominated the early development of computing. Margaret Boden terms this type of artificial intelligence “Good Old-Fashioned AI” or GOFAI.

The binary systems synthesised by McCulloch and Pitts helped to catalyse the embryonic cybernetics movement, which emerged alongside the symbolic/representational paradigm discussed above. Cybernetics was coined in 1948 by Norbert Wiener, an MIT mathematician and engineer who developed some of the first automatic systems. Wiener defined cybernetics as “the study of control and communication in the animal and the machine.”[xvi] Cyberneticians examined a variety of phenomena related to nature and technology including autonomous thought, biological self-organisation, autopoiesis and human behaviour. The driving idea behind cybernetics was the idea of the feedback loop or “circular causation”, which allows a system to make continual adjustments to itself based on the aim it was programmed to achieve. Such cybernetic insights were later applied to social phenomena by Stafford Beer to model management processes among others. Wiener and Beer’s insights were used in Project Cybersyn – a pathbreaking method of managing and planning the Chilean national economy under the presidency of Salvador Allende from 1971-73.[xvii] However, as AI gained increasing attention from the public and government funding bodies, there began to be a split between two paradigms – the symbolic/representational paradigm which studied mind and the cybernetic/connectionist paradigm which studied life itself. The symbolic/representational paradigm came to dominate the field.

There were numerous theoretical and technological developments from the 1960s through to the present that provided the foundations for the range of intelligent machines that we rely on today. One of the most important was the re-emergence in 1986 of parallel distributed processing, which formed the basis for artificial neural networks, a type of computing that mimics the human mind. Artificial neural networks are comprised of many interconnected units that are each capable of computing one thing; but instead of computing sequential instructions based on top-down instructions given by formal logic, they use a huge number of parallel processes, controlled from the bottom up based on probabilistic inference. They are the basis for what is called “deep learning” today. “Deep learning” uses multi-layer networks and algorithms to systematically map the source of a computation, thus allowing it to adapt and improve itself. Another important development was Rosalind Picard’s ground-breaking work on “affective computing”, which inaugurated the study of human emotion and artificial intelligence in the late 1990s.[xviii] Marvin Minsky also influenced the incorporation of emotion into AI in considering the mind as a whole, inspiring Aaron Sloman’s MINDER program in the late 1990s.[xix] MINDER indicates some ways in which emotions can control behaviour, scheduling competing motives. Their approaches also inspired more recent hybrid models of machine consciousness such as LIDA (Learning Intelligent Distribution Agent), by researchers led by Stan Franklin.[xx]

What puts the ‘intelligence’ in Artificial Intelligence?

Today there are many different kinds of intelligent machines, with many different applications. In 1955, the study of intelligent machines is essentially rebranded as “artificial intelligence” via a conference at Dartmouth College during the summer of 1956.[xxi] In the proposal for the conference, the authors state that “a truly intelligent machine will carry out activities which may best be described as self-improvement”.[xxii] However, a single definition of artificial intelligence is difficult to adhere to, especially in a field rife with debate. For perspective, Legg and Hutter provide over seventy different definitions of the term.[xxiii] It has been variously described as the “art of creating machines that perform functions that require intelligence when performed by people”,[xiv] as well as “the branch of computer science that is concerned with the automation of intelligent behaviour”.[xv] One of the best definitions comes from the highly influential philosopher and computer scientist Margaret Boden: “Artificial intelligence (AI) seeks to make computers do the sorts of things that minds can do”.[xvi] Within this definition, Boden (2016, p. 6) classifies five major types of AI, each with their own variations. The first is classical, or symbolic “Good Old-Fashioned AI” (GOFAI mentioned in a previous post), which can model learning, planning and reasoning based on logic; the second is artificial neural networks or connectionism, which can model aspects of the brain, recognise patterns in data and facilitate “deep learning”; the third type of AI is evolutionary programming, which models biological evolution and brain development; the last two types, cellular automata and dynamical systems, are used to model development in living organisms.

None of these types of AI can currently approximate anything close to human intelligence in terms of general cognitive capacities. A human level of AI is usually referred to as artificial general intelligence or AGI. AGIs should be capable of solving various complex problems in various different domains with the ability of autonomous control with their own thoughts, worries, feelings, strengths, weaknesses and predispositions (Goertzel and Pennachin, 2007). The only AI that exists right now is of a narrower type (often called artificial narrow intelligence or ANI), in that its intelligence is generally limited to the frame in which it is programmed. Some intelligent machines can currently evolve autonomously through deep learning, but these are still a weak form of AI relative to human cognition. In an influential essay from the 1980s, John Searle makes the distinction between “weak” and “strong” AI. This distinction is useful in understanding the current capacities of AI versus AGI. For weak AI, “the principal value of the computer in the study of the mind is that it gives us a very powerful tool”; while for strong AI “the appropriately programmed computer really is a mind, in the sense that computers given the right programs can be literally said to understand and have other cognitive states”.[xvii] For strong AI, the programs are not merely tools that enable humans to develop explanations of cognition, the programs themselves are essentially the same as human cognition.

The Prospect of General Intelligence

While we currently do not have AGI, investment in ANI is only increasing and will have a significant impact on scientific and commercial development. These narrow intelligences are very powerful, able to perform a huge number of computations that would in some cases take humans multiple lifetimes. For example, some computers can beat world-champions in popular games of creative reasoning such as chess (IBM’s Deep Blue in 1997), Jeopardy (IBM’s Watson in 2011), and Go (Google’s AlphaGo in 2016). The Organisation for Economic Co-operation and Development [OECD], found that private equity investments in AI start-ups have increased from just 3% in 2011 to roughly 12% worldwide in 2018.[xviii] Germany is planning to invest €3 billion in AI research between now and 2025 to help implement its national AI strategy (“AI Made in Germany”), while the UK has a thriving AI startup scene and £1 billion of government support.[xxix] The USA had US$5 billion of AI investments by VCs in 2017 and US$8 billion in 2018.[xxx] The heavy investment in ANI start-ups and the extremely high valuations of some of the leading tech companies funding AGI research might lead to an artificial general intelligence in the coming years.

Achieving an artificial general intelligence could be a watershed moment for humanity and allow for complex problems to be solved at a scale once unimaginable. However, the rise of AGI comes with significant ethical issues and there is a debate as to whether AGI would be a benevolent or malevolent force in relation to humanity. There are also people who fear such developments could lead to an artificial super intelligence (ASI), which would be “much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills”. [xxxi] With an increasingly connected world (referred to as the internet of things) artificial super intelligences could potentially “cause human extinction in the course of optimizing the Earth for their goals”.[xxxii] It is important, therefore, that humans remain in control of our technologies to use them for social good. As Stephen Hawking noted in 2016, “The rise of powerful AI will be either the best or the worst thing ever to happen to humanity. We do not yet know which”.

Endnotes

[i] Homer, 1924. The Iliad. William Heinemann, London. pp. 417–421

[ii] Aristotle, 1978. Aristotle’s De motu animalium. Princeton University Press, Princeton.

[iii] Glymour, G., 1992. Thinking Things Through. MIT Press, Cambridge, Mass.

[iv] Russell, S.J. and Norvig, P., 2010. Artificial intelligence: a modern approach, 3rd ed. Pearson Education, Upper Saddle River, N.J;Harlow;

[v] Carroll, L., 1958. Symbolic logic, and, The game of logic : (both books bound as one), Mathematical recreations of Lewis Carroll. Dover, New York.

[vi] Descartes, R., 1637, 1931. The philosophical works of Descartes. Cambridge University Press, Cambridge.

[vii] Ibid., p. 116

[viii] Boden, M.A., 2016. AI : Its Nature and Future. OUP, Oxford. p. 4

[ix] Lovelace, A.A., 1989. Notes by the Translator (1843), in: Hyman, R.A. (Ed.), Science and Reform: Selected Works of Charles Babbage. Cambridge University Press, Cambridge, pp. 267–311.

[x] Turing, A.M., 1936. “On Computable Numbers with an Application to the Entscheidungsproblem,” Proceedings of the London Mathematical Society, Series 2, 42/3 and 42/4., in: Davis, M. (Ed.), The Undecidable: Basic Papers on Undecidable Propositions, Unsolvable Problems, and Computable Functions. Raven Press, Hewlett, NY, pp. 116–53.

[xi] Nisson, N., 1998. Artificial Intelligence: A New Synthesis. Morgan Kaufmann, San Francisco.

[xii] Lovelace, A.A., 1989. Notes by the Translator (1843), in: Hyman, R.A. (Ed.), Science and Reform: Selected Works of Charles Babbage. Cambridge University Press, Cambridge, pp. 303.

[xiii] See Boden, M.A., 2016. AI : Its Nature and Future. OUP, Oxford. See also Luger, G.F., 1998. Artificial intelligence : structures and strategies for complex problem solving. England, United Kingdom.

[xiv] Mcculloch, W.S., Pitts, W., 1943. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5, 115–133. https://doi.org/10.1007/BF02478259

[xv] See Newell, A., Simon, H., 1956. The logic theory machine–A complex information processing system. IRE Transactions on Information Theory 2, 61–79. https://doi.org/10.1109/TIT.1956.1056797. See also Simon, H.A., Newell, A., 1972. Human problem solving / Allen Newell, Herbert A. Simon, Human problem solving / Allen Newell, Herbert A. Simon. Prentice-Hall, Englewood Cliffs, N.J.

[xvi] Wiener, N., 1961. Cybernetics : or, Control and communication in the animal and the machine, Second edition. ed. M.I.T. Press, New York.

[xvii] Medina, E., 2014. Cybernetic revolutionaries : technology and politics in Allende’s Chile. The MIT Press, Cambridge.

[xviii] Picard, R.W., 1997. Affective computing. MIT Press, Cambridge, Mass.

[xix] Minsky, M., 2006. The Emotion Machine: Commonsense Thinking, Artificial Intelligence, and the Future of the Human Mind. Simon & Schuster, Riverside.

[xx] Baars, B.J., Franklin, S., 2009. CONSCIOUSNESS IS COMPUTATIONAL: THE LIDA MODEL OF GLOBAL WORKSPACE THEORY. International Journal of Machine Consciousness 1, 23–32. https://doi.org/10.1142/S1793843009000050

[xxi] McCarthy, J., Minsky, M.L., Rochester, N., Shannon, C.E., 2006. A proposal for the Dartmouth summer research project on artificial intelligence: August 31, 1955. AI Magazine 27, 12.

[xxii] Ibid., p 14

[xxiii] Legg, S., Hutter, M., 2007. Universal Intelligence: A Definition of Machine Intelligence.(Author abstract)(Report). Minds and Machines: Journal for Artificial Intelligence, Philosophy and Cognitive Science 17, 391. https://doi.org/10.1007/s11023-007-9079-x

[xxiv] Kurzweil, R., 1990. The age of intelligent machines. MIT Press, London;Cambridge, Mass

[xxv] Luger, G.F., 1998. Artificial intelligence: structures and strategies for complex problem solving. England; p. 1.

[xxvi] Boden, M.A., 2016. AI : Its Nature and Future. OUP, Oxford. p.1.

[xxvii] Searle, J.R., 1980. Minds, brains, and programs. Behavioral and Brain Sciences 3, p. 417. https://doi.org/10.1017/S0140525X00005756

[xxviii] OECD, 2018. Private Equity Investment in Artificial Intelligence (OECD Going Digital Policy Note). Paris.

[xxix] Deloitte, 2019. Future in the balance? How countries are pursuing an AI advantage (Insights from Deloitte’s State of AI in the Enterprise, No. 2nd Edition survey). Deloitte, London.

[xxx] Ibid.

[xi] Bostrom, N., 2006. How Long Before Superintelligence? Linguistic and Philosophical Investigations 5, p.11.

[xii] Yudkowsky, E., Salamon, A., Shulman, C., Nelson, R., Kaas, S., Rayhawk, S., McCabe, T., 2010. Reducing Long-Term Catastrophic Risks from Artificial Intelligence. Machine Intelligence Research Institute. p. 1