David Rostcheck, 4/30/2023
Our concept of intelligence once extended solely to humanity. As our understanding of the topic broadened we began to understand other forms of intelligence such as the swarm intelligence expressed in termite mounds and beehives, where the colony as a whole is more intelligent than any individual member. We began to recognize that forming structured networks of intelligent actors, such as governments and corporations, may create new super-intelligent systems. Finally, we have begun to create artificially intelligent systems such as large language models, whose intelligence we can directly measure through IQ tests. It has become clear that to me that we need a general framework to characterize, understand, and qualify intelligence.
In this article, we will explore the concept of intelligence from a computational perspective. We'll define intelligence in relation to a system's ability to solve problems and discuss how it's connected to the materials used in computing. Our discussion will cover the following topics:
We can think of a system's intelligence as its ability to process information effectively to solve a given problem. To determine how "effective" a system is, we must introduce an objective function, which is a problem that the system is trying to solve. The objective function comes from outside the system and can vary depending on the context. For example, a data processing system might need to process data according to certain criteria, while a living organism's main objective is to survive and reproduce.
As the complexity of the objective function increases, so does the intelligence required to solve it. Consider a simple task, like rotating every third item on an assembly line by 90 degrees. A basic machine could accomplish this task. However, if we add more complexity by introducing different objects that require different actions, the system needs to be more intelligent to handle the challenge.
We can regard intelligence as a continuum, with systems becoming more intelligent as they can better approximate solutions to complex objective functions. In this context intelligence is essentially an optimization algorithm for solving an externally imposed objective function. More intelligent systems are those that more effectively navigate the challenges imposed by the world around them.
The concept of generality is also important when discussing intelligence. It refers to a system's ability to solve a wide range of complex and dynamic problems. For instance, a simple pendulum can approximate the local gravity field but can't do much else. A more intelligent system would be able to tackle various tasks.
By defining intelligence as an optimizer for an external objective function, we can better understand and compare different forms of intelligence. Consider swarm intelligence, a phenomenon where individual systems with limited intelligence come together to solve more complex problems than they could individually handle. For example: termite colonies build intricate structures, a task beyond the cognitive ability of any individual termite. This definition allows us to compare the intelligence of various systems, such as: a mechanical pendulum, a termite colony, a tax preparation program, a human being, and a large language model.
By considering these diverse systems under a common framework we can recognize and analyze entities that might not be typically seen as intelligent, like a business corporation or a government. This perspective can help us better understand and appreciate the complexity of intelligence in different contexts.
In the previous section, we defined a system's general intelligence as its ability to handle multiple, complex, and/or time-varying objective functions. Now, let's explore a method for quantifying the generality of intelligence.
One approach to measure the generality of intelligence is to evaluate the variability of information in a system's inputs and outputs over time. To do this, we can use vector encoding, a technique commonly used in large language models. Vector encoding transforms input language into vectors within a high-dimensional semantic space, grouping related concepts together. This method takes advantage of the natural language's inherent patterns to map them into the semantic space. Questions about diverse subjects produce vectors pointing to different areas in the space.
By calculating the span, or distance, between all the input vectors and output vectors, we can quantify the variability of information processed by the system. However, we want to ensure the system's output is useful and not just random. To achieve this, we first filter the inputs and outputs, keeping only those that meet a certain accuracy threshold.
We can then define the generality of intelligence as: