The emergence of modern machine-based artificial intelligence (”generative AI”) has captivated popular interest. But generative AI is not the first form of artificial intelligence that humanity has encountered. What we commonly think of as “organizations” - corporations, governments, universities, and organized religions - are actually aggregate intelligences formed from collections of humans. Although composed of humans, such entities possessing their own unique intelligence and capability which often exceeds that of individual individuals. In this article I explore the phenomenon of aggregate intelligence.
Imagine that you attend a county fair where an exhibit encourages visitors to guess the number of jelly beans in a large jar. When the results are published, you, having a mathematical mindset, notice that the average of the guesses proves surprisingly accurate.
This phenomenon was explored by economics professor Jack Treynor in his paper Market Efficiency and the Bean Jar Experiment. Although individuals occasionally exceed the accuracy of the averaged group, the group together consistently gives highly accurate answers - more accurate than almost all of its members. What is going on here?
To investigate the phenomenon further, you engage your friends Alice and Bob to each independently take an IQ test. Suppose you then put them both in the same room and allow them to collaborate on the test. How will their joint effort compare to their individual tests?
We can reason that the results are likely to be higher. If both know an answer, they will agree. If neither knows, they are no worse off. But if only one knows, the other now has access to a correct answer that he or she did not have before. Since Alice and Bob can communicate, they can discuss any answer in dispute. It is possible they sometimes choose the wrong answer of the two. But it is more likely that they choose the correct answer because they each understand and can communicate their uncertainty about the answer.
This thought experiment gives some insight into aggregate intelligence. In the jelly bean jar guessing game, averaging the participants’ guesses improved the projection’s accuracy even though the participants did not communicate with each other at al. This happened because the inaccuracy in their numerical answers was equally distributed, so positive and negative over-estimations cancelled each other out. This situation does not apply in an IQ test, where answers are discrete (right or wrong) not continuous. There, Alice and Bob improved their joint intelligence through communication, a more sophisticated form of combining information.
Alice and Bob have formed an ensemble. It is well-known from machine learning that ensembles, which combine the results from a set of diverse algorithms, tend to outperform individual algorithms. The reason, discussed above, is that they compensate for each other’s weaknesses. A larger and more diverse ensemble almost invariably outperforms a smaller and more uniform one. The Alice-Bob ensemble proves more intelligent than either of them in isolation. This simple observation explains much of human society.
In biology, such ensembles are termed as “swarms” after the insect groupings in which such aggregate intelligence was first noted. Ants, termites, and bees can solve optimization problems well beyond the computing capabilities in their simple brains. Although one ant cannot solve a maze, a set of ants can explore it independently, leaving pheromone trails for others to follow and reinforcing them when they return successfully with food. The pheromone markers on the dead-end trails eventually decay, and the colony has learned the best path through the maze.
People newly exposed to the swarm phenomenon often dismiss the solution as something other than intelligence. Yet it is real intelligence - the ants can solve the maze problem, and others like it, as repeatably as a rat can run a maze. They just can’t do so individually. It is the entire colony that solves the problem. If we replaced all the ants with different ants, the colony would still solve the maze, although a different ant would reach the goal.
While the term “swarm” generally refers to a colony of one type of insect, we can find ensembles in which the members are heterogeneous. ChatGPT4 proposes labeling such systems “complex swarms.” For instance, humans are made up of eukaryotic cells containing human DNA, alongside populations of gut-dwelling bacteria that carry out essential functions and communicate via the vagus nerve. Therefore even a single human could be considered a complex swarm.
Intelligence researchers generally apply the term "swarm" to communication patterns where each member independently contributes to an outcome, usually through voting. Direct democracy is the application of the swarm pattern to civic decision making. Exploring this idea, we note there exist other patterns that contain the same elements but use a different communications structure. Here the individual elements still contribute, but in a more limited way. I employ the term “heterogeneous ensemble” to refer generally to aggregate intelligence formed by diverse individual members and their implicit or explicit communication structure. Examples include a republic versus a democracy in the civic domain, and a modern corporation in the business sphere. In both these examples, individuals contribute to decision-making, but their contribution is mediated by the structure of the organization.
To explore the effects of the communication structure, let’s return to our experiment with Alice and Bob. We might ask what happens if we can continue to scale the ensemble. As we add team members, we would expect team’s collective intelligence to improve, or at least to not get any worse - up to a point. However, once the group begins to grow large, communication factors begin having a noticeable impact. For example, a dominant personality might monopolize decisions, or lengthy discussion time may impact the team's velocity (IQ tests are typically timed). If the environment becomes noisy or disruptive, individual team members’ test-taking performance could potentially decline.
If we repeat this experiment over many teams, we would expect that the most effective teams employ structures to minimize these issues. Perhaps all members quietly take the test in parallel, then reserve time at the end to chose the most popular answers (applying the swarming algorithm from earlier). They might incorporate internal quality checks to validate answers or assign challenging questions to a sub-team of top performers.
It is evident that both the number of team members (computation) and their communication structure (network architecture) contribute to the team's overall effectiveness. In future work, I intend to explore measuring the communication structure’s effect on the team’s combined intelligence.