The Intelligence is Plural

Science aspires to reach truths, propositions that hold for all people beyond their individual subjectivities. Formal sciences validate these propositions by deriving theorems in closed axiomatic systems without uncertainty. In contrast, empirical sciences (data-based) must validate their propositions in open natural systems that contain regions hidden from our perception. Is it possible to determine the "truth value" of a proposition for which we have uncertainty about its real state? At least we know the meaning of not lying: not claiming or affirming more than what is known, without hiding what is indeed known. Mathematically, this is defined as maximizing uncertainty (or entropy), given the available information (or constraints), known as MaxEnt [1].

Distributions that adhere to the principle of not lying have the property of intuitively aligning our individual subjectivities. For example, if we know there is a hidden gift behind one of three identical boxes, we naturally avoid assigning all our belief to one of them (as we lack absolute certainty about the gift's position) and also avoid assigning greater belief to one box over the others (as we have no information to favor one over the others). Intuitively, all people, regardless of their culture or ideology, ultimately agree that the only possible belief distribution given the available information is to equally divide belief among the three boxes (maximizing uncertainty) without assigning belief outside them (based on the available information).

The principle of not lying, present in all cultures around the world, predates modern science and forms the foundation upon which empirical sciences reach truths (intersubjective agreements) in contexts of uncertainty. The rules of probability, proposed in the late 18th century and since adopted as the reasoning system in all data-driven sciences, encode this principle. They are conceptually intuitive. The product rule (or conditional probability) updates belief distributions by preserving prior belief (prior information) that remains compatible with the new information (data), thus maximizing uncertainty given the available information. The sum rule (or marginal probability) predicts unobserved events by incorporating the contribution of all alternative (mutually exclusive) hypotheses, integrating all individual predictions.

Our everyday causal arguments, as well as scientific theories or models, can be seen as higher-level hypotheses since they consist of systems of mutually exclusive individual hypotheses (variables) interacting through cause-effect relationships (conditional probabilities). Correct causal theories, those that predict data with the same probability as they are generated by the underlying causal reality, are higher-level hypotheses (aggregations of individual hypotheses) that produce the least amount of surprise. No artificial intelligence model, no matter how complex, can outperform them. So far, no new reasoning system has emerged.

Although no better practical reasoning system for contexts of uncertainty has been proposed during this time, the strict application of probability rules (or Bayesian approach) has historically been limited due to the computational cost required to evaluate the entire hypothesis space. While many models were fully solved by the late 19th century, especially in statistical physics, the 20th century saw the emergence of criteria that arbitrarily select a single hypothesis from the space to avoid computational costs. It was not until the dawn of the 21st century that it became generally possible to compute optimal belief distributions given the available information across all fields of science.

The multiplicative nature of hypothesis evaluation processes and life form selection drives learning in both probability and evolution. This was pointed out by John L. Kelly in his article "A New Interpretation of Information Rate" [2], endorsed by Claude Shannon. "The cost function approach [...] can actually be used to analyze nearly any branch of human endeavor. [...] The point here is that an arbitrary combination of a statistical transducer (i.e., a channel) and a cost function does not necessarily constitute a communication system. What can be done, however, is to take some real-life situation which seems to possess the essential features of a communication problem, and to analyze it without the introduction of an arbitrary cost function. The situation which will be chosen here is one in which a gambler uses knowledge of the received symbols of a communication channel in order to make profitable bets on the transmitted symbols."

Processes of evaluating alternative hypotheses under strict probability rules follow a multiplicative process, a sequence of predictions. Prior belief is filtered through surprise, the only source of information. If a hypothesis's prediction of the observed data is 1 (zero surprise), the prior belief in that hypothesis is fully preserved. If the prediction of the observed data is 0 (total surprise), the hypothesis becomes permanently false. Similarly, evolutionary processes of life form selection are also multiplicative, like sequences of survival and reproduction rates. In fact, the standard model of evolution (replicator dynamic [3]) is structurally equivalent to Bayes' theorem [4]. A single zero in the sequence causes extinction.

In multiplicative processes, the impact of losses is greater than that of gains, favoring variants that reduce fluctuations through individual diversification, cooperation, and specialization. This is evident in the evolution of our own life, which depends on at least four levels of cooperative specialization without which we could not survive: the cell with the mitochondrion, the multicellular organism, society, and the ecosystem [5]. In probability, elementary hypotheses group to form variables, variables relate to form causal models, and systems of models form theories [6]. In human history, the cultural transition had profoundly positive effects: before knowledge transmission between individuals, we faced grave extinction risks; afterward, we were able to occupy all ecological niches on Earth [7, 8].

The advantage of plurality is practical, not theoretical. When plurality is disrupted, negative effects become evident. In probability, arbitrarily selecting a single hypothesis from the space causes what is known as overfitting [9]. In evolution, genetic diversity is fundamental for species adaptation, and its loss has negative consequences known as inbreeding depression. In human history, the massive loss of cultural diversity caused by imposing a single societal type during colonial-modernity has had increasingly evident environmental consequences [10, 11].

Despite the fact that deep neural networks are trained without strictly applying probability rules by selecting a single parameter from the possible parameter space, in recent years, algorithms have been developed with capabilities that, for the first time, can be considered artificial intelligence. This has only been achieved when truly large neural networks, on the order of billions of parameters, have been trained on enormous datasets. The negative consequences of selecting a single hypothesis (overfitting) were mitigated through a type of plurality similar to that employed by life in evolution, based on the coexistence of a vast number of individual units (neurons), leading to the emergence of intelligence (double descent) [12].

Despite all the advancements, metropolitan science remains incapable of compensating for the loss of millenary knowledge caused during colonial-modernity, and the current ecological crisis continues to deepen. However, the accumulated experience of the world's most diverse communities has independently led to a universal obligation to give and receive and to the development of reciprocity technologies that reactivate communal bonds through exchange rituals (festive or coercive). Similarly, the institutions that have demonstrated the ability to manage common goods are local community institutions that directly regulate the cycles of exchange with ecological systems [13]. The sudden replacement of these cultural systems with external institutions, whether state or market-based, has led to devastating ecological consequences [14].

The word Plurinational in the Americas represents the coexistence of our local cultural diversities. The word Bayes represents the reasoning system that allows us to reach intersubjective agreements in contexts of uncertainty by adhering to the principle of not lying, which compels us to believe in mutually contradictory hypotheses while rejecting absolute truths that are unjustified. Intelligence is nothing more than the ability of life forms to survive and reproduce throughout evolution, continuing to inhabit the Earth today. Destroying life in the name of truths claimed to be absolute or intelligences claimed to be superior is the highest form of ignorance. Just as all intelligence emerges from plurality, societies adapt to life through the coexistence of their local diversities.

The goal of Bayes Plurinacional is to promote Bayesian Intelligence in Plurinational America and the peoples of the Global South.
References:
[1] Jaynes ET. Information theory and statistical mechanics. Physical review. 1957;106(4):620.
[2] Kelly, JL. A new interpretation of information rate. The Bell System Technical Journal. 1956.
[3] Taylor PD, Jonker LB. Evolutionary stable strategies and game dynamics. Mathematical biosciences. 1978;40(1-2):145–156.
[4] Czégel D, Giaffar H, Tenenbaum JB, Szathmáry E. Bayes and Darwin: How replicator populations implement Bayesian computations. BioEssays. 2022; p. 2100255.
[5] Maynard Smith J, Szathmary E. The Major Transitions in Evolution. New York: Oxford University Press; 1995.
[6] Winn J. Causality with gates. In: Artificial Intelligence and Statistics. Proceedings of Machine Learning Research; 2012. p. 1314–1322.
[7] Hrdy SB, Burkart JM. The emergence of emotionally modern humans: implications for language and learning. Philosophical Transactions of the Royal Society B. 2020;375(1803):20190499.
[8] Boyd R, Richerson PJ, Henrich J. The cultural niche: Why social learning is essential for human adaptation. 2011;108(2):10918–10925.
[9] Bishop CM. Pattern recognition and machine learning. Springer. 2006.
[10] Dussel E. Sistema mundo y transmodernidad. In: Modernidades coloniales. El Colegio de México México DF; 2004. p. 201–226.
[11] Segato, RL. La crítica de la colonialidad en ocho ensayos y una antropología por demanda. Prometeo (2013).
[12] Bishop, CM and Bishop, H. Deep learning: Foundations and concepts. Springer Nature. 2023.
[13] Ostrom E. Governing the commons: The evolution of institutions for collective action. Cambridge university press. 1990.
[14] Ostrom E. Beyond markets and states: polycentric governance of complex economic systems. American economic review. 2010.


Externado
khipu
khipu
Comunica
Metodos