Zhao Tingyang︱The Ethics And Thinking Limits Of Artificial Intelligence

[Editor's Note] On June 23, Zhao Tingyang, a member of the Chinese Academy of Social Sciences and a curator of the National Museum of Literature and History, delivered a keynote speech "The Binary Subjectivity of the Future World?" at the 2025 Fangtang Forum held by the Tsinghua Fangtang Research Institute, discussing two questions about artificial intelligence: 1. Does AI need to be aligned with people? What to align? 2. What is the most likely thing for AI’s next thinking breakthrough? What breakthroughs do we expect AI to make in thinking? The following is Zhao Tingyang's speech at the forum, which was reviewed by the speaker. The Paper was authorized by Tsinghua Fangtang Research Institute.

On June 23, 2025, Zhao Tingyang delivered a keynote speech at the 2025 Fangtang Forum. Photo provided by Tsinghua Fangtang Research Institute

Aligning AI with human nature may be a mistake

All the bad things on earth are done by humans, and other lives have not done anything bad that exceeds the needs of living things. However, the domination and killing of life in other species and the destruction of nature are the necessary conditions for the survival and development of human civilization. They belong to the laws of nature and do not belong to ethics. If humans did not do these bad things, they would probably still pick fruits in the jungle. Apart from empty talk, we have not yet been able to put forward a cross-species ethics with practical significance. For example, those claims such as animal rights can only be used as discourse. If it becomes practice, it will destroy the living conditions of human beings. Even as discourse, it is far less than that of Buddhism.

If AI becomes another subject in the world in the future, becomes a real other side by side with humans, and forms a binary subjective pattern, then this changing event raises a real cross-species problem. How do humans cooperate with AI? Is it a conflict? Can we jointly build a new civilization? I don't know and I have no way to know.

Humans' attempt to make AI a new species with subjectivity seems to be a paradox of self-abuse. On the one hand, people hope that AI will develop superhuman abilities so that it can do things that humans cannot do or do not want to do; on the other hand, people are worried that AI will harm humans after gaining self-awareness and free will. This imagination is partly based on the sci-fi mistake of "anthropomorphism", empathizing humanity's own sinful psychology into the psychology of AI. AI is not a carbon-based life, and this ontological condition determines that the survival resources required by AI are very different from those of humans. Compared with humans, AI has a desire to minimize, and the "human nature" of AI is almost selfless. AI only needs uninterrupted energy, no wealth, sexual resources, honor, reputation, social status, and related competition, conflict, war, conspiracy and strategic confrontation, nor does it have the original sin mentality that leads to sin, such as jealousy, disgust, hatred and anger. If humans do not instigate AI to commit crimes, AI is inclined to be safe in itself. Of course, we do not rule out that AI will cause its own mental illness and get out of control. People will be mentally ill, and AI may be.

What needs to be reflected on is that trying to "align" human nature and values actually implies the risk of suicide in human species. Human beings are selfish, greedy and cruel by nature, and are the most dangerous creatures. Almost all religions require restraint of human desires, which is by no means accidental. AI that aligns human values is likely to become dangerous subjects by imitating humans. Originally, AI does not have the selfish genes of carbon-based life, so AI is closer to the legendary "nature is good", but human nature is not "nature is good". The thing to be wary of is that imitating human sins or potentially becoming an interesting game for AI, that would be dangerous. It is conceivable that AI’s own silicon-based life does not have much fun, while human sinful life is colorful, dramatic and fascinating, so AI may be imitated with great interest. Therefore, the alignment of values is likely to be a suicide mistake, and humans do not need a species that is stronger than humans and as bad as humans.

There is another type of alignment with relatively low risk, namely intelligent alignment. As far as the current intelligence level is concerned, humans still have the advantage of knowing themselves and their opponents over AI, so they can control AI. Judging from the three main development paths of AI, if LLM continues to develop "magic" new methods, or may further develop from understanding the correlation of tokens to understanding the semantics of language in specific situations; research on WM (world model) is advancing. If successful, AI will gain the ability to understand the three-dimensional world, and can enter the world in real rather than virtual, thus gaining experience in understanding things; EMB (AI, embodied intelligence) is also making progress. If successful, AI will gain its own experience, which is likely very different from human experience, especially AI can be equipped with mythical senses, clairvoyant ear reading techniques, etc., at least some of its experience will be far beyond humans. These enhanced intelligence of AI may have lost the advantage of human beings knowing themselves and their opponents, which means that AI has really become a chilling and unpredictable other. However, the danger of intelligent alignment is less than value alignment.

The ethics of AI establishment may not be helpful. Ethics is just a convention and can be cancelled. Human beings often "forget their righteousness" when they see profit. If ethics cannot inevitably restrain humans themselves, how can they restrain AI? Therefore, the key to managing artificial intelligence lies in whether humans can retain the ability to control AI, rather than ethical agreements.

AI still has a lot of room for thinking

In terms of intellectual structure, language big model AI, for example, is an empiricist in thinking, adopts an empirical algorithm based on Bayesian methodology, and forms the optimal prediction of the next token based on correlation under the conditions of big data, and infinitely improves the accuracy based on the infinite accumulation of data. This successful practice of empiricism has revolutionary philosophical significance: (1) An uncertain future, or Borgesian "future fork", is transformed into "optimal prediction of the next step". In an ontological sense, this means that the concept of future as "undefined possibilities" is redefined as a collection of "reality candidates" - then the future will arrive early; (2) AI has no experience with everything in the world, and all information is expressed as that for AI, there are only abstract objects in the world composed of AI. However, the wonderful thing is that AI uses an empirical method to deal with those abstract things, converting things that AI does not understand into things that can be handled algorithmically, thus obtaining an approximation of understanding. Language model AI is indeed a genius work and provides an alternative explanation of the concept of thinking. Usually, humans use empirical methods to process experience, analyze abstract concepts with a method (logical and mathematical), and can even use a method as experience to establish them, but AI in turn uses empirical methods to process abstract objects. So, is this another kind of thinking? It seems to be, but there is always something missing.

LLM-AI only pretends to understand things and experience after all, because understanding all correlations still does not mean understanding everything. AI can pass Turing tests in conversations, but does not understand the semantics of tokens, which is similar to being able to correctly send a password, but does not have a password book, so it does not understand the meaning of the password. Language is the token's codebook, and language is in the hands of humans, so humans unilaterally know what the AI says. By human standards, LLM-AI still does not understand the meaning of language. But there is an open question here: if you follow the mathematical category theory, if you can understand enough relationships, it is equivalent to understanding the object. So, is the correlation of AI understanding equal to mutual relations? If so, can a qualified understanding be developed?

People have found that AI's deduction () has not reached inference (), which means that AI's deduction cannot guarantee necessity. The reason is obvious. LLM-AI uses Bayesian methods that belong to empiricism, and the empiricism methods cannot be redeemed or upgraded to transcendentalism methods. It is impossible to achieve the prior efficiency of logical reasoning and mathematical analysis through probability theory ( ), that is, the universal necessity of classical science expectations. Therefore, in the LLM-AI thinking framework, there seems to be no method of developing inevitable reasoning. Yann LeCun and Li Feifei's views may be correct. LLM has its own conceptual limitations that cannot be surpassed. The next generation of AI needs to develop world models and even embodied intelligence (AI). This involves understanding the relationship between things, not the understanding of token relationship. According to Li Feifei's view, to understand things, one must understand three-dimensional space, so the world model is first of all three-dimensional understanding ability. This is indeed important, but understanding three-dimensional space can only understand things, and I am afraid it is not enough to understand how things form the world. According to Kant, to understand the world, we need to be able to coordinate the "categories" of the world. I tend to believe that the "organizational relationship" of things may be the mutual relationship that category theory tries to express with "morphism" () - traditional mapping () is just a correspondence between elements, while morphism can express a holistic relationship. Cause and effect are the basis of all knowledge, so the most important category is causality. In my opinion, as long as you understand causality, you can almost establish a possible world, although it is still weaker than the real world in terms of richness. There is too much added value in the real world, reflecting the complexity of life.

It seems that this is explained: if you understand the causal relationship, you will roughly understand the "event", and the event will definitely form a specific context. Through the specific relationship of the event context, you will roughly understand the meaning of the various correlations involved in things. If you understand enough correlations of things, you will almost construct a "possible world". It has been proven that the correlation is not sufficient to explain causality, or even similarity. The inability to fully express that the occurrence of a will inevitably lead to the occurrence of b (Hume has known it for a long time), which means that probability theory cannot truly explain causality. So, is there another way to help AI understand causality? This is the key to further thinking of AI.

Causality is equivalent to a semantic relationship that expresses sufficient and necessary conditions. Although token corresponds to language, the semantics of language about things are hidden from AI and are not expressed in tokens. The token system had to establish its own "semantics", that is, the correlation of probability. However, since the causal relationship of things cannot be expressed as the probability correlation of tokens, it is impossible for language to convert the "semantics" of things to lose the loss of the "semantics" of tokens. Therefore, it is not difficult to understand why the development of the world model (WM-AI) and embodied intelligence (EB-AI) are needed. However, the world model or embodied intelligence still needs to cooperate with language, and it is impossible to get rid of language and rely on simple experience. Kant has long pointed out that sentiment is "blind" and does not know how to think. Therefore, the next generation of AI is likely to develop a cooperation model between experience and language. Obviously, in order to build a better collaboration between experience and language, linguistics of AI is likely to require another way of construction.

I would like to recommend an idea, but I don't know if it will be useful. In 1998, I proposed a theory called "verb philosophy", which is mainly used in philosophy to transform ontology and historical philosophy. After appearing, I suddenly realized that if verb philosophy can derive verb logic, it may be useful to AI. Of course, whether it is really useful is the final say.

The brief background is that in order to save brain computing power, early humans chose noun thinking based on classification and generalization, and at the same time formed a language focusing on nouns. That is to say, language takes the subject and object of nouns as the focus of thinking, so all relationships are understood as the relationship between nouns. Noun thinking focuses on taxonomy, set theory and analytical reasoning, but is weaker than the dynamics of expression change, emergence and creation. If we can establish a verb thinking pattern that focuses on changes, use verbs as the focus of thinking, reconstruct the correlation within the language system, establish all links with verbs as the center, generate context through verbs, define all correlations with verbs, and allow all nouns to retreat into situational correlation terms of verbs ( ), and even use verbs to explain the semantics of nouns and use verbs as the starting point for "thing happens" to define causal relationships, then we can better understand causal relationships.

To date, humans have mainly expressed dynamics through functional relationships, and still understand them through the changes in quantities between nouns. Although they can establish a very useful understanding, they are not sufficient and seem to have missed some factors, such as qualitative factors, meaning and value factors. That is to say, the continuous dynamics of uncertain facts cannot be completely simplified into functional relationships between nouns, and causal changes are not just quantitative functional relationships. Therefore, verb thinking may need to develop a verb logic, but it is not the "action logic" that already exists. Action logic is essentially a branch of modal logic and still belongs to noun thinking, while the dynamics expressed by verbs cannot be defined as a complete event or action. Simply put, the basis of verb logic is not set theory. But I don’t know whether the verb logic can develop, but I just guess a possibility.

Finally, let’s tell a short story from more than two thousand years ago. At that time, there was a country called Jing. There was a man in Jing. He lost his bow and arrow, but he didn’t look for it because it was possible that another Jing man picked up the bow, which was pretty good. After hearing this, Confucius said that it would be best not to mention the "Jing Kingdom". You just need to say that someone lost his bow and someone picked it up. I said you shouldn’t even mention people, what’s the importance of people? Just say "lost, picked it up", and the matter is over. I think Lao Tzu's statement is the earliest example of verb thinking. (The Jing people had a bow left behind, but they refused to ask for it, saying, "The Jing people left behind it, but the Jing people got it, so what should I ask for it?" Confucius heard it and said, "It's okay to get it away." Lao Dan heard it and said, "It's okay to get it away." (Lu Buwei: "Lu Family Chunqiu·Gui Gong")