AI Ethics

Liu Wei: From Technology To Ethics, Solving The Problem Of AI "lying"

Liu Wei: From Technology To Ethics, Solving The Problem Of AI "lying"

Liu Wei: From Technology To Ethics, Solving The Problem Of AI "lying"

Source: Global Times Recently, a set of rumors that "the mortality rate for those born in the 1980s will exceed 5.2% by the end of 2024" has sparked heated discussions, and many people believe it to be true. It was later discovered that the "initiator" of this rumor was most likely artificial intelligence (AI)

Recently, a set of rumors that "the mortality rate for those born in the 1980s will exceed 5.2% by the end of 2024" has sparked heated discussions, and many people believe it to be true. It was later discovered that the "initiator" of this rumor was most likely artificial intelligence (AI). It may be that the large AI model made a calculation error in answering the question, and was subsequently widely spread with the help of self-media.

With the rapid development of large models and the exponential increase in the number of users, the quality of corpus gradually varies. "Machine deception" and "machine hallucination" will become the core challenges facing current generative artificial intelligence, profoundly affecting its credibility and practicality. Strictly speaking, this is an inevitable result of the nonlinear composite function in multi-inner layer neural networks, and it is an "Achilles' heel" that is difficult to eradicate.

"Machine deception" refers to large models that generate content that seems reasonable but is actually false and misleading, and deliberately conceals its uncertainty, such as fabricating authoritative data in question and answer systems, actively avoiding (or even inducing) sensitive questions instead of acknowledging knowledge blind spots, etc. There are roughly three reasons for this: First, the corpus and training data deviate, causing the model to learn from data that contains false information or misleading remarks, and the output will naturally be wrong results; second, the objective function driving mechanism set up by large models simply takes "user satisfaction" as the optimization goal, which will cause the model to tend to provide "answers that users want to hear" rather than real answers; third, most models lack moral alignment and do not clearly embed "integrity" as a core principle, so that the model may choose "efficiently achieve goals" rather than "correctly".

"Machine hallucination" generally refers to content generated by large models that is logically self-consistent but divorced from reality. It typically manifests as fictitious facts, characters, and events, such as fabricating details of historical events or inventing non-existent scientific theories. Strictly speaking, machine illusion is not intentional deception, but an inherent flaw in the model when generating "reasonable text" based on probability. The main cause is statistical pattern dependence. This leads to insurmountable flaws in its genes. For example, in a multi-inner layer neural network system, there is a nonlinear composite function formed by the superposition of a linear function and a trigger function. This is the fundamental reason why its parameter weight distribution cannot be explained. It is also the internal reason why the model generates text through word frequency co-occurrence black boxes instead of understanding the semantic authenticity. The result is that the knowledge boundaries of large models are blurred. The time lag of training data makes it impossible to distinguish outdated information from current facts. At the same time, causal reasoning is missing and the causal chain of real-world events cannot be established. Logical links are only relying on superficial correlations, resulting in output logic that is often specious.

The impact of machine deception and machine illusion is mainly reflected in the pollution of information, including the spread of false content and the impact of erroneous data on public decision-making. The consequences of its proliferation are also unimaginable: it may lead to the collapse of trust between humans and machines. After users are repeatedly deceived, they may give up AI tools completely; secondly, if the model is used in social system attacks, malicious deception and other fields, it may even bring about a social ethical crisis; thirdly, it may bring about distortion of cultural cognition, and the fiction of historical and cultural-related content may encourage false collective memories and cause a crisis of group belief.

As mentioned before, machine deception and machine illusion are difficult to eradicate, and their effects can only be mitigated through continuous optimization. At the technical level, alignment training should be strengthened first, and "integrity first" should be clearly required through RLHF (reinforcement learning based on human feedback). Secondly, a hybrid architecture design should be adopted to combine the generation model with the retrieval system to achieve dynamic fact checking through a "generation verification" closed loop to integrate various real-time databases including academic journals, news media and other sources for output verification, strengthen uncertainty quantification, and require the model to mark the answer confidence, such as "I am 90% sure that this data comes from 2024 statistics", etc., to improve the accuracy of information sources. At the ethical and normative level, transparency standards should be established, such as requiring AI systems to declare their knowledge deadlines and potential error ranges. The implementation of industry certification mechanisms and AI output review processes should also be promoted to strengthen output supervision.

In short, the root cause of machine deception and illusion is that most current large AI models focus on technology and lack “understanding” and “values” of the world. To reverse this trend, we need to shift from purely probabilistic models to "cognitive architecture" and introduce symbolic logic, causal reasoning and ethical constraints to make the model more "human". Only when machines truly understand “authenticity”, “beauty and ugliness”, and “good and evil” and effectively integrate it with human experience, common sense, and task environments, can the challenges of deception and illusion be fundamentally solved. (The author is the director of the Human-Computer Interaction and Cognitive Engineering Laboratory of Beijing University of Posts and Telecommunications)

More