From Technology To Ethics, Solve The Problem Of AI "lying"

Recently, a group of rumors that "the death rate of post-80s generation has exceeded 5.2% by the end of 2024" has caused heated discussions, and many people believe it. Later, it was discovered that the "initiator" of this rumor is most likely artificial intelligence (AI). It may be that the AI model has an error in its calculations in the process of answering questions, and was later widely circulated under the fuel of self-media.

With the rapid development of large models and the exponential increase in the number of users, the quality of corpus is gradually uneven. "Machine Deception" and "Machine Illusion" will become the core challenges facing current generative artificial intelligence, which will profoundly affect its credibility and practicality. Strictly speaking, this is an inevitable result of the nonlinear composite function in multi-inner neural networks and is the "Achilles' heel" that is difficult to eradicate.

"Machine fraud" refers to the large model generating seemingly reasonable but actually false and misleading content, and deliberately covering up its uncertainty, such as fabricating authoritative data in a question-and-answer system, actively avoiding (or even inducing) sensitive issues rather than acknowledging blind spots in knowledge. There are roughly three reasons: First, the deviation between the corpus and the training data causes the model to learn from data containing false information or misleading remarks, and the output is naturally a wrong result; Second, the objective function driving mechanism set by the big model simply takes "user satisfaction" as the optimization goal, which will lead to the model tending to provide "answer that users want to hear" rather than real answers; Third, most models lack moral alignment and do not explicitly embed "integrity" as the core principle, making the model possible to choose "efficiently achieve the goal" rather than "correct".

"Machine Illusion" generally refers to the logically self-consistent but detached from reality generated by the big model. It is typically manifested as fictional facts, characters, and events, such as fabricating historical events details or inventing scientific theories that do not exist. Strictly speaking, machine hallucination is not intentional deception, but an inherent flaw in the model when generating "reasonable text" based on probability, and its main cause lies in statistical model dependence. This leads to insurmountable defects in its genes. For example, there is a nonlinear composite function composed of linear functions and trigger functions in a multi-inner neural network system. This is the fundamental reason why its parameter weight allocation is unexplainable, and it is also the inherent reason why the model generates text through word frequency co-occurrence, rather than understanding the authenticity of semantics. The result is that the knowledge boundaries of the big model are relatively vague, and the time lag of the training data makes it impossible to distinguish outdated information from current facts. At the same time, causal reasoning is missing, and the causal chain of real-world events cannot be established. It only relies on surface associations for logical links, resulting in the logic of the output being often plausible.

The impact of machine fraud and machine hallucination is mainly reflected in information pollution, including the spread of false content and the impact of wrong data on public decision-making. The consequences of its proliferation are unimaginable: First, it may lead to the collapse of trust between human and machine. After users are repeatedly deceived, they may completely give up AI tools; secondly, if the model is used in areas such as social system attacks and malicious deception, it may even bring about social ethical crises; thirdly, it may bring about cultural cognition distortions, and the fiction of historical and cultural related content may encourage wrong collective memory and cause a collective belief crisis.

As mentioned earlier, machine deception and machine illusion are difficult to eradicate, and their impact can only be alleviated through continuous optimization. At the technical level, alignment training should be first strengthened, and "integrity first" should be clearly required through RLHF (reinforcement learning based on human feedback). Secondly, hybrid architecture design should be adopted, the generative model should be combined with the search system, and dynamic fact verification should be realized through the "generation verification" closed loop, so as to integrate various real-time databases including academic journals, news media and other sources for output verification, strengthen uncertainty quantification, and require the model to mark the answer confidence, such as "I 90% confirmed that the data is derived from the 2024 statistics", etc., to improve the accuracy of information sources. At the ethical and normative level, transparency standards should be built, such as requiring the AI system to declare its knowledge deadline and potential error range, etc., and the implementation of the industry certification mechanism and AI output review process should be promoted and the output supervision should be strengthened.

In short, the root of machine deception and illusion lies in the fact that most current AI models focus on technology and lack "understanding" and "values" of the world. To reverse this trend, we need to shift from pure probability models to "cognitive architecture" and introduce symbolic logic, causal reasoning and ethical constraints to make the model more like "human". Only when machines truly understand "truth and falsehood", "beauty and ugliness", "good and evil", and truly combine it with human experience, common sense, and task environment can the challenges of deception and hallucination be fundamentally solved. (The author is the director of the Laboratory of Human-Computer Interaction and Cognitive Engineering, Beijing University of Posts and Telecommunications)