🌕 Gate Square · Mid-Autumn Creator Incentive Program is Now Live!
Share your creations with trending topics and get a chance to split $5,000 in rewards! 🎁
👉 Join now: https://www.gate.com/campaigns/1953
💡 How to Join:
1️⃣ Post with the hashtag #Gate Square Mid Autumn Creator Incentive# .
2️⃣ Your content should follow the daily trending topics posted by [Gate _Square], and include both hashtags.
3️⃣ The more posts, higher quality, and greater engagement — the bigger your rewards! 🚀
💰 Creator Rewards:
🏆 Top 1: Bulgari Mid-Autumn Gift Box + $100 Futures Voucher + $100 GT
🥈 Top 2: Bulgari
OpenAI's Latest Research: Why GPT-5 and Other LLMs Still Hallucinate
OpenAI released its latest research paper, stating that even though large language models (LLM) like GPT-5 have made significant progress, "AI hallucinations" (Hallucinations) remain a fundamental issue that may never be completely eliminated. The research team revealed through experiments that the model can confidently provide completely incorrect answers when responding to specific questions and proposed a new "evaluation mechanism" reform plan to help reduce the model's tendency to "guess randomly."
Researchers tested the AI model with different questions, and all the answers were wrong.
Researchers asked a widely used chatbot about a certain doctoral thesis topic, and ended up receiving three consecutive incorrect answers. Then they asked about its birthday, and the chatbot similarly provided three different dates, all of which were wrong.
Research indicates that AI models tend to provide answers with high confidence when faced with "very uncommon information" in certain data, but they can be wildly incorrect.
The pre-training mechanism only learns the "surface of the language" and does not understand factual accuracy.
Research indicates that the pre-training process of the model is done by "predicting the next word" through a large amount of text, but the data is not labeled as "true or false." In other words, the model only learns the surface of the language, not the factual correctness.
With the increase in model size, errors in regular patterns such as spelling or parentheses will gradually disappear.
However, information with high randomness, such as "someone's birthday", cannot be inferred through language models, making it prone to hallucinations.
AI models are encouraged to "guess blindly" and need to revise their evaluation models.
Research emphasizes that the evaluation method needs a major overhaul; the focus should not be merely on "right or wrong," but rather on heavily penalizing those confident but incorrect answers, while rewarding AI for "honestly saying I don't know." In other words, AI should be penalized more for giving wrong answers than for admitting it doesn't know.
On the other hand, if it answers "uncertain", it should also receive some points instead of being directly counted as zero. Moreover, this cannot simply be achieved by adding a few more tests for show; it must fundamentally overturn the current evaluation system that only looks at the accuracy rate. If the evaluation method is not corrected, AI will continue to make random guesses.
The research ultimately indicates that to reduce illusions, it is necessary to start from the evaluation system and establish a testing method that can genuinely encourage "caution and honesty." Instead of requiring AI to "get it right every time," it is more important to establish game rules that can accept AI saying "I don’t know."
( 2025 Latest Analysis of the Top Five Mainstream LLMs, Understanding Payment, Applications, and Security All at Once )
This article OpenAI's latest research: Why GPT-5 and other LLMs still talk nonsense first appeared in Chain News ABMedia.