Mon 07 April 2025:
Chinese artificial intelligence (AI) start-up DeepSeek has introduced a new method for enhancing the reasoning abilities of large language models (LLMs), reportedly surpassing current approaches.
DeepSeek developed a dual technique that combines generative reward modeling (GRM) and self-principled critique tuning in collaboration with researchers from Tsinghua University, the South China Morning Post reported on Sunday.
__________________________________________________________________________
https://whatsapp.com/channel/0029VaAtNxX8fewmiFmN7N22
__________________________________________________________________________
This dual method is designed to enable LLMs to provide more accurate and faster responses to general queries, according to a paper published Friday.
The researchers said the resulting DeepSeek-GRM models outperformed existing techniques, achieving “competitive performance” with robust public reward models. Reward modeling is a process used to align an LLM’s behavior with human preferences.
DeepSeek plans to make its GRM models open source, the researchers said, although no specific timeline was given.
The paper, published on the online scientific repository arXiv, comes amid growing interest in the company’s future developments, following the global attention drawn by its V3 foundation model and R1 reasoning model.
-Source: AA
__________________________________________________________________________
FOLLOW INDEPENDENT PRESS:
WhatsApp CHANNEL
https://whatsapp.com/channel/0029VaAtNxX8fewmiFmN7N22
TWITTER (CLICK HERE)
https://twitter.com/IpIndependent
FACEBOOK (CLICK HERE)
https://web.facebook.com/ipindependent
YOUTUBE (CLICK HERE)
https://www.youtube.com/@ipindependent
Think your friends would be interested? Share this story!