Technological advances

monira444 · Post by **monira444** » Mon Feb 17, 2025 4:27 am

The creators of the neural network used a relatively new training method that requires fewer computing resources. The model was trained in just two months on a cluster of Nvidia H800 GPUs, and the costs amounted to $5.5 million (OpenAI spent $78 million on training GPT).

The neural network is also distinguished by its “smart” architecture, which uses resources only when they are really needed, the expert notes. It used the Multi-token Prediction (MTP) architecture, Mixture of Experts (MoE), and Multi-head Latent Attention (MLA) technology, which made it possible to increase the accuracy of the model, increase its performance, speed up training, increase the efficiency of AI, analyze various input data, etc. For example, MLA allows you to extract key details from a text fragment several times, not just once, which helps the neural network not to miss important information.

Economic efficiency
The expert of "Convenient City" is sure that such popularity of the Chinese chat-bot is connected both with its general availability and with the fact that tens of times less money was spent on its training oman mobile database than competitors. In addition, the cost of use is reduced - up to 27 times compared to OpenAI, and the dependence on expensive equipment for operation is significantly reduced.

"According to publicly available data, the cost of creating DeepSeek was only 2% of the investment in OpenAI. That's $12 million versus $500 million spent on developing GPT-5."

Alexander Kasyanov

Leader of the international phygital project "Convenient City"

According to him, the reason for such popularity was an aggressive market entry policy. Thus, DeepSeek applied a comprehensive product launch strategy aimed at conquering the market as quickly as possible. The creators released a web chat for the general public, developed and launched mobile applications, provided tools for developers at revolutionary prices and opened the source code, allowing the community to participate in development. A flexible payment system allows you to experiment with repeated requests. At the same time, there are no one-time purchases or subscriptions for users - everything is immediately available for free.

"DeepSeek has no regional restrictions, unlike the same ChatGPT. The network operates in Russia, the US, Europe, China, and the Middle East, supports more than 20 languages, and you don't need to "dance with tambourines" to use it. This, among other things, allows the startup to quickly expand its audience," says Kasyanov.

What DeepSeek V3 Can Do

The new neural network can analyze up to 300 pages of text. It generates texts of different genres, searches for information on the Internet, deciphers diagrams and explains pictures, is capable of programming in C++, Go, Java, JavaScript, Python and Rust, the model successfully integrates with code editors, the expert notes.

The AI can write codes and solve complex problems, and can reason in DeepThink mode. DeepSeek V3 is available in multiple languages, and is much better at handling Chinese and English texts thanks to its deep understanding.