Finca Calvia
Add a review FollowOverview
-
Founded Date December 3, 1926
-
Sectors Education Training
-
Posted Jobs 0
-
Viewed 108
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation thinking designs, achieving performance comparable to OpenAI-o1 throughout math, code, and thinking tasks.
Models
DeepSeek-R1

Distilled designs

DeepSeek team has actually demonstrated that the thinking patterns of larger designs can be distilled into smaller designs, leading to much better performance compared to the reasoning patterns discovered through RL on small models.

Below are the models created through fine-tuning versus numerous dense designs widely utilized in the research neighborhood using reasoning data generated by DeepSeek-R1. The assessment results demonstrate that the distilled smaller sized dense designs carry out well on standards.

DeepSeek-R1-Distill-Qwen-1.5 B
![]()
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B

DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B

DeepSeek-R1-Distill-Llama-70B
License
The model weights are accredited under the MIT License. DeepSeek-R1 series assistance industrial usage, enable for any modifications and acquired works, consisting of, but not restricted to, distillation for training other LLMs.

