update paper and upload supported langauges

This commit is contained in:
guoday 2024-06-18 11:45:00 +08:00
parent c84bfab8c9
commit b12b33c0d9
3 changed files with 340 additions and 2 deletions

View File

@ -56,14 +56,14 @@
# DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence # DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
## 1. Introduction ## 1. Introduction
We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from DeepSeek-Coder-V2-Base with 6 trillion tokens sourced from a high-quality and multi-source corpus. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-Coder-V2-Base, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K. We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder-33B, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K.
<p align="center"> <p align="center">
<img width="100%" src="figures/performance.png"> <img width="100%" src="figures/performance.png">
</p> </p>
In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found in the paper. In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found [here](supported_langs.txt).
## 2. Model Downloads ## 2. Model Downloads

BIN
paper.pdf

Binary file not shown.

338
supported_langs.txt Normal file
View File

@ -0,0 +1,338 @@
ABAP
AGS Script
AMD GPU
AMPL
ANSYS Parametric Design Language
ANTLR
APL
ASP
AWK
ActionScript
Ada
Agda
Alloy
AmbientTalk
Apache Configuration
AppleScript
Arc
Arduino
AspectJ
Assembly
Asymptote
Augeas
AutoHotkey
AutoIt
BC
BNF
BST
Berry
BitBake
BlitzBasic
BlitzMax
Bluespec
Boo
Boogie
Brainfuck
BrightScript
Bro
C
C#
C++
C2HS Haskell
CADL
CMake
COBOL
COBOLFree
CSS
CUDA
CapDL
Ceylon
Chapel
ChucK
Cirru
Click
Clojure
CoffeeScript
ColdFusion CFC
Common Lisp
Crystal
Csound
Csound Score
Cypher
Cython
DASM16
DM
Darcs Patch
Dart
Debian Control File
DeviceTree
Diff
Docker
Dockerfile
Dylan
EBNF
ELPi
Eiffel
Elixir
Elm
Emacs Lisp
EmberScript
Erlang
Execline
F#
F*
Factor
Fancy
Fantom
Felix
Fennel
Fish
Flux
Fortran
Fortran Fixed Form
FoxPro
FreeFem
FreeMarker
Futhark
G-Code
GAP
GAS
GDScript
GLSL
GSQL
Genshi
Gentoo Ebuild
Gentoo Eclass
Gettext Catalog
Glyph
Gnuplot
Go
Gosu
Grace
Gradle
Grammatical Framework
GraphQL
Graphviz DOT
Groff
Groovy
Groovy Server Pages
HCL
HLSL
HTML
HTML Django
HTML ERB
HTML PHP
HTTP
Handlebars
Haskell
Haxe
Hy
IGOR Pro
Idris
Inform 6 Template
Inno Setup
Io
Isabelle
J
JAGS
JCL
JFlex
JSON
JSONiq
JSX
Jade
Jasmin
Java
Java Server Pages
JavaScript
JavaScript MozPreproc
Julia
Jupyter Notebook
K
KRL
Kconfig
Koka
Kotlin
LFE
LLVM
LSL
Lean
Less
Lex
Lighttpd Configuration File
LilyPond
Limbo
Linker Script
Liquid
Literate Agda
Literate CoffeeScript
Logtalk
Lua
M4
MATLAB
MQL
MUF
Makefile
Mako
Mason
Maxima
Meson
Metal
MiniScript
Mirah
Mizar
Modelica
Modula-2
Monkey
MooCode
MoonScript
Mosel
MuPAD
NASM
NCL
NSIS
NetLinx
Nginx Configuration File
Nimrod
Ninja
Nit
Nix
Nu
NuSMV
OCaml
OMG Interface Definition Language
Objdump
Objective-C
Objective-C++
Octave
Odin
Opa
OpenCL
OpenEdge ABL
OpenSCAD
Ox
Oz
PAWN
PEG
PHP
POD
POV-Ray
Papyrus
Parrot Internal Representation
Pascal
Perl
Perl 6
Pike
PkgConfig
Pony
PowerShell
Praat
Processing
Propeller Spin
Protocol Buffer
Pug
Puppet
PureBasic
PureScript
Python
Q
QML
QVTO
R
RAML
RConsole
REALbasic
REXX
RHTML
Racket
Ragel in Ruby Host
Rd
ReasonML
Red
Ren'Py
RenderScript
Ride
Robot Framework
Rouge
Ruby
Rust
S
SARL
SAS
SCSS
SMT
SPARQL
SQF
SQL
SWIG
Sage
Sass
Scala
Scheme
Scilab
Self
ShExC
Shell
Sieve
Silver
Singularity
Slim
Smali
Smarty
Smithy
Solidity
SourcePawn
Squirrel
Stan
Standard ML
Stata
Stylus
SuperCollider
Swift
SystemVerilog
Tcl
Tcsh
TeX
Tea
Terminfo
Thrift
Transact-SQL
Treetop
Turing
Twig
TypeScript
TypoScript
USD
Unity3D Asset
Uno
UnrealScript
UrWeb
VBScript
VCL
VHDL
Vala
Velocity
Verilog
VimL
Visual Basic
Vue
Web IDL
WebAssembly
Whiley
X10
XBase
XC
XML
XML Lasso
XQuery
XS
XSLT
Xtend
Xtlang
YANG
Zeek
Zephir
Zig
Zimpl
eC
ooc