File tree 2 files changed +0
-13
lines changed
Filter options
2 files changed +0
-13
lines changed
Original file line number Diff line number Diff line change @@ -25,7 +25,6 @@ DevOps-Eval is a comprehensive evaluation suite specifically designed for founda
25
25
## 📜 Table of Contents
26
26
27
27
- [ 🏆 Leaderboard] ( #-leaderboard )
28
- - [ 🛠️ Results On Validation Split] ( #-results-on-validation-split )
29
28
- [ ⏬ Data] ( #-data )
30
29
- [ 🚀 How to Evaluate] ( #-how-to-evaluate )
31
30
- [ 🧭 TODO] ( #-todo )
@@ -74,12 +73,6 @@ Below are zero-shot and five-shot accuracies from the models that we evaluate in
74
73
| Baichuan2-7B-Chat | 60.61 | 64.95 | 81.19 | 75.88 | 71.23 | 75.69 | 78.36 | 79.17 | 70.49 |
75
74
| Internlm-7B-Base | 62.12 | 65.25 | 77.52 | 80.7 | 74.06 | 78.82 | 79.85 | 75.46 | 69.17 |
76
75
77
-
78
- ## 🛠️ Results On Validation Split
79
- coming soon
80
- <br >
81
- <br >
82
-
83
76
## ⏬ Data
84
77
#### Download
85
78
* Method 1: Download the zip file (you can also simply open the following link with the browser):
Original file line number Diff line number Diff line change @@ -24,7 +24,6 @@ DevOps-Eval是一个专门为DevOps领域大模型设计的综合评估数据集
24
24
## 📜 目录
25
25
26
26
- [ 🏆 排行榜] ( #-排行榜 )
27
- - [ 🛠️ 验证集结果] ( #-验证集结果 )
28
27
- [ ⏬ 数据] ( #-数据 )
29
28
- [ 🚀 如何进行测试] ( #-如何进行测试 )
30
29
- [ 🧭 TODO] ( #-todo )
@@ -73,11 +72,6 @@ Below are zero-shot and five-shot accuracies from the models that we evaluate in
73
72
| Baichuan2-7B-Chat | 60.61 | 64.95 | 81.19 | 75.88 | 71.23 | 75.69 | 78.36 | 79.17 | 70.49 |
74
73
| Internlm-7B-Base | 62.12 | 65.25 | 77.52 | 80.7 | 74.06 | 78.82 | 79.85 | 75.46 | 69.17 |
75
74
76
- ## 🛠️ 验证集结果
77
- coming soon
78
- <br >
79
- <br >
80
-
81
75
## ⏬ 数据
82
76
#### 下载
83
77
* 方法一:下载zip压缩文件(你也可以直接用浏览器打开下面的链接):
You can’t perform that action at this time.
0 commit comments