Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the column accoridng to the original paper #3

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

<div align="center">

<h1>质衡: 通用基础模型在底层视觉上的基准测试 </h1>
Expand Down Expand Up @@ -107,7 +106,7 @@ _Q-Bench中文版,包含中文版【底层视觉问答】和【底层视觉描

由于此项任务中GPT辅助测评具有很强的主观性,中文的描述任务分数和英文的描述任务分数难以直接进行比较。另外,目前在中文GPT辅助下的*准确性*评价指标的绝对分数还存在一些问题(相对分数的比例基本符合人的感知),因此目前版本的榜单仅供参考。

| **Model Name** | p_{0, 完整性} | p_{0, 完整性} | p_{2, 完整性} | s_{完整性} | p_{0, 准确性} | p_{0, 准确性} | p_{2, 准确性} | s_{准确性} | p_{0, 相关性} | p_{0, 相关性} | p_{2, 相关性} | s_{相关性} | s_{总分} |
| **Model Name** | P<sub>0</sub> (完整性) | P<sub>1</sub> (完整性) | P<sub>1</sub> (完整性) | score (完整性) | P<sub>0</sub> (准确性) | P<sub>1</sub> (准确性) | P<sub>2</sub> (准确性) | score (准确性) | P<sub>0</sub> (相关性) | P<sub>1</sub> (相关性) | P<sub>2</sub> (相关性) | score (相关性) | Sum (总分) |
| - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| llava_v1.5 | 4.61% | 52.53% | 42.86% | 1.38/2.00 | 67.95% | 26.84% | 5.20% | 0.37/2.00 | 0.65% | 23.36% | 75.99% | 1.75/2.00 | 3.51/6.00 |
| qwen_vl | 12.79% | 53.05% | 34.16% | 1.21/2.00 | 62.46% | 31.58% | 5.96% | 0.44/2.00 | 15.24% | 34.23% | 50.54% | 1.35/2.00 | 3.00/6.00 |
Expand Down