mirror of
https://www.modelscope.cn/Shanghai_AI_Laboratory/internlm3-8b-instruct.git
synced 2025-08-13 20:25:53 +08:00
Update modeling_internlm3.py (#18)
- Update modeling_internlm3.py (94cd46f35e87e1b3b2b82df73230bdb5275cd652) - Update tokenization_internlm3.py (0f3d7019880c0b6f7a9d35b392d21cbfca07478b)
This commit is contained in:
parent
2ecb8953b0
commit
03ffaab065
148
README.md
148
README.md
@ -1,7 +1,7 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
# InternLM
|
||||
|
||||
|
||||
@ -23,7 +23,7 @@ license: apache-2.0
|
||||
|
||||
[](https://github.com/internLM/OpenCompass/)
|
||||
|
||||
[💻Github Repo](https://github.com/InternLM/InternLM) • [🤔Reporting Issues](https://github.com/InternLM/InternLM/issues/new) • [📜Technical Report](https://arxiv.org/abs/2403.17297)
|
||||
[💻Github Repo](https://github.com/InternLM/InternLM) • [🤗Demo](https://huggingface.co/spaces/internlm/internlm3-8b-instruct) • [🤔Reporting Issues](https://github.com/InternLM/InternLM/issues/new) • [📜Technical Report](https://arxiv.org/abs/2403.17297)
|
||||
|
||||
</div>
|
||||
|
||||
@ -48,25 +48,26 @@ InternLM3 supports both the deep thinking mode for solving complicated reasoning
|
||||
|
||||
We conducted a comprehensive evaluation of InternLM using the open-source evaluation tool [OpenCompass](https://github.com/internLM/OpenCompass/). The evaluation covered five dimensions of capabilities: disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence. Here are some of the evaluation results, and you can visit the [OpenCompass leaderboard](https://rank.opencompass.org.cn) for more evaluation results.
|
||||
|
||||
| Benchmark | | InternLM3-8B-Instruct | Qwen2.5-7B-Instruct | Llama3.1-8B-Instruct | GPT-4o-mini(close source) |
|
||||
| ------------ | ------------------------------- | --------------------- | ------------------- | -------------------- | ------------------------- |
|
||||
| General | CMMLU(0-shot) | **83.1** | 75.8 | 53.9 | 66.0 |
|
||||
| | MMLU(0-shot) | 76.6 | **76.8** | 71.8 | 82.7 |
|
||||
| | MMLU-Pro(0-shot) | **57.6** | 56.2 | 48.1 | 64.1 |
|
||||
| Reasoning | GPQA-Diamond(0-shot) | **37.4** | 33.3 | 24.2 | 42.9 |
|
||||
| | DROP(0-shot) | **83.1** | 80.4 | 81.6 | 85.2 |
|
||||
| | HellaSwag(10-shot) | **91.2** | 85.3 | 76.7 | 89.5 |
|
||||
| | KOR-Bench(0-shot) | **56.4** | 44.6 | 47.7 | 58.2 |
|
||||
| MATH | MATH-500(0-shot) | **83.0*** | 72.4 | 48.4 | 74.0 |
|
||||
| | AIME2024(0-shot) | **20.0*** | 16.7 | 6.7 | 13.3 |
|
||||
| Coding | LiveCodeBench(2407-2409 Pass@1) | **17.8** | 16.8 | 12.9 | 21.8 |
|
||||
| | HumanEval(Pass@1) | 82.3 | **85.4** | 72.0 | 86.6 |
|
||||
| Instrunction | IFEval(Prompt-Strict) | **79.3** | 71.7 | 75.2 | 79.7 |
|
||||
| Long Context | RULER(4-128K Average) | 87.9 | 81.4 | **88.5** | 90.7 |
|
||||
| Chat | AlpacaEval 2.0(LC WinRate) | **51.1** | 30.3 | 25.0 | 50.7 |
|
||||
| | WildBench(Raw Score) | **33.1** | 23.3 | 1.5 | 40.3 |
|
||||
| | MT-Bench-101(Score 1-10) | **8.59** | 8.49 | 8.37 | 8.87 |
|
||||
| | Benchmark | InternLM3-8B-Instruct | Qwen2.5-7B-Instruct | Llama3.1-8B-Instruct | GPT-4o-mini(closed source) |
|
||||
| ------------ | ------------------------------- | --------------------- | ------------------- | -------------------- | -------------------------- |
|
||||
| General | CMMLU(0-shot) | **83.1** | 75.8 | 53.9 | 66.0 |
|
||||
| | MMLU(0-shot) | 76.6 | **76.8** | 71.8 | 82.7 |
|
||||
| | MMLU-Pro(0-shot) | **57.6** | 56.2 | 48.1 | 64.1 |
|
||||
| Reasoning | GPQA-Diamond(0-shot) | **37.4** | 33.3 | 24.2 | 42.9 |
|
||||
| | DROP(0-shot) | **83.1** | 80.4 | 81.6 | 85.2 |
|
||||
| | HellaSwag(10-shot) | **91.2** | 85.3 | 76.7 | 89.5 |
|
||||
| | KOR-Bench(0-shot) | **56.4** | 44.6 | 47.7 | 58.2 |
|
||||
| MATH | MATH-500(0-shot) | **83.0*** | 72.4 | 48.4 | 74.0 |
|
||||
| | AIME2024(0-shot) | **20.0*** | 16.7 | 6.7 | 13.3 |
|
||||
| Coding | LiveCodeBench(2407-2409 Pass@1) | **17.8** | 16.8 | 12.9 | 21.8 |
|
||||
| | HumanEval(Pass@1) | 82.3 | **85.4** | 72.0 | 86.6 |
|
||||
| Instrunction | IFEval(Prompt-Strict) | **79.3** | 71.7 | 75.2 | 79.7 |
|
||||
| Long Context | RULER(4-128K Average) | 87.9 | 81.4 | **88.5** | 90.7 |
|
||||
| Chat | AlpacaEval 2.0(LC WinRate) | **51.1** | 30.3 | 25.0 | 50.7 |
|
||||
| | WildBench(Raw Score) | **33.1** | 23.3 | 1.5 | 40.3 |
|
||||
| | MT-Bench-101(Score 1-10) | **8.59** | 8.49 | 8.37 | 8.87 |
|
||||
|
||||
- Values marked in bold indicate the **highest** in open source models
|
||||
- The evaluation results were obtained from [OpenCompass](https://github.com/internLM/OpenCompass/) (some data marked with *, which means evaluating with Thinking Mode), and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/internLM/OpenCompass/).
|
||||
- The evaluation data may have numerical differences due to the version iteration of [OpenCompass](https://github.com/internLM/OpenCompass/), so please refer to the latest evaluation results of [OpenCompass](https://github.com/internLM/OpenCompass/).
|
||||
|
||||
@ -85,8 +86,9 @@ To load the InternLM3 8B Instruct model using Transformers, use the following co
|
||||
|
||||
```python
|
||||
import torch
|
||||
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
|
||||
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm3-8b-instruct')
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
model_dir = "internlm/internlm3-8b-instruct"
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
|
||||
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
|
||||
model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
|
||||
@ -161,6 +163,7 @@ Find more details in the [LMDeploy documentation](https://lmdeploy.readthedocs.i
|
||||
#### Ollama inference
|
||||
|
||||
First install ollama,
|
||||
|
||||
```python
|
||||
# install ollama
|
||||
curl -fsSL https://ollama.com/install.sh | sh
|
||||
@ -199,17 +202,14 @@ stream = ollama.chat(
|
||||
for chunk in stream:
|
||||
print(chunk['message']['content'], end='', flush=True)
|
||||
```
|
||||
|
||||
|
||||
#### vLLM inference
|
||||
|
||||
We are still working on merging the PR(https://github.com/vllm-project/vllm/pull/12037) into vLLM. In the meantime, please use the following PR link to install it manually.
|
||||
Refer to [installation](https://docs.vllm.ai/en/latest/getting_started/installation/index.html) to install the latest code of vllm
|
||||
|
||||
```python
|
||||
git clone -b support-internlm3 https://github.com/RunningLeon/vllm.git
|
||||
# and then follow https://docs.vllm.ai/en/latest/getting_started/installation/gpu/index.html#build-wheel-from-source to install
|
||||
cd vllm
|
||||
python use_existing_torch.py
|
||||
pip install -r requirements-build.txt
|
||||
pip install -e . --no-build-isolatio
|
||||
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
|
||||
```
|
||||
|
||||
inference code:
|
||||
@ -306,8 +306,9 @@ Focus on clear, logical progression of ideas and thorough explanation of your ma
|
||||
#### Transformers inference
|
||||
```python
|
||||
import torch
|
||||
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
|
||||
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm3-8b-instruct')
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
model_dir = "internlm/internlm3-8b-instruct"
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
|
||||
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
|
||||
model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
|
||||
@ -403,14 +404,10 @@ for chunk in stream:
|
||||
|
||||
#### vLLM inference
|
||||
|
||||
We are still working on merging the PR(https://github.com/vllm-project/vllm/pull/12037) into vLLM. In the meantime, please use the following PR link to install it manually.
|
||||
Refer to [installation](https://docs.vllm.ai/en/latest/getting_started/installation/index.html) to install the latest code of vllm
|
||||
|
||||
```python
|
||||
git clone https://github.com/RunningLeon/vllm.git
|
||||
# and then follow https://docs.vllm.ai/en/latest/getting_started/installation/gpu/index.html#build-wheel-from-source to install
|
||||
cd vllm
|
||||
python use_existing_torch.py
|
||||
pip install -r requirements-build.txt
|
||||
pip install -e . --no-build-isolatio
|
||||
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
|
||||
```
|
||||
|
||||
inference code
|
||||
@ -474,25 +471,26 @@ InternLM3支持通过长思维链求解复杂推理任务的深度思考模式
|
||||
|
||||
我们使用开源评测工具 [OpenCompass](https://github.com/internLM/OpenCompass/) 从学科综合能力、语言能力、知识能力、推理能力、理解能力五大能力维度对InternLM开展全面评测,部分评测结果如下表所示,欢迎访问[ OpenCompass 榜单 ](https://rank.opencompass.org.cn)获取更多的评测结果。
|
||||
|
||||
| 评测集\模型 | | InternLM3-8B-Instruct | Qwen2.5-7B-Instruct | Llama3.1-8B-Instruct | GPT-4o-mini(close source) |
|
||||
| ------------ | ------------------------------- | --------------------- | ------------------- | -------------------- | ------------------------- |
|
||||
| General | CMMLU(0-shot) | **83.1** | 75.8 | 53.9 | 66.0 |
|
||||
| | MMLU(0-shot) | 76.6 | **76.8** | 71.8 | 82.7 |
|
||||
| | MMLU-Pro(0-shot) | **57.6** | 56.2 | 48.1 | 64.1 |
|
||||
| Reasoning | GPQA-Diamond(0-shot) | **37.4** | 33.3 | 24.2 | 42.9 |
|
||||
| | DROP(0-shot) | **83.1** | 80.4 | 81.6 | 85.2 |
|
||||
| | HellaSwag(10-shot) | **91.2** | 85.3 | 76.7 | 89.5 |
|
||||
| | KOR-Bench(0-shot) | **56.4** | 44.6 | 47.7 | 58.2 |
|
||||
| MATH | MATH-500(0-shot) | **83.0*** | 72.4 | 48.4 | 74.0 |
|
||||
| | AIME2024(0-shot) | **20.0*** | 16.7 | 6.7 | 13.3 |
|
||||
| Coding | LiveCodeBench(2407-2409 Pass@1) | **17.8** | 16.8 | 12.9 | 21.8 |
|
||||
| | HumanEval(Pass@1) | 82.3 | **85.4** | 72.0 | 86.6 |
|
||||
| Instrunction | IFEval(Prompt-Strict) | **79.3** | 71.7 | 75.2 | 79.7 |
|
||||
| LongContext | RULER(4-128K Average) | 87.9 | 81.4 | **88.5** | 90.7 |
|
||||
| Chat | AlpacaEval 2.0(LC WinRate) | **51.1** | 30.3 | 25.0 | 50.7 |
|
||||
| | WildBench(Raw Score) | **33.1** | 23.3 | 1.5 | 40.3 |
|
||||
| | MT-Bench-101(Score 1-10) | **8.59** | 8.49 | 8.37 | 8.87 |
|
||||
| | 评测集\模型 | InternLM3-8B-Instruct | Qwen2.5-7B-Instruct | Llama3.1-8B-Instruct | GPT-4o-mini(闭源) |
|
||||
| ------------ | ------------------------------- | --------------------- | ------------------- | -------------------- | ----------------- |
|
||||
| General | CMMLU(0-shot) | **83.1** | 75.8 | 53.9 | 66.0 |
|
||||
| | MMLU(0-shot) | 76.6 | **76.8** | 71.8 | 82.7 |
|
||||
| | MMLU-Pro(0-shot) | **57.6** | 56.2 | 48.1 | 64.1 |
|
||||
| Reasoning | GPQA-Diamond(0-shot) | **37.4** | 33.3 | 24.2 | 42.9 |
|
||||
| | DROP(0-shot) | **83.1** | 80.4 | 81.6 | 85.2 |
|
||||
| | HellaSwag(10-shot) | **91.2** | 85.3 | 76.7 | 89.5 |
|
||||
| | KOR-Bench(0-shot) | **56.4** | 44.6 | 47.7 | 58.2 |
|
||||
| MATH | MATH-500(0-shot) | **83.0*** | 72.4 | 48.4 | 74.0 |
|
||||
| | AIME2024(0-shot) | **20.0*** | 16.7 | 6.7 | 13.3 |
|
||||
| Coding | LiveCodeBench(2407-2409 Pass@1) | **17.8** | 16.8 | 12.9 | 21.8 |
|
||||
| | HumanEval(Pass@1) | 82.3 | **85.4** | 72.0 | 86.6 |
|
||||
| Instrunction | IFEval(Prompt-Strict) | **79.3** | 71.7 | 75.2 | 79.7 |
|
||||
| LongContext | RULER(4-128K Average) | 87.9 | 81.4 | **88.5** | 90.7 |
|
||||
| Chat | AlpacaEval 2.0(LC WinRate) | **51.1** | 30.3 | 25.0 | 50.7 |
|
||||
| | WildBench(Raw Score) | **33.1** | 23.3 | 1.5 | 40.3 |
|
||||
| | MT-Bench-101(Score 1-10) | **8.59** | 8.49 | 8.37 | 8.87 |
|
||||
|
||||
- 表中标粗的数值表示在对比的开源模型中的最高值。
|
||||
- 以上评测结果基于 [OpenCompass](https://github.com/internLM/OpenCompass/) 获得(部分数据标注`*`代表使用深度思考模式进行评测),具体测试细节可参见 [OpenCompass](https://github.com/internLM/OpenCompass/) 中提供的配置文件。
|
||||
- 评测数据会因 [OpenCompass](https://github.com/internLM/OpenCompass/) 的版本迭代而存在数值差异,请以 [OpenCompass](https://github.com/internLM/OpenCompass/) 最新版的评测结果为主。
|
||||
|
||||
@ -515,8 +513,9 @@ transformers >= 4.48
|
||||
|
||||
```python
|
||||
import torch
|
||||
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
|
||||
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm3-8b-instruct')
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
model_dir = "internlm/internlm3-8b-instruct"
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
|
||||
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
|
||||
model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
|
||||
@ -634,17 +633,13 @@ for chunk in stream:
|
||||
|
||||
|
||||
####
|
||||
|
||||
##### vLLM 推理
|
||||
|
||||
我们还在推动PR(https://github.com/vllm-project/vllm/pull/12037) 合入vllm,现在请使用以下PR链接手动安装
|
||||
参考[文档](https://docs.vllm.ai/en/latest/getting_started/installation/index.html) 安装 vllm 最新代码
|
||||
|
||||
```python
|
||||
git clone https://github.com/RunningLeon/vllm.git
|
||||
# and then follow https://docs.vllm.ai/en/latest/getting_started/installation/gpu/index.html#build-wheel-from-source to install
|
||||
cd vllm
|
||||
python use_existing_torch.py
|
||||
pip install -r requirements-build.txt
|
||||
pip install -e . --no-build-isolatio
|
||||
```bash
|
||||
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
|
||||
```
|
||||
|
||||
推理代码
|
||||
@ -740,11 +735,12 @@ Focus on clear, logical progression of ideas and thorough explanation of your ma
|
||||
|
||||
```python
|
||||
import torch
|
||||
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
|
||||
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm3-8b-instruct')
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
model_dir = "internlm/internlm3-8b-instruct"
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
|
||||
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
|
||||
model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True, torch_dtype=torch.float16).cuda()
|
||||
model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
|
||||
# (Optional) If on low resource devices, you can load model in 4-bit or 8-bit to further save GPU memory via bitsandbytes.
|
||||
# InternLM3 8B in 4bit will cost nearly 8GB GPU memory.
|
||||
# pip install -U bitsandbytes
|
||||
@ -837,15 +833,10 @@ for chunk in stream:
|
||||
|
||||
##### vLLM 推理
|
||||
|
||||
我们还在推动PR(https://github.com/vllm-project/vllm/pull/12037) 合入vllm,现在请使用以下PR链接手动安装
|
||||
参考[文档](https://docs.vllm.ai/en/latest/getting_started/installation/index.html) 安装 vllm 最新代码
|
||||
|
||||
```python
|
||||
git clone https://github.com/RunningLeon/vllm.git
|
||||
# and then follow https://docs.vllm.ai/en/latest/getting_started/installation/gpu/index.html#build-wheel-from-source to install
|
||||
cd vllm
|
||||
python use_existing_torch.py
|
||||
pip install -r requirements-build.txt
|
||||
pip install -e . --no-build-isolatio
|
||||
```bash
|
||||
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
|
||||
```
|
||||
|
||||
推理代码
|
||||
@ -895,5 +886,4 @@ print(outputs)
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CL}
|
||||
}
|
||||
```
|
||||
|
||||
```
|
Loading…
x
Reference in New Issue
Block a user