Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/mindspore-lab/mindocr into doc
Browse files Browse the repository at this point in the history
  • Loading branch information
horcham committed Dec 14, 2023
2 parents 8bf0224 + 6604ed9 commit 9429602
Show file tree
Hide file tree
Showing 14 changed files with 1,120 additions and 573 deletions.
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,13 @@ You can do MindSpore Lite inference in MindOCR using **MindOCR models** or **Thi

</details>

<details open markdown>
<summary>Layout Analysis</summary>

- [x] [YOLOv8](configs/layout/yolov8/README.md) ([Ultralytics Inc.](https://github.com/ultralytics/ultralytics))

</details>

For the detailed performance of the trained models, please refer to [configs](./configs).

For details of MindSpore Lite and ACL inference models support, please refer to [MindOCR Models Support List](docs/en/inference/inference_quickstart.md) and [Third-party Models Support List](docs/en/inference/inference_thirdparty_quickstart.md) (PaddleOCR, MMOCR, etc.).
Expand Down Expand Up @@ -232,13 +239,12 @@ Frequently asked questions about configuring environment and mindocr, please ref
<summary>News</summary>
- 2023/12/05
1. Add new trained models
- [YOLOv8 nano]()
- [YOLOv8](configs/layout/yolov8) for layout analysis
- [VI-LayoutXLM](configs/kie/vi_layoutxlm/README_CN.md) for key information extraction
- [PP-OCRv3](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/PP-OCRv3_introduction.md)
- [PP-OCRv3 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv3_det_cml.yaml) for text detection
- [PP-OCRv3 SVTR](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv3_rec_distillation.yml) for text recognition
2. Add new offline inference models
- [YOLOv8 nano]() for table recognition, inference on Ascend310
- [PP-OCRv4](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/PP-OCRv4_introduction.md) inference on Ascend310
- [PP-OCRv4 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv4_det_cml.yaml) for text detection
- [PP-OCRv4 CRNN](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv4_rec_distillation.yaml) for text recognition
Expand Down
8 changes: 7 additions & 1 deletion README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,12 @@ python tools/infer/text/predict_system.py --image_dir {path_to_img or dir_to_img
- [x] [ABINet](configs/rec/abinet/README_CN.md) (CVPR'2021)
</details>

<details open markdown>
<summary>版面分析</summary>

- [x] [YOLOv8](configs/layout/yolov8/README_CN.md) ([Ultralytics Inc.](https://github.com/ultralytics/ultralytics))
</details>

关于以上模型的具体训练方法和结果,请参见[configs](./configs)下各模型子目录的readme文档。

关于[MindSpore Lite](https://www.mindspore.cn/lite)[ACL](https://www.hiascend.com/document/detail/zh/canncommercial/63RC1/inferapplicationdev/aclcppdevg/aclcppdevg_000004.html)模型推理的支持列表,
Expand Down Expand Up @@ -233,7 +239,7 @@ MindOCR提供了[数据格式转换工具](tools/dataset_converters) ,以支

- 2023/12/05
1. 增加新模型
- 文档版面识别 [YOLOv8 nano]()
- 版面分析[YOLOv8](configs/layout/yolov8)
- 关键信息提取 [VI-LayoutXLM](configs/kie/vi_layoutxlm/README_CN.md)在线训练推理
- [PP-OCRv3](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/PP-OCRv3_introduction.md)三方模型训练推理
- 文本检测 [PP-OCRv3 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv3_det_cml.yaml)
Expand Down
907 changes: 606 additions & 301 deletions configs/det/dbnet/db_mobilenetv3_ppocrv3_param_map.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion configs/kie/layoutlm_series/ser_layoutxlm_xfund_zh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ train:

loader:
shuffle: True
batch_size: 4
batch_size: 8
drop_remainder: True
num_workers: 8

Expand Down
2 changes: 1 addition & 1 deletion configs/kie/vi_layoutxlm/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Table Format:

| **模型** |**任务** |**环境配置** | **训练集** | **参数量** | **单卡批量** | **图模式单卡训练 (s/epoch)** | **图模式单卡训练 (ms/step)** | **图模式单卡训练 (FPS)** | **hmean** | **配置文件** | **模型权重下载** |
| :-----: | :-----: |:-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----:
| VI-LayoutXLM | SER | D910Ax1-MS2.1-G | XFUND_zh | 265.7 M | 4 | 7.53 | 203.48 | 19.66 | 93.31% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/vi-layoutxlm/ser_vi_layoutxlm.ckpt) |
| VI-LayoutXLM | SER | D910Ax1-MS2.1-G | XFUND_zh | 265.7 M | 8 | 3.06 | 169.7 | 47.2 | 93.31% | [yaml](ser_vi_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/vi-layoutxlm/ser_vi_layoutxlm-f3c83585.ckpt) |
</div>


Expand Down
4 changes: 2 additions & 2 deletions configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ model:
head :
name: TokenClassificationHead
num_classes: 7
use_visual_backbone: True
use_visual_backbone: False
use_float16: True
pretrained:

Expand Down Expand Up @@ -85,7 +85,7 @@ train:

loader:
shuffle: True
batch_size: 4
batch_size: 8
drop_remainder: True
num_workers: 8

Expand Down
2 changes: 1 addition & 1 deletion configs/layout/yolov8/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ Please [download](#2-results) the exported MindIR file first, or refer to the [M
python tools/export.py --model_name_or_config configs/layout/yolov8/yolov8n.yaml --data_shape 800 800 --local_ckpt_path /path/to/local_ckpt.ckpt
```

The `data_shape` is the model input shape of height and width for MindIR file. The shape value of MindIR in the download link can be found in [Notes](#2-results) under results table.
The `data_shape` is the model input shape of height and width for MindIR file. The shape value of MindIR in the download link can be found in [Notes](#2-results) under results table. `distribute` in yaml shall be set to False.

**2. Environment Installation**

Expand Down
4 changes: 2 additions & 2 deletions configs/layout/yolov8/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Table Format:

**注意:**

- 环境配置:训练的环境配置表示为 {处理器}x{处理器数量}-{MS模式},其中 Mindspore 模式可以是 G-graph 模式或 F-pynative 模式。例如,D910x8-MS2.2-G 用于使用图模式在4张昇腾910 NPU上依赖Mindspore2.2版本进行训练。
- 环境配置:训练的环境配置表示为 {处理器}x{处理器数量}-{MS模式},其中 Mindspore 模式可以是 G-graph 模式或 F-pynative 模式。例如,D910x4-MS2.2-G 用于使用图模式在4张昇腾910 NPU上依赖Mindspore2.2版本进行训练。
- 如需在其他环境配置重现训练结果,请确保全局批量大小与原配置文件保持一致。
- 模型都是从头开始训练的,无需任何预训练。关于训练和测试数据集的详细介绍,请参考[PubLayNet数据集准备](#3.1.2 PubLayNet数据集准备)章节。
- YOLOv8的MindIR导出时的输入Shape均为(1, 3, 800, 800)。
Expand Down Expand Up @@ -154,7 +154,7 @@ python tools/eval.py --config configs/layout/yolov8/yolov8n.yaml
python tools/export.py --model_name_or_config configs/layout/yolov8/yolov8n.yaml --data_shape 800 800 --local_ckpt_path /path/to/local_ckpt.ckpt
```

其中,`data_shape`是导出MindIR时的模型输入Shape的height和width,下载链接中MindIR对应的shape值见[注释](#2-评估结果)
其中,`data_shape`是导出MindIR时的模型输入Shape的height和width,下载链接中MindIR对应的shape值见[注释](#2-评估结果)yaml中的`distribute`需要被设置为False。

**2. 环境搭建**

Expand Down
2 changes: 1 addition & 1 deletion configs/layout/yolov8/yolov8n.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ system:
log_interval: 100
val_while_train: False
drop_overflow_update: False
ckpt_max_keep: 100
ckpt_max_keep: 500
device_id: 0

common:
Expand Down
Loading

0 comments on commit 9429602

Please sign in to comment.