Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/mindspore-lab/mindocr into doc
Browse files Browse the repository at this point in the history
  • Loading branch information
horcham committed Dec 14, 2023
2 parents 8bf0224 + 6604ed9 commit cae8ff8
Show file tree
Hide file tree
Showing 20 changed files with 1,371 additions and 771 deletions.
57 changes: 45 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,20 @@ You can do MindSpore Lite inference in MindOCR using **MindOCR models** or **Thi

</details>

<details open markdown>
<summary>Layout Analysis</summary>

- [x] [YOLOv8](configs/layout/yolov8/README.md) ([Ultralytics Inc.](https://github.com/ultralytics/ultralytics))

</details>

<details open markdown>
<summary>Key Information Extraction</summary>

- [x] [LayoutXLM SER](configs/kie/vi_layoutxlm/README_CN.md) (arXiv'2016)

</details>

For the detailed performance of the trained models, please refer to [configs](./configs).

For details of MindSpore Lite and ACL inference models support, please refer to [MindOCR Models Support List](docs/en/inference/inference_quickstart.md) and [Third-party Models Support List](docs/en/inference/inference_thirdparty_quickstart.md) (PaddleOCR, MMOCR, etc.).
Expand Down Expand Up @@ -219,6 +233,20 @@ MindOCR provides a [dataset conversion tool](tools/dataset_converters) to OCR da

</details>

<details close markdown>
<summary>Layout Analysis Datasets</summary>

- [PublayNet](https://github.com/ibm-aur-nlp/PubLayNet) [[paper](https://arxiv.org/abs/1908.07836)] [[download](https://dax-cdn.cdn.appdomain.cloud/dax-publaynet/1.0.0/publaynet.tar.gz)]

</details>

<details close markdown>
<summary>Key Information Extraction Datasets</summary>

- [XFUND](https://github.com/doc-analysis/XFUND) [[paper](https://aclanthology.org/2022.findings-acl.253/)] [[download](https://github.com/doc-analysis/XFUND/releases/tag/v1.0)]

</details>

We will include more datasets for training and evaluation. This list will be continuously updated.

## Frequently Asked Questions
Expand All @@ -230,19 +258,24 @@ Frequently asked questions about configuring environment and mindocr, please ref

<details close markdown>
<summary>News</summary>
- 2023/12/05
1. Add new trained models
- [YOLOv8 nano]()
- [VI-LayoutXLM](configs/kie/vi_layoutxlm/README_CN.md) for key information extraction
- [PP-OCRv3](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/PP-OCRv3_introduction.md)
- [PP-OCRv3 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv3_det_cml.yaml) for text detection
- [PP-OCRv3 SVTR](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv3_rec_distillation.yml) for text recognition
2. Add new offline inference models
- [YOLOv8 nano]() for table recognition, inference on Ascend310
- [PP-OCRv4](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/PP-OCRv4_introduction.md) inference on Ascend310
- [PP-OCRv4 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv4_det_cml.yaml) for text detection
- [PP-OCRv4 CRNN](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv4_rec_distillation.yaml) for text recognition

- 2023/12/14
1. Add new trained models
- [LayoutXLM SER](configs/kie/vi_layoutxlm) for key information extraction
- [VI-LayoutXLM SER](configs/kie/layoutlm_series) for key information extraction
- [PP-OCRv3 DBNet](configs/det/dbnet/db_mobilenetv3_ppocrv3.yaml) for text detection and [PP-OCRv3 SVTR](configs/rec/svtr/svtr_ppocrv3_ch.yaml) for recognition, supporting online inferece and finetuning
2. Add more benchmark datasets and their results
- [XFUND](configs/kie/vi_layoutxlm/README_CN.md)
3. Multiple specifications support for Ascend 910: DBNet ResNet-50, DBNet++ ResNet-50, CRNN VGG7, SVTR-Tiny, FCENet, ABINet
- 2023/11/28
1. Add offline inference support for PP-OCRv4
- [PP-OCRv4 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv4_det_cml.yaml) for text detection and [PP-OCRv4 CRNN](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv4_rec_distillation.yaml) for text recognition, supporting offline inferece
2. Fix bugs of third-party models offline inference
- 2023/11/17
1. Add new trained models
- [YOLOv8](configs/layout/yolov8) for layout analysis
2. Add more benchmark datasets and their results
- [PublayNet](configs/layout/yolov8/README_CN.md)
- 2023/07/06
1. Add new trained models
- [RobustScanner](configs/rec/robustscanner) for text recognition
Expand Down
57 changes: 45 additions & 12 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

# MindOCR

[![CI](https://github.com/mindspore-lab/mindocr/actions/workflow/ci.yml/badge.svg)](https://github.com/mindspore-lab/mindocr/actions/workflows/ci.yml)
[![CI](https://github.com/mindspore-lab/mindocr/actions/workflow/ci.yml/badge.svg)](https://github.com/mindspore-lab/mindocr/actions/workflow/ci.yml)
[![license](https://img.shields.io/github/license/mindspore-lab/mindocr.svg)](https://github.com/mindspore-lab/mindocr/blob/main/LICENSE)
[![open issues](https://img.shields.io/github/issues/mindspore-lab/mindocr)](https://github.com/mindspore-lab/mindocr/issues)
[![PRs](https://img.shields.io/badge/PRs-welcome-pink.svg)](https://github.com/mindspore-lab/mindocr/pulls)
Expand Down Expand Up @@ -183,6 +183,20 @@ python tools/infer/text/predict_system.py --image_dir {path_to_img or dir_to_img
- [x] [ABINet](configs/rec/abinet/README_CN.md) (CVPR'2021)
</details>

<details open markdown>
<summary>版面分析</summary>

- [x] [YOLOv8](configs/layout/yolov8/README_CN.md) ([Ultralytics Inc.](https://github.com/ultralytics/ultralytics))
</details>

<details open markdown>
<summary>关键信息抽取</summary>

- [x] [LayoutXLM SER](configs/kie/vi_layoutxlm/README_CN.md) (arXiv'2016)

</details>


关于以上模型的具体训练方法和结果,请参见[configs](./configs)下各模型子目录的readme文档。

关于[MindSpore Lite](https://www.mindspore.cn/lite)[ACL](https://www.hiascend.com/document/detail/zh/canncommercial/63RC1/inferapplicationdev/aclcppdevg/aclcppdevg_000004.html)模型推理的支持列表,
Expand Down Expand Up @@ -220,6 +234,20 @@ MindOCR提供了[数据格式转换工具](tools/dataset_converters) ,以支

</details>

<details close markdown>
<summary>版面分析数据集</summary>

- [PublayNet](https://github.com/ibm-aur-nlp/PubLayNet) [[paper](https://arxiv.org/abs/1908.07836)] [[download](https://dax-cdn.cdn.appdomain.cloud/dax-publaynet/1.0.0/publaynet.tar.gz)]

</details>

<details close markdown>
<summary>关键信息抽取数据集</summary>

- [XFUND](https://github.com/doc-analysis/XFUND) [[paper](https://aclanthology.org/2022.findings-acl.253/)] [[download](https://github.com/doc-analysis/XFUND/releases/tag/v1.0)]

</details>

我们会在更多的数据集上进行模型训练和验证。该列表将持续更新。

## 常见问题
Expand All @@ -231,18 +259,23 @@ MindOCR提供了[数据格式转换工具](tools/dataset_converters) ,以支
<details close markdown>
<summary>详细</summary>

- 2023/12/05
- 2023/12/14
1. 增加新模型
- 文档版面识别 [YOLOv8 nano]()
- 关键信息提取 [VI-LayoutXLM](configs/kie/vi_layoutxlm/README_CN.md)在线训练推理
- [PP-OCRv3](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/PP-OCRv3_introduction.md)三方模型训练推理
- 文本检测 [PP-OCRv3 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv3_det_cml.yaml)
- 文本识别 [PP-OCRv3 SVTR](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv3_rec_distillation.yml)
2. 离线推理
- 文档版面识别 [YOLOv8 nano]()昇腾310推理
- [PP-OCRv4](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/PP-OCRv4_introduction.md)三方模型昇腾310推理
- 文本检测 [PP-OCRv4 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv4_det_cml.yaml)
- 文本识别 [PP-OCRv4 CRNN](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv4_rec_distillation.yaml)
- 关键信息抽取[LayoutXLM SER](configs/kie/vi_layoutxlm)
- 关键信息抽取[VI-LayoutXLM SER](configs/kie/layoutlm_series)
- 文本检测[PP-OCRv3 DBNet](configs/det/dbnet/db_mobilenetv3_ppocrv3.yaml)和文本识别[PP-OCRv3 SVTR](configs/rec/svtr/svtr_ppocrv3_ch.yaml),支持在线推理和微调训练
2. 添加更多基准数据集及其结果
- [XFUND](configs/kie/vi_layoutxlm/README_CN.md)
3. 昇腾910硬件多规格支持:DBNet ResNet-50、DBNet++ ResNet-50、CRNN VGG7、SVTR-Tiny、FCENet、ABINet
- 2023/11/28
1. 增加支持PP-OCRv4模型离线推理
- 文本检测 [PP-OCRv4 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv4_det_cml.yaml)和文本识别 [PP-OCRv4 CRNN](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv4_rec_distillation.yaml),支持离线推理
2. 修复第三方模型离线推理bug
- 2023/11/17
1. 增加新模型
- 版面分析[YOLOv8](configs/layout/yolov8)
2. 添加更多基准数据集及其结果
- [PublayNet](configs/layout/yolov8/README_CN.md)
- 2023/07/06
1. 增加新模型
- 文本识别 [RobustScanner](configs/rec/robustscanner)
Expand Down
Loading

0 comments on commit cae8ff8

Please sign in to comment.