-
-
Notifications
You must be signed in to change notification settings - Fork 351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
references #1
Comments
ocropushttps://github.com/jbest/typeface-corpus Is there support for non-latin languages like Chinese, Japanese or Thai? Here is report about training clstm to recognize Japanese:
Ocropus fork with sane defaults kraken is a fork of ocropus intended to rectify a number of issues while preserving (mostly) functional equivalence. Its main goals are:
Ticked of goals have been realized while some others still require further work. Pull requests and code contributions are always welcome. Recognition Models for Kraken and CLSTM https://github.com/mittagessen/kraken-models kraken-models This repository contains recognition models for kraken, both legacy pyrnn (converted to pronn) and clstm ones. To have one or more model added open a pull request or send an email to [email protected]. https://github.com/kendemu/char-rnn-chinese OpenPhilology https://github.com/OpenPhilology An expandable and scalable OCR pipeline Updated 8 days ago TEI customization for OCR generated layout and content information Updated on 22 Mar Investigating text reuse in the Patrologia. Updated on 18 Feb forked from ryanfb/ancientgreekocr-ocr-evaluation-tools 'ocr-evaluation-tools' from http://ancientgreekocr.org/. Tools to test OCR accuracy. Updated on 6 Nov 2015 The OCR pipeline to succeed Rigaudon Updated on 12 Apr 2015 eLearning for historical languages. Updated on 13 Jan 2015 For now we're just using the wiki to discuss future steps. Updated on 5 Dec 2014 OPP work iterating PerseusDL Updated on 7 Oct 2014 forked from fbaumgardt/hocrinfoaggregator HocrInfoAggregator Updated on 31 Mar 2014 forked from GreekOCR/OpenGreekAndLatin Based on Rigaudon, hOCRInfoAggregator and CoPhi Proofreader Updated on 18 Sep 2013 forked from brobertson/rigaudon Polytonic Greek OCR engine derived from Gamera and based on the work of Dalitz and Brandt Updated on 16 Sep 2013 Proof-reading system for OCR applied to Greek and Latin texts Updated on 11 Sep 2013 https://github.com/jknollmeyer/whiteboard ocropus 的nodejs封装 https://github.com/Totkichi/SciOCR |
latest paper in arxiv during 2015-2016 titled ”OCR“
|
深度学习进行目标识别的资源列表:O网页链接 包括RNN、MultiBox、SPP-Net、DeepID-Net、Fast R-CNN、DeepBox、MR-CNN、Faster R-CNN、YOLO、DenseBox、SSD、Inside-Outside Net、G-CNN |
我现同事之前一个工作就是给约炮网站写程序模拟女性和男网友聊天的,用聊天对话数当评价,硬编码,几乎完美通过约炮版图灵测试//@編程菜菜: //@UB_吴斌:不要让你们的客户知道,聊天对象是个计算机,呵呵,后果不堪设想。 |
自然场景下的识别https://github.com/tongpi/basicOCR 菜单《Applying OCR Technology for Receipt Recognition》by Ozhiganov Ivan pdf:http://t.cn/Rqqsban http://t.cn/RqqFY4X 验证码https://zhuanlan.zhihu.com/p/21344595?f3fb8ead20=357481ecd0939762f4f9dcc75015e93a 车牌号码延伸到各种号码 车牌号码的识别 印刷文本简历https://github.com/Halfish/cvOCR 发票 票据https://github.com/xuwenxue000/PJ_DARKNET |
预处理二值化https://github.com/zp-j/binarizewolfjolion 文本定位 opencvhttp://stackoverflow.com/questions/23506105/extracting-text-opencv
Below is the code written in python with pyopencv, it should easily be ported to C++. import cv2 image = cv2.imread("card.png") for each contour found, draw a rectangle around it on original imagefor contour in contours:
write original image with added contours to diskcv2.imwrite("contoured.jpg", image) The original image is the first image in your post. After preprocessing (grayscale, threshold and dilate - so after step 3) the image looked like this: Dilated image Below is the resulted image ("contoured.jpg" in the last line); the final bounding boxes for the objects in the image look like this: enter image description here You can see the text block on the left is detected as a separate block, delimited from its surroundings. Using the same script with the same parameters (except for thresholding type that was changed for the second image like described below), here are the results for the other 2 cards: enter image description here enter image description here The parameters (threshold value, dilation parameters) were optimized for this image and this task (finding text blocks) and can be adjusted, if needed, for other cards images or other types of objects to be found. For thresholding (step 2), I used a black threshold. For images where text is lighter than the background, such as the second image in your post, a white threshold should be used, so replace thesholding type with cv2.THRESH_BINARY). For the second image I also used a slightly higher value for the threshold (180). Varying the parameters for the threshold value and the number of iterations for dilation will result in different degrees of sensitivity in delimiting objects in the image. Finding other object types: For example, decreasing the dilation to 5 iterations in the first image gives us a more fine delimitation of objects in the image, roughly finding all words in the image (rather than text blocks): enter image description here Knowing the rough size of a word, here I discarded areas that were too small (below 20 pixels width or height) or too large (above 100 pixels width or height) to ignore objects that are unlikely to be words, to get the results in the above image. https://github.com/danvk/oldnyc/blob/master/ocr/tess/crop_morphology.py |
'ocr-evaluation-tools' from http://ancientgreekocr.org/. Tools to test OCR accuracy. https://github.com/ryanfb/ancientgreekocr-ocr-evaluation-tools |
参考其接口 库 模板编辑工具的设计 |
OCR 结果的输出格式 hocr 可视化 |
https://github.com/kba/awesome-ocr Links to awesome OCR projects https://github.com/kba/awesome-ocr |
WebAppFind OCR demo - Applies Ocras.js or GOCR.js to a PDF file opened via right-click from the desktop (the Firefox add-on is currently Windows only; ports welcome!) |
https://github.com/Shreeshrii/imagessan https://github.com/Shreeshrii/ocr-evaluation-tools |
latest paper in arxiv during 2015-2016 titled ”OCR“ |
Optical Character Recognition of old and noisy print sources. https://github.com/digiah/oldOCR https://github.com/jflesch/pyocr |
手写字符
|
https://github.com/mateogianolio/ocr |
https://github.com/Kidel/In-Codice-Ratio-OCR-with-CNN In Codice Ratio (ICR) is a project curated by Roma Tre University in collaboration with Vatican Secret Archives. This project has the purpose of digitalizing the contents of documents and ancient texts from the Archive. The problem we faced in this repository wes just a part of ICR, basically its core. We had to classify handwritten characters in Carolingian minuscule starting from an image of that character. The input is an ensemble of possible cuts of the word that has to be read, and our system has to be able to decide if a cut is correct and, if it is, which character it is. |
https://github.com/ruiwen905/MLTensorFlow Use of Google's Open Source Artificial Intelligence API Develop OCR and Supervised Learning applications using TensorFlow, Scikit and Graphviz |
https://github.com/Shreeshrii/tess4tutorial https://github.com/Shreeshrii/tess4eval_deva 参考这个的安装测试 |
OCR evaluation brought to you by University of Alicante https://github.com/impactcentre/ocrevalUAtion/wiki Glyph Miner, a system for extracting glyphs from early typeset prints https://github.com/benedikt-budig/glyph-miner 📎 Using scanners and OCR to grep paper documents the easy way (Linux/Windows) https://openpaper.work/ https://github.com/jflesch/paperwork tess4训练过程 |
A brand logo recognition system using deep convolutional neural networks. |
【基于计算机视觉/深度学习打造先进OCR工作流】《Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning | Dropbox Tech Blog》by Brad Neuberg O |
We ended up using a classic computer vision approach named Maximally Stable Extremal Regions (MSERs), |
测试地址在http://www.onlineocr.net/ http://www.ocrwebservice.com/api/restguide 这个网站提供了ocr的接口 soap和rest的 支持中文 |
https://github.com/fierceX/cnn_ocr_mnist |
https://github.com/psoder3/OCRPractice An attempt at Optical Character Recognition without being tainted by knowledge of existing implementations |
https://github.com/danielquinn/paperless |
Apache Tika bridge for Node.js. Text and metadata extraction, language detection and more. |
https://github.com/Muhimbi/PDF-Converter-Services-Online OCR scripts for digitized NYC city directories https://github.com/nypl-spacetime/ocr-scripts Optical character recognition ANN for AI class https://github.com/StateFromJakeFarm/OCRANN https://github.com/LanguageMachines/PICCL https://github.com/CatWang/OCR-Picture-Generators/ This is a simple project to generate simple cropped images with characters. You can generate with Chinese or English characters. Backgrounds are also allowed. Medical bills simulation are also included. https://github.com/CatWang/Synthesize_text_generation_Python 一个比较复杂的生成真实场景文字的Python项目。原项目只能生成英文。 经过修改之后能够生成中文。 并且我也添加了图片中文字的切割和对应label的保存代码。 https://github.com/Gr1f0n6x/OCR_NN Python, Keras, OpenCV |
|
护照 卡片 |
https://github.com/JarveeLee/SynthText_Chinese_version 这个应该是可以模拟生成出胶片ocr的训练数据来 《Synthetic Data for Text Localisation in Natural Images》A Gupta, A Vedaldi, A Zisserman [University of Oxford] (CVPR 2016) O |
|
printed scientific document |
Setting Up a Simple OCR Server |
https://github.com/PedroBarcha/Context-Spelling-Correction Given a text, wrap it into phrases and send them to Yandex's search engine. If it yields a "did you mean:", substitute the original phrase for the suggestion. The software was originally developed for correcting OCR output. |
屏幕捕获
|
Image Processing Worms Assignment Report
This image is a good start but some flaws is that the worm on the far left is a similar shade to the background and that the background is not all one colour. As a result, I decided to apply various threshold based segmentations. I then implemented different segmentation methods depending on what image thresholding methods I was using i) Simple Binary thresholding I found a threshold value of 54 gave me the best results I then inverted the image so I could effectively use morphological transformations. I.e. all worms now white with a black background.
ii) Adaptive Mean Thresholding I repeated this process but in exchange of binary Adaptive thresholding gives a much better output as the algorithm calculates the thresholding for small regions in the image. This gave much better results in terms of the quality of the worms but also left a border. I found a block size of 33 and a constant of 10 gave me the best results regarding the quality of worms.
I then compared both methods (i) and (ii) to the ground truth data. Comparison via taking the difference between the 2 images. Segmentation method 1 comparison Segmentation method 2 comparison To get a better image to compare to ground truth data, I read in the ‘w2’ band image only and started with a power law transform where I used a gamma of 1.3 to brighten the image.
I then got a clear white frame with well-defined edges and an image with well-defined worms before adding them together. I then inverted the image before applying various morphological transforms however this time with a kernel size 2. Very good segmentation in comparison to ground truth. I then found the contours of the image. I looped through these contours and if their area was greater than 250 and less than 10000 I plotted a minimum area bounding rectangle. I then drew the contours on the image. Clearly shows evidence of success of system. Straight worms classified Counts the number of worms as 11 which is a good level of accuracy. Worm Written to file Individual Worm Ground Truth Very clear evidence of system performing the specific task of separation of individual worms. Watershed Algorithm Sources
|
https://github.com/Transkribus?page=1 |
I'm probably missing other reasons as well, but this is just off the top of my head. That being said, I'm sure that there's going to be a lot of progress in the OCR + deep learning space soon. |
|
ABBYY has dominated the field for many years (decades really) and still outperforms every solution out there. OmniPage by Nuance is probably the second best. There is a lot of stuff going on in a OCR engine. Layout analysis, dewarping, binarization, deskewing, despeckling (and others) and then there's the OCR itself. With Tesseract you have to do a lot of things yourself, you have to provide it with a clean image. The commercial packages do that for you automatically. ABBYY and other solutions also use NLP to augment/check the OCR results from a semantic analysis perspective. Also, there is no "one size fits all" OCR. It is highly specific to the nature of the application. Consider the following use cases:
If anyone is doing work with OCR + deep learning, I'd love to discuss! |
There is a new set of DL-based OCR tools at https://github.com/NVlabs/ocropus3 OCR systems have traditionally relied on huge data collection efforts and supervised training. I think we can only make significant progress in the long run if we change over to self-supervised training. In the pre-DL world, we had some approaches to that (with the original OCRopus), but carrying that over into the DL world will still require significant effort. |
thx for your excellent work @tmbdev |
https://github.com/dhlab-epfl/dhSegment It does include the following features:
|
https://github.com/AstarLight/CPS-OCR-Engine 中山大学某同学做的发票等票据的识别 |
Autonomous feedback-based preprocessing using classification likelihoods
|
1 similar comment
Autonomous feedback-based preprocessing using classification likelihoods
|
https://github.com/DriesSmit/GeneralOCR This software finds text and structure in images. |
No description provided.
The text was updated successfully, but these errors were encountered: