opencv:
For object detection, you're just trying to figure out whether the object is in the frame, and approximately where it's located. The OpenCv features framework is great for this
tesseract:
If your documents have a fixed structured (consistent layout of text fields) then tesseract-ocr is all you need.
I found tesseract easy to use for font based OCR while OpenCV is good for recognizing hand writing.
The following steps worked well with me:
1.Obtain grayscale of image.
2.Perform canny edge detection on grayscale image.
3.Apply gaussian blur on grayscale image(store in seperate matrix)
4.Input matrices from steps 2 & 3 into SWT algorithm
5.Binarize(threshhold) resulting image.
6.Feed image to tesseract.
Please note, for step 4 you will need to build the c++ library in the link and then import into your android project with JNI wrappers. Also, you will need to do micro tweaking for all steps to get the best results. But, this should at least get you started.
some discuss about opencv vs tesseract on ocr
http://stackoverflow.com/questions/11489824/how-do-i-choose-between-tesseract-and-opencv
image processing to improve tesseract OCR accuracy
1.fix DPI (if needed) 300 DPI is minimum
2.fix text size (e.g. 12 pt should be ok)
3.try to fix text lines (deskew and dewarp text)
4.try to fix illumination of image (e.g. no dark part of image
5.binarize and de-noise image
and some advice
Three points to improve the readability of the image: 1)Resize the image with variable height and width(multiply 0.5 and 1 and 2 with image height and width). 2)Convert the image to Gray scale format(Black and white). 3)Remove the noise pixels and make more clear(Filter the image).
//Resize
public Bitmap Resize(Bitmap bmp, int newWidth, int newHeight)
{
Bitmap temp = (Bitmap)bmp;
Bitmap bmap = new Bitmap(newWidth, newHeight, temp.PixelFormat);
double nWidthFactor = (double)temp.Width / (double)newWidth;
double nHeightFactor = (double)temp.Height / (double)newHeight;
double fx, fy, nx, ny;
int cx, cy, fr_x, fr_y;
Color color1 = new Color();
Color color2 = new Color();
Color color3 = new Color();
Color color4 = new Color();
byte nRed, nGreen, nBlue;
byte bp1, bp2;
for (int x = 0; x < bmap.Width; ++x)
{
for (int y = 0; y < bmap.Height; ++y)
{
fr_x = (int)Math.Floor(x * nWidthFactor);
fr_y = (int)Math.Floor(y * nHeightFactor);
cx = fr_x + 1;
if (cx >= temp.Width) cx = fr_x;
cy = fr_y + 1;
if (cy >= temp.Height) cy = fr_y;
fx = x * nWidthFactor - fr_x;
fy = y * nHeightFactor - fr_y;
nx = 1.0 - fx;
ny = 1.0 - fy;
color1 = temp.GetPixel(fr_x, fr_y);
color2 = temp.GetPixel(cx, fr_y);
color3 = temp.GetPixel(fr_x, cy);
color4 = temp.GetPixel(cx, cy);
// Blue
bp1 = (byte)(nx * color1.B + fx * color2.B);
bp2 = (byte)(nx * color3.B + fx * color4.B);
nBlue = (byte)(ny * (double)(bp1) + fy * (double)(bp2));
// Green
bp1 = (byte)(nx * color1.G + fx * color2.G);
bp2 = (byte)(nx * color3.G + fx * color4.G);
nGreen = (byte)(ny * (double)(bp1) + fy * (double)(bp2));
// Red
bp1 = (byte)(nx * color1.R + fx * color2.R);
bp2 = (byte)(nx * color3.R + fx * color4.R);
nRed = (byte)(ny * (double)(bp1) + fy * (double)(bp2));
bmap.SetPixel(x, y, System.Drawing.Color.FromArgb
(255, nRed, nGreen, nBlue));
}
}
bmap = SetGrayscale(bmap);
bmap = RemoveNoise(bmap);
return bmap;
}
//SetGrayscale
public Bitmap SetGrayscale(Bitmap img)
{
Bitmap temp = (Bitmap)img;
Bitmap bmap = (Bitmap)temp.Clone();
Color c;
for (int i = 0; i < bmap.Width; i++)
{
for (int j = 0; j < bmap.Height; j++)
{
c = bmap.GetPixel(i, j);
byte gray = (byte)(.299 * c.R + .587 * c.G + .114 * c.B);
bmap.SetPixel(i, j, Color.FromArgb(gray, gray, gray));
}
}
return (Bitmap)bmap.Clone();
}
//RemoveNoise
public Bitmap RemoveNoise(Bitmap bmap)
{
for (var x = 0; x < bmap.Width; x++)
{
for (var y = 0; y < bmap.Height; y++)
{
var pixel = bmap.GetPixel(x, y);
if (pixel.R < 162 && pixel.G < 162 && pixel.B < 162)
bmap.SetPixel(x, y, Color.Black);
}
}
for (var x = 0; x < bmap.Width; x++)
{
for (var y = 0; y < bmap.Height; y++)
{
var pixel = bmap.GetPixel(x, y);
if (pixel.R > 162 && pixel.G > 162 && pixel.B > 162)
bmap.SetPixel(x, y, Color.White);
}
}
return bmap;
}
分享到:
相关推荐
opencv+tesseract OCR,配置一些环境就可以了。。。。。。。。。。。。。。。。。。。。。。。。。。。
车牌识别代码,利用opencv实现对车牌的定位,tesseract-ocr进行车牌的识别。
基于Git上的MAImage lib及Tesseract-OCR-iOS lib整合起来的OCR识别Demo, http://blog.csdn.net/ouq68/article/details/44015483
文本已经对整个环境配置完成的情况下实现,有需要配置环境的文档请看我的其他上传。
基于OPENCV和tesseract的中文扫描票据OCR识别源码+全部数据(毕业设计).zip已获导师认可并高分通过的毕业设计项目,代码完整,该资源代码都是经过测试运行成功,功能ok的情况下才上传的,请放心下载使用!...
本源码采用VS2010编写,其中包含了OPENCV 处理图片的多种方法,如 二值化、多种方式去噪点算法,图片翻转,该源码生成为DLL文件,作为研究OPENCV 以及 Tesseract3.02 的图像处理和识别学习采用,该源码以成功应用...
用opencv2,tesseract-ocr和一些机器学习算法识别验证码
iOS平台上的文字识别,采用OCR库,内含demo以及库的源码工程,有兴趣的可以试试
基于opencv+Tesseract-OCR的银行卡图片处理,智能识别银行卡号
OpenCV + Tesseract = OCR || Baidu OCR(身份证识别、护照识别) Tesseract Training 中英语言包相加近百兆,身份证号码只用到 0~9+X,故可自行训练语言. 其他语言需求自行准备素材: Text + Font 生成 tif & box, ...
基于python+Opencv和Tesseract-OCR开发的图像文字识别程序+源码+开发文档+视频演示+设计报告,适合毕业设计、课程设计、项目开发。项目源码已经过严格测试,可以放心参考并在此基础上延申使用~ 基于python+Opencv和...
基于OpenCV+TesseractOCRiOS的银行卡号识别源码+使用文档+全部资料(优秀项目).zip基于OpenCV+TesseractOCRiOS的银行卡号识别源码+使用文档+全部资料(优秀项目).zip基于OpenCV+TesseractOCRiOS的银行卡号识别源码...
基于Python+OpenCV+tesseract的中文扫描票据OCR识别。源码+使用文档+全部资料(优秀项目).zip基于Python+OpenCV+tesseract的中文扫描票据OCR识别。源码+使用文档+全部资料(优秀项目).zip基于Python+OpenCV+...
python opencv2 tesseract
使用OpenCV处理身份证照片,用TesseractOCR光学字符识别解析获得身份证号码,OpenCV的framework太占空间,没有上传需要自己去官网下载
OpenCV(Open Source Computer Vision Library)是一款开源的计算机视觉库,专门为图像和视频处理任务设计,广泛应用于学术研究、工业应用以及个人项目中。以下是关于OpenCV的详细介绍: 历史与发展 起源:OpenCV...
需配置好OpenCV和OCR环境
一个Google支持的开源的OCR图文识别开源项目。去持多语言(当前3.02 版本支持包括英文,简体中文,繁体中文),支持Windows,Linux,Mac OSX 多平台。使用中Tesseract 的识别率非常高。可以参考网上的相关资料进行对...
iOS opencv TesseractOCR 配置好的工程 xcode10下载可用,需真机运行 详细:https://blog.csdn.net/ffffuckyou/article/details/90050071
OpenCV(Open Source Computer Vision Library)是一款开源的计算机视觉库,专门为图像和视频处理任务设计,广泛应用于学术研究、工业应用以及个人项目中。以下是关于OpenCV的详细介绍: 历史与发展 起源:OpenCV...