使用Tesseract 识别验证码 -

j4s0nh4ck

浏览: 278735 次

最近访客更多访客>>

XiaoPY

zhanchaomao1987

vicen888

prontosil

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

使用Tesseract 识别验证码

博客分类：

kali
web

apt-get install tesseract-ocr

2. 预先处理图片，代码片段：

from PIL import Image
import os
import time

def crack(cap_name):
 
    img = Image.open(cap_name+'.JPEG')
    img = img.convert("RGB")
    pixdata = img.load()
 
    for y in xrange(img.size[1]):
        for x in xrange(img.size[0]):
            if pixdata[x, y][0] < 90:
                pixdata[x, y] = (0, 0, 0, 255)
    for y in xrange(img.size[1]):
        for x in xrange(img.size[0]):
            if pixdata[x, y][1] < 136:                 pixdata[x, y] = (0, 0, 0, 255)     for y in xrange(img.size[1]):         for x in xrange(img.size[0]):             if pixdata[x, y][2] > 0:
                pixdata[x, y] = (255, 255, 255, 255)
    ext = ".tif"
    img.save(cap_name + ext)

3. 使用tesseract命令识别图片：

tesseract imagename outbase [-l lang] [-psm N] [configfile ...]

引用

0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.

4. 限制Tesseract搜索的字符
1）在tessdata/configs文件夹中创建一个新的配置文件
2）在配置文件中添加如下：

引用

tessedit_char_whitelist abcdefghijklmnopqrstuvwxyz

3. 使用新建的配置文件调用tessdata命令。

5. 训练Tesseract识图能力
参考文章：
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract2
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3

分享到：

kali更新exploit-db | 工具--excesspy 客户端邮件xss检查工具

2014-12-10 00:48
浏览 769
评论(0)
分类:互联网
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

使用Tesseract 识别验证码

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

使用Tesseract 识别验证码

评论

发表评论

相关推荐

kali 2.0 broadcom wifi connection

kali2.0中国源

linux 安装scrapy

nginx reverse proxy cofinguration

wpscan

arachni-web-ui使用

linux dd命令

HACKING NODEJS AND MONGODB

php object inject

[转]Forcing XXE Reflection through Server Error Messages

CVE-2011-2461

[译]从配置错误的web server中dump git数据

[转]Microsoft Access sqli

[转]sqlmap注入Microsoft Access

Wine中使用MinGW

crossdomain.xml

[译]使用wireshark解密TLS浏览器流量

xxe方法

owasp zed--Web Sockets

memcached

最近访客更多访客>>