GithubHelp home page GithubHelp logo

jaccount-anti-captcha's Introduction

Jaccount Anti-Captcha

License

环境:Python 3.6 VS2017

  1. 本验证码识别程序目标是Jaccount登陆验证码(https://jaccount.sjtu.edu.cn/jaccount/captcha 每次访问生成一个随机验证码),其他验证码样本可能不起作用。
  2. 本识别程序针对的是无粘连字符的识别,如果出现字符粘连可能影响识别效果。
  3. test目录为测试目录,直接使用python run.py即可查看效果,每次从网站即时抓取一个验证码图片,生成图片数据文件并自动调用Network.exe识别,按任意键进行下一次抓取、判断。save.txt为训练好的参数数据,不要删除。

技术说明

  • 目录下get.py文件为训练样本抓取,采用pytesseract作为OCR辅助识别,会抓取一个验证码并显示OCR识别结果,如果正确则按下空格,该图片保存在src目录下作为训练样本,文件名即为验证码,如果不正确则按其他键,该图片被忽略,为省事并未删除缓存文件,请在抓取结束后手动删除src/captcha.jpg文件
  • classify.py文件为训练样本的数据提取,首先会将图像灰度化、对比度增强、图像增强、二值化,然后分割字符(分割依据为从左到右第一个出现黑色的列和之后第一个没有黑色的列之间为第一个字符,以此类推,然后每个字符取从上往下第一行有黑色的列和从下往上第一行有黑色的列之间为字符区域,然后把切割下的字符统一移动到一个15*20的纯白背景左上角),分割结束后把分割下的字符图片写入data.txt文件,格式为 字符+图片数据(0/1),每一行有301个字符。
  • Network文件夹为C++工程目录,为一个3层BP网络,需要手动将data.txt复制到exe目录下,TRAINING宏开启则为训练模式,调整参数后开始训练,单次训练后保存结果到save.txt文件中,GOON宏开启为继续训练,将会读取save.txt文件继续训练(参数要手动调整保持一致),TEST宏开启为测试模式,将会读取test.txt文件(文件格式为每行300字符的图片数据),给出结果,可以手动打印value值查看输出数据。

jaccount-anti-captcha's People

Contributors

mxwxz avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

stevenchen1976

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.