New app - pyOcrHelper!
I just checked a new project into Google Code - I called it pyOcrHelper (because I couldn't think of anything else). Basically, it's a python class which makes access to OCR software such as Tesseract or Ocropus easier, because you don't have to think about converting the image/document you have into the format required by Tesseract or by Ocropus - pyOcrHelper takes care of this for you.
What it can do currently:
The first release provides the basic functionality that I required - simply to be able to OCR scan any image file and (importantly) also PDFs (seeing as scanned documents are often sent as images embedded in PDF). As mentioned, this works (kind of). It badly needs documentation and probably also needs to be packaged in the openSUSE build service. There are a couple of loose dependencies which could probably be deleted altogether.
Next steps:
The next steps are to tighten up the code (a lot), to make the code readable and to start raising worthwhile exceptions instead of having class member functions bail out with sys.exit() after doing a sys.stderr.write(). I also want to do some work on output formats. Currently, Ocropus produces half usable HTML, but this could easily be improved upon - and XML can't be that hard to output either. Apart from that, there are other things that I might consider, like taking the opportunity to get to grips with pyqt4 and KDE4/Plasma. I'm thinking of a nice plasma desktop app where you can drop any file and have the OCRd version jump back out at you...
Similar applications:
Just spotted another python project on Google code - Clarify which is aimed at doing more or less the same as what I'm aiming at - but possibly with multithreading as well. Must have a look at the code and the results. Maybe I can learn something from it.
Abonnieren
Kommentare zum Post (Atom)
Keine Kommentare:
Kommentar veröffentlichen