Image to Text – OCR – Tesseract – Linux – Tutorial

Video is ready, Click Here to View ×

For help:


  1. Really interesting project. I have been playing with tesseract and having pretty good luck with it. I hit a roadblock however when I try and get it to read digital LCD screens. Google leads me to some potential leads but so far nothing workable. I would like to use it to read the digital output on a flow-rate meter. I can see lots of applications for such a tool in science and engineering (reading equipment output)
    Most modern science/engineering equipment can be coupled with a computer these days but not all. For less than $100 this could allow us to digitize even older equipment. Going to keep chasing.

  2. So I scanned some text (see below) from a magazine that was printed in columns but how do I get it to stop appearing bunched up like this? Surely there must be a way to stop it pasting exactly the way it was copied? I tried using notepad without word wrap but it makes no difference. Can anyone think of a way to remedy this so it runs horizontally from left to right instead of a long vertical column? Rearranging this manually line by line is brain numbing. There must be a hack to sort this one out or is this really an impossible task. I believe there must be a way but I can't fathom how to do it.

    The first article is another
    Me William last-minute
    special this time featuring 
    Asbury Park's second son
    Johnny Lyon. He's known
    to you as Southside. Plus
    lots of pictures.
    Now we know Omaha
    Rainbow has told you 
    everything about John 
    Stewart from his album 
    serial numbers to his inside 
    leg measurement. However
    for a concise over view of
    the man's music look no
    further than guest writer
    Spencer Leigh s critique.
    Ace shu Herman Eugen
    Beer (of CAMRA?) rapped
    briefly with the revitalised
    John Martyn recently.
    Here s what went down.
    TWILLEY BAND (above).
    Famed correspondent of
    Creem magazine Peter
    Langley has come up with
    his first piece for us <*t
    last. The Big Star/Raspbe-
    rries/Twilley angle has
    been covered elsewhere,
    but Pete gives this field
    of American music a new
    and perceptive treatment.
    Mike Heron has been
    steadily building a second
    stage to his career over the
    last few years. If he's
    escaped your attentions
    then catch up with Mike
    Davies of his solo activi-
    Features (on 2 pages)
    fewer reviews but some
    good grist on ASHLEY
    Since the very first issue
    we 've been promising you
    A Lou Reed article, so
    it’s fitting that it should
    appear on our first birth-
    day. Dave Seal never got
    round to writing it, nor
    did Dave Smith, but His
    brother Martin DID (he
    of Spector series fame).
    Inside is a candid apprai-
    sal of the man’s activities
    this decade. The Velvet
    Underground article? Well
    maybe next year.

  3. That's the reverse search you get in terminals. You can get to it by typing Ctrl+r, and start typing the command you want to search for. It will constantly update the screen with the last command issued that matches what you've typed so far.

  4. At about 10m23s you're doing some kind of command history search. Can you point me towards the command or some reference material that I can learn more about this? It looked especially handy! Cool video — very informative!

  5. This will not work on a good captcha. That's why captchas are so hard to read, so you can't do this. If a SPAM bot is getting past a captcha, it's more likely that there is another security hole on the site and teh bot is by passing the captcha altogether.

  6. the thing with OCR, is that (in my experience) even the comercial OCR programs work porly if the font isn't set properly. If the font is set as the same as the ones on the image than the resaults are almost perfect (font needs to be installed). That said I have tried a couple of opensource ones and didn't have good resaults, but I didn't know about this neet trick with the resolution :). If there is an opensource one that has the font settings I think the resaults should be better. Great vid btw

  7. i am guessing OCR softwares have an inital character segmentation stage, when it tries to separate the characters from image. And it helps to have higher res images for that.

Leave a Reply

Your email address will not be published.