📖   Chapter 10

Paste an image to OCR text

Learn how to use shell commands and the pasteboard to make a cool paste widget.

Next Chapter
APIs used today Description
hs.executeAllows you to call shell commands from Hammerspoon.
hs.eventtapLets you type keystrokes automatically.
hs.hotkeyEverything you need to bind functions to hotkeys.
hs.pasteboardLets you interact with the system clipboard.

What you’re building #

You’re going to build a fancy paste widget that reads an image from your clipboard and pastes out OCR’d text from it. To do this, you’ll combine the open-source tesseract OCR library with some simple Hammerspoon APIs.

Any image in your clipboard can be pasted as text.

Install tesseract #

In order to convert an image to OCR text, we need to install the tesseract library:

copy
brew install tesseract

Create a new file #

First, make a config file to hold all your code for this project:

copy
touch ~/.hammerspoon/ocr-paste.lua

And require it in your main config:

copy
require("ocr-paste")

Convert your pasteboard to OCR #

Next, you’ll write a function that takes an image out of your pasteboard and converts it to OCR. We write the image out to a temporary location and pass it through tesseract to get an OCR result. Once we have that, we can read the OCR result in, and then programmically type it out to simulate a “paste” event.

copy
local function pasteOcrText()
  local image = hs.pasteboard.readImage()

  -- If the clipboard doesn't have an image in it, we can just return early.
  if image then
    -- First, we save the image to a /tmp path so tesseract can read it in.
    local imagePath = "/tmp/ocr_image.png"
    image:saveToFile(imagePath)

    -- We shell out to tesseract, and save the OCR result in /tmp/ocr_output.txt
    local _, success = hs.execute(
      "/usr/local/bin/tesseract -l eng " .. imagePath .. " /tmp/ocr_output"
    )

    -- hs.execute returns a success value of true or false, which we can use
    -- to make sure OCR worked ok.
    if success then
      -- Next, we read in the OCR result to `content` using Lua's built-in `io`
      -- library.
      local file = io.open("/tmp/ocr_output.txt", "r")
      local content = file:read("*all")
      file:close()

      -- To simulate "pasting" we can just literally type every character out
      -- using `hs.eventtap`.
      hs.eventtap.keyStrokes(content)

      -- Finally, we need to clean up our two /tmp files.
      hs.execute("rm " .. imagePath .. " /tmp/ocr_output.txt")
    end
  end
end

Bind a hotkey #

Now that we have our function, all we need to do is wire it up.

copy
hs.hotkey.bind(super, 'p', pasteOcrText)

Basic window management hot keys

Get the entire script #

Want to just paste in this whole project to your ocr-paste.lua file?

copy
TK fill this in at the very end once we're sure all the code is solid
Basic window management hot keys