I just needed to grab text from some Photoshop images and put it into a LaTeX document. But, how do I get that text on Linux? GIMP can’t do it, nor can any other program I tried (they all rasterize the text). Fortunately the text is just saved in plain in the .psd file, so it’s relatively easy to get it out of there. However it’s in utf-16, so with just grep it’s quite painful. Here‘s a little Python script which finds and decodes text from psd images (inline version below).
You use it by passing it the .psd file, and the beginning of the text you want to find, and it’ll give you the rest.
python psdtextextract.py myimage.psd "This is"
will result in “This is a text layer.” if that’s the text your layer indeed contains. To find a sample of the text you can use any image viewer such as gwenview. Quite stupid, I know, but it works 😉
Inline source code:
#!/usr/bin/env python # -*- coding:utf-8 -*- from codecs import encode, decode def get_next_occurence(text, buf, start): text = encode(text, "utf-16")[2:] # cut BOM index = buf.find(text, start) if index == -1: return b"", -1 end = buf.find(b'x00x00', index) if end == -1: return b"", -1 chunk = b'x00' + buf[index:end + 2] return chunk.replace(b'x5C', b''), end def get_all_occurences(text, buf): start = 0 items =  while start != -1: try: found, start = get_next_occurence(text, buf, start) items.append(decode(found, "utf-16be").replace('r', 'n')) except UnicodeDecodeError as e: continue print(" - undecodable match skipped:", e) return items if __name__ == '__main__': import sys if len(sys.argv) > 4 or len(sys.argv) < 3: print("Usage: psdtextextract.py file.psd pattern [display-all-matches]") with open(sys.argv, 'rb') as f: items = get_all_occurences(sys.argv, f.read()) if "display-all-matches" not in sys.argv: print(items) else: for item in items: print(item)