I’ve shared a small project I’ve recently written on sourceforge: pdfocrwrapper.sourceforge.net This tool serves as a wrapper around the excellent ABBYY OCR engine which also exists as a Linux variant. The wrapper recursively iterates a directory structure and submits all PDF that it finds (and that it has not yet worked on) to ABBYY. The wrapper is of course flexible enough to work with other engines.
With this wrapper, you can keep scanning all your documents into PDFs. The wrapper will run, asynchronously, over those PDFs and apply the OCR engine to them. As a result, you’ll be able to index and search the content of these files.