Friday, February 19, 2010

Is OCR PDF larger than a TIFF?

When performing the Optical Character Recognition process, a question that is often asked is from a file size perspective, is a searchable PDF going to be larger than just a plain PDF or TIFF image?  When converting to just image, PDF and TIFF are typically the same size, and both use compression.  The addition of the text layer adds a very small incremental file size portion, when compared to the overall size of the file.  The key here in keeping file sizes as small as possible is to utilize image processing prior to the recognition process to clean your image, remove speckles, etc.  This requires an engine or document capture application that will provide the means to process images.

