HomeKurzweil 3000 (K3000)Kurzweil 3000 (General)How to emphasize existing OCR for improved text recognition of PDF/KES files

1.65. How to emphasize existing OCR for improved text recognition of PDF/KES files

Last Updated: March 2021

Applies to: Kurzweil 3000 v20 Subscription/Web License Edition, kurzweil3000.com, Kurzweil 3000 v20 for Windows

Issue:

Prior to Kurzweil 3000 version 20.07 (March 2021) the Kurzweil 3000 platform emphasized its own recognition over pre-existing remediated OCR of PDF documents.

This meant that in most cases Kurzweil 3000 would redo the OCR of PDF documents that already contained OCR, resulting in re-recognizing the text of documents that were already recognized.

For example, let's say you obtain PDF files from the Access Text Network or publisher. Now Kurzweil 3000 can emphasize the pre-existing remediated OCR and use that instead.

If you use PDF files from publisher sources, the existing OCR of those documents is usually of high quality. It's recommended to have the these settings enabled because it will result in faster handling/opening of those PDFs and also emphasize the existing OCR.

These features are only available in Kurzweil 3000 v20.07 or later.

Solution:

The setting is account-bound on kurzweil3000.com, and then should also be checked within the Kurzweil 3000 installed desktop application (if you're using that). This solution is best applied for the web app (kurzweil3000.com) and has the best results on that platform.

kurzweil3000.com

1. Log into kurzweil3000.com with your username/password.

2. Go to My Account > Settings

3. On PDF Scanning Settings page, make sure that Enable Remediated Processing and Emphasize Embedded Text are checked. Then Update PDF Scanning Settings

4. Now go back to your Universal Library, drill into your private folder, and click Upload to upload a PDF file where recognition of that same file was not ideal before.

5. Click on the file after it's uploaded to open it. Go through the chute in an attempt to open the file, and be patient for the pages to load. It will most likely result in a red error message saying that the file could not be displayed. Don't worry, that's OK, as by clicking on the file you started the processing job on our server.

6. If you try again about 30 seconds later after failing, the file should open as the processing job has progressed enough for it to display.

7. The file is now ready to copy and distribute to students. The recognized text of the file should match the embedded text from the publisher.

Notes:

The upside of this method is that the resulting PDF/KES file has nice clean text from the publisher. You shouldn't have to worry about making manual corrections to the file via Zone Editing or Editing Underlying Text.

The downside, or annoying part of this method, is that processing files this way is a frontloaded process that often results in the file initially erroring out the first time you try to open it. If it does error out, simply try opening the file a few moments later and it should be good.

Kurzweil 3000 for Windows

1. Make sure to update/patch Kurzweil 3000 to the latest version here:

Windows: https://www.kurzweiledu.com/news-resources/software-updates/ki-3000-windows-version20-updates.html

Mac: https://www.kurzweiledu.com/news-resources/software-updates/ki-3000-macintosh-version20wl-updates.html

2. Within the Kurzweil 3000 menu (for the Windows app), go to Tools > Options > Scanning and make sure that Emphasize Embedded Text is checked.

 

 

 

 

Knowledge Tags

This page was: Helpful | Not Helpful