Word-Level Alignment of Paper Documents with their Electronic Full-Text Counterparts

04/30/2021
by   Mark-Christoph Müller, et al.
0

We describe a simple procedure for the automatic creation of word-level alignments between printed documents and their respective full-text versions. The procedure is unsupervised, uses standard, off-the-shelf components only, and reaches an F-score of 85.01 in the basic setup and up to 86.63 when using pre- and post-processing. Potential areas of application are manual database curation (incl. document triage) and biomedical expression OCR.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset