Fil d'Ariane
- Accueil /
- Évènements /
- The CATMuS initiative: building large and diverse corpora for handwritten text recognition
ENS-PSL
45 rue d'Ulm
75005 Paris
France
Séance du séminaire DHAI avec Thibault Clérice (INRIA) et Malametenia Vlachou-Efstathiou (IRHT/IMAGINE-ENPC)
The CATMuS (Consistent Approaches to Transcribing ManuScripts) initiative is a set of datasets and guidelines meant for training large and generalizing HTR models. In this presentation, we set out to present the issues behind handwritten text recognition of historical documents over a long time and many languages, the choices we faced and how we addressed them. We'll present the resulting dataset for the Middle Ages, the first one to be published out of the CATMuS Initiative, and will present initial results with some models.