30 nov 2015 14:25
The Digital Age we live in has brought with it large-scale digitization of historical records. Many collections of documentary material in many languages are currently being digitized around the globe and made available on the Internet. Thus, the modern scholar is often confronted with easily accessible images of hundreds of thousands of readily-available and potentially-relevant full or fragmentary documents, as well as numerous transcribed texts. Without appropriate computer aids, however, one is faced with the challenge of isolating the sought-after needles in a proverbial haystack of online images and texts. State-of-the-art optical character-recognition still comes nowhere near providing quality searchable texts for historical handwritten material, but there are now prototypes of an array of other computational tools that can be brought to bear on such documents. We describe the underlying machine-learning methods and the scholarly tools that are emerging from them. For such tools to be of value to scholars, the results and the reasoning behind them must be presented in a meaningful form.

Nachum Dershowitz interviendra dans le cadre du programme E-philologie PSL.

Le programme E-Philologie, porté par l'École Nationale des Chartes (ENC), l'École des Hautes Études en Sciences Sociales (EHESS), l'École Nationale Supérieure (ENS) et l'École Pratique des Hautes Études (EPHE), financé par PSL, vise à constituer un séminaire de recherche en philologie numérique, couplé à des cours à destination des étudiants avancés (master 2 et doctorat) et des chercheurs, qui permette de former les étudiants aux techniques de publication électronique et créer des synergies entre des établissements au niveau national et international en matière de recherche.

