Partager
Séminaire

Building Multilingual BookNLP

Mercredi 21 juin 2023 Mercredi 21 juin 2023
De 10h à 12h
Image
forgotten books
ENS, Salle de conférence du centre de science des données

ENS-PSL
45 rue d'Ulm
75005 Paris
France

Le prochain séminaire de David Bamman aura lieu le mercredi 21 juin de 10h à 12h à l’Ecole normale supérieure (rue d’Ulm, Paris).

Abstract :

BookNLP (Bamman et al. 2014) is a natural language processing pipeline for reasoning about the linguistic structure of text in books, specifically designed for works of fiction.  In addition to its pipeline of part-of-speech tagging, named entity recognition, and coreference resolution, BookNLP identifies the characters in a literary text, and represents them through the actions they participate in, the objects they possess, their attributes, and dialogue.  The availability of this tool has driven much work in the computational humanities, especially surrounding character (Underwood et al. 2018; Kraicer and Piper 2018; Cheng 2020).  At the same time, however, BookNLP has had one major limitation: it currently only supports texts written in English.  In this talk, I will describe our efforts to expand BookNLP to support literature in languages beyond English, and create a blueprint for others to develop it for additional languages in the future.

The talk will be followed with a Q/A session on BookNLP

Bio :

David Bamman is an associate professor in the School of Information at UC Berkeley, where he works in the areas of natural language processing and cultural analytics, applying NLP and machine learning to empirical questions in the humanities and social sciences. His research focuses on improving the performance of NLP for underserved domains like literature (including LitBank and BookNLP) and exploring the affordances of empirical methods for the study of literature and culture. Before Berkeley, he received his PhD in the School of Computer Science at Carnegie Mellon University and was a senior researcher at the Perseus Project of Tufts University. Bamman’s work is supported by the National Endowment for the Humanities, National Science Foundation, the Mellon Foundation and an NSF CAREER award.

Wednesday, 21 June 2023, 10am-12pm (Paris time)

Salle de conférence du centre de science des données
(3e étage, couloir entre l'escalier B et l'escalier C)
Ecole normale supérieure 
45 rue d'Ulm 75005 Paris

Pre-registration by Tuesday, 20 June at noon at https://forms.gle/SwhBqjMFh8Fj5oTh6 is mandatory. If you have any trouble with the Google Form, you can also register by emailing the organizer.

En raison du protocole de sécurité en vigueur, une pré-inscription est obligatoire (voir en fin de mail). La session sera en principe retransmise sur Zoom (si les conditions le permettent) pour les participants ne pouvant se rendre dur place. Un lien sera envoyé par mail le jour de l’événement. 

Financé par / Funded by Translitterae : https://www.translitterae.psl.eu/david-bamman/

Mercredi 21 juin 2023