Wednesday 6 November 2024Wednesday 6 November 2024
From 3 to 6 PM
Image
ENS-PSL, salle de conférences du Centre Sciences des Données
ENS-PSL 45 rue d'Ulm 75005Paris France
48.8418371, 2.3440403
Fifth session of the course "Enquêtes quantitatives. Boîte à outils pour sciences sociales", given by Théo Boulakia.
R. was born in 1946. From the age of 14 to 35, he worked for free on the family farm. He then worked as a tenant farmer until his retirement at 56. How many trajectories resemble R.'s? This question falls within the scope of a family of methods known as sequence analysis, whose best-known representative is Optimal Matching. The basic unit of this method is a sequence, i.e. a series of ordered elements: a professional career, a day's schedule, a sentence, a dance. To calculate the distance between two sequences, the algorithms count the minimum number of elementary operations (insertion, deletion, substitution) required to move from one sequence to the other. The resulting distance matrix can be used to create sequence typologies. Implementing and interpreting the results with R is particularly straightforward. Most of the trial and error lies upstream: how do I recode my data so that the elementary components of each sequence reflect as precisely as possible the question I'm asking? What "cost" should be assigned to the substitution of one portion of a sequence for another? We'll explore these questions with two surveys: one on professional careers, the other on time use.
Data analysis course in social sciences, delivered by Théo Boulakia. The sessions take place on Mondays, from 3 p.m. to 6 p.m., in the conference room of the Data Science Center (ENS-PSL, 45 rue d'Ulm, at the top of stairs B or C).
Goals of the course: This course offers an introduction to various quantitative methods based on social science surveys. Each session is organized around the meeting between a question (sociological, anthropological, historiographical), data and a method: cartography, dimensionality reduction, partitioning, sequence analysis, textual analysis, Bayesian statistics. The objective of the course is to acquire a schematic understanding of the implementation of these methods, their merits and their limitations. How to represent spatial data and temporal sequences? What do Bayesian statistics provide that the frequentist approach lacks? How to analyze the morpho-synthactic properties of a text? How to go from a large number of variables to a small number of classes? These questions will arise in context, in a dynamic of adjustment between data, method and research question (an investigation dynamic). Programming questions will be covered only in broad terms, no experience in this area is required.
Validation: Submit a four-page document applying one of the methods discovered in progress to your own data: presentation of the data, interest of the method, implementation and interpretation. People without programming experience will be assisted with implementation.