Erin Pacquetet, Ph.D

Data scientist and Linguist

erinmorr@buffalo.edu

About me

I am a Doctor in Linguistics who specializes in Data Science and Natural Language Processing. My research is focused on the analysis of language production data with a special focus on keystroke logs and typing patterns. I have recieved graduate-level training in Machine Learning, Information retrieval, Data Analysis and Statistics. I am also well-versed in multiple programming languages and NLP toolkits that I use on a daily basis in various research projects.

My dissertation focuses on the linguistic analysis of typing patterns, with the aim of understanding the relationship between how someone types and what they are typing. I am interested in how one can use computational techniques to investigate production in time in order to better understand the production processes at play and how they are influenced by the linguistic characteristics of the language being typed. For this project, I have set up experiments and collected large typing datasets thanks to funding I secured and devised my own Python scripts to analyze these datasets qualitatively and quantitatively, in order to create a production model of typing.

Projects

Starting in January 2022, I have been a part of the Deep Learning for Language Assessment (DLLA) project, which is a collaboration between Université de Paris Cité and King’s College London. The goal of this project is to use typing data in order to automatically assess the language proficiency of a learner, and to provide them with automated visual feedback to help them learn better.

In 2020-2021, I have participated in the Alexa Prize Socialbot Grand Challege 4 with Team PROTO as Linguistic specialist, and achieved 3rd place worldwide. We worked on designing a Socialbot that would be able to sustain varied and engaging conversations on a wide range of topics, using conversational AI techniques. As a linguistic specialist, my main tasks included to help in collecting a large dataset of conversational data using the "Alexa let's chat" feature, to analze and parse this data using MySQL, to propose sets of conversational rules and fixes to implement, as well as participating in writing technical research papers.


From Fall 2020 to Spring 2022, I was a research assistant for Dr. Fabiola Henri, working on creating a lexical database of French Based Creoles which involved compiling dictionary data and writing Python scripts to automate the processes of collecting new words from different text sources.

CV

My full academic CV in .pdf format can be found here

Education:
2024 Ph.D in Linguistics, University at Buffalo
2021 M.S in Computational Linguistics, University at Buffalo
2018 M.A in Linguistics , Université Paris Diderot
2016 B.A in English Language, Literature and Foreign Civilisations, Université Paris Diderot

Publications:
2021 Proto: A Neural Cocktail for Generating Appealing Conversations - Saha, Das, Soper, Pacquetet, Srihari. In 4th Proceedings of the Alexa Prize
2019 Prototype de feedback visuel des productions écrites d’apprenants francophones de l’anglais sous Moodle - Ballier, Gaillat, Pacquetet. EIAH 2019
2019 Investigating Keylogs as Time-Stamped Graphemes - Ballier, Pacquetet, Arnold. Grafematik 2018

Skills:
Human Languages: French (native), English (Full Professional Proficiency), German (Limited Working Proficiency)
Programming Languages: Python, Perl, R, JavaScript