A record-setting blizzard is raging through Montreal. Your friend says "Wow, the weather is so amazing!" While humans can easily interpret sarcastic or ironic statements like this one, LLMs often struggle with such linguistic phenomena. In this paper (ACL 2025 - Main Conference), we propose a rhetorical-strategy-aware probabilistic framework to model such uses of language. We show that using this framework, we enable LLMs to more realistically interpret non-literal language. In a nutshell, our framework modes figurative language through different rhetorical listeners : one listener will interpret language literally (they will think you mean exactly what you say), another will interpret language ironically (they will think you mean the opposite of what you say), etc. These listeners and their interpretations are then marginalized together to produce distributions which we show are often compatible with human expectations!
Does This Summary Answer My Question? Modeling Query-Focused Summary Readers with Rational Speech Acts
This paper which was presented as a poster at the CustomNLP4U Workshop proposed to model the readers of query-focused summaries using LLMs in order to re-rank candidate summaries based on information needs. We found that using likelihood-based (e.g., LLM-induced conditional likelihoods) methods to score candidate summaries can perform worse than random and that our method mitigates this through a reader-based re-ranking procedure based on the Rational Speech Acts framework.
Identifying and Analyzing Task-Encoding Tokens in Large Language Models (2024)
This preprint was the result of a project led by the wonderful Yu Bai who worked to analyze the behaviour of large language models (LLMs) while performing in-context learning. We found that LLMs are quite sensitive to task-encoding tokens (e.g., tokens making up the prompt template) in terms of donwstream performance. For example, we observe that the performance of models drops to 0% (yes, worse than random) when provided a prompt with input-output demonstrations but no template tokens.
Qualitative Code Suggestion: A Human-Centric Approach to Qualitative Coding (2023)
This publication was the result of the work I carried out during my Master's at McGill with my wonderful advisors Jackie Cheung and Samira Rahimi. We showed that qualitative coding, a qualitative research technique, can be partially automated in a way which better aligns with the desires of scientists which frequently conduct these analyses.
McGill BabyLM Shared Task Submission: The Effects of Data Formatting and Structural Biases (2023)
Ziling Cheng, Rahul Aralikatte, Ian Porada, Cesare Spinoso-Di Piano, and Jackie CK Cheung. 2023. McGill BabyLM Shared Task Submission: The Effects of Data Formatting and Structural Biases. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, pages 179–192, Singapore. Association for Computational Linguistics.
This publication was the result of a project lead by the brilliant and tremendously hard-working Ziling Cheng in the context of the BabyLM Challenge. Through our experiments, we showed that more careful data preprocessing decisions can lead to performance increases of language models trained on very little amounts of data.
Mental Health–Related Emergency Department Visits in Adolescents Before and During the COVID-19 Pandemic: A Multicentric Retrospective Study (2021)
This publication was the result of work conducted with Dr. Nicholas Chadi and Dr. Olivier Drouin while I was a data analyst at the Research Centre of Sainte-Justine University Hospital. Through our analyses, we showed a significant increase in adolescent eating-disorder-related emergency department visits between 2018-2019 (pre-pandemic) and 2020 (at the peak of the pandemic).