Selection of Projects

Project: scikit-learn
Date: 2025-2026
Summary: scikit-learn is a Python package for all sorts of machine learning applications.
My role in it: Starting October 2025, I will assist the scikit-learn maintainers with the general maintenance of the project and the development of new features.
Links: My contributions

Project: Refactoring the clembench framework
Date: September - December 2024
Summary: The clembench framework is designed to evaluate Large Language Models (such as the GPTs and other open or commercially available models) by letting them play language games against each other.
My role in it: I initiated the refactoring of the code to separate the specific games from the framework. This paved the way to make the framework a separate module and allowed for easier addition of new games.
Links: Code

Project: Exploring entity status in contextual word embeddings
Date: 2022
Summary: My colleague and I set out to explore if current language models encode (i.e., “know”) whether an entity (like an object or person) was already mentioned before in the context, wich is essential for knowing how to refer to it (“the book” assumes that we already know which book we are talking about, whereas “a book” can be used to introduce a new one).
My role in it: I implemented a classifier that took the contextualized word representations from a model and used it to predict whether the corresponding entity was known or new in the given context.
Links: Code, Paper

Project: Introduction to huggingface
Date: August 2022
Summary: Together with some colleuagues I designed and gave a workshop on Deep Learning for linguists within the CRC 1287.
My role in it: I set up and guided participants through a jupyter notebook for using models from the huggingface library (as it was in August 2022).
Links: Notebook

Anne Beyer