Marion Di Marco (née Weller)

Post-Doc Researcher at TUM (School of Computation, Information and Technology)

Generative Models on Text

Sommer Semester 2025

Masters Seminar: Generative Models on Text

Large Language Models (such as GPT2, GPT3, GPT4, Llama, T5) and Intelligent Chatbots (such as ChatGPT, Claude, Gemini and Copilot) are a very timely topic.

Contents: N-gram language models, neural language modeling, word2vec, RNNs, Transformers, BERT, RLHF, ChatGPT, multilingual alignment, prompting, transfer learning, domain adaptation, linguistic knowledge in large language models

Instructors: Prof. Alexander Fraser, Marion Di Marco


Location: Room D.2.11

Time: Tuesday 16:15 – 17:45


Lectures

Lecture 1 29. 04. 2025 Organization and Introduction to Linguistic Concepts
Lecture 2 06. 05. 2025 N-gram Models (without section 3.7)
(Dan Jurafsky and James H. Martin (2025). Speech and Language Processing)
Slides
Lecture 3 13. 05. 2025 Bengio et al. (2003): A Neural Probabilistic Language Model.
(Journal of Machine Learning Research 3, 1137-1155)
Lecture 4 20. 05.2025 Talk by Prof. Victoria Nash: AI and the Evolution of Digital Childhood
May 20, 2025  16:15-18:00  in Room D.0.0.1
Lecture 5 27. 05. 2025 Smith (2019): Contextual Word Representations: A Contextual Introduction.
(arXiv)
Lecture 6 03. 06. 2025 Lena Voita. NLP Course: Neural Language Models and
Sequence to Sequence and Attention (Web Tutorial)
10. 06. 2025 Whitsun Vacation – no lecture
Lecture 7 17. 06. 2025 Vaswani et al. (2017): Attention Is All You Need (NIPS)
Lecture by Dr. Lukas Edman slides
Lecture 8 24. 06. 2025 Devlin et al. (2019): BERT: Pre-training of Deep Bidirectional Transformers
Lecture 9 01. 07. 2025 Ouyang et al. (2022): Training language models to follow instructions
with human feedback. (arXiv) slides
Lecture 10 08. 07. 2025 Paper Presentations
(1) Kang et al. (2025): Unfamiliar Finetuning Examples Control How
   Language Models Hallucinate.
(2) Hou et al. (2023): Effects of sub-word segmentation on performance
   of transformer language models.
Lecture 11 15. 07. 2025 Paper Presentations
(1) Hu et al. (2025): Fine-Tuning Large Language Models with Sequential
   Instructions.
(2) Mondshine et al. (2025): Beyond English: The Impact of Prompt Translation
   Strategies across Languages and Tasks in Multilingual LLMs.
(3) Liu et al. (2025): Is Translation All You Need? A Study on Solving
   Multilingual Tasks with Large Language Models.
(4) Bang et al. (2024): Measuring Political Bias in Large Language Models:
   What Is Said and How It Is Said.
Lecture 12 22. 07. 2025 Paper Presentations
(1) Xu et al. (2025): LLM The Genius Paradox: A Linguistic and Math Expert’s
   Struggle with Simple Word-based Counting Problems.
(2) Luo et al. (2025): Self-Training Large Language Models for Tool-Use
   Without Demonstrations.
(3) Zhang et al (2025): Tomato, Tomahto, Tomate: Do Multilingual Language
   Models Understand Based on Subword-Level Semantic Concepts?
(4) Helm et al. (2025): Token Weighting for Long-Range Language Modeling.

Literature

Speech and Language Processing
Dan Jurafsky and James H. Martin (2024; 3rd ed. draft)


Paper Presentations

Notes on Presentation and References

Please select a paper from the list below by letting me know per mail your two top choices. Alternatively, you can select another paper that is relevant to the seminar – in this case, please also contact me by mail.

Contact: marion.dimarco –AT– tum.de


Presentation details: 20 minutes presentation + discussion time

Written summary: 6 pages + references. Please use the ACL template

Summary deadline: 3 weeks after the presentation