Generative Models on Text

Sommer Semester 2025

Masters Seminar: Generative Models on Text

Large Language Models (such as GPT2, GPT3, GPT4, Llama, T5) and Intelligent Chatbots (such as ChatGPT, Claude, Gemini and Copilot) are a very timely topic.

Contents: N-gram language models, neural language modeling, word2vec, RNNs, Transformers, BERT, RLHF, ChatGPT, multilingual alignment, prompting, transfer learning, domain adaptation, linguistic knowledge in large language models

Instructors: Prof. Alexander Fraser, Marion Di Marco

Location: Room D.2.11

Time: Tuesday 16:15 – 17:45

Lectures


Lecture 1	29. 04. 2025	Organization and Introduction to Linguistic Concepts

Lecture 2	06. 05. 2025	N-gram Models (without section 3.7)
		(Dan Jurafsky and James H. Martin (2025). Speech and Language Processing)
		Slides

Lecture 3	13. 05. 2025	Bengio et al. (2003): A Neural Probabilistic Language Model.
		(Journal of Machine Learning Research 3, 1137-1155)

Lecture 4	20. 05.2025	Talk by Prof. Victoria Nash: AI and the Evolution of Digital Childhood
		May 20, 2025 16:15-18:00 in Room D.0.0.1

Lecture 5	27. 05. 2025	Smith (2019): Contextual Word Representations: A Contextual Introduction.
		(arXiv)

Lecture 6	03. 06. 2025	Lena Voita. NLP Course: Neural Language Models and
		Sequence to Sequence and Attention (Web Tutorial)

	10. 06. 2025	Whitsun Vacation – no lecture

Lecture 7	17. 06. 2025	Vaswani et al. (2017): Attention Is All You Need (NIPS)
		Lecture by Dr. Lukas Edman slides

Lecture 8	24. 06. 2025	Devlin et al. (2019): BERT: Pre-training of Deep Bidirectional Transformers


Lecture 9	01. 07. 2025	Ouyang et al. (2022): Training language models to follow instructions
		with human feedback. (arXiv) slides

Lecture 10	08. 07. 2025	Paper Presentations
		(1) Kang et al. (2025): Unfamiliar Finetuning Examples Control How
		Language Models Hallucinate.
		(2) Hou et al. (2023): Effects of sub-word segmentation on performance
		of transformer language models.

Lecture 11	15. 07. 2025	Paper Presentations
		(1) Hu et al. (2025): Fine-Tuning Large Language Models with Sequential
		Instructions.
		(2) Mondshine et al. (2025): Beyond English: The Impact of Prompt Translation
		Strategies across Languages and Tasks in Multilingual LLMs.
		(3) Liu et al. (2025): Is Translation All You Need? A Study on Solving
		Multilingual Tasks with Large Language Models.
		(4) Bang et al. (2024): Measuring Political Bias in Large Language Models:
		What Is Said and How It Is Said.

Lecture 12	22. 07. 2025	Paper Presentations
		(1) Xu et al. (2025): LLM The Genius Paradox: A Linguistic and Math Expert’s
		Struggle with Simple Word-based Counting Problems.
		(2) Luo et al. (2025): Self-Training Large Language Models for Tool-Use
		Without Demonstrations.
		(3) Zhang et al (2025): Tomato, Tomahto, Tomate: Do Multilingual Language
		Models Understand Based on Subword-Level Semantic Concepts?
		(4) Helm et al. (2025): Token Weighting for Long-Range Language Modeling.

Literature

Speech and Language Processing
Dan Jurafsky and James H. Martin (2024; 3rd ed. draft)

Paper Presentations

Notes on Presentation and References

Please select a paper from the list below by letting me know per mail your two top choices. Alternatively, you can select another paper that is relevant to the seminar – in this case, please also contact me by mail.

Contact: marion.dimarco –AT– tum.de

~~Xu et al. (2025): LLM The Genius Paradox: A Linguistic and Math Expert’s Struggle with Simple Word-based Counting Problems. NAACL 2025.~~
~~Hou et al. (2023): Effects of sub-word segmentation on performance of transformer language models. EMNLP 2023.~~
Hu et al. (2025): Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases. ACL 2025.
~~Helm et al. (2025): Token Weighting for Long-Range Language Modeling. NAACL 2025 (findings).~~
~~Luo et al. (2025): Self-Training Large Language Models for Tool-Use Without Demonstrations. arXiv.~~
~~Kang et al. (2025): Unfamiliar Finetuning Examples Control How Language Models Hallucinate. NAACL 2025.~~
~~Hu et al. (2025): Fine-Tuning Large Language Models with Sequential Instructions. NAACL 2025.~~
~~Zhang et al (2025): Tomato, Tomahto, Tomate: Do Multilingual Language Models Understand Based on Subword-Level Semantic Concepts?. NAACL 2025 (findings).~~
~~Mondshine et al. (2025): Beyond English: The Impact of Prompt Translation Strategies across Languages and Tasks in Multilingual LLMs. NAACL 2025 (findings).~~

Presentation details: 20 minutes presentation + discussion time

Written summary: 6 pages + references. Please use the ACL template

Summary deadline: 3 weeks after the presentation

Marion Di Marco (née Weller)

Post-Doc Researcher at TUM (School of Computation, Information and Technology)

Generative Models on Text