Sommer Semester 2025
Bachelors Seminar: Large Language Models
Large Language Models (such as GPT2, GPT3, GPT4, Llama, T5) and Intelligent Chatbots (such as ChatGPT, Claude, Gemini and Copilot) are a very timely topic.
Contents: N-gram language models, neural language modeling, word2vec, RNNs, Transformers, BERT, RLHF, ChatGPT, multilingual alignment, prompting, transfer learning, domain adaptation, linguistic knowledge in large language models
Instructors: Prof. Alexander Fraser, Marion Di Marco
Location: Room D.2.11
Time: Tuesday 16:15 – 17:45
Lectures
Lecture 1 | 29. 04. 2025 | Organization and Introduction to Linguistic Concepts | ||
Lecture 2 | 06. 05. 2025 | N-gram Models (without section 3.7) | ||
(Dan Jurafsky and James H. Martin (2025). Speech and Language Processing) | ||||
Slides | ||||
Lecture 3 | 13. 05. 2025 | Bengio et al. (2003): A Neural Probabilistic Language Model. | ||
(Journal of Machine Learning Research 3, 1137-1155) | ||||
Lecture 4 | 20. 05.2025 | Talk by Prof. Victoria Nash: AI and the Evolution of Digital Childhood | ||
May 20, 2025 16:15-18:00 in Room D.0.0.1 | ||||
Lecture 5 | 27. 05. 2025 | Smith (2019): Contextual Word Representations: A Contextual Introduction. | ||
(arXiv) | ||||
Lecture 6 | 03. 06. 2025 | Lena Voita. NLP Course: Neural Language Models and | ||
Sequence to Sequence and Attention (Web Tutorial) | ||||
10. 06. 2025 | Whitsun Vacation – no lecture | |||
Lecture 7 | 17. 06. 2025 | Vaswani et al. (2017): Attention Is All You Need (NIPS) | ||
Lecture by Dr. Lukas Edman slides | ||||
Lecture 8 | 24. 06. 2025 | Devlin et al. (2019): BERT: Pre-training of Deep Bidirectional Transformers | ||
for Language Understanding (NAACL-HLT) | ||||
Lecture 9 | 01. 07. 2025 | Ouyang et al. (2022): Training language models to follow instructions | ||
with human feedback. (arXiv) slides | ||||
Lecture 10 | 08. 07. 2025 | Paper Presentations | ||
(1) Kang et al. (2025): Unfamiliar Finetuning Examples Control How | ||||
Language Models Hallucinate. | ||||
(2) Hou et al. (2023): Effects of sub-word segmentation on performance | ||||
of transformer language models. | ||||
Lecture 11 | 15. 07. 2025 | Paper Presentations | ||
(1) Hu et al. (2025): Fine-Tuning Large Language Models with Sequential | ||||
Instructions. | ||||
(2) Mondshine et al. (2025): Beyond English: The Impact of Prompt Translation | ||||
Strategies across Languages and Tasks in Multilingual LLMs. | ||||
(3) Liu et al. (2025): Is Translation All You Need? A Study on Solving | ||||
Multilingual Tasks with Large Language Models | ||||
(4) Bang et al. (2024): Measuring Political Bias in Large Language Models: | ||||
What Is Said and How It Is Said | ||||
Lecture 12 | 22. 07. 2025 | Paper Presentations | ||
(1) Xu et al. (2025): LLM The Genius Paradox: A Linguistic and Math Expert’s | ||||
Struggle with Simple Word-based Counting Problems. | ||||
(2) Luo et al. (2025): Self-Training Large Language Models for Tool-Use | ||||
Without Demonstrations. | ||||
(3) Zhang et al (2025): Tomato, Tomahto, Tomate: Do Multilingual Language | ||||
Models Understand Based on Subword-Level Semantic Concepts? | ||||
(4) Helm et al. (2025): Token Weighting for Long-Range Language Modeling. |
Literature
Speech and Language Processing
Dan Jurafsky and James H. Martin (2024; 3rd ed. draft)
Paper Presentations
Notes on Presentation and References
Please select a paper from the list below by letting me know per mail your two top choices. Alternatively, you can select another paper that is relevant to the seminar – in this case, please also contact me by mail.
Contact: marion.dimarco –AT– tum.de
Gonen et al. (2025): Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models. NAACL 2025.
Ide et al. (2025): How to Make the Most of LLMs’ Grammatical Knowledge for Acceptability Judgments. NAACL 2025.
Wahle et al. (2024): Paraphrase Types Elicit Prompt Engineering Capabilities. EMNLP 2024.
Liu et al. (2025): Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models NAACL 2025.
Presentation details: 15 minutes presentation + discussion time
Written summary: 4 pages + references. Please use the ACL template
Summary deadline: 3 weeks after the presentation