Large Language Models

Sommer Semester 2025

Bachelors Seminar: Large Language Models

Large Language Models (such as GPT2, GPT3, GPT4, Llama, T5) and Intelligent Chatbots (such as ChatGPT, Claude, Gemini and Copilot) are a very timely topic.

Contents: N-gram language models, neural language modeling, word2vec, RNNs, Transformers, BERT, RLHF, ChatGPT, multilingual alignment, prompting, transfer learning, domain adaptation, linguistic knowledge in large language models

Instructors: Prof. Alexander Fraser, Marion Di Marco

Location: Room D.2.11

Time: Tuesday 16:15 – 17:45

Lectures


Lecture 1	29. 04. 2025	Organization and Introduction to Linguistic Concepts

Lecture 2	06. 05. 2025	N-gram Models (without section 3.7)
		(Dan Jurafsky and James H. Martin (2025). Speech and Language Processing)
		Slides

Lecture 3	13. 05. 2025	Bengio et al. (2003): A Neural Probabilistic Language Model.
		(Journal of Machine Learning Research 3, 1137-1155)

Lecture 4	20. 05.2025	Talk by Prof. Victoria Nash: AI and the Evolution of Digital Childhood
		May 20, 2025 16:15-18:00 in Room D.0.0.1

Lecture 5	27. 05. 2025	Smith (2019): Contextual Word Representations: A Contextual Introduction.
		(arXiv)

Lecture 6	03. 06. 2025	Lena Voita. NLP Course: Neural Language Models and
		Sequence to Sequence and Attention (Web Tutorial)

	10. 06. 2025	Whitsun Vacation – no lecture

Lecture 7	17. 06. 2025	Vaswani et al. (2017): Attention Is All You Need (NIPS)
		Lecture by Dr. Lukas Edman slides

Lecture 8	24. 06. 2025	Devlin et al. (2019): BERT: Pre-training of Deep Bidirectional Transformers
		for Language Understanding (NAACL-HLT)

Lecture 9	01. 07. 2025	Ouyang et al. (2022): Training language models to follow instructions
		with human feedback. (arXiv) slides

Lecture 10	08. 07. 2025	Paper Presentations
		(1) Kang et al. (2025): Unfamiliar Finetuning Examples Control How
		Language Models Hallucinate.
		(2) Hou et al. (2023): Effects of sub-word segmentation on performance
		of transformer language models.

Lecture 11	15. 07. 2025	Paper Presentations
		(1) Hu et al. (2025): Fine-Tuning Large Language Models with Sequential
		Instructions.
		(2) Mondshine et al. (2025): Beyond English: The Impact of Prompt Translation
		Strategies across Languages and Tasks in Multilingual LLMs.
		(3) Liu et al. (2025): Is Translation All You Need? A Study on Solving
		Multilingual Tasks with Large Language Models
		(4) Bang et al. (2024): Measuring Political Bias in Large Language Models:
		What Is Said and How It Is Said

Lecture 12	22. 07. 2025	Paper Presentations
		(1) Xu et al. (2025): LLM The Genius Paradox: A Linguistic and Math Expert’s
		Struggle with Simple Word-based Counting Problems.
		(2) Luo et al. (2025): Self-Training Large Language Models for Tool-Use
		Without Demonstrations.
		(3) Zhang et al (2025): Tomato, Tomahto, Tomate: Do Multilingual Language
		Models Understand Based on Subword-Level Semantic Concepts?
		(4) Helm et al. (2025): Token Weighting for Long-Range Language Modeling.

Literature

Speech and Language Processing
Dan Jurafsky and James H. Martin (2024; 3rd ed. draft)

Paper Presentations

Notes on Presentation and References

Please select a paper from the list below by letting me know per mail your two top choices. Alternatively, you can select another paper that is relevant to the seminar – in this case, please also contact me by mail.

Contact: marion.dimarco –AT– tum.de

Gonen et al. (2025): Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models. NAACL 2025.
Ide et al. (2025): How to Make the Most of LLMs’ Grammatical Knowledge for Acceptability Judgments. NAACL 2025.
Wahle et al. (2024): Paraphrase Types Elicit Prompt Engineering Capabilities. EMNLP 2024.
~~Liu et al. (2025): Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models NAACL 2025.~~

Presentation details: 15 minutes presentation + discussion time

Written summary: 4 pages + references. Please use the ACL template

Summary deadline: 3 weeks after the presentation

Marion Di Marco (née Weller)

Post-Doc Researcher at TUM (School of Computation, Information and Technology)

Large Language Models