![]() |
CCAI9003 Artificial Intelligence
|
Course Description
Artificial Intelligence (AI) systems are rapidly improving in capabilities. This rapid advancement poses a potentially catastrophic risk to humanity. This course is an interdisciplinary introduction to AI safety. Students will learn about (i) present and future capabilities of AI systems; (ii) the potentially catastrophic risk that AI systems pose to humanity; (iii) AI alignment, which seeks to give AI systems goals that reflect human values; (iv) AI governance, which seeks to design optimal regulations and institutions for controlling AI systems; and (v) foundational ethical and philosophical challenges in designing safe AI. Emphasizing an interdisciplinary approach, the course draws on perspectives from computer science, philosophy, political science, and complex systems theory. Through discussions, case studies, and policy briefs, students will critically analyse real-world scenarios where AI safety plays a pivotal role.
Course Learning Outcomes
On completing the course, students will be able to:
- Communicate effectively regarding core catastrophic risks surrounding AI systems.
- Understand core questions related to AI alignment.
- Explain key proposals in AI governance.
- Demonstrate competence in concepts related to safety engineering and complex systems.
Offer Semester and Day of Teaching
First semester (Wed)
Study Load
Activities | Number of hours |
Lectures | 24 |
Tutorials | 12 |
Reading / Self-study | 60 |
Assessment: Essay / Report writing | 24 |
Assessment: Presentation (incl preparation) | 12 |
Total: | 132 |
Assessment: 100% coursework
Assessment Tasks | Weighting |
Case analysis | 20 |
Issue papers | 20 |
Design proposal | 20 |
Reflective journal | 10 |
Group presentation | 20 |
Tutorial discussion | 10 |
Required Reading / Viewing
Overview of Catastrophic AI Risks
- Hendrycks, et. al. (2024). Overview of AI Catastrophic Risk.
AI Fundamentals
- Pullen, J. (2017). But what is a Neural Network. From https://www.3blue1brown.com/lessons/neural-networks
- Karpathy, A. (2023). Intro to Large Language Models. From https://www.youtube.com/watch?v=zjkBMFhNj_g
- Epoch AI. (2024). Key Trends and Figures in Machine Learning. From https://epochai.org/trends
Single Agent Safety
- Carlsmith, J. (2022). Is Power-Seeking AI An Existential Risk? arXiv. From https://arxiv.org/abs/2206.13353
- Yohan, J. J., Caldwell, L., McCoy, D. E., & Braganza, O. (2023). Dead rats, dopamine, performance metrics, and peacock tails: Proxy failure is an inherent risk in goal-oriented systems. Behavioral and Brain Sciences, 47, e67. From https://doi.org/10.1017/S0140525X23002753
Safety Engineering, Complex Systems
- Hendrycks, D. (2025). AI Safety, Ethics, and Society. [Chap. 5]
Beneficial AI and Machine Ethics
- Hendrycks, D., et. al. (2022). What Would Jiminy Cricket Do? Towards Agents That Behave Morally. arXiv. From https://arxiv.org/abs/2110.13136
Collective Action Problems
- Fearon, J. D. (1995). Rationalist Explanations for War. International Organization, 49(3), 379-414.
Governance
- Erdil, E. & Besiroglu, T. (2023). Explosive growth from AI automation: A review of the arguments. arXiv. From https://arxiv.org/abs/2309.11690
- Shavit, Y. (2023). What does it take to catch a Chinchilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring. arXiv. From https://arxiv.org/abs/2303.11341
Course Co-ordinator and Teacher(s)
Course Co-ordinator | Contact |
Professor S.D. Goldstein School of Humanities (Philosophy), Faculty of Arts |
Tel: Email: sgold@hku.hk |
Teacher(s) | Contact |
Professor S.D. Goldstein School of Humanities (Philosophy), Faculty of Arts |
Tel: Email: sgold@hku.hk |