This is the course material for CSC 83060: Speech and Audio Understanding at the CUNY Graduate Center, as taught by Michael Mandel in Fall 2016.

See course announcements below

Topics (syllabus)

Note that this schedule might change, so check back frequently!

Date Topic Assignments Readings
2016/08/26 Introduction
Part I: Fundamentals
2016/09/02 Digital signal processing
2016/09/09 [No class: Interspeech]
2016/09/16 Acoustics
2016/09/23 Auditory perception
2016/09/30 Machine Learning and Neural networks
  • Review: Deng and Yu, 2014 Deep learning (Chapter 7 only: applications to speech and audio)
  • Zhaoheng: Lee et al, 2009 Unsupervised feature learning for audio classification
2016/10/07 Project proposal presentations
2016/10/14 [No class: Tuesday schedule]
Part II: Core machine listening topics
2016/10/21 Speech models and speech synthesis
2016/10/28 Speech recognition front ends
(features, acoustic modeling, noise robustness)
2016/11/04 Speech recognition back ends
(language modeling, search, finite state transducers)
2016/11/11 Music analysis and modeling
2016/11/18 Source separation and spatial sound
2016/11/25 [No class: Thanksgiving]
2016/12/02 Environmental sound analysis
2016/12/09 Final project presentations Final project assignment
2016/12/16 Final papers due (no class) Final project assignment

Recommended textbooks

Announcements

2016/11/03
See this article for instructions on viewing the detailed feedback I provide on your assignments in blackboard
2016/08/26
Welcome to class, the course website has been updated again with a tentative schedule, but incomplete readings list
2016/08/04
Welcome to class, the course website has been updated for this semester