What is multimodal interaction? What are the knowledge sources (models) of speech recognition? Name 4 variabilities in human speech recognition! What's the motivation for machine learning? Why are classic software engineering methods not useful for machine learning? What is audio visual speech recognition? What are components of it? What is feature extraction and why is it important for machine learning? What are typical features of speech recognition? What is a multilayer perceptron? How can it be trained? What is the local minimum problem? Name 2 modalities of speech recognition (except speech) and explain them shortly.