What is multimodal interaction?

What are the knowledge sources (models) of speech recognition? Name 4
variabilities in human speech recognition!

What's the motivation for machine learning? Why are classic software
engineering methods not useful for machine learning?

What is audio visual speech recognition? What are components of it?

What is feature extraction and why is it important for machine learning? What
are typical features of speech recognition?

What is a multilayer perceptron? How can it be trained? What is the local
minimum problem?

Name 2 modalities of speech recognition (except speech) and explain them
shortly.