2. (10 points) Please circle all that apply. There are deductions for wrong answers. The denoted points will be taken away for each question. (a) (1 point) (T/F) Q-learning is a model-free algorithm, which does not explicitly learn transition function T(s, a, s') and reward function R(s, a, s'). (b) (2 points) (T/F) Since the posterior probability takes into account the prior probability, Maxi- mum A Posteriori (MAP) is always more accurate than Maximum Likelihood Estimation (MLE). (c) (2 points) (T/F) When concerning only the most likely class MAP by MAP without its actual posterior probability, the prediction can be more efficiently computed by YMAP arg maxy P(YIX) = arg maxy P(X|Y)P(Y) without considering P(X). = (d) (1 point) (T/F) If a machine learning model is too simple then it has a high bias and suffers from overfitting. (e) (1 point) (T/F) When a Decision Tree has a huge test error even though the training error is small, then pruning or early stopping can be used to resolve the overfitting problem. (f) (1 point) (T/F) The standard Decision Tree algorithm based on the information gain finds the optimal tree, which has the smallest (simplest) decision tree in polynomial time. (g) (1 point) (T/F) If two hypotheses are consistent with data, then a shorter (simpler) hypothesis is preferred and this principle is called Occam's Razor. (h) (1 point) (T/F) An identical concept (function) can be represented by multiple and different sizes of Decision Trees.