The differences between ID3 and the algorithm implemented by the scikit-learn which is CART are the following:
Type of learning:
- ID3: is for binary classification only
- CART: "Classification And Regression Trees" is composed by various algorithms, including binary classification tree learning. There is a method "rpart()" where you can specify the classes, but rpart can infer this type of dependent variable.
Loss functions used for split selection:
- ID3: splits based in IG (Information Gain) which is the reduction in entropy between the parent node and (weighted sum of) children nodes
- CART: splits the datas in subsets that minimize the Gini impurity