Machine Learning

WIP

Decision Trees

Objective

Recursively split the dataset until the tree is left with pure leaf nodes i.e. only one class in a node.

Decision trees are greedy and are supervised.

ID3 Algorithm

  1. Calculate the entropy.
  2. Split the data.
  3. Create a node.
  4. Repeat recursively.

There are two types of nodes: decision nodes and leaf nodes.

Note

When arriving at non-pure leaf nodes e.g. likely when data becomes complex, classify based on the majority.

Entropy

Measures the amount of uncertainty or impurity in the dataset.

  • - Probability of Class

Information Gain

Measures the reduction in entropy or Gini impurity after a dataset is split on an attribute.

  • e.g.

Gini Impurity Index

  • Todo