Articles → MACHINE LEARNING → Information Gain In Machine Learning

Information Gain In Machine Learning

This article describes information gain in Machine Learning.

Purpose

Information Gain tells us which characteristic or property helps us best split a group into smaller, more organized groups. The Information Gain is a measure based on the reduction of entropy.

Significance

If the value of Information Gain is higher, the attribute is a good splitter or vice versa.

Formula

S = the original dataset (before the split)
A = the attribute for which IG is being calculated
v∈Values(A)=each possible value of attribute A
Sv = the subset of 𝑆 where attribute 𝐴 has value
∣Sv∣ = number of samples in subset Sv
∣S∣=total number of samples
Entropy(S) = measure of impurity in dataset 𝑆

Sample Dataset

Consider the following table: -

ID	Outlook	Temperature	Humidity	Windy	Play Tennis
1	Sunny	Hot	High	False	No
2	Sunny	Hot	High	True	No
3	Overcast	Hot	High	False	Yes
4	Rain	Mild	High	False	No
5	Rain	Cool	Normal	False	No
6	Rain	Cool	Normal	True	No
7	Overcast	Cool	Normal	True	Yes
8	Sunny	Mild	High	False	Yes

Entropy Of Full Dataset

We have 8 instances, and the target variable is Play Tennis with values Yes or No. In the above dataset: -

Picture showing the entropy of yes and no for Play Tennis

The entropy of the dataset is: -

Picture showing calculating the entropy of whole dataset

Entropy Of Sunny

The option Sunny has three instances, i.e., one yes and two no.

Entropy Of Overcast

The option overcast has two instances, i.e., two yes and zero no.

Entropy (Overcast)=0

Entropy Of Rain

The option rain has three instances, i.e., zero yes and three no.

Entropy (Rain)=0

Weighted Average Entropy For Outlook

Picture showing calculating the average entropy of outlook

Information Gain (Outlook)

Picture showing calculating the information gain of outlook

Posted By -	Karan Gupta

Posted On -	Tuesday, September 2, 2025

Query/Feedback

Your Email Id		**

Subject		*

Query/Feedback	Characters remaining 250	**