Articles → NLP → Bag Of Words In NLP

Bag Of Words In NLP






What Is A Bag Of Words?







Sparse Matrix







How Does A Bag Of Words Work?






Sentence 1: "I like NLP"
Sentence 2: "I like machine learning"










"I like NLP" → [1, 1, 1, 0, 0]  
"I like machine learning" → [1, 1, 0, 1, 1]



Example


import nltk
from sklearn.feature_extraction.text import CountVectorizer

# Example corpus
documents = [
    "I like NLP",
    "I like machine learning",
    "NLP is fun and powerful"
]

# Initialize CountVectorizer
vectorizer = CountVectorizer(
    lowercase=True,
    stop_words='english'   # remove English stopwords
)

# Fit and transform documents
X = vectorizer.fit_transform(documents)

# Get vocabulary
vocab = vectorizer.get_feature_names_out()

print("Vocabulary:", vocab)

print("\nBag of Words Matrix:")
print(X.toarray())



Output


Picture showing the output of bag of words in nlp



How Is Bags Of Words Different From The Count Vectorization?





Bag Of Words And Semantic Meaning




  1. Which words appear?
  2. How many times do they appear?




I love this movie
I do not love this movie




SentenceIlovethismoviedonot
I love this movie111100
I do not love this movie111111





Same Meaning Of Different Words




The car is fast
The vehicle is quick





Posted By  -  Karan Gupta
 
Posted On  -  Thursday, September 11, 2025
 
Updated On  -  Tuesday, February 10, 2026

Query/Feedback


Your Email Id
 
Subject
 
Query/FeedbackCharacters remaining 250