Articles → NLP → Bag Of Words In NLP
Bag Of Words In NLP
What Is A Bag Of Words?
Sparse Matrix
How Does A Bag Of Words Work?
Sentence 1: "I like NLP"
Sentence 2: "I like machine learning"
"I like NLP" → [1, 1, 1, 0, 0]
"I like machine learning" → [1, 1, 0, 1, 1]
Example
import nltk
from sklearn.feature_extraction.text import CountVectorizer
# Example corpus
documents = [
"I like NLP",
"I like machine learning",
"NLP is fun and powerful"
]
# Initialize CountVectorizer
vectorizer = CountVectorizer(
lowercase=True,
stop_words='english' # remove English stopwords
)
# Fit and transform documents
X = vectorizer.fit_transform(documents)
# Get vocabulary
vocab = vectorizer.get_feature_names_out()
print("Vocabulary:", vocab)
print("\nBag of Words Matrix:")
print(X.toarray())
Output
How Is Bags Of Words Different From The Count Vectorization?
Bag Of Words And Semantic Meaning
- Which words appear?
- How many times do they appear?
I love this movie
I do not love this movie
| Sentence | I | love | this | movie | do | not |
|---|
| I love this movie | 1 | 1 | 1 | 1 | 0 | 0 |
| I do not love this movie | 1 | 1 | 1 | 1 | 1 | 1 |
Same Meaning Of Different Words
The car is fast
The vehicle is quick
| Posted By - | Karan Gupta |
| |
| Posted On - | Thursday, September 11, 2025 |
| |
| Updated On - | Tuesday, February 10, 2026 |