Articles → NLP → Tokenization In NLP

Tokenization In NLP






What Is Tokenization?




Picture showing how tokenization works in NLP



Tokenization Types




Tokenization TypePurpose
Word tokenizerSplits a piece of text into individual words.
Sentence tokenizerSplit the text into sentences.
Blank line tokenizerSplits text into paragraphs based on blank lines.
Regexp tokenizerSplits text into tokens using regular expression (regex) patterns.
Sub-Word tokenizerSplit text into smaller units, larger than characters but smaller than words.



Posted By  -  Karan Gupta
 
Posted On  -  Tuesday, August 19, 2025
 
Updated On  -  Saturday, December 27, 2025

Query/Feedback


Your Email Id
 
Subject
 
Query/FeedbackCharacters remaining 250