Articles → NLP → Character Tokenization In NLP

Character Tokenization In NLP






What Is Character Tokenization?





Example


from nltk.tokenize import RegexpTokenizer

text = "NLP 101!"

# Tokenize each character (letters or digits only)
tokenizer = RegexpTokenizer(r'.')
char_tokens = tokenizer.tokenize(text)

print(char_tokens)



Output


Picture showing the output of character tokenization in nlp



Posted By  -  Karan Gupta
 
Posted On  -  Friday, September 12, 2025

Query/Feedback


Your Email Id
 
Subject
 
Query/FeedbackCharacters remaining 250