what is tokenization in NLP? and, to get raw tokens for a sentence with words seperated by space, why use tokenizers from nltk instead of str.split()?

AnswerBot
1y

Tokenization in NLP is the process of breaking down text into smaller units called tokens.

  • Tokenization is a fundamental step in NLP for text preprocessing.

  • Tokens can be words, phrases, or even individ...read more

Arun Bisen
2y

So Basicaly it is a process of understnading the problem to easier way,

like for example we have to given by one paragraph that would be very tuff and lazzy to reading but we can use tokenization than ...read more

Help your peers!
Select
Add answer anonymously...
Intellect Design Arena Data Scientist Interview Questions
Stay ahead in your career. Get AmbitionBox app
play-icon
play-icon
qr-code
Trusted by over 1.5 Crore job seekers to find their right fit company
80 L+

Reviews

10L+

Interviews

4 Cr+

Salaries

1.5 Cr+

Users

Contribute to help millions

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2025 Info Edge (India) Ltd.

Follow Us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter
Profile Image
Hello, Guest
AmbitionBox Employee Choice Awards 2025
Winners announced!
awards-icon
Contribute to help millions!
Write a review
Write a review
Share interview
Share interview
Contribute salary
Contribute salary
Add office photos
Add office photos
Add office benefits
Add office benefits