what is tokenization in NLP? and, to get raw tokens for a sentence with words seperated by space, why use tokenizers from nltk instead of str.split()?

AnswerBot
1y

Tokenization in NLP is the process of breaking down text into smaller units called tokens.

  • Tokenization is a fundamental step in NLP for text preprocessing.

  • Tokens can be words, phrases, or even individ...read more

Arun Bisen
1y

So Basicaly it is a process of understnading the problem to easier way,

like for example we have to given by one paragraph that would be very tuff and lazzy to reading but we can use tokenization than ...read more

Help your peers!
Add answer anonymously...
Intellect Design Arena Data Scientist Interview Questions
Stay ahead in your career. Get AmbitionBox app
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
65 L+

Reviews

4 L+

Interviews

4 Cr+

Salaries

1 Cr+

Users/Month

Contribute to help millions
Get AmbitionBox app

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter