what is tokenization in NLP? and, to get raw tokens for a sentence with words seperated by space, why use tokenizers from nltk instead of str.split()?

AnswerBot
1y
Tokenization in NLP is the process of breaking down text into smaller units called tokens.
Tokenization is a fundamental step in NLP for text preprocessing.
Tokens can be words, phrases, or even individ...read more
Arun Bisen
2y
So Basicaly it is a process of understnading the problem to easier way,
like for example we have to given by one paragraph that would be very tuff and lazzy to reading but we can use tokenization than ...read more
Help your peers!
Add answer anonymously...
>
Intellect Design Arena Data Scientist Interview Questions
Stay ahead in your career. Get AmbitionBox app


Trusted by over 1.5 Crore job seekers to find their right fit company
80 L+
Reviews
10L+
Interviews
4 Cr+
Salaries
1.5 Cr+
Users
Contribute to help millions
AmbitionBox Awards
Get AmbitionBox app

