what is tokenization in NLP? and, to get raw tokens for a sentence with words seperated by space, why use tokenizers from nltk instead of str.split()?
AnswerBot
1y
Tokenization in NLP is the process of breaking down text into smaller units called tokens.
Tokenization is a fundamental step in NLP for text preprocessing.
Tokens can be words, phrases, or even individ...read more
Arun Bisen
1y
So Basicaly it is a process of understnading the problem to easier way,
like for example we have to given by one paragraph that would be very tuff and lazzy to reading but we can use tokenization than ...read more
Help your peers!
Add answer anonymously...
Popular interview questions of Data Scientist
Top HR questions asked in Intellect Design Arena Data Scientist
>
Intellect Design Arena Data Scientist Interview Questions
Stay ahead in your career. Get AmbitionBox app
Helping over 1 Crore job seekers every month in choosing their right fit company
65 L+
Reviews
4 L+
Interviews
4 Cr+
Salaries
1 Cr+
Users/Month
Contribute to help millions
Get AmbitionBox app