Starting from:

$30

CS4395 Homework 2 -Solved

. Create a Python notebook (Jupyter or Colab) with appropriate headings. You will later print-to
pdf for uploading. Note: intersperse all the code cells below with text cells that use markdown 
to describe what the code is doing and its output. Make sure that your notebook displays the 
code output. 
2. If you use Jupyter notebook with NLTK and libraries installed plus the nltk book download, you 
are good to go. If you use Colab, insert a code chunk at the top of your notebook to install these 
items: 
import nltk 
nltk.download(‘stopwords’) 
nltk.download(‘wordnet’) 
nltk.download(‘punkt’) 
ntlk.download(‘omw-1.4’) 
3. Code cell: Each of the built-in 9 texts is an NLTK Text object. Look at the code for the Text object 
at this link: https://www.nltk.org/_modules/nltk/text.html. Look at the tokens() method. Extract 
the first 20 tokens from text1. List two things you learned about the tokens() method or Text 
objects in the text cell above this code cell. 
4. Look at the concordance() method in the API. Using the documentation to guide you, in code, 
print a concordance for text1 word 'sea', selecting only 5 lines. 
5. Code cell: Look at the count() method in the API. How does this work, and how is it different or 
the same as Python's count method? Write your commentary above the code cell. In the code 
cells, experiment with both count() methods. 
6. Code cell: Using raw text of at least 5 sentences of your choice from any source (cite the source), 
save the text into a variable called raw_text. Using NLTK's word tokenizer, tokenize the text into 
variable 'tokens'. Print the first 10 tokens. 
7. Code cell: Using the same raw text, and NLTK's sentence tokenizer sent_tokenize(), perform 
sentence segmentation and display the sentences. 
8. Code cell: Using NLTK's PorterStemmer(), write a list comprehension to stem the text. Display 
the list.
9. Code cell: Using NLTK's WordNetLemmatizer, write a list comprehension to lemmatize the text. 
Display the list. In the text cell above this code cell, list at least 5 differences you see in the 
stems verses the lemmas. You can just write them each on a line, like this: 
stem-lemma 
10. Comment cell: Write a paragraph outlining: 
a. your opinion of the functionality of the NLTK library 
b. your opinion of the code quality of the NLTK library 
c. a list of ways you may use NLTK in future projects 
Grading Rubric: 

More products