Exercise: Chapter 2 (20-22)

20.

>>> def word_freq(word, section):
...     fdist = FreqDist([w for w in nltk.corpus.brown.words(categories=section)])
...     return fdist.__getitem__(word)
... 
>>> word_freq('love', 'romance')
32
>>> word_freq('city', 'government')
7
>>> word_freq('train', 'adventure')
10

21.

Need to consider words with multiple pronunciation.

>>> text = ['love', 'new', 'yankee', 'really']
>>> new_entries = [y for x, y in enumerate(nltk.corpus.cmudict.entries()) if y[0] in text]
>>> done_ent = []
>>> tot_len = 0
>>> for entry in new_entries:
...     if entry[0] not in done_ent:
...             tot_len += len(entry[1])
...             done_ent.append(entry[0])
... 
>>> tot_len
14
>>> new_entries
[('love', ['L', 'AH1', 'V']), ('new', ['N', 'UW1']), ('new', ['N', 'Y', 'UW1']), ('really', ['R', 'IH1', 'L', 'IY0']), ('really', ['R', 'IY1', 'L', 'IY0']), ('yankee', ['Y', 'AE1', 'NG', 'K', 'IY0'])]
>>>

The result seems correct. (3 + 2 + 4 + 5 = 14)

Define function and process with samples.

>>> def CountSE(text):
...     new_entries = [y for x, y in enumerate(nltk.corpus.cmudict.entries()) if y[0] in text]
...     done_ent = []
...     tot_len = 0
...     for entry in new_entries:
...             if entry[0] not in done_ent:
...                     tot_len += len(entry[1])
...                     done_ent.append(entry[0])
...     return tot_len
... 
>>> CountSE(text)
14
>>> CountSE(fdist.samples())
36162

22.

Not sure I correctly understand the question, but I create like this.

>>> text = fdist.samples()[50]
>>> def hedge(text):
...     new_text = []
...     counter = 1
...     for word in text:
...             new_text.append(word)
...             if counter % 3 == 0:
...                     new_text.append('like')
...             counter = counter + 1
...     return new_text
... 
>>> hedge(text)
[',', '.', 'the', 'like', 'and', 'to', 'a', 'like', 'of', '``', "''", 'like', 'was', 'I', 'in', 'like', 'he', 'had', '?', 'like', 'her', 'that', 'it', 'like', 'his', 'she', 'with', 'like', 'you', 'for', 'at', 'like', 'He', 'on', 'him', 'like', 'said', '!', '--', 'like', 'be', 'as', ';', 'like', 'have', 'but', 'not', 'like', 'would', 'She', 'The', 'like', 'out', 'were', 'up', 'like', 'all', 'from', 'could', 'like', 'me', 'like', 'been', 'like', 'so', 'there']
>>> 

I will try remaining questions (23-) after going through entire the book.