Combining Different Sequence Types (4.2.2-4.2.3)

Let's continue.

>>> words = 'I turned off the spectroroute'.split()
>>> wordlens = [(len(word), word) for word in words]
>>> wordlens.sort()
>>> ' '.join(w for(_, w) in wordlens)
'I off the turned spectroroute'
>>>

The first line is to split into each words from sentence. Then generate a tuple for length of the words and words themselves. The next step is to sort by length. At the end, joined with space( ) each word. Underscore (_) here to indicate that not to use the value.

>>> lexicon = [
...     ('the', 'det', ['Di:', 'D@']),
...     ('off', 'prep', ['Qf', 'O:f'])
... ]
>>> lexicon
[('the', 'det', ['Di:', 'D@']), ('off', 'prep', ['Qf', 'O:f'])]
>>>

list vs tuple: list is mutable, but tuple is immutable.

>>> lexicon.sort()
>>> lexicon[1] = ('turned', 'VBD', ['t3:nd', 't3`nd'])
>>> del lexicon[0]
>>> lexicon
[('turned', 'VBD', ['t3:nd', 't3`nd'])]
>>> lexicon = tuple(lexicon)
>>> lexicon[1] = ('turned', 'ADJ', ['t3:nd', 't31nd'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>> 

Generator Expressions:

>>> text = '''"When I use a word," Humpty Dumpty said in rather a scornful tone, 
... "it means just what I chooseit to mean - neither more nor less."'''
>>> [w.lower() for w in nltk.word_tokenize(text)]
['``', 'when', 'i', 'use', 'a', 'word', ',', "''", 'humpty', 'dumpty', 'said', 'in', 'rather', 'a', 'scornful', 'tone', ',', "''", 'it', 'means', 'just', 'what', 'i', 'chooseit', 'to', 'mean', '-', 'neither', 'more', 'nor', 'less', '.', "''"]
>>> max([w.lower() for w in nltk.word_tokenize(text)])
'word'
>>> max(w.lower() for w in nltk.word_tokenize(text))
'word'

[] is omitted in the second one. This will contribute runtime improvement if handling huge data size.