O'Reilly: Chapter 1 Exercise 23-29
23.
>>>for w in [w for w in text6 if w.isupper()]: ...     print w ... .... ARTHUR GALAHAD KNIGHTS TIM ROBIN ARTHUR ROBIN GALAHAD ARTHUR ROBIN KNIGHTS ARTHUR TIM I KNIGHTS ....
24.
At beginning, I thought the requirement is to extract words who meet all conditions but I could not find. I executed one by one.
>>> [w for w in text6 if w.endswith('ize')] [] >>> [w for w in text6 if 'z' in w] ['zone', 'amazes', 'Fetchez', 'Fetchez', 'zoop', 'zoo', 'zhiv', 'frozen', 'zoosh'] >>> [w for w in text6 if 'pt' in w] ['empty', 'aptly', 'Thpppppt', 'Thppt', 'Thppt', 'empty', 'Thppppt', 'temptress' , 'temptation', 'ptoo', 'Chapter', 'excepting', 'Thpppt'] >>> set([w for w in text6 if (w.istitle() and len(w) > 1)]) set(['Welcome', 'Winter', 'Lead', 'Uugh', 'Does', 'Saint', 'Until', 'Today', 'Th ou', 'Burn', 'Lucky', 'Uhh', 'Not', 'Now', 'Twenty', 'Where', 'Just', 'Course', 'Go', 'Erbert', 'Uther', 'Actually', 'Cherries', 'Thpppt', 'Bloody', 'Aramaic', 'Mmm', 'Put', 'Haw', 'True', 'Pull', 'Fiends', 'Agh', 'Yup', 'We', 'Arthur', 'Zo ot', 'English', 'Alright', 'My', 'Silence', 'Clark', 'Bedevere', 'Bors', 'Back',  'Maynard', 'Fetchez', 'Seek', 'Exactly', 'Doctor', 'Rather', 'When', 'Three', ' Providence', 'Book', 'Therefore', 'Huh', 'Stay', 'Umhm', 'Aaaaaaaah', 'Huy', 'Th ose', 'Dingo', 'Cider', 'Chop', 'Aauuugh', 'So', 'Found', 'Guy', 'Oui', 'Anarcho ', 'Torment', 'Our', 'Your', 'Lie', 'Almighty', 'Galahad', 'Britons', 'Lord', 'W ho', 'Beast', 'Loimbard', 'Why', 'Don', 'Guards', 'Oooh', 'All', 'Aaauugh', 'Ass yria', 'Yeaaah', 'One', 'Farewell', 'Greetings', 'Beyond', 'Blue', 'What', 'Ayy' , 'His', 'Recently', 'Here', 'Hic', 'Away', 'Wait', 'Concorde', 'Herbert', 'Ere' , 'Bad', 'She', 'Mother', 'Shh', 'Erm', 'Tower', 'Robin', 'Summer', 'Chaste', 'E nchanter', 'Skip', 'Four', 'Say', 'Anthrax', 'Mud', 'Armaments', 'Build', 'Which ', 'Nador', 'Hiyaah', 'Woa', 'More', 'Picture', 'Holy', 'Very', 'Practice', 'Pac king', 'Uuh', 'Hold', 'Huyah', 'Throw', 'Must', 'None', 'This', 'Leaving', 'Ives ', 'Nine', 'Stand', 'Firstly', 'Brother', 'Oooo', 'Eh', 'Amen', 'Jesus', 'Camaaa aaargue', 'Divine', 'Speak', 'Even', 'Hallo', 'Dappy', 'Yay', 'Iiiives', 'Prepar e', 'There', 'Please', 'Black', 'Pure', 'Quoi', 'Excalibur', 'Iesu', 'Hmm', 'Mid get', 'Angnor', 'Splendid', 'Aggh', 'Lancelot', 'Victory', 'See', 'Will', 'Shrub beries', 'Court', 'Aauuuves', 'God', 'Father', 'Patsy', 'It', 'Peng', 'Other', ' Then', 'Halt', 'Thee', 'Ridden', 'Aaaah', 'Knight', 'Antioch', 'They', 'Ask', 'W ith', 'Gallahad', 'Off', 'Thy', 'Well', 'Didn', 'Anybody', 'Isn', 'Grail', 'Neee ', 'The', 'Bridge', 'Thsss', 'Hiyah', 'Yapping', 'Robinson', 'Hah', 'Explain', ' Aauuggghhh', 'Hill', 'Forward', 'Behold', 'European', 'Shut', 'Meanwhile', 'Chic kennn', 'French', 'Psalms', 'Auuuuuuuugh', 'Ector', 'Aah', 'Keep', 'Quick', 'Onc e', 'Right', 'Help', 'Over', 'Anyway', 'Aaaugh', 'For', 'France', 'Umm', 'Walk',  'Dramatically', 'Good', 'Run', 'That', 'Arimathea', 'Forgive', 'Ecky', 'King', 'Could', 'Quiet', 'Hooray', 'Himself', 'African', 'Launcelot', 'Gable', 'Bravest ', 'Bring', 'Shrubber', 'Aaah', 'Yes', 'Death', 'Christ', 'Would', 'Hey', 'Waa',  'Hee', 'Sorry', 'Heh', 'Get', 'Crapper', 'But', 'Hiyya', 'Aaaaaaaaah', 'Schools ', 'Hurry', 'Princess', 'Together', 'Dragon', 'Honestly', 'Caerbannog', 'Action' , 'Knights', 'Round', 'And', 'Old', 'How', 'Winston', 'Mercea', 'Battle', 'Follo w', 'Aaaaugh', 'Open', 'Ahh', 'Bedwere', 'Hya', 'Tis', 'Til', 'Tim', 'Charge', ' Wood', 'You', 'Nay', 'Tell', 'Stop', 'Aaaaaah', 'Excuse', 'Riiight', 'Supposing' , 'Aaauggh', 'Attila', 'Do', 'Clear', 'Alice', 'Apples', 'Bristol', 'Order', 'Tr y', 'Piglet', 'Tall', 'Spring', 'Is', 'Mind', 'Mine', 'Have', 'In', 'Table', 'De nnis', 'If', 'Wayy', 'Thank', 'Ninepence', 'Said', 'Hyy', 'Churches', 'Be', 'Aug h', 'Ewing', 'Far', 'Oooohoohohooo', 'Surely', 'Consult', 'By', 'On', 'Unfortuna tely', 'Oh', 'Did', 'Of', 'Supreme', 'Morning', 'Tale', 'Ow', 'England', 'Or', ' Dis', 'Brave', 'Ohh', 'Pin', 'Pendragon', 'Are', 'Bones', 'Fine', 'Prince', 'Too ', 'Iiiiives', 'Since', 'Pie', 'Idiom', 'Between', 'Whoa', 'Listen', 'Monsieur',  'Oooooooh', 'Frank', 'Quite', 'Let', 'Ho', 'Hm', 'Nothing', 'Ha', 'He', 'Chapte r', 'Look', 'Thppppt', 'Um', 'Un', 'Uh', 'Bon', 'Hello', 'First', 'Ages', 'Autum n', 'Looks', 'Olfin', 'Message', 'Really', 'Ni', 'Use', 'Cut', 'No', 'Make', 'Aa uuuuugh', 'Two', 'Quickly', 'Everything', 'Thpppppt', 'Nu', 'Rheged', 'Most', 'H ang', 'Ooh', 'Hand', 'Gawain', 'Every', 'Aaagh', 'Come', 'Bread', 'Peril', 'Stea dy', 'Thppt', 'Ulk', 'Silly', 'Defeat', 'Eee', 'Castle', 'Grenade', 'Camelot', ' Aagh', 'Britain', 'Joseph', 'Badon', 'Sir', 'Hoa', 'Perhaps', 'Hoo', 'Saxons', ' Lake', 'Thursday', 'To', 'Shall', 'May', 'Never', 'Eternal', 'As', 'Cornwall', ' Running', 'Five', 'Gorge', 'Lady', 'Man', 'Great', 'Like', 'Yeaah', 'Remove', 'S wamp', 'Heee', 'Ah', 'Am', 'Yeah', 'An', 'Bravely', 'Allo', 'At', 'Ay', 'Roger',  'Chicken']) >>>
Although no words were found for the first condition, there are some which end with 'ise' instead of 'ize'.
>>> [w for w in text6 if w.endswith('ise')] ['wise', 'wise', 'apologise', 'surprise', 'surprise', 'surprise', 'noise', 'surp rise']
25.
>>> sentx = ['she', 'sells', 'sea', 'shells', 'by', 'the', 'sea', 'shore'] >>> sentx ['she', 'sells', 'sea', 'shells', 'by', 'the', 'sea', 'shore'] >>> for w in [w for w in sentx if w.startswith('sh')]: ...     print w ... she shells shore >>> for w in [w for w in sentx if len(w) > 4]: ...     print w ... sells shells shore >>>
26.
>>> sum([len(w) for w in text1]) 999044
This code to sum up the length ("len") of each word in text1. The calculated number should be total number of characters in text1.
27.
>>> def vocab_size(text): ...     return len(set(text)) ... >>> vocab_size(text1) 19317 >>> len(set(text1)) 19317
28.
>>> def percent(word, text): ...     wf = text.count(word) ...     pt = wf / len(text) * 100 ...     rt = [] ...     rt.append(wf) ...     rt.append(pt) ...     return rt ... >>> percent('whale', text1) [906, 0.3473673313677301]
Not beautiful. I know I don't have a good sense of coding...
29.
>>> set(sent3) < set(text1) True >>> set(text3) < set(text1) False >>> len(set(text3)) 2789 >>> len(set(text1)) 19317
When using set() for comparison, I always used with len(). I thought they are just comparing length, however it cannot be explained for the seconde one, set(text3) < set(text1) is False.
According to language reference of Python (http://docs.python.org/2/library/sets.html), set() supports set to set comparison. Based on this information. This code means set(text1) includes all element of set(sent3). Therefore the return is True.
>>> set(sent3) < set(text1) True
The next one is to check whether set(text1) includes all element of set(text3).
>>> set(text3) < set(text1) False
The reason of returning False is that at least one element of set(text3) is not included in set(text1).
If so, this would be very helpful when comparing two sets.