今さらPython3 (48) - 第7章復習課題

第7章復習課題。入門書とは言え、課題がだんだん難しくなってきてますね。

入門 Python 3

入門 Python 3

7.1
>>> import unicodedata
>>> mystery = '\U0001f4a9'
>>> mystery
'💩'
>>> unicodedata.name(mystery)
'PILE OF POO'

この回答を導き出したときの脱力感ったら。。。

7.2
>>> pop_byte = mystery.encode('utf-8')
>>> pop_byte
b'\xf0\x9f\x92\xa9'
7.3
>>> pop_string = pop_byte.decode('utf-8')
>>> mystery == pop_string
True
>>> pop_string
'💩'

この笑顔を見る度に軽くイラッとするのはなぜ?

7.4
>>> poem = '''My kitty cat likes %s,
... My kitty cat likes %s,
... My kitty cat fell on his %s,
... And now thinks he's a %s.''' % (
... 'roast beef', 'ham', 'head', 'clam')
>>> print(poem)
My kitty cat likes roast beef,
My kitty cat likes ham,
My kitty cat fell on his head,
And now thinks he's a clam.
>>> 

正直、古いスタイルの方が慣れてる。

7.5
>>> letter = '''Dear {salutation} {name},
... 
... Thank you for your letter. We are sorry that our {product} {verbed} in your {room}. Please note that it should never be used in a {room}, especially near any {animals}.
... 
... Send us your receipt and {amount} for shipping and handling. We will send you another {product} that, in our tests, is {percent}% less likely to have {verbed}.
... 
... Thank you for your support.
... 
... Sincerely,
... {spokesman}
... {job_title}
... '''
>>> print(letter)
Dear {salutation} {name},

Thank you for your letter. We are sorry that our {product} {verbed} in your {room}. Please note that it should never be used in a {room}, especially near any {animals}.

Send us your receipt and {amount} for shipping and handling. We will send you another {product} that, in our tests, is {percent}% less likely to have {verbed}.

Thank you for your support.

Sincerely,
{spokesman}
{job_title}

>>> 

作るだけでいいんだよね?

7.6

まずは辞書を作る。

>>> response = {}
>>> response['salutation'] = 'Ms.'
>>> response['name'] = 'Suzuki'
>>> response['product'] = 'TV set'
>>> response['verbed'] = 'broken'
>>> response['room'] = 'bathroom'
>>> response['animals'] = 'tigers'
>>> response['amount'] = 1000
>>> response['percent'] = 0.2
>>> response['spokesman'] = 'Bill Blackgates'
>>> response['job_title'] = 'Chairman'
>>> response
{'verbed': 'broken', 'room': 'bathroom', 'job_title': 'Chairman', 'percent': 0.2, 'product': 'TV set', 'animal': 'tigers', 'amount': 1000, 'salutation': 'Ms.', 'name': 'Suzuki', 'spokesman': 'Bill Blackgates'}
>>> 

letterを表示してみる。

>>> print(letter.format(response))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'salutation'
>>> response['salutation']
'Ms.'

キーはあるのに怒られた。

>>> print(letter.format(**response))
Dear Ms. Suzuki,

Thank you for your letter. We are sorry that our TV set broken in your bathroom. Please note that it should never be used in a bathroom, especially near any tigers.

Send us your receipt and 1000 for shipping and handling. We will send you another TV set that, in our tests, is 0.2% less likely to have broken.

Thank you for your support.

Sincerely,
Bill Blackgates
Chairman

(**response)で通ったところを見ると、辞書のキーの数と挿入する箇所の数が一致していないという理由かな?

7.7

テキストは、ここで手に入る。ただし全部ではないので注意。

http://www.gutenberg.org/cache/epub/36068/pg36068.txt

>>> mammoth = '''
... We have seen thee, queen of cheese,
...     Lying quietly at your ease,
...     Gently fanned by evening breeze,
...     Thy fair form no flies dare seize.
... 
...     All gaily dressed soon you'll go
...     To the great Provincial show,
...     To be admired by many a beau
...     In the city of Toronto.
... 
...     Cows numerous as a swarm of bees,
...     Or as the leaves upon the trees,
...     It did require to make thee please,
...     And stand unrivalled, queen of cheese.
... 
...     May you not receive a scar as
...     We have heard that Mr. Harris
...     Intends to send you off as far as
...     The great world's show at Paris.
... 
...     Of the youth beware of these,
...     For some of them might rudely squeeze
...     And bite your cheek, then songs or glees
...     We could not sing, oh! queen of cheese.
... 
...     We'rt thou suspended from balloon,
...     You'd cast a shade even at noon,
...     Folks would think it was the moon
...     About to fall and crush them soon.
... '''

インデントがずれたけど、課題を進める上で問題はないと信じて。

7.8
>>> import re
>>> re.findall(r'c.*', mammoth)
['cheese,', 'cial show,', 'city of Toronto.', 'cheese.', 'ceive a scar as', 'cheek, then songs or glees', 'could not sing, oh! queen of cheese.', 'cast a shade even at noon,', 'crush them soon.']
>>> 

それっぽく見えるけど、単語じゃないな。

>>> re.findall(r'\bc\w*', mammoth)
['cheese', 'city', 'cheese', 'cheek', 'could', 'cheese', 'cast', 'crush']
>>> 

\bが単語の境界、\wが空白以外の文字、*で継続だからということだね。

ちなみにこんな間違いも犯した。

>>> re.findall(r'c\w*\b', mammoth)
['cheese', 'cial', 'city', 'cheese', 'ceive', 'car', 'cheek', 'could', 'cheese', 'cast', 'crush']

単語の切れ目ということで、\bを最後に持って行ったんだけど、単語の途中でcがあるやつも拾われたのでダメ。

>>> re.findall(r'\bc\w*\b', mammoth)
['cheese', 'city', 'cheese', 'cheek', 'could', 'cheese', 'cast', 'crush']

てか、これの方が由緒正しいような。(違)

7.9

これも失敗を交えて。

>>> re.findall(r'\bc...', mammoth)
['chee', 'city', 'chee', 'chee', 'coul', 'chee', 'cast', 'crus']
>>> re.findall(r'\bc\w..', mammoth)
['chee', 'city', 'chee', 'chee', 'coul', 'chee', 'cast', 'crus']
>>> re.findall(r'\bc\w..\b', mammoth)
['city', 'cast']
>>> re.findall(r'\bc\w{3}\b', mammoth)
['city', 'cast']
>>> 

単語の区切りを意味する\bを最後にも入れるのがポイントみたい。

7.10
>>> re.findall(r'\b\w*r\b', mammoth)
['your', 'fair', 'Or', 'scar', 'Mr', 'far', 'For', 'your', 'or']

Mr. はどう評価すれば良いんだろう?アウト?セーフ?

ここで回答を見ると、l終わりの単語だとおかしな事になると言っているので試す。

>>> re.findall(r'\b\w*l\b', mammoth)
['All', 'll', 'Provincial', 'fall']
>>> re.findall(r'\b[\w\']*l\b', mammoth)
['All', "you'll", 'Provincial', 'fall']
>>> 

\wだとアルファベットとアンダースコアにしかマッチしないので、アポストロフィーの扱いで問題が出るからとのこと。

7.11
>>> re.findall(r'\b[\w\']*[aeiou]{3}[\w\']*\b', mammoth)
['queen', 'quietly', 'beau', 'queen', 'squeeze', 'queen']
>>>

良さげ。でも、これだと母音が4つ以上続く場合も拾うことに気がついた。

>>> re.findall(r'\b[\w\']*[aeiou]{3}[^aeiou][\w\']*\b', mammoth)
['queen', 'quietly', 'queen', 'squeeze', 'queen']
>>>

あれ、beauが消えてる。別に母音が4つ以上連なっている訳じゃない。

>>> re.findall(r'\b[\w\']*[aeiou]{3}[^aeiou\s]*[\w\']*\b', mammoth)
['queen', 'quietly', 'beau', 'queen', 'squeeze', 'queen']
>>> 

beauの直後が改行なので、改行コードが悪さをしていると想像して、[^aeiou]の最後に\s(空白文字)を加えて[^aeiou\s]にしてみたら大丈夫そう。

7.12

unhexlifyはbinasciiに入っているので、インポートからやる。

>>> import binascii
>>> gifhex = '47494638396101000100800000000000ffffff21f9' + \
... '0401000000002c000000000100010000020144003b'
>>> gif = binascii.unhexlify(gifhex)
>>> gif
b'GIF89a\x01\x00\x01\x00\x80\x00\x00\x00\x00\x00\xff\xff\xff!\xf9\x04\x01\x00\x00\x00\x00,\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x01D\x00;'

最後のセミコロンは何だろう?とりあえず続けてみる。

7.13
>>> gif[:6] == b'GIF89a'
True
>>> 

最初のbを忘れると、こうなる。

>>> gifprefix = 'GIF89a'
>>> gif[:6] == gifprefix
False
>>> type(gif[:6])
<class 'bytes'>
>>> type('GIF89a')
<class 'str'>
>>> 

そりゃそうだ。

7.14

structを使えと言われているような気がするので、

>>> import struct
>>> width, height = struct.unpack('>2H', gif[6:10])
>>> width
256
>>> height
256
>>> 

1じゃなくて256になった。

ちなみに、こうやると1になる。

>>> width, height = struct.unpack('<2H', gif[6:10])
>>> width
1
>>> height
1

それだと、ビッグエンディアンじゃなくて、リトルエンディアンだよね。出題ミスの疑い。

(つづく)