CSV for unicode, Numpy (4.8.3-4.8.4)
Handling CSV file; I have a good CSV file that was generated from my Chinese words learning database. There are 5800 words inside.
>>> import csv, codecs >>> import_file = codecs.open("/Users/xxx/Documents/workspace/NLTK Learning/text files/ChineseWords.csv",encoding="utf-8") >>> for row in csv.reader(import_file): ... print row[0].encode('utf-8'), row[1] ... 按−来− an4 lai2 熬夜 ao2 ye4 懊恼 ao4 nao3 巴不得 ba1 bu de 吧嗒 ba1 da1 把握 ba3 wo4 霸 ba4 白头雕 bai2 tou2 diao1 .... 白菜 bai2 cai4 白天 bai2 tian 班长 ban1 zhang3 板 ban3 半导体 ban4 dao3 ti3 >>>
Using this data, I would like to make something for Chinese learner as second (third) language...
Numpy:
>>> from numpy import array >>> cube = array ([ [[0,0,0], [1,1,1], [2,2,2]], ... [[3,3,3], [4,4,4], [5,5,5]], ... [[6,6,6], [7,7,7], [8,8,8]] ]) >>> cube[1,1,1] 4 >>> cube[2].transpose() array([[6, 7, 8], [6, 7, 8], [6, 7, 8]]) >>> cube[2, 1:] array([[7, 7, 7], [8, 8, 8]]) >>>
I don't know the meaning of this part. What is LSA???
>>> from numpy import linalg >>> a = array([[4,0], [3,-5]]) >>> u, s, vt = linalg.svd(a) >>> u array([[-0.4472136 , -0.89442719], [-0.89442719, 0.4472136 ]]) >>> s array([ 6.32455532, 3.16227766]) >>> vt array([[-0.70710678, 0.70710678], [-0.70710678, -0.70710678]]) >>>