The Python Tutorial 筆記

讀書筆記 python

讀<dive into python>讀得雲裏霧裏以後決定去重讀一遍官方教程。之前初學python的時候讀過一些，終因語言問題沒有讀完。現在回過頭來再看，這個確實可以算上是最適合初學者的文檔，循序漸進，結構清晰，裏面還穿插了很多有用的小技巧，現在讀來依舊收穫很大。
本文所做的筆記沒有記錄那些很常用的python語法，重點是查漏補缺式的把一些技巧或以前沒有關注的點記錄了下來，並結合個人的心得寫了一些過程，算是一個記錄吧，也希望能夠對大家有所幫助。

2. Using the Python Interpreter

解釋器

在python解釋器中，ctrl+d退出python，ctrl+c爲KeyboardInterrupt
使用命令python -c <exp>來執行<exp>語句，如python -c "print 'hello'"將打印hello
使用命令python -i test.py來交互執行test.py，即執行完test.py後留在python解釋器中，並保留之前的變量信息。
```
$ cat interactive.py
a = 13
b = 15
print a-b
$ python -i interactive.py
-2
>>> a+b
28
>>>
```
python -m module可以用來運行模塊，如python -m random
文件編碼：在文件頭加上# -*- coding: utf-8 –*-，也可將utf-8改爲其他編碼

傳遞參數

python解釋器將文件名和其他參數放在sys模塊的argv變量中。通過import sys可以訪問該變量。

未指定文件和參數，則sys.argv爲空

⇒  python
Python 2.7.8 |Anaconda 2.0.1 (x86_64)| (default, Aug 21 2014, 15:21:46)
[GCC 4.2.1 (Apple Inc. build 5577)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://binstar.org
>>> import sys
>>> print sys.argv
['']

指定腳本名，則sys.argv[0]爲文件名

⇒  cat argument.py
import sys
print sys.argv
⇒  python argument.py
['argument.py']

使用-c命令，則sys.argv[0]被置爲-c

⇒  python -c "import sys; print sys.argv"
['-c']

使用-m命令，則sys.argv[0]被置爲模塊名

python配置文件

建立全局配置文件

首先用以下方法查找python包目錄，再在該目錄下建立sitecustomize.py或usercustomize.py。在打開python交互解釋器之前，系統會先後執行sitecustomize.py和usercustomize.py腳本。
注意：這兩個腳本只在打開交互解釋器之前執行，若使用python直接執行命令或腳本，則不會觸發。

>>> import site
>>> site.getusersitepackages()
'/home/user/.local/lib/python3.2/site-packages'

注：【如果使用的是第三方python，此處.local路徑可能不存在，要找到site-packages路徑可以使用命令site.getsitepackages()替代】

建立本地配置文件

有時候我們需要在當前目錄下打開python解釋器之前執行一些命令，可以在全局配置文件中添加代碼，讓python解釋器啓動前在當前文件夾搜索指定配置文件，若搜索到則先執行該文件。

# 在site-packages文件夾下建立usercustomize.py文件，搜索當前文件夾下是否
# 有.pythonrc.py文件，若存在，則執行
⇒  cat ~/anaconda/lib/python2.7/site-packages/usercustomize.py
import os
if os.path.isfile('.pythonrc.py'):
    execfile('.pythonrc.py')
# 測試本地配置文件
⇒  cat .pythonrc.py
print "this is a startup"
# 啓動python解釋器，注意第一句爲this is a startup，說明本地配置加載成功
⇒  python
this is a startup
Python 2.7.8 |Anaconda 2.0.1 (x86_64)| (default, Aug 21 2014, 15:21:46)
[GCC 4.2.1 (Apple Inc. build 5577)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://binstar.org
>>>
# ctrl+d退出，回到上一層目錄，再打開python，發現這句話不見了，說明本地配置文件只對當前文件夾生效
⇒  cd ..
⇒  python
Python 2.7.8 |Anaconda 2.0.1 (x86_64)| (default, Aug 21 2014, 15:21:46)
[GCC 4.2.1 (Apple Inc. build 5577)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://binstar.org
>>>

3. An Informal Introduction to Python

複數支持

Complex numbers are also supported; imaginary numbers are written with a suffix of j or J. Complex numbers with a nonzero real component are written as (real+imagj), or can be created with the complex(real, imag) function.

>>> 1j * 1J
(-1+0j)
>>> 1j * complex(0,1)
(-1+0j)
>>> 3+1j*3
(3+3j)
>>> (3+1j)*3
(9+3j)
>>> a=1.5+0.5j
>>> a.real  # 訪問實部
1.5
>>> a.imag  # 訪問虛部
0.5

`_`表示上一個最後打印的結果

>>> tax = 12.5 / 100
>>> price = 100.50
>>> price * tax
12.5625
>>> price + _
113.0625
>>> round(_, 2)
113.06

unicode字符編碼問題

>>> ur'Hello\u0020World !'
u'Hello World !'
>>> u"äöü".encode('utf-8')
'\xc3\xa4\xc3\xb6\xc3\xbc'
>>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8')
u'\xe4\xf6\xfc'

List操作

>>> a
['apple', 'banana']
>>> a*3 + ['pear']
>>> a = range(10)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[1:2]=[100, 200, 300]  # 修改子列
>>> a
[0, 100, 200, 300, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[:0] = [11,22]         # 最前面插入列表
>>> a
[11, 22, 0, 100, 200, 300, 2, 3, 4, 5, 6, 7, 8, 9]

4. More Control Flow Tools

函數默認參數不可使用可變對象

如下面的函數將L的默認參數設爲[]

def f(a, L=[]):
    L.append(a)
    return L

print f(1)      # [1]
print f(2)      # [1, 2]
print f(3)      # [1, 2, 3]

上面的代碼可改爲

def f(a, L=None):
    if L is None:
        L = []
    L.append(a)
    return L

參數列表展開

>>> range(1, 10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l = [1, 10]
>>> range(*l)   # 使用*號展開列表，此用法相當於range(1,10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> def mySum(a,b,c):return a+b+c
...
>>> mySum(1,2,3)
6
>>> mySum(a=8,c=1,b=4)  # 可以指定參數名稱傳入，若如此，則順序可以打亂
13
>>> d = {'a':1, 'b':2,'c':3}
>>> mySum(**d)          # 通過**將d展開，相當於mySum(a=1, b=2, c=3)
6

5. Data Structures

5.1 More on Lists

將列表作爲隊列使用

>>> from collections import deque
>>> queue = deque([1,2,3])

queue的方法包括

In [148]: queue.<tab>
queue.append      queue.count       queue.maxlen      queue.remove
queue.appendleft  queue.extend      queue.pop         queue.reverse
queue.clear       queue.extendleft  queue.popleft     queue.rotate

函數式編程工具

python有三個用於列表的非常好用的內置工具：filter(), map()和reduce().

filter(function,
sequence)

返回sequence中使function(item)爲真的子序列

>>> def f(x): return x % 2 != 0 and x % 3 != 0
...
>>> filter(f, range(2,25))
[5, 7, 11, 13, 17, 19, 23]

map(function, sequence) calls function(item) for each of the sequence’s items and returns a list of the return values.

>>> def cube(x): return x*x*x
...
>>> map(cube, range(1, 11))
[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]

參數序列可以超過一個，但個數必須和函數的參數個數相同，如：

>>> a = range(1,11)
>>> b = range(11,21)
>>> def add(x,y): return x+y
...
>>> map(add, a, b)
[12, 14, 16, 18, 20, 22, 24, 26, 28, 30]

reduce(function, sequence)將序列中的第一個和第二個取出來，進行function操作後的值再與第三個進行function操作，以此類推，知道sequence中沒有數據爲止。除此之外還可指定第三個參數，作爲初始值。
```
>>> def product(x,y): return x*y
...
>>> reduce(product, range(1,11))    # 相當於(((1*2)*3)...*10)
3628800
>>> reduce(product, range(1,11),0)  # 相當於(((0*1)*2)...*10)
0
```

list/set/dictionary comprehension

# List Comprehension, 求1～10中的偶數
>>> [n for n in range(1,11) if n%2==0]
[2, 4, 6, 8, 10]
# 同上，只是重複了1～10兩次，答案也重複了兩次
>>> [n for n in range(1,11)*2 if n%2==0]
[2, 4, 6, 8, 10, 2, 4, 6, 8, 10]
# 將[]改爲{}則爲Set Comprehension， 得到一個set，去除了重複字符
>>> {n for n in range(1,11)*2 if n%2==0}
set([8, 2, 4, 10, 6])
# dictionary comprehension， 將前面的n改爲n:n**2，生成以n爲key，n的平方爲value 的字典
>>> {n:n**2 for n in range(1,11) if n%2==0}
{8: 64, 2: 4, 4: 16, 10: 100, 6: 36}

Looping Techniques

使用enumerate()在遍歷中加上序號

>>> for i, v in enumerate(['tic', 'tac', 'toe']):
...     print i, v
...
0 tic
1 tac
2 toe

使用zip()一次遍歷多個相同長度的列表

>>> questions = ['name', 'quest', 'favorite color']
>>> answers = ['lancelot', 'the holy grail', 'blue']
>>> for q, a in zip(questions, answers):
...     print 'What is your {0}?  It is {1}.'.format(q, a)
...
What is your name?  It is lancelot.
What is your quest?  It is the holy grail.
What is your favorite color?  It is blue.

注：該方法同樣適用於多個list的情況

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> c = [7,8,9]
>>> zip(a,b,c)
[(1, 4, 7), (2, 5, 8), (3, 6, 9)]
>>> for i,j,k in zip(a,b,c):
...     print i,j,k
...
1 4 7
2 5 8
3 6 9

使用d.keys(), d.values()和d.iteritems()來遍歷鍵、值和鍵-值對

>>> d
{8: 64, 2: 4, 4: 16, 10: 100, 6: 36}
>>> d.keys()
[8, 2, 4, 10, 6]
>>> d.values()
[64, 4, 16, 100, 36]
>>> d.iteritems()
<dictionary-itemiterator object at 0x1005e40a8>
>>> list(d.iteritems())
[(8, 64), (2, 4), (4, 16), (10, 100), (6, 36)]
>>> for k, v in d.iteritems():
...     print k, "'s square is ", v
...
8 's square is  64
2 's square is  4
4 's square is  16
10 's square is  100
6 's square is  36

在遍歷列表的同時修改列表，爲避免列表修改後影響之後的遍歷，可以使用l[:]生成一個原列表的深拷貝來實現遍歷

>>> words = ['cat', 'window', 'defenestrate']
>>> for w in words[:]:  # Loop over a slice copy of the entire list.
...     if len(w) > 6:
...         words.insert(0, w)
...
>>> words
['defenestrate', 'cat', 'window', 'defenestrate']

序列比較

對於list，tuple，字符串，比較時會取出元素逐個比較，按照數字或字母排序比較。若一個序列先用完還未比出大小，則用完元素的這個小。

(1, 2, 3)              < (1, 2, 4)
[1, 2, 3]              < [1, 2, 4]
'ABC' < 'C' < 'Pascal' < 'Python'
(1, 2, 3, 4)           < (1, 2, 4)
(1, 2)                 < (1, 2, -1)
(1, 2, 3)             == (1.0, 2.0, 3.0)
(1, 2, ('aa', 'ab'))   < (1, 2, ('abc', 'a'), 4)

6. Modules

6.1. More on Modules

將模塊作爲腳本調用

⇒  more add.py
def add(a, b):
    return a+b

if __name__ == '__main__':
    import sys
    print sys.argv
    x = sys.argv[1] 
    y = sys.argv[2]
    print add(x,y)

⇒  python add.py 1 2
['add.py', '1', '2']    # 【1】
12                      # 【2】

【1】：sys.argv中存放的第一個值爲模塊名，其餘爲我們傳入的參數，格式爲string
【2】：此處我們想要返回的是兩個參數的和，卻返回了12，原因是1和2被當成了字符串，相加後得到了"12"，應該在獲取參數時用int方法將x,y轉化成int，即x = int(sys.argv[1])。

"Compiled" Python files

在導入模塊時，python會自動將.py文件編譯成字節碼文件，即.pyc文件。.py文件的修改時間會被記錄在.pyc文件中，若不一致，說明.py被修改過，.pyc將被忽略。
在導入模塊時python會自動編譯.py文件，將.pyc文件寫入.py所在文件夾。即使出錯，錯誤的.pyc文件也會被忽略，不會影響程序。
.pyc文件是平臺無關的。
指定-O參數，編譯會進行優化。去除所有的assert語句，生成的文件後綴爲.pyo。
指定-OO參數，編譯會進一步優化，目前是去除__doc__以及進一步壓縮結果。有時可能導致程序出錯，僅當你知道自己在幹什麼的時候使用它。
編譯後的文件沒有加快文件的執行(run)速度，只是加快了加載(load)速度。
通過命令行指定執行的python文件編譯後的字節碼不會生成.pyc或.pyo文件。因此，將一個大文件中的部分功能放進一個module中並import該module可以加快其啓動速度，因爲該module會被編譯成pyc文件保存下來，下次直接載入。
可以只有.pyc或.pyo文件而無.py文件使用，這樣可以在發佈代碼時防止被反編譯。
模塊compileall可以爲某目錄下的所有module生成.pyc文件。

Standard Modules

>>> import sys
>>> sys.ps1
'>>> '
>>> sys.ps2
'... '
>>> sys.ps1 = 'C> '
C> print 'Yuck!'
Yuck!
C>

Packages

先來進行初始化，在當前文件夾下建立myPackage文件夾，並在其中建立兩個文件calc.py和str_calc.py。內容如下：

⇒  tree
.
└── myPackage
    ├── calc.py
    └── str_calc.py
⇒  cat myPackage/calc.py
def add(x,y):
    return x+y

def product(x,y):
    return x*y
⇒  cat myPackage/str_calc.py
def str_join(s1, s2):
    return s1+s2

def str_head(s):
    return s[0]

我們在當前文件夾下打開python，試圖引用這兩個文件：

⇒  python -c "import myPackage"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named myPackage

因爲myPackage文件夾中沒有建立__init__.py文件，所以引用包時無法識別。

⇒  touch myPackage/__init__.py
⇒  python -c "import myPackage"

創建該文件後再執行就不會報錯了(即使該文件爲空)，我們再來試試是否可以調用其中的函數

⇒  python -c "import myPackage.calc;print myPackage.calc.add(1,2)"
3

__init__.py中的內容會在import語句處被執行，我們可以驗證一下。在__init__.py中加入一句代碼，然後重新執行前面的命令。

⇒  more myPackage/__init__.py
print "this is init"
⇒  python -c "import myPackage.calc;print myPackage.calc.add(1,2)"
this is init
3

引入的對象是模塊，即一個.py文件。如果直接引入文件夾名，則只會引入__init__.py中的東西。我們再來修改一下其中的內容進行驗證。

⇒  more myPackage/__init__.py
my_init = "This is in the __init__.py."
⇒  python
>>> import myPackage        # 直接引入包名，只會引入__init__.py中的東西
>>> myPackage.calc.add(1,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'calc'
>>> myPackage.my_init       # 可以通過包名訪問__init__.py中的變量
'This is in the __init__.py.'
>>> import myPackage.calc   # 直接引入模塊名
>>> myPackage.calc.add(1,2) # 可以訪問其中的函數
3
>>> import myPackage.calc.add   # 試圖引入函數，出錯
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named add

除了直接引用包外，還可以通過from package import module或from package.module import function(class)的形式來進行引用。

>>> from myPackage import calc
>>> calc.add(1,2)
3
>>> from myPackage.calc import add
>>> add(1, 2)
3

區別是訪問函數時，直接引入需要用全路徑訪問，而這種形式可以通過import後的module名來訪問。
還可以通過from package import *來訪問package中的模塊，但需在__init__.py中指定__all__ = []列表，只有在這個列表中的模塊，以及__init__.py中的變量和函數會被引入。如果是from package.module import *則會引入module中的所有變量，方法和類。

⇒  python
>>> from myPackage import *
>>> my_init
'This is in the __init__.py.'
>>> calc.add(1,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'calc' is not defined
>>> 
# <ctrl+d>退出python，修改__init__.py
⇒  more myPackage/__init__.py
my_init = "This is in the __init__.py."
__all__ = ["calc"]
⇒  python
>>> from myPackage import *
>>> calc.add(1,2)   # 可以訪問到__all__中定義了的calc
3
>>> str_calc.add("a","b")   # 無法訪問__all__中未定義的str_calc
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'str_calc' is not defined

7. Input and Output

8. Errors and Exceptions

一個完整的捕獲異常例子：

import sys
def divide(x, y):
    try:    #【1】
        print "devide({}, {})".format(x, y)
        result = x / y
    except ZeroDivisionError, ex:   # 【2】
        print ZeroDivisionError, ':', ex
    except: # 【3】
        print sys.exc_info()    
    else:  
        print "result is", result
    finally:    
        print "executing finally clause"

測試結果：

>>> divide(2,1)
devide(2, 0)
<type 'exceptions.ZeroDivisionError'> : integer division or modulo by zero
executing finally clause
>>> divide(2,0)
devide(2, 1)
result is 2
executing finally clause
>>> divide("2", 9)
devide(2, 9)
(<type 'exceptions.TypeError'>, TypeError("unsupported operand type(s) for /: 'str' and 'int'",), <traceback object at 0x1098b6248>)
executing finally clause

說明：
【1】：進入函數，首先執行try中的語句，如果出現錯誤則根據錯誤類型轉到相應的except語句，無錯誤則在執行完後轉入else語句，無論錯誤與否，最終進入finally語句。
【2】：已知錯誤名稱需要特別捕獲的可以指定，如此處的ZeroDivisionError，ex用來獲取額外的錯誤信息，如執行divide(2,0)後爲integer division or modulo by zero executing finally clause。這個語句還有其他寫法：

except name：只捕獲異常，不用ex參數
except (name1, name2)：捕獲多種異常
except Exception as e：指定捕獲異常的名稱

【3】：除已指定的錯誤類型外，未指定類型的except語句會捕獲所有異常，並通過sys模塊的sys.exc_info()可以得到具體的錯誤信息。

9. Classes

10. Brief Tour of the Standard Library

10.1. Operating System Interface

# 系統管理模塊
import os
os.getcwd()                 # 獲取當前目錄
os.chdir('/path/to/dir')    # 更換目錄
os.system("echo 'hello'")   # 執行shell命令

# 文件和文件夾管理增強模塊
import shutil               
shutil.copyfile('data.db', 'archive.db')    # 複製文件
shutil.move('source', 'des')    # 移動文件

# 通配符搜索模塊
import glob
glob.glob('*.py')

13. Interactive Input Editing and History Substitution

解釋器快捷鍵列表

下列的C均代表Control

光標移動
- C-A：移動光標到開頭
- C-E：移動光標到末尾
- C-B：光標左移一格
- C-F：光標右移一格
- Backspace：刪除光標左邊的一個字符
- C-D：刪除光標右邊的一個字符
- C-K：刪除光標右邊的所有字符
- C-Y：粘貼最後刪除的字符與光標當前位置
- C-unserscore：撤銷最終的修改，相當於其他位置control+z的效果
歷史命令
- C-P：上一個命令，相當於上光標鍵
- C-N：下一個命令，相當於下光標鍵
- C-R：反向搜索歷史命令
- C-S：搜索歷史命令

重讀 The Python Tutorial 筆記