十一七天樂,翻譯自 http://cs231n.github.io/python-numpy-tutorial/
屬於CS231n Convolutional Neural Networks for Visual Recognition課程中關於Python和numpy的科普教程
本教程由Justin Johnson 提供。
本課程中所有作業將使用Python來完成。Python本身就是一種很棒的通用編程語言,現在在一些流行的庫(numpy,scipy,matplotlib)的幫助下,它爲科學計算提供強大的環境。
我們希望課程中的大部分人都有一些Python和numpy的經驗;對於其他人來說,本教程將作爲Python用於科學計算的速成課程。
另外,一些人由Matlab基礎,我們也提供了numpy for Matlab users供學習者使用。
可以在IPython notebook version of this tutorial here找到Volodymyr Kuleshov和Isaac Caswell爲本科成提供的IPython腳本。
目錄:
Python
Python是一種高級動態類型的多範式編程語言。Python代碼通常被稱爲僞代碼,因爲它允許您在非常少的代碼行中表達非常複雜的算法,同時具有很強的可讀性。作爲示例,這裏是Python中經典快速排序算法的實現:
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quicksort(left) + middle + quicksort(right)
print(quicksort([3,6,8,10,1,2,1]))
# Prints "[1, 1, 2, 3, 6, 8, 10]"
Python版本
目前有兩種不同的受支持版本的Python,2.7和3.5。但是,Python 3.0後引入了許多向後兼容的語言更改,因此爲2.7編寫的代碼可能無法在3.5下運行,反之亦然。對於這個類,所有代碼都將使用Python 3.5以上版本。
可以通過python --version
檢查Python版本。
基本數據類型
與大多數語言一樣,Python有許多基本類型,包括整數,浮點數,布爾值和字符串。這些數據類型的行爲方式與其他編程語言相似。
數字: 整數和浮點數的工作方式與其他語言相同:
x = 3
print(type(x)) # Prints "<class 'int'>"
print(x) # Prints "3"
print(x + 1) # Addition; prints "4"
print(x - 1) # Subtraction; prints "2"
print(x * 2) # Multiplication; prints "6"
print(x ** 2) # Exponentiation; prints "9"
x += 1
print(x) # Prints "4"
x *= 2
print(x) # Prints "8"
y = 2.5
print(type(y)) # Prints "<class 'float'>"
print(y, y + 1, y * 2, y ** 2) # Prints "2.5 3.5 5.0 6.25"
提示:與其他語言不同,Python沒有一元增量(x++
)和減量(x--
)。
Python還有複雜數字的內置類型; 您可以在文檔.中找到所有詳細信息 。
布爾: Python中實現所有通常的布爾邏輯,但是使用英文詞語(and,or等)而非符號(&&,||等):
t = True
f = False
print(type(t)) # Prints "<class 'bool'>"
print(t and f) # Logical AND; prints "False"
print(t or f) # Logical OR; prints "True"
print(not t) # Logical NOT; prints "False"
print(t != f) # Logical XOR; prints "True"
字符串: Python對字符串有很好的支持:
hello = 'hello' # String literals can use single quotes
world = "world" # or double quotes; it does not matter.
print(hello) # Prints "hello"
print(len(hello)) # String length; prints "5"
hw = hello + ' ' + world # String concatenation
print(hw) # prints "hello world"
hw12 = '%s %s %d' % (hello, world, 12) # sprintf style string formatting
print(hw12) # prints "hello world 12"
String對象有很多有用的方法; 例如:
s = "hello"
print(s.capitalize()) # Capitalize a string; prints "Hello"
print(s.upper()) # Convert a string to uppercase; prints "HELLO"
print(s.rjust(7)) # Right-justify a string, padding with spaces; prints " hello"
print(s.center(7)) # Center a string, padding with spaces; prints " hello "
print(s.replace('l', '(ell)')) # Replace all instances of one substring with another;
# prints "he(ell)(ell)o"
print(' world '.strip()) # Strip leading and trailing whitespace; prints "world"
您可以在文檔中找到所有字符串方法的列表。
容器
Python包含幾種內置容器類型: lists, dictionaries, sets, and tuples。
Lists
List是Python的等效數組,但是可以調整大小並且可以包含不同類型的元素:
xs = [3, 1, 2] # Create a list
print(xs, xs[2]) # Prints "[3, 1, 2] 2"
print(xs[-1]) # Negative indices count from the end of the list; prints "2"
xs[2] = 'foo' # Lists can contain elements of different types
print(xs) # Prints "[3, 1, 'foo']"
xs.append('bar') # Add a new element to the end of the list
print(xs) # Prints "[3, 1, 'foo', 'bar']"
x = xs.pop() # Remove and return the last element of the list
print(x, xs) # Prints "bar [3, 1, 'foo']"
您可以在文檔中找到有關List的所有詳細信息。
Slicing: 除了一次訪問一個列表元素外,Python還提供了訪問子列表的簡明語法; 這被稱爲 slicing:
nums = list(range(5)) # range is a built-in function that creates a list of integers
print(nums) # Prints "[0, 1, 2, 3, 4]"
print(nums[2:4]) # Get a slice from index 2 to 4 (exclusive); prints "[2, 3]"
print(nums[2:]) # Get a slice from index 2 to the end; prints "[2, 3, 4]"
print(nums[:2]) # Get a slice from the start to index 2 (exclusive); prints "[0, 1]"
print(nums[:]) # Get a slice of the whole list; prints "[0, 1, 2, 3, 4]"
print(nums[:-1]) # Slice indices can be negative; prints "[0, 1, 2, 3]"
nums[2:4] = [8, 9] # Assign a new sublist to a slice
print(nums) # Prints "[0, 1, 8, 9, 4]"
我們將在numpy arrays的部分中再次看到slicing。
Loops: 您可以循環遍歷列表的元素,如下所示:
animals = ['cat', 'dog', 'monkey']
for animal in animals:
print(animal)
# Prints "cat", "dog", "monkey", each on its own line.
如果要訪問循環體內每個元素的索引,請使用內置enumerate
函數:
animals = ['cat', 'dog', 'monkey']
for idx, animal in enumerate(animals):
print('#%d: %s' % (idx + 1, animal))
# Prints "#1: cat", "#2: dog", "#3: monkey", each on its own line
List comprehensions:
編程時,我們經常想要將一種數據轉換爲另一種數據。舉個簡單的例子,考慮以下計算平方數的代碼:
nums = [0, 1, 2, 3, 4]
squares = []
for x in nums:
squares.append(x ** 2)
print(squares) # Prints [0, 1, 4, 9, 16]
您可以使用list comprehension使此代碼更簡單:
nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]
print(squares) # Prints [0, 1, 4, 9, 16]
list comprehension還可以包含條件:
nums = [0, 1, 2, 3, 4]
even_squares = [x ** 2 for x in nums if x % 2 == 0]
print(even_squares) # Prints "[0, 4, 16]"
Dictionaries
字典存儲(鍵,值)對,類似於Java或Javascript中的Map
對象。你可以像這樣使用它:
d = {'cat': 'cute', 'dog': 'furry'} # Create a new dictionary with some data
print(d['cat']) # Get an entry from a dictionary; prints "cute"
print('cat' in d) # Check if a dictionary has a given key; prints "True"
d['fish'] = 'wet' # Set an entry in a dictionary
print(d['fish']) # Prints "wet"
# print(d['monkey']) # KeyError: 'monkey' not a key of d
print(d.get('monkey', 'N/A')) # Get an element with a default; prints "N/A"
print(d.get('fish', 'N/A')) # Get an element with a default; prints "wet"
del d['fish'] # Remove an element from a dictionary
print(d.get('fish', 'N/A')) # "fish" is no longer a key; prints "N/A"
您可以在文檔中找到有關dictionaries的所有信息。
Loops: 很容易迭代字典中的鍵:
d = {'person': 2, 'cat': 4, 'spider': 8}
for animal in d:
legs = d[animal]
print('A %s has %d legs' % (animal, legs))
# Prints "A person has 2 legs", "A cat has 4 legs", "A spider has 8 legs"
如果要訪問Key及其對應的值,請使用以下items
方法:
d = {'person': 2, 'cat': 4, 'spider': 8}
for animal, legs in d.items():
print('A %s has %d legs' % (animal, legs))
# Prints "A person has 2 legs", "A cat has 4 legs", "A spider has 8 legs"
Dictionary comprehensions:
這些與列表理解類似,但允許您輕鬆構建字典。例如:
nums = [0, 1, 2, 3, 4]
even_num_to_square = {x: x ** 2 for x in nums if x % 2 == 0}
print(even_num_to_square) # Prints "{0: 0, 2: 4, 4: 16}"
Sets
集合是不同元素的無序集合。舉個簡單的例子,請考慮以下事項:
animals = {'cat', 'dog'}
print('cat' in animals) # Check if an element is in a set; prints "True"
print('fish' in animals) # prints "False"
animals.add('fish') # Add an element to a set
print('fish' in animals) # Prints "True"
print(len(animals)) # Number of elements in a set; prints "3"
animals.add('cat') # Adding an element that is already in the set does nothing
print(len(animals)) # Prints "3"
animals.remove('cat') # Remove an element from a set
print(len(animals)) # Prints "2"
像往常一樣,您可以在文檔中找到有關Sets的所有信息。
Loops: 對集合進行迭代與迭代列表具有相同的語法; 但是由於集合是無序的,因此您無法對訪問集合元素的順序進行假設:
animals = {'cat', 'dog', 'fish'}
for idx, animal in enumerate(animals):
print('#%d: %s' % (idx + 1, animal))
# Prints "#1: fish", "#2: dog", "#3: cat"
Set comprehensions:
像列表和詞典一樣,我們可以使用Set comprehensions輕鬆構建集合:
from math import sqrt
nums = {int(sqrt(x)) for x in range(30)}
print(nums) # Prints "{0, 1, 2, 3, 4, 5}"
Tuples
Tuple是(不可變的)有序值列表。
Tuple在很多方面類似於列表; 其中一個最重要的區別是Tuple可以用作字典中的鍵和集合的元素,而列表則不能。這是一個簡單的例子:
d = {(x, x + 1): x for x in range(10)} # Create a dictionary with tuple keys
t = (5, 6) # Create a tuple
print(type(t)) # Prints "<class 'tuple'>"
print(d[t]) # Prints "5"
print(d[(1, 2)]) # Prints "1"
文檔包含有關元組的更多信息。
Functions
Python函數是使用def
關鍵字定義的。例如:
def sign(x):
if x > 0:
return 'positive'
elif x < 0:
return 'negative'
else:
return 'zero'
for x in [-1, 0, 1]:
print(sign(x))
# Prints "negative", "zero", "positive"
我們經常定義函數來獲取可選的關鍵字參數,如下所示:
def hello(name, loud=False):
if loud:
print('HELLO, %s!' % name.upper())
else:
print('Hello, %s' % name)
hello('Bob') # Prints "Hello, Bob"
hello('Fred', loud=True) # Prints "HELLO, FRED!"
有關Python函數的更多信息 ,請參閱文檔。
Classes
在Python中定義類的語法很簡單:
class Greeter(object):
# Constructor
def __init__(self, name):
self.name = name # Create an instance variable
# Instance method
def greet(self, loud=False):
if loud:
print('HELLO, %s!' % self.name.upper())
else:
print('Hello, %s' % self.name)
g = Greeter('Fred') # Construct an instance of the Greeter class
g.greet() # Call an instance method; prints "Hello, Fred"
g.greet(loud=True) # Call an instance method; prints "HELLO, FRED!"
您可以在文檔中閱讀有關Python類的更多信息。
Numpy
Numpy 是Python中科學計算的核心庫。它提供了一個高性能的多維數組對象,以及用於處理這些數組的工具。如果您已經熟悉MATLAB,那麼您可能會發現本教程對Numpy入門非常有用。
Arrays
numpy arrays是一個值網格,所有類型都相同,並由非負整數元組索引。維數是array的排名; arrays的形狀是一個給出了arrays中的每個維度大小的整數Tuple。
我們可以從嵌套的Python列表初始化numpy array,並使用方括號訪問元素:
import numpy as np
a = np.array([1, 2, 3]) # Create a rank 1 array
print(type(a)) # Prints "<class 'numpy.ndarray'>"
print(a.shape) # Prints "(3,)"
print(a[0], a[1], a[2]) # Prints "1 2 3"
a[0] = 5 # Change an element of the array
print(a) # Prints "[5, 2, 3]"
b = np.array([[1,2,3],[4,5,6]]) # Create a rank 2 array
print(b.shape) # Prints "(2, 3)"
print(b[0, 0], b[0, 1], b[1, 0]) # Prints "1 2 4"
Numpy還提供了許多創建數組的函數:
import numpy as np
a = np.zeros((2,2)) # Create an array of all zeros
print(a) # Prints "[[ 0. 0.]
# [ 0. 0.]]"
b = np.ones((1,2)) # Create an array of all ones
print(b) # Prints "[[ 1. 1.]]"
c = np.full((2,2), 7) # Create a constant array
print(c) # Prints "[[ 7. 7.]
# [ 7. 7.]]"
d = np.eye(2) # Create a 2x2 identity matrix
print(d) # Prints "[[ 1. 0.]
# [ 0. 1.]]"
e = np.random.random((2,2)) # Create an array filled with random values
print(e) # Might print "[[ 0.91940167 0.08143941]
# [ 0.68744134 0.87236687]]"
您可以在文檔中閱讀有關其他數組創建方法 的信息。
Array indexing
Numpy提供了幾種索引數組的方法。
Slicing: 與Python列表類似,可以切割numpy數組。由於數組可能是多維的,因此必須爲數組的每個維指定一個切片:
import numpy as np
# Create the following rank 2 array with shape (3, 4)
# [[ 1 2 3 4]
# [ 5 6 7 8]
# [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
# [6 7]]
b = a[:2, 1:3]
# A slice of an array is a view into the same data, so modifying it
# will modify the original array.
print(a[0, 1]) # Prints "2"
b[0, 0] = 77 # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1]) # Prints "77"
您還可以將整數索引與切片索引混合使用。但是,這樣做會產生比原始數組更低級別的數組。請注意,這與MATLAB處理數組切片的方式完全不同:
import numpy as np
# Create the following rank 2 array with shape (3, 4)
# [[ 1 2 3 4]
# [ 5 6 7 8]
# [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
# Two ways of accessing the data in the middle row of the array.
# Mixing integer indexing with slices yields an array of lower rank,
# while using only slices yields an array of the same rank as the
# original array:
row_r1 = a[1, :] # Rank 1 view of the second row of a
row_r2 = a[1:2, :] # Rank 2 view of the second row of a
print(row_r1, row_r1.shape) # Prints "[5 6 7 8] (4,)"
print(row_r2, row_r2.shape) # Prints "[[5 6 7 8]] (1, 4)"
# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape) # Prints "[ 2 6 10] (3,)"
print(col_r2, col_r2.shape) # Prints "[[ 2]
# [ 6]
# [10]] (3, 1)"
Integer array indexing:
使用切片索引到numpy數組時,生成的數組視圖將始終是原始數組的子數組。相反,整數數組索引允許您使用另一個數組中的數據構造任意數組。這是一個例子:
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
# An example of integer array indexing.
# The returned array will have shape (3,) and
print(a[[0, 1, 2], [0, 1, 0]]) # Prints "[1 4 5]"
# The above example of integer array indexing is equivalent to this:
print(np.array([a[0, 0], a[1, 1], a[2, 0]])) # Prints "[1 4 5]"
# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]]) # Prints "[2 2]"
# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]])) # Prints "[2 2]"
integer array indexing的一個有用技巧是從矩陣的每一行中選擇或改變一個元素:
import numpy as np
# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
print(a) # prints "array([[ 1, 2, 3],
# [ 4, 5, 6],
# [ 7, 8, 9],
# [10, 11, 12]])"
# Create an array of indices
b = np.array([0, 2, 0, 1])
# Select one element from each row of a using the indices in b
print(a[np.arange(4), b]) # Prints "[ 1 6 7 11]"
# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 10
print(a) # prints "array([[11, 2, 3],
# [ 4, 5, 16],
# [17, 8, 9],
# [10, 21, 12]])
Boolean array indexing:
布爾數組索引允許您選擇數組的任意元素。通常,這種類型的索引用於選擇滿足某些條件的數組元素。這是一個例子:
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
bool_idx = (a > 2) # Find the elements of a that are bigger than 2;
# this returns a numpy array of Booleans of the same
# shape as a, where each slot of bool_idx tells
# whether that element of a is > 2.
print(bool_idx) # Prints "[[False False]
# [ True True]
# [ True True]]"
# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx]) # Prints "[3 4 5 6]"
# We can do all of the above in a single concise statement:
print(a[a > 2]) # Prints "[3 4 5 6]"
爲簡潔起見,我們忽略了很多關於numpy數組索引的細節; 如果你想了解更多,你應該閱讀文檔。
Datatypes
每個numpy數組都是相同類型元素的網格。Numpy提供了一組可用於構造數組的大量數值數據類型。Numpy在創建數組時嘗試猜測數據類型,但構造數組的函數通常還包含一個可選參數來顯式指定數據類型。這是一個例子:
import numpy as np
x = np.array([1, 2]) # Let numpy choose the datatype
print(x.dtype) # Prints "int64"
x = np.array([1.0, 2.0]) # Let numpy choose the datatype
print(x.dtype) # Prints "float64"
x = np.array([1, 2], dtype=np.int64) # Force a particular datatype
print(x.dtype) # Prints "int64"
您可以文檔中閱讀有關numpy數據類型的所有內容。
Array math
基本數學函數在數組上以元素方式運行,既可以作爲運算符重載,也可以作爲numpy模塊中的函數:
import numpy as np
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)
# Elementwise sum; both produce the array
# [[ 6.0 8.0]
# [10.0 12.0]]
print(x + y)
print(np.add(x, y))
# Elementwise difference; both produce the array
# [[-4.0 -4.0]
# [-4.0 -4.0]]
print(x - y)
print(np.subtract(x, y))
# Elementwise product; both produce the array
# [[ 5.0 12.0]
# [21.0 32.0]]
print(x * y)
print(np.multiply(x, y))
# Elementwise division; both produce the array
# [[ 0.2 0.33333333]
# [ 0.42857143 0.5 ]]
print(x / y)
print(np.divide(x, y))
# Elementwise square root; produces the array
# [[ 1. 1.41421356]
# [ 1.73205081 2. ]]
print(np.sqrt(x))
請注意,與MATLAB不同,*
是元素乘法,而不是矩陣乘法。我們使用該dot
函數來計算向量的內積,將向量乘以矩陣,並乘以矩陣。dot
既可以作爲numpy模塊中的函數,也可以作爲數組對象的實例方法:
import numpy as np
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])
v = np.array([9,10])
w = np.array([11, 12])
# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))
# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))
# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
# [43 50]]
print(x.dot(y))
print(np.dot(x, y))
Numpy提供了許多用於在數組上執行計算的有用函數; 其中一個最有用的是sum
:
import numpy as np
x = np.array([[1,2],[3,4]])
print(np.sum(x)) # Compute sum of all elements; prints "10"
print(np.sum(x, axis=0)) # Compute sum of each column; prints "[4 6]"
print(np.sum(x, axis=1)) # Compute sum of each row; prints "[3 7]"
您可以在文檔中找到numpy提供的完整數學函數列表。
除了使用數組計算數學函數之外,我們經常需要重新整形或以其他方式操縱數組中的數據。這種操作的最簡單的例子是轉置矩陣; 要轉置矩陣,只需使用T
數組對象的屬性:
import numpy as np
x = np.array([[1,2], [3,4]])
print(x) # Prints "[[1 2]
# [3 4]]"
print(x.T) # Prints "[[1 3]
# [2 4]]"
# Note that taking the transpose of a rank 1 array does nothing:
v = np.array([1,2,3])
print(v) # Prints "[1 2 3]"
print(v.T) # Prints "[1 2 3]"
Numpy提供了更多用於操作數組的函數; 您可以在文檔中看到完整列表。
Broadcasting
廣播是一種強大的機制,允許numpy在執行算術運算時使用不同形狀的數組。我們經常有一個較小的數組和一個較大的數組,我們希望多次使用較小的array來對較大的array執行某些操作。
例如,假設我們想要向矩陣的每一行添加一個常量向量。我們可以這樣做:
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x) # Create an empty matrix with the same shape as x
# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
y[i, :] = x[i, :] + v
# Now y is the following
# [[ 2 2 4]
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]
print(y)
這有效; 但是當矩陣x
非常大時,在Python中計算顯式循環可能會很慢。請注意,將向量添加v到矩陣的每一行 x
等同於vv
通過堆疊多個v
垂直副本來形成矩陣,然後執行和的元素x
和求和vv
。我們可以像這樣實現這種方法:
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
vv = np.tile(v, (4, 1)) # Stack 4 copies of v on top of each other
print(vv) # Prints "[[1 0 1]
# [1 0 1]
# [1 0 1]
# [1 0 1]]"
y = x + vv # Add x and vv elementwise
print(y) # Prints "[[ 2 2 4
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]"
Numpy廣播允許我們執行此計算而無需實際創建多個副本v
。考慮這個版本,使用廣播:
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v # Add v to each row of x using broadcasting
print(y) # Prints "[[ 2 2 4]
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]"
The line y = x + v
works even though x
has shape (4, 3)
and v
has shape
(3,)
due to broadcasting; this line works as if v
actually had shape (4, 3)
,
where each row was a copy of v
, and the sum was performed elementwise.
將兩個數組一起廣播遵循以下規則:
- 如果數組不具有相同的等級,則將較低等級數組的形狀添加爲1,直到兩個形狀具有相同的長度。
- 如果兩個數組在維度中具有相同的大小,或者如果其中一個數組在該維度中具有大小1,則稱這兩個數組在維度上是兼容的。
- 如果陣列在所有維度上兼容,則可以一起廣播。
- 在廣播之後,每個陣列的行爲就好像它的形狀等於兩個輸入數組的形狀的元素最大值。
- 在一個數組的大小爲1且另一個數組的大小大於1的任何維度中,第一個數組的行爲就像沿着該維度複製一樣。
支持廣播的功能稱爲通用功能。您可以在文檔中找到所有通用功能的列表 。
以下是廣播的一些應用:
import numpy as np
# Compute outer product of vectors
v = np.array([1,2,3]) # v has shape (3,)
w = np.array([4,5]) # w has shape (2,)
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:
# [[ 4 5]
# [ 8 10]
# [12 15]]
print(np.reshape(v, (3, 1)) * w)
# Add a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:
# [[2 4 6]
# [5 7 9]]
print(x + v)
# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column. Gives the following matrix:
# [[ 5 6 7]
# [ 9 10 11]]
print((x.T + w).T)
# Another solution is to reshape w to be a column vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
print(x + np.reshape(w, (2, 1)))
# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
# [[ 2 4 6]
# [ 8 10 12]]
print(x * 2)
廣播通常會使您的代碼更簡潔,更快速,因此您應該儘可能地使用它。
Numpy Documentation
這個簡短的概述涉及了許多關於numpy需要了解的重要事項,但還遠未完成。查看numpy reference,瞭解有關numpy的更多信息。
SciPy
Numpy提供了一個高性能的多維數組和基本工具來計算和操作這些數組。SciPy 以此爲基礎,提供了大量在numpy數組上運行的函數,可用於不同類型的科學和工程應用程序。
熟悉SciPy的最佳方法是 瀏覽文檔。我們將重點介紹您可能會發現對此類有用的SciPy的一些部分。
圖像操作
SciPy提供了一些處理圖像的基本功能。例如,它具有將圖像從磁盤讀取到numpy數組,將numpy數組作爲圖像寫入磁盤以及調整圖像大小的功能。這是一個展示這些功能的簡單示例:
from scipy.misc import imread, imsave, imresize
# Read an JPEG image into a numpy array
img = imread('assets/cat.jpg')
print(img.dtype, img.shape) # Prints "uint8 (400, 248, 3)"
# We can tint the image by scaling each of the color channels
# by a different scalar constant. The image has shape (400, 248, 3);
# we multiply it by the array [1, 0.95, 0.9] of shape (3,);
# numpy broadcasting means that this leaves the red channel unchanged,
# and multiplies the green and blue channels by 0.95 and 0.9
# respectively.
img_tinted = img * [1, 0.95, 0.9]
# Resize the tinted image to be 300 by 300 pixels.
img_tinted = imresize(img_tinted, (300, 300))
# Write the tinted image back to disk
imsave('assets/cat_tinted.jpg', img_tinted)
MATLAB files
The functions scipy.io.loadmat
and scipy.io.savemat
allow you to read and
write MATLAB files. You can read about them
in the documentation.
點之間的距離
SciPy定義了一些用於有效計算點集之間距離的函數。
該函數scipy.spatial.distance.pdist
計算給定集合中所有點對之間的距離:
import numpy as np
from scipy.spatial.distance import pdist, squareform
# Create the following array where each row is a point in 2D space:
# [[0 1]
# [1 0]
# [2 0]]
x = np.array([[0, 1], [1, 0], [2, 0]])
print(x)
# Compute the Euclidean distance between all rows of x.
# d[i, j] is the Euclidean distance between x[i, :] and x[j, :],
# and d is the following array:
# [[ 0. 1.41421356 2.23606798]
# [ 1.41421356 0. 1. ]
# [ 2.23606798 1. 0. ]]
d = squareform(pdist(x, 'euclidean'))
print(d)
您可以在文檔中閱讀有關此功能的所有詳細信息 。
類似的函數(scipy.spatial.distance.cdist
)計算兩組點之間所有對之間的距離; 你可以在文檔中閱讀它。
Matplotlib
Matplotlib是一個繪圖庫。
本節簡要介紹該matplotlib.pyplot
模塊,該模塊提供了類似於MATLAB的繪圖系統。
Plotting
matplotlib中最重要的功能是plot
,它允許您繪製2D數據。這是一個簡單的例子:
import numpy as np
import matplotlib.pyplot as plt
# Compute the x and y coordinates for points on a sine curve
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)
# Plot the points using matplotlib
plt.plot(x, y)
plt.show() # You must call plt.show() to make graphics appear.
運行此代碼會生成以下圖表:
通過一些額外的工作,我們可以輕鬆地一次繪製多條線,並添加標題,圖例和標籤:
import numpy as np
import matplotlib.pyplot as plt
# Compute the x and y coordinates for points on sine and cosine curves
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)
# Plot the points using matplotlib
plt.plot(x, y_sin)
plt.plot(x, y_cos)
plt.xlabel('x axis label')
plt.ylabel('y axis label')
plt.title('Sine and Cosine')
plt.legend(['Sine', 'Cosine'])
plt.show()
您可以在文檔中閱讀有關該plot
功能的 更多信息。
Subplots
您可以使用該subplot
函數在同一圖中繪製不同的東西。這是一個例子:
import numpy as np
import matplotlib.pyplot as plt
# Compute the x and y coordinates for points on sine and cosine curves
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)
# Set up a subplot grid that has height 2 and width 1,
# and set the first such subplot as active.
plt.subplot(2, 1, 1)
# Make the first plot
plt.plot(x, y_sin)
plt.title('Sine')
# Set the second subplot as active, and make the second plot.
plt.subplot(2, 1, 2)
plt.plot(x, y_cos)
plt.title('Cosine')
# Show the figure.
plt.show()
您可以在文檔中閱讀有關該subplot
功能的 更多信息。
Images
您可以使用該imshow
功能顯示圖像。這是一個例子:
import numpy as np
from scipy.misc import imread, imresize
import matplotlib.pyplot as plt
img = imread('assets/cat.jpg')
img_tinted = img * [1, 0.95, 0.9]
# Show the original image
plt.subplot(1, 2, 1)
plt.imshow(img)
# Show the tinted image
plt.subplot(1, 2, 2)
# A slight gotcha with imshow is that it might give strange results
# if presented with data that is not uint8. To work around this, we
# explicitly cast the image to uint8 before displaying it.
plt.imshow(np.uint8(img_tinted))
plt.show()