datasets中make_moons()函數:
(1)函數定義
Signature:
datasets.make_moons(
n_samples=100,
shuffle=True,
noise=None,
random_state=None,
)
Docstring:
Make two interleaving half circles
A simple toy dataset to visualize clustering and classification algorithms.
Parameters
----------
n_samples : int, optional (default=100)
The total number of points generated.
shuffle : bool, optional (default=True)
Whether to shuffle the samples.
noise : double or None (default=None)
Standard deviation of Gaussian noise added to the data.
random_state : int, RandomState instance or None (default)
Determines random number generation for dataset shuffling and noise.
Pass an int for reproducible output across multiple function calls.
See :term:`Glossary <random_state>`.
Returns
-------
X : array of shape [n_samples, 2]
The generated samples.
y : array of shape [n_samples]
The integer labels (0 or 1) for class membership of each sample.
(2)函數圖像
1. 導入所需的模塊和包
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
2. 通過make_moons()函數配合噪聲和隨機種子繪製出數據點,並獲取數據集的值
X, y = datasets.make_moons(noise=0.15, random_state=666)
plt.scatter(X[y == 0, 0], X[y == 0, 1])
plt.scatter(X[y == 1, 0], X[y == 1, 1])
plt.show()
3. 使用多項式特徵的SVM
from sklearn.preprocessing import PolynomialFeatures, StandardScaler
from sklearn.svm import LinearSVC
from sklearn.pipeline import Pipeline
def PolynomialSVC(degree, C=1.0):
return Pipeline([
("poly", PolynomialFeatures(degree=degree)),
("std_scaler", StandardScaler()),
("linearSVC", LinearSVC(C=C))
])
poly_svc = PolynomialSVC(degree=3)
poly_svc.fit(X, y)
4. 繪製決策界限的函數
def plot_decision_boundary(model, axis):
x0, x1 = np.meshgrid(
np.linspace(axis[0], axis[1], int((axis[1]-axis[0])*100)),
np.linspace(axis[2], axis[3], int((axis[3]-axis[2])*100))
)
x_new = np.c_[x0.ravel(), x1.ravel()]
y_predict = model.predict(x_new).reshape(x0.shape)
from matplotlib.colors import ListedColormap
custom_cmap = ListedColormap(['#EF9A9A', '#FFF59D', '#90CAF9'])
plt.contourf(x0,x1,y_predict, linewidth=5, cmap=custom_cmap)
5. 繪製分界線(如圖所示爲非線性)
plot_decision_boundary(poly_svc, axis=[-1.5, 2.5, -1.0, 1.5])
plt.scatter(X[y == 0, 0], X[y == 0, 1])
plt.scatter(X[y == 1, 0], X[y == 1, 1])
plt.show()