パーセプトロンを用いた分類

理論

f:id:kdog08:20170430002037p:plain:w300

入力信号 ${\bf x}=\{x_1,x_2,\cdots,x_n\}$ は重み ${\bf w}=\{w_1,w_2,\cdots,w_n\}$ が乗算されて送られ、それらの総和 $u$ がニューロンに渡される。次に、総和 $u$ は活性化関数 $f()$ (ここではステップ関数を想定)に渡されて予測ラベル $y'$ が出力される。予測ラベルと教師ラベル $y$ が誤差関数に渡されるとその結果に応じて学習効率 $\eta$ で重み ${\bf w}$ が更新される。

したがって、ニューロンに渡される総和は

$\displaystyle \begin{equation} u = b + \sum_{i=1}^{n}x_{i} w_{i} \end{equation}$

$b$ はバイアスと呼ばれ、ニューロンの発火を調整する閾値である。

予測ラベルは

$\begin{eqnarray} y' = f(u) = \left\{ \begin{array}{ll} 1 & \text{ if } u \ge 0 \\ -1 & \text{ otherwise }. \end{array}\right. \end{eqnarray}$

今回用いる誤差関数は教師ラベル $y$ とニューロンが出力した予測ラベル $y'$ の差分を取るだけである。したがって、学習後の重みとバイアスは以下のように更新される。

$\begin{eqnarray} \left\{ \begin{array}{ll} w_i &:= w_i + \eta(y-y')x_i & (i = 1,2,\cdots,n) \\ b &:= b + \eta(y-y') & \end{array}\right. \end{eqnarray}$

実装

class Perceptron():
    def __init__(self, eta=0.01, n_iter=10):
        self.eta = eta
        self.n_iter = n_iter
    def train(self, X, y):
        self.w_ = np.zeros(X.shape[1])
        self.b_ = np.zeros(1)
        self.errors_ = []
        
        self.w_history = []
        self.b_history = []
        
        for _ in range(self.n_iter):
            errors = 0
            for xi, label in zip(X, y):
                update = self.eta * (label - self.predict(xi))
                self.w_ = self.w_ + update*xi
                self.b_ = self.b_ + update
                errors += int(update != 0.0)
            self.errors_.append(errors)
            self.w_history.append(self.w_)
            self.b_history.append(self.b_)
        
        return self
        
    def predict(self,X):
        A = np.dot(X,self.w_) + self.b_
        return np.where(A>=0.0, 1, -1)

計算条件

データセット UCI Machine Learning RepositoryにあるIrisデータセットを用いた。ただし、SetosaおよびVersicolorの2つの品種クラスに関する「がく片の長さ」と「花びらの長さ」の2つの特徴量を含むデータセットのみを使用した。

実行環境 プログラムはすべてJupyter Notebook上で実行した。

$ python -V
Python 3.5.3 :: Anaconda custom (x86_64)

結果

(実行コマンド1)

ppn = Perceptron(eta=0.01, n_iter=10).train(X, y)

plt.plot(range(1, len(ppn.errors_) + 1), ppn.errors_, marker='o')
plt.xlabel('Number of iteration')
plt.ylabel('Number of updates')

plt.tight_layout()
plt.show()

(出力1) 横軸は反復回数(エポック数)、縦軸はエラーの個数。6回目の学習のときにエラーの数が0になり、学習が終了していることがわかった。

f:id:kdog08:20170430040319p:plain:w300

(実行コマンド2)

w_history = np.array(ppn.w_history)
b_history = np.array(ppn.b_history)
print(w_history.T)
print(b_history.T)

(出力2) 重み ${\bf w}$ およびバイアス $b$ の更新履歴

[[ 0.038  0.076  0.022  0.034 -0.068 -0.068 -0.068 -0.068 -0.068 -0.068]
 [ 0.066  0.132  0.168  0.21   0.182  0.182  0.182  0.182  0.182  0.182]]
[[ 0.    0.   -0.02 -0.02 -0.04 -0.04 -0.04 -0.04 -0.04 -0.04]]

(実行コマンド3) 上の計算では重みとバイアスの初期条件を

self.w_ = np.zeros(X.shape[1])
self.b_ = np.zeros(1)

としていたが、例えば以下のように変更する。

self.w_ = np.array([0.6,0.1])
self.b_ = np.zeros(1)

(出力3) パーセプトロンの分類は初期条件に依存することがわかった。

f:id:kdog08:20170430043113p:plain:w300

[[ 0.146  0.086  0.124  0.064 -0.038 -0.038 -0.038 -0.038 -0.038 -0.038]
 [ 0.02   0.058  0.124  0.162  0.134  0.134  0.134  0.134  0.134  0.134]]
[[-0.1  -0.12 -0.12 -0.14 -0.16 -0.16 -0.16 -0.16 -0.16 -0.16]]