机器学习算法可以分类不同的波形?

机器学习算法可以分类不同的波形?

三种波形

假设我们有三种波形,如图所示。

  • 蓝色模式显示从正弦函数采样的数据;

  • 红色模式显示从随机分布采样的数据;

  • 绿色模式显示从三个正弦函数的组合中采样的数据;

这里是我们的问题:如果我们有很多由这三种波形组成的样本,机器学习算法是否可以正确分类这些波形,如监督学习中的一般多类分类问题?

简单的答案?是! 机器学习算法和深度学习机制都可以正确管理这样的多类分类问题。这里是关于实验的简要介绍。

  • 数据:有1万个训练波形,10,000个测试波形。每个模式有500个数据点。三种模式大致均匀分布在训练数据集和测试数据集中。

  • 算法:在Scikit-Learn下的四种机器学习分类算法(朴素贝叶斯,随机森林,梯度增强和支持向量机)和Keras实现的一种神经网络体系结构(即多层感知器)进行了测试。

  • 性能:所有分类方法都可以预测精度高于98%的结果。

用Python实现Scikit-Learn下的四种分类算法

# import library from Scikit-Learn ---------------------------------------------

from sklearn.metrics import accuracy_score

from sklearn.metrics import confusion_matrix

# algorithm 1 ------------------------------------------------------------------

print(" Naive Bayes ... ")

start = timeit.default_timer()

from sklearn import naive_bayes

classifier = naive_bayes.GaussianNB()

nb_model = classifier.fit(X, Y)

prediction = nb_model.predict(X_test)

end = timeit.default_timer()

print(" accuracy = ", accuracy_score(Y_test, prediction), " time = ", end - start)

print(confusion_matrix(Y_test, prediction))

print("")

# algorithm 2 ------------------------------------------------------------------

print(" Random Forest ... ")

start = timeit.default_timer()

from sklearn.ensemble import RandomForestClassifier

classifier = RandomForestClassifier()

rf_model = classifier.fit(X, Y)

prediction = rf_model.predict(X_test)

end = timeit.default_timer()

print(" accuracy = ", accuracy_score(Y_test, prediction), " time = ", end - start)

print(confusion_matrix(Y_test, prediction))

print("")

# algorithm 3 ------------------------------------------------------------------

print(" Gradient Boosting ... ")

start = timeit.default_timer()

from sklearn.ensemble import GradientBoostingClassifier as gbc

classifier = gbc()

gbc_model = classifier.fit(X, Y)

prediction = gbc_model.predict(X_test)

end = timeit.default_timer()

print(" accuracy = ", accuracy_score(Y_test, prediction), " time = ", end - start)

print(confusion_matrix(Y_test, prediction))

print("")

# algorithm 4 ------------------------------------------------------------------

print(" SVM ... ")

start = timeit.default_timer()

from sklearn import svm

classifier = svm.SVC()

svc_model = classifier.fit(X, Y)

prediction = svc_model.predict(X_test)

end = timeit.default_timer()

print(" accuracy = ", accuracy_score(Y_test, prediction), " time = ", end - start)

print(confusion_matrix(Y_test, prediction))

print("")

由Keras实现的三层多层感知器

____________________________________________

Layer (type) Output Shape Param #

==========================================

dense_1 (Dense) (None, 32) 16032

_____________________________________________

activation_1 (Activation) (None, 32) 0

_____________________________________________

dense_2 (Dense) (None, 32) 1056

_____________________________________________

activation_2 (Activation) (None, 32) 0

_____________________________________________

dense_3 (Dense) (None, 32) 1056

_____________________________________________

activation_3 (Activation) (None, 32) 0

_____________________________________________

dense_4 (Dense) (None, 3) 99

_____________________________________________

activation_4 (Activation) (None, 3) 0

===========================================

Total params: 18,243

Trainable params: 18,243

Non-trainable params: 0

不同算法下的精确度和执行时间

MLP (3 layers, 32 neurons, 30 epochs) ...

accuracy = 0.9986 time = 14.8910531305

[[3378 0 0]

[ 0 3407 0]

[ 13 1 3201]]

Naive Bayes ...

accuracy = 0.9874 time = 1.30055759897

[[3252 0 126]

[ 0 3407 0]

[ 0 0 3215]]

Random Forest ...

accuracy = 1.0 time = 3.23794154915

[[3378 0 0]

[ 0 3407 0]

[ 0 0 3215]]

Gradient Boosting ...

accuracy = 1.0 time = 128.849120774

[[3378 0 0]

[ 0 3407 0]

[ 0 0 3215]]

SVM ...

accuracy = 1.0 time = 116.055568152

[[3378 0 0]

[ 0 3407 0]

[ 0 0 3215]]

相关推荐