Building classification model with
Python Step by Step
Chuc1803@gmail.com
Bài
viết này giới thiệu cách xây dựng các mô hình phân lớp dữ liệu bằng ngôn ngữ
Python theo các bước sau:
1.
Loading the dataset.
2.
Summarizing the dataset.
3.
Visualizing the dataset.
4.
Building and Evaluating some classification
algorithms.
5.
Making some predictions.
Yêu cầu:
Python
software installed
Dataset:
iris (download here)
1.
Loading
the dataset
# Load libraries
import pandas as pd
from pandas.plotting
import scatter_matrix
import matplotlib.pyplot
as plt
from sklearn import
model_selection
from sklearn.metrics
import classification_report
from sklearn.metrics
import confusion_matrix
from sklearn.metrics
import accuracy_score
from sklearn.linear_model
import LogisticRegression
from sklearn.tree import
DecisionTreeClassifier
from sklearn.neighbors
import KNeighborsClassifier
from
sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.naive_bayes
import GaussianNB
from sklearn.svm import
SVC
# Load dataset
dataset =
np.read_csv("D:/Python_Pro/iris.csv")
2.
Summarizing
the dataset
3.
Visualizing
the dataset
4.
Building
and Evaluating some classification algorithms
# Split dataset into
train, test and validation sets
array = dataset.values
X = array[:,0:4]
Y = array[:,4]
X_train, X_validation,
Y_train, Y_validation = model_selection.train_test_split(X, Y, test_size=0.2,
random_state=7)
# Building and
evaluating classification Algorithms
models=[
models.append(('LR',
LogisticRegression()))
models.append(('LDA',
LinearDiscriminantAnalysis()))
models.append(('KNN',
KNeighborsClassifier()))
models.append(('CART',
DecisionTreeClassifier()))
models.append(('NB',
GaussianNB()))
models.append(('SVM',
SVC()))
# evaluate each model in
turn
results=[
names=[
for name, model in
models:
kfold = model_selection.KFold(n_splits=10,
random_state=7)
cv_results =
model_selection.cross_val_score(model, X_train, Y_train, cv=kfold,
scoring='accuracy')
results.append(cv_results)
names.append(name)
msg = "%s: %f (%f)" % (name,
cv_results.mean(), cv_results.std())
print(msg)
#
Compare Algorithms
fig = plt.figure()
fig.suptitle('Algorithm
Comparison')
ax = fig.add_subplot(111)
plt.boxplot(results)
ax.set_xticklabels(names)
plt.show()
5.
Making
some predictions
Download
Code file Here
View
Video Here