Mlxtend是一个基于Python的开源项目,主要为日常处理数据科学相关的任务提供了一些工具和扩展,项目的Github地址:https://github.com/rasbt/mlxtend

在项目文档的User Guide一栏可以看到,mlxtend主要提供如下几个大类的工具模块

classifier

  • Adaline
  • EnsembleVoteClassifier
  • LogisticRegression
  • NeuralNetMLP
  • Perceptron

regressor

  • LinearRegression

regression_utils

  • plot linear regression

feature_selection

  • SequentialFeatureSelector

evaluate

  • Confusion Matrix
  • Plot decision regions
  • Plot learning curves
  • Scoring

preprocesssing

  • DenseTransformer
  • MeanCenterer
  • Minmax scaling
  • Shuffle arrays unison
  • Standardize

data

  • AutoMPG data
  • Boston housing data
  • Iris data
  • Mnist data
  • Load mnist
  • Wine data

file_io

  • Find filegroups
  • Find files

general plotting

  • Category scatter
  • Enrichment plot
  • Stacked barplot

math

  • Num combinations
  • Num permutations

text

  • Generalize names
  • Generalize names duplcheck
  • Tokenizer

utils

  • Counter

general concepts

  • Activation functions
  • Gradient optimization
  • Linear gradient derivative
  • Regularization linear

以上每个工具模块都附有相应的example、API和source code,可方便查阅。

附上项目首页的example:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import itertools
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from mlxtend.classifier import EnsembleVoteClassifier
from mlxtend.data import iris_data
from mlxtend.evaluate import plot_decision_regions

# Initializing Classifiers
clf1 = LogisticRegression(random_state=0)
clf2 = RandomForestClassifier(random_state=0)
clf3 = SVC(random_state=0, probability=True)
eclf = EnsembleVoteClassifier(clfs=[clf1, clf2, clf3], weights=[2, 1, 1], voting='soft')

# Loading some example data
X, y = iris_data()
X = X[:,[0, 2]]

# Plotting Decision Regions
gs = gridspec.GridSpec(2, 2)
fig = plt.figure(figsize=(10, 8))
for clf, lab, grd in zip([clf1, clf2, clf3, eclf],
                         ['Logistic Regression', 'Random Forest', 'Naive Bayes', 'Ensemble'],
                         itertools.product([0, 1], repeat=2)):
    clf.fit(X, y)
    ax = plt.subplot(gs[grd[0], grd[1]])
    fig = plot_decision_regions(X=X, y=y, clf=clf, legend=2)
    plt.title(lab)
plt.show()

mlxtend

Updated:

Leave a Comment