Scikit-learn Interface (alpha)¶

This module support an interface between hyperspectral algorithms and scikit-learn.

Cross Validation
HyperAdaBoostClassifier
HyperBaggingClassifier
HyperExtraTreesClassifier
HyperGaussianNB
HyperGradientBoostingClassifier
HyperKNeighborsClassifier
HyperLogisticRegression
HyperRandomForestClassifier
Suppot Vector Supervised Classification (HyperSVC)
Unsupervised clustering using KMeans

Utility functions.

hyper_scale
shape_to_XY

Cross Validation¶

class pysptools.skl.HyperEstimatorCrossVal(estimator, param_grid)[source]¶

Do a cross validation on a hypercube or a concatenation of hypercubes. Use scikit-learn KFold and GridSearchCV.

fit(X, y)[source]¶

Run the cross validation.

Parameters:	X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum. y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value.

fit_cube(M, mask)[source]¶

Do a cross validation on a hypercube

Parameters:	M – numpy array A HSI cube (m x n x p). mask – numpy array A class map mask.

get_best_params()[source]¶

Returns: dic: Dic of best match.

print(label='No title')[source]¶

Print a summary for the cross validation results.

Parameters:	label – string The test title.

HyperAdaBoostClassifier¶

class pysptools.skl.HyperAdaBoostClassifier(base_estimator=None, n_estimators=50, learning_rate=1.0, algorithm='SAMME.R', random_state=None)[source]¶

Apply scikit-learn AdaBoostClassifier on a hypercube.

For the __init__ class contructor parameters: see the sklearn.ensemble.AdaBoostClassifier class parameters

The class is intrumented to be use with the scikit-learn cross validation. It use the plot and display methods from the class Output.

classify(M)[source]¶

Classify a hyperspectral cube.

Parameters:	M – numpy array A HSI cube (m x n x p).

Returns: numpy array: A class map (m x n x 1).

display_feature_importances(n_labels='all', height=0.2, sort=False, suffix=None)[source]¶

Display the feature importances. The output can be split in n graphs.

Parameters:	n_labels – string or integer The number of labels to output by graph. If the value is ‘all’, only one graph is generated. height – float [default 0.2] The bar height (in fact width). sort – boolean [default False] If true the feature importances are sorted. suffix – string [default None] Add a suffix to the file name.

fit(X, y, sample_weight=None)[source]¶

Same as the sklearn.ensemble.GradientBoostingClassifier fit call.

Parameters:	X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum. y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value. sample_weight – array-like of shape = [n_samples], optional Sample weights. If None, the sample weights are initialized to `1 / n_samples`.

fit_rois(M, ROIs)[source]¶

Fit the HS cube M with the use of ROIs.

Parameters:	M – numpy array A HSI cube (m x n x p). ROIs – ROIs type Regions of interest instance.

plot_feature_importances(path, n_labels='all', height=0.2, sort=False, suffix=None)[source]¶

Plot the feature importances. The output can be split in n graphs.

Parameters:

path – string The path where to save the plot.
n_labels – string or integer The number of labels to output by graph. If the value is ‘all’, only one graph is generated.
height – float [default 0.2] The bar height (in fact width).
sort – boolean [default False] If true the feature importances are sorted.
suffix – string [default None] Add a suffix to the file name.

HyperBaggingClassifier¶

class pysptools.skl.HyperBaggingClassifier(base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, warm_start=False, n_jobs=1, random_state=None, verbose=0)[source]¶

Apply scikit-learn BaggingClassifier on a hypercube.

For the __init__ class contructor parameters: see the sklearn.ensemble.BaggingClassifier class parameters

The class is intrumented to be use with the scikit-learn cross validation. It use the plot and display methods from the class Output.

classify(M)[source]¶

Classify a hyperspectral cube.

Parameters:	M – numpy array A HSI cube (m x n x p).

Returns: numpy array: A class map (m x n x 1).

fit(X, y, sample_weight=None)[source]¶

Same as the sklearn.ensemble.BaggingClassifier fit call.

Parameters:

X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum.
y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value.
sample_weight – array-like, shape = [n_samples] or None Sample weights. If None, then samples are equally weighted. Note that this is supported only if the base estimator supports sample weighting.

fit_rois(M, ROIs)[source]¶

Fit the HS cube M with the use of ROIs.

Parameters:	M – numpy array A HSI cube (m x n x p). ROIs – ROIs type Regions of interest instance.

HyperExtraTreesClassifier¶

class pysptools.skl.HyperExtraTreesClassifier(n_estimators=10, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, min_impurity_split=1e-07, bootstrap=False, oob_score=False, n_jobs=1, random_state=None, verbose=0, warm_start=False, class_weight=None)[source]¶

Apply scikit-learn ExtraTreesClassifier on a hypercube.

For the __init__ class contructor parameters: see the sklearn.ensemble.ExtraTreesClassifier

The class is intrumented to be use with the scikit-learn cross validation. It use the plot and display methods from the class Output.

classify(M)[source]¶

Classify a hyperspectral cube.

Parameters:	M – numpy array A HSI cube (m x n x p).

Returns: numpy array: A class map (m x n x 1).

display_feature_importances(n_labels='all', height=0.2, sort=False, suffix=None)[source]¶

Display the feature importances. The output can be split in n graphs.

Parameters:	n_labels – string or integer The number of labels to output by graph. If the value is ‘all’, only one graph is generated. height – float [default 0.2] The bar height (in fact width). sort – boolean [default False] If true the feature importances are sorted. suffix – string [default None] Add a suffix to the file name.

fit(X, y, sample_weight=None)[source]¶

Same as the sklearn.ensemble.ExtraTreesClassifier fit call.

Parameters:

X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum.
y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value.
sample_weight – array-like, shape = [n_samples] or None Sample weights. If None, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. In the case of classification, splits are also ignored if they would result in any single class carrying a negative weight in either child node.

fit_rois(M, ROIs)[source]¶

Fit the HS cube M with the use of ROIs.

Parameters:	M – numpy array A HSI cube (m x n x p). ROIs – ROIs type Regions of interest instance.

plot_feature_importances(path, n_labels='all', height=0.2, sort=False, suffix=None)[source]¶

Plot the feature importances. The output can be split in n graphs.

Parameters:

path – string The path where to save the plot.
n_labels – string or integer The number of labels to output by graph. If the value is ‘all’, only one graph is generated.
height – float [default 0.2] The bar height (in fact width).
sort – boolean [default False] If true the feature importances are sorted.
suffix – string [default None] Add a suffix to the file name.

HyperGaussianNB¶

class pysptools.skl.HyperGaussianNB(priors=None)[source]¶

Apply scikit-learn GaussianNB on a hypercube.

For the __init__ class contructor parameters: see the sklearn.naive_bayes.GaussianNB class parameters

The class is intrumented to be use with the scikit-learn cross validation. It use the plot and display methods from the class Output.

classify(M)[source]¶

Classify a hyperspectral cube.

Parameters:	M – numpy array A HSI cube (m x n x p).

Returns: numpy array: A class map (m x n x 1).

fit(X, y, sample_weight=None)[source]¶

Same as the sklearn.naive_bayes.GaussianNB fit call.

Parameters:	X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum. y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value.

fit_rois(M, ROIs)[source]¶

Fit the HS cube M with the use of ROIs.

Parameters:	M – numpy array A HSI cube (m x n x p). ROIs – ROIs type Regions of interest instance.

HyperGradientBoostingClassifier¶

class pysptools.skl.HyperGradientBoostingClassifier(loss='deviance', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_split=1e-07, init=None, random_state=None, max_features=None, verbose=0, max_leaf_nodes=None, warm_start=False, presort='auto')[source]¶

Apply scikit-learn GradientBoostingClassifier on a hypercube.

For the __init__ class contructor parameters: see the sklearn.ensemble.GradientBoostingClassifier class parameters

The class is intrumented to be use with the scikit-learn cross validation. It use the plot and display methods from the class Output.

classify(M)[source]¶

Classify a hyperspectral cube.

Parameters:	M – numpy array A HSI cube (m x n x p).

Returns: numpy array: A class map (m x n x 1).

display_feature_importances(n_labels='all', height=0.2, sort=False, suffix=None)[source]¶

Display the feature importances. The output can be split in n graphs.

Parameters:	n_labels – string or integer The number of labels to output by graph. If the value is ‘all’, only one graph is generated. height – float [default 0.2] The bar height (in fact width). sort – boolean [default False] If true the feature importances are sorted. suffix – string [default None] Add a suffix to the file name.

fit(X, y)[source]¶

Same as the sklearn.ensemble.GradientBoostingClassifier fit call.

Parameters:	X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum. y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value.

fit_rois(M, ROIs)[source]¶

Fit the HS cube M with the use of ROIs.

Parameters:	M – numpy array A HSI cube (m x n x p). ROIs – ROIs type Regions of interest instance.

plot_feature_importances(path, n_labels='all', height=0.2, sort=False, suffix=None)[source]¶

Plot the feature importances. The output can be split in n graphs.

Parameters:

path – string The path where to save the plot.
n_labels – string or integer The number of labels to output by graph. If the value is ‘all’, only one graph is generated.
height – float [default 0.2] The bar height (in fact width).
sort – boolean [default False] If true the feature importances are sorted.
suffix – string [default None] Add a suffix to the file name.

HyperKNeighborsClassifier¶

class pysptools.skl.HyperKNeighborsClassifier(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', metric_params=None, n_jobs=1, **kwargs)[source]¶

Apply scikit-learn KNeighborsClassifier on a hypercube.

For the __init__ class contructor parameters: see the sklearn.neighbors.KNeighborsClassifier class parameters

The class is intrumented to be use with the scikit-learn cross validation. It use the plot and display methods from the class Output.

classify(M)[source]¶

Classify a hyperspectral cube.

Parameters:	M – numpy array A HSI cube (m x n x p).

Returns: numpy array: A class map (m x n x 1).

fit(X, y)[source]¶

Same as the sklearn.neighbors.KNeighborsClassifier fit call.

Parameters:	X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum. y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value.

fit_rois(M, ROIs)[source]¶

Fit the HS cube M with the use of ROIs.

Parameters:	M – numpy array A HSI cube (m x n x p). ROIs – ROIs type Regions of interest instance.

HyperLogisticRegression¶

class pysptools.skl.HyperLogisticRegression(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver='liblinear', max_iter=100, multi_class='ovr', verbose=0, warm_start=False, n_jobs=1)[source]¶

Apply scikit-learn LogisticRegression on a hypercube.

For the __init__ class contructor parameters: see the sklearn.linear_model.LogisticRegression class parameters

The class is intrumented to be use with the scikit-learn cross validation. It use the plot and display methods from the class Output.

classify(M)[source]¶

Classify a hyperspectral cube.

Parameters:	M – numpy array A HSI cube (m x n x p).

Returns: numpy array: A class map (m x n x 1).

fit(X, y)[source]¶

Same as the sklearn.linear_model.HyperLogisticRegression fit call.

Parameters:	X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum. y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value.

fit_rois(M, ROIs)[source]¶

Fit the HS cube M with the use of ROIs.

Parameters:	M – numpy array A HSI cube (m x n x p). ROIs – ROIs type Regions of interest instance.

HyperRandomForestClassifier¶

class pysptools.skl.HyperRandomForestClassifier(n_estimators=10, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=1, random_state=None, verbose=0, warm_start=False, class_weight=None)[source]¶

Apply scikit-learn RandomForestClassifier on a hypercube.

For the __init__ class contructor parameters: see the sklearn.ensemble.RandomForestClassifier class parameters

The class is intrumented to be use with the scikit-learn cross validation. It use the plot and display methods from the class Output.

classify(M)[source]¶

Classify a hyperspectral cube.

Parameters:	M – numpy array A HSI cube (m x n x p).

Returns: numpy array: A class map (m x n x 1).

display_feature_importances(n_labels='all', height=0.2, sort=False, suffix=None)[source]¶

Display the feature importances. The output can be split in n graphs.

Parameters:	n_labels – string or integer The number of labels to output by graph. If the value is ‘all’, only one graph is generated. height – float [default 0.2] The bar height (in fact width). sort – boolean [default False] If true the feature importances are sorted. suffix – string [default None] Add a suffix to the file name.

fit(X, y)[source]¶

Same as the sklearn.ensemble.RandomForestClassifier fit call.

Parameters:	X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum. y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value.

fit_rois(M, ROIs)[source]¶

Fit the HS cube M with the use of ROIs.

Parameters:	M – numpy array A HSI cube (m x n x p). ROIs – ROIs type Regions of interest instance.

plot_feature_importances(path, n_labels='all', height=0.2, sort=False, suffix=None)[source]¶

Plot the feature importances. The output can be split in n graphs.

Parameters:

path – string The path where to save the plot.
n_labels – string or integer The number of labels to output by graph. If the value is ‘all’, only one graph is generated.
height – float [default 0.2] The bar height (in fact width).
sort – boolean [default False] If true the feature importances are sorted.
suffix – string [default None] Add a suffix to the file name.

Suppot Vector Supervised Classification (HyperSVC)¶

see test_HyperSVC.py for an example

class pysptools.skl.HyperSVC(C=1.0, kernel='rbf', degree=3, gamma='auto', coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape=None, random_state=None)[source]¶

Apply scikit-learn SVC on a hypercube.

For the __init__ class contructor parameters: see the sklearn.svm.SVC class parameters

The class is intrumented to be use with the scikit-learn cross validation. It use the plot and display methods from the class Output.

Note: the class always do a preprocessing.scale before any processing.

Note: the C parameter is set to 1, the result of this setting is that the class_weight is relative to C and that the first value of class_weight is the background. An example: you wish to fit two classes “1” and “2” with the help of one ROI for each, you declare class_weight like this:

class_weight={0:1,1:10,2:10}

0: is always the background and is set to 1, 1: is the first class,

2: is the second. A value of 10 for both classes give good results to start with.

classify(M)[source]¶

Classify a hyperspectral cube. Do a preprocessing.scale before.

Parameters:	M – numpy array A HSI cube (m x n x p).

Returns: numpy array: A class map (m x n x 1).

fit(X, y)[source]¶

Same as the sklearn.svm.SVC fit call, but with preprocessing.scale call first.

Parameters:	X – numpy array A vector (n_samples, n_features) where each element n_features is a spectrum. y – numpy array Target values (n_samples,). A zero value is the background. A value of one or more is a class value.

fit_rois(M, ROIs)[source]¶

Fit the HS cube M with the use of ROIs.

Parameters:	M – numpy array A HSI cube (m x n x p). ROIs – ROIs type Regions of interest instance.

predict(X)[source]¶

Same as the sklearn.svm.SVC predict call, but with a call to preprocessing.scale first.

Parameters:	X – numpy array A vector where each element is a spectrum.

Unsupervised clustering using KMeans¶

See the file test_kmeans.py for an example.

class pysptools.skl.KMeans[source]¶

KMeans clustering algorithm adapted to hyperspectral imaging

display(interpolation='none', colorMap='Accent', suffix=None)¶

Display the cluster map.

Parameters:

path – string The path where to put the plot.
interpolation – string [default none] A matplotlib interpolation method.
colorMap – string [default ‘Accent’] A color map element of [‘Accent’, ‘Dark2’, ‘Paired’, ‘Pastel1’, ‘Pastel2’, ‘Set1’, ‘Set2’, ‘Set3’], “Accent” is the default and it fall back on “Jet”.
suffix – string [default None] Add a suffix to the title.

plot(path, interpolation='none', colorMap='Accent', suffix=None)¶

Plot the cluster map.

Parameters:

path – string The path where to put the plot.
interpolation – string [default none] A matplotlib interpolation method.
colorMap – string [default ‘Accent’] A color map element of [‘Accent’, ‘Dark2’, ‘Paired’, ‘Pastel1’, ‘Pastel2’, ‘Set1’, ‘Set2’, ‘Set3’], “Accent” is the default and it fall back on “Jet”.
suffix – string [default None] Add a suffix to the file name.

predict(M, n_clusters=5, n_jobs=1, init='k-means++')¶

KMeans clustering algorithm adapted to hyperspectral imaging. It is a simple wrapper to the scikit-learn version.

Parameters:

M – numpy array A HSI cube (m x n x p).
n_clusters – int [default 5] The number of clusters to generate.
n_jobs – int [default 1] Taken from scikit-learn doc: The number of jobs to use for the computation. This works by breaking down the pairwise matrix into n_jobs even slices and computing them in parallel. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.
init – string or array [default ‘k-means++’] Taken from scikit-learn doc: Method for initialization, defaults to k-means++: k-means++ : selects initial cluster centers for k-mean clustering in a smart way to speed up convergence. See section Notes in k_init for more details. random: choose k observations (rows) at random from data for the initial centroids. If an ndarray is passed, it should be of shape (n_clusters, n_features) and gives the initial centers.

Returns: numpy array: A cluster map (m x n x c), c is the clusters number .

hyper_scale¶

pysptools.skl.hyper_scale(M)[source]¶

Center a hyperspectral image to the mean and component wise scale to unit variance.

Call scikit-learn preprocessing.scale()

shape_to_XY¶

pysptools.skl.shape_to_XY(M_list, cmap_list)[source]¶

Receive as input a hypercubes list and the corresponding masks list. The function reshape and concatenate both to create the X and Y arrays.

Parameters:	M_list – numpy array list A list of HSI cube (m x n x p). cmap_list – numpy array list A list of class map (m x n), as usual the classes are numbered: 0 for the background, 1 for the first class …