Set up analysis


Defines functions used for analysis.

Functions used to import training and analysis data


Both of these functions import datacube data using a query, and return an xarray dataset with multiple bands/variables and 'geo_transform' and 'proj' attributes. This format is required as an input to both randomforest_train and randomforest_classify, and ensures that both training and analysis data are consistent.

Import training data and fit model


Uses randomforest_train to extract training data from potentially multiple training shapefiles, and returns a trained classifier (and optionally, training label and training sample arrays)

Import analysis data and classify


Classifies and exports an analysis dataset using a previously trained random forest classifier, provided this dataset has the same number of bands/variables as the data used to train the classifier. Using the same data function (e.g. tc_import, hltc_import) used to train the classifier will ensure this is the case. By setting 'class_prob = True', can optionally export a geotiff of predicted class probabilities in addition to classification output.

Loading output library...
Loading output library...
Loading output library...

Feature/band/variable importance


Extract classifier estimates of the relative importance of each band/variable for training the classifier. Useful for potentially selecting a subset of input bands/variables for model training/classification (i.e. optimising feature space)

Loading output library...
Loading output library...

Export tree diagrams


Export .png plots of each decision tree in the random forest ensemble. Useful for inspecting the splits used by the classifier to classify the data.

Plot performance of model by parameter values


Random forest classifiers contain many modifiable parameters that can strongly affect the performance of the model. This section evaluates the effect of these parameters by plotting out-of-bag (OOB) error for a set of classifier parameter scenarios, and exports the resulting plots to file.

Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...

Visualise random forest structure


Code to visualise internal structure of ensemble forest using histogram of leaf depths and number of samples.


Loading output library...

Classification statistics (TBA)


Not currently working; will need method for incorperating validation data