#Agnostic Bayes Ensemble Documentation
Overview
I have to thank my employer Auticon Berlin for letting me develop this package in my working time. Agnostic Bayes Ensemble is thought to be basis technology, that will be refined over time, furthermore it forms one pillar of a upcoming machine learning framework, which is supposed to consist of three broad branches:
- cleaning and transformation of datasets.
- ensemble algorithms.
- general applicable meta parameter learning.
There are minimal requirements regarding the installation and usage of this package. Right now, the only prerequisite is running on a machine with Julia 1.X installed. However in the upcoming releases GPU support in form of CUDA will be integrated, from there on out, CUDA-DEV-Kit will become a prerequisite.
This package has been developed to facilitate increased predictive performance, by combining raw base models in an agnostic fashion, i.e. the methods don’t use any assumption regarding the used raw models. Furthermore, we specifically implemented ensemble algorithms that can deal with arbitrary loss function and with regression and classification problems, this holds true for all, except for the dirichletPosterior estimation algorithm, which is limited to classification problems.
The algorithms bootstrapPosteriorEstimation, bootstrapPosteriorCorEstimation, dirichletPosteriorEstimation, TDistPosteriorEstimation infer an actual posterior distribution.
The algorithms δOptimizationMSE , δOptimizationHinge , δOptimizationHingeRegularized, δOptimizationMSERegularized do not, these algorithms are inferring mixing coefficients not required to be true probability distributuions .
Hint: In most cases it is advisable to deactivate Hyperthreading for best performance. However, in some rare cases – depending on the (hardware) platform the package runs on- you will get the best performance with Hyperthreading enabled, to be sure, it is best practice to measure the performance with and without Hyperthreading.
generic methods
make a prediction given trained mixing coefficients and input Matrix.
AgnosticBayesEnsemble.predictEnsemble
— FunctionpredictEnsemble( predictions::Matrix{Float64}, weights::Vector{Float64} )
perform bayesian ensemble prediction.
#Arguments
- `predictions::Matrix{Float64}`: each column is the prediction of one hypothesis.
- `weights::Vector{Float64}`: mixing weights.
#Return
- `Vector{Float64}`: prediction y.
predictEnsemble( predictions::Vector{Matrix{Float64}}, weights::Vector{Float64} )
perform bayesian ensemble prediction. #Arguments
predictions::Vector{Matrix{Float64}}
: each matrix is the prediction of one hypothesis.weights::Vector{Float64}
: mixing weights.
#Return
Vector{Float64}
: prediction y.
list of algorithms
basic algorithm for computing a true posterior distribution using bootstrap sampling and arbitrary loss functions.
AgnosticBayesEnsemble.bootstrapPosteriorEstimation
— FunctionbootstrapPosteriorEstimation( errMat::Matrix{Float64}, samplingBatchSize::Int64, nrRuns::Int64 )
compute posterior p( h* = h | S ).
#Arguments
- `errMat::Matrix{Float64}}`: each column is the prediction error of one hypothesis.
- `samplingBatchSize::Int64`: sample size per main iteration.
- `nrRuns::Int64`: number of passes over predictions.
#Return
- `Vector{Float64}`: Distribution p( h* = h | S ).
basic algorithm for computing a true posterior distribution using bootstrap sampling and arbitrary loss functions, parameter return version.
AgnosticBayesEnsemble.bootstrapPosteriorEstimation!
— FunctionbootstrapPosteriorEstimation!( errMat::Matrix{Float64}, samplingBatchSize::Int64, nrRuns::Int64, p::Array{Float64} )
compute posterior p( h* = h | S ).
#Arguments
- `errMat::Matrix{Float64}}`: each column is the prediction error of one hypothesis.
- `samplingBatchSize::Int64`: sample size per main iteration.
- `nrRuns::Int64`: number of passes over predictions.
- `p::Vector{Float64}`: resulting posterior p( h* = h | S ).
#Return
- `nothing`: nothing.
basic algorithm for computing a true posterior distribution using bootstrap sampling and the linear correlation.
AgnosticBayesEnsemble.bootstrapPosteriorCorEstimation
— FunctionbootstrapPosteriorCorEstimation( predictions::Matrix{Float64}, t::Vector{Float64}, samplingBatchSize::Int64, nrRuns::Int64 )
compute posterior p( h* = h | S ).
#Arguments
- `predictions::Matrix{Float64}`: each column is the prediction of one hypothesis.
- `t::Vector{Float64}`: label vector.
- `samplingBatchSize::Int64`: sample size per main iteration.
- `nrRuns::Int64`: number of main iterations.
#Return
- `Vector{Float64}`: posterior p( h* = h | S ).
bootstrapPosteriorCorEstimation( predictions::Vector{Matrix{Float64}}, T::Matrix{Float64}, samplingFactor::Float64, nrRuns::Int64 )
compute posterior p( h* = h | S ).
#Arguments
- `predictions::Matrix{Float64}`: each column is the prediction of one hypothesis.
- `T::Matrix{Float64}`: label matrix.
- `samplingBatchSize::Int64`: sample size per main iteration.
- `nrRuns::Int64`: number of main iterations.
#Return
- `Vector{Float64}`: posterior p( h* = h | S ).
advanced algorithm, probabilistic inference using a dirichlatian prior.
AgnosticBayesEnsemble.dirichletPosteriorEstimation
— FunctiondirichletPosteriorEstimation( errMat::Matrix{Float64}, G::Matrix{Float64}, nrRuns::Int64, α_::Float64 )
compute posterior p( h* = h | S ).
# Arguments
- `errMat::Matrix{Float64}`: each column is the prediction error of one hypothesis.
- `G::Matrix{Float64}`: transformation matrix G.
- `nrRuns::Int64`: number of sampling runs.
- `α_::Float64`: scalar prior parameter.
- `sampleSize::Int64`: number of samples per run.
# Return
- `Vector{Float64}`: posterior distribution
dirichletPosteriorEstimation( errMat::Matrix{Float64}, nrRuns::Int64, α_::Float64 )
compute posterior p( h* = h | S ).
#Arguments
- `errMat::Matrix{Float64}`: each column is the prediction error of one hypothesis.
- `nrRuns::Int64`: number of main iterations.
- `α_::Float64`: scalar prior parameter.
#Return
- `Vector{Float64}`: posterior p( h* = h | S ).
advanced algorithm, probabilistic inference using a dirichlatian prior, improved performance and hardware usage under certain parameters.
AgnosticBayesEnsemble.dirichletPosteriorEstimationV2
— FunctiondirichletPosteriorEstimationV2( errMat::Matrix{Float64}, G::Matrix{Float64}, nrRuns::Int64, α_::Float64, sampleSize::Int64 )
compute posterior p( h* = h | S ), alternative version for improved performance.
# Arguments
- `errMat::Matrix{Float64}`: each column is the prediction error of one hypothesis.
- `G::Matrix{Float64}`: transformation matrix G.
- `nrRuns::Int64`: number of sampling runs.
- `α_::Float64`: scalar prior parameter.
- `sampleSize::Int64`: number of samples per run.
# Return
- `Vector{Float64}`: posterior distribution posterior p( h* = h | S ).
dirichletPosteriorEstimationV2( errMat::Matrix{Float64}, nrRuns::Int64, α_::Float64, sampleSize::Int64 )
compute posterior p( h* = h | S ), alternative version for improved performance.
# Arguments
- `errMat::Matrix{Float64}`: each column is the prediction of one hypothesis.
- `nrRuns::Int64`: number of sampling runs.
- `α_::Float64`: scalar prior parameter.
- `sampleSize::Int64`: number of samples per run.
# Return
- `Vector{Float64}`: posterior distribution p( h* = h | S ).
precomputation of the transformation Matrix G, should be precomputed once, if bootstrapPosteriorCorEstimation gets called several times.
AgnosticBayesEnsemble.GMatrix
— FunctionGMatrix( d::Int64 )
compute transformation matrix G.
#Arguments
- `d::Int64`: number of hypothesis used for prediction.
#Return
- `Matrix{Float64}`: transformation matrix G.
advanced algorithm, probabilistic inference using a dirichlatian prior, parameter return version.
AgnosticBayesEnsemble.dirichletPosteriorEstimation!
— FunctiondirichletPosteriorEstimation!( errMat::Matrix{Float64}, nrRuns::Int64, α_::Float64, p::Vector{Float64} )
compute posterior p( h* = h | S ).
#Arguments
- `errMat::Matrix{Float64}`: each column is the prediction error of one hypothesis.
- `nrRuns::Int64`: number of passes over predictions.
- `α_::Float64`: meta parameter value.
- `p::Vector{Float64}`: return value posterior p( h* = h | S ).
#Return
- `nothing`: nothing.
parameter search for prior parameter α.
AgnosticBayesEnsemble.metaParamSearchValidationDirichlet
— FunctionmetaParamSearchValidationDirichlet( Y::Matrix{Float64}, t::Vector{Float64}, nrRuns::Int64, minVal::Float64, maxVal::Float64, nSteps::Int64, holdout::Float64, lossFunc )
compute best α parameter regarding predictive performance.
#Arguments
- `Y::Matrix{Float64}`: each column is the prediction error of one hypothesis.
- `t::Vector{Float64}`: label vector.
- `nrRuns::Int64`: number of passes over predictions.
- `minVal::Float64`: minimum value of α.
- `maxVal::Float64`: maximum value of α.
- `nSteps::Int64`: number of steps between min and max val.
- `holdout::Float64`: percentage used in holdout.
- `lossFunc`: error function handle.
#Return
- `Vector{Float64} x2`: α_sequence, performance_sequence.
advanced algorithm, probabilistic inference using a T-distribution prior.
AgnosticBayesEnsemble.TDistPosteriorEstimation
— FunctionTDistPosteriorEstimation( errMat::Matrix{Float64}, nrRuns::Int64; [κ_0::Float64=1.0] [, v_0::Float64=Float64( size( errMat, 2 ) )] [, α::Float64=0.5] [, β::Float64=0.25] )
compute posterior p( h* = h | S ).
#Arguments
- `errMat::Matrix{Float64}`: each column is the prediction error of one hypothesis.
- `nrRuns::Int64`: number of main iterations.
- `κ_0::Float64=1.0`: regularization param.
- `v_0::Float64=Float64( size( errMat, 2 ) )`: regularization param.
- `α::Float64=0.5`: regularization param.
- `β::Float64=0.25`: regularization param.
#Return
- `Vector{Float64}`: posterior p( h* = h | S ).
advanced algorithm, probabilistic inference using a T-distribution prior, reference algorithm.
AgnosticBayesEnsemble.TDistPosteriorEstimationReference
— FunctionTDistPosteriorEstimationReference( errMat::Matrix{Float64}, nrRuns::Int64 )
compute posterior p( h* = h | S ).
#Arguments
- `errMat::Matrix{Float64}`: each column is the prediction error of one hypothesis.
- `nrRuns::Int64`: number of main iterations.
#Return
- `Vector{Float64}`: posterior p( h* = h | S ).
refine tuning algorithms
given a solution for the ensemble learning problem, this method seeks to further improve the solution by refining it using unconstrainted optimization under Mean Squared Error loss function.
The resulting solutions aren't guaranteed to be valid probability distributions.
AgnosticBayesEnsemble.directOptimNaiveMSE
— FunctiondirectOptimNaiveMSE( predMat::Matrix{Float64}, t::Vector{Float64}, p::Vector{Float64} )
compute refined solution _for_ mixing parameter p.
#Arguments
- `predMat::Matrix{Float64}`: each column is the prediction _of_ one hypothesis.
- `t::Vector{Float64}`: label vector.
- `p::Vector{Float64}`: initial solution for mixing coefficients.
#Return
- `Vector{Float64}`: improved initial solution.
given a solution for the ensemble learning problem, this method seeks to further improve the solution by refining it using unconstrainted optimization under Hinge loss function.
AgnosticBayesEnsemble.directOptimHinge
— FunctiondirectOptimHinge( predMat::Matrix{Float64}, t::Vector{Float64}, p::Vector{Float64} )
compute refined solution _for_ mixing parameter p.
#Arguments
- `predMat::Matrix{Float64}`: each column is the prediction _of_ one hypothesis.
- `t::Vector{Float64}`: label vector.
- `p::Vector{Float64}`: initial solution for mixing coefficients.
#Return
- `Vector{Float64}`: improved initial solution.
Tutorials
low level Interface
The Interface was designed to be easy to use, therefore all parameters needed by the algorithms in the package are either y1, y2, y3, …, yk the predictions per raw model along with the label vector T, Or alternatively e1, e2, e3, …, ek the error between predicted and real labels and ground truth T. Some of the methods need additional (prior-) parameters, however this simple basic structure is consistent along all implemented ensemble methods in this package. ___
Examples
"""
using AgnosticBayesEnsemble
using DataFrames
using Random
using Statistics
using StaticArrays
using Optim
using MultivariateStats
#== create artificial predictions and ground truth ==#
function distortBinaryPrediction( y::BitArray{1}, distortionFactor::Float64 )
res = deepcopy( y );
indices = rand( 1:1:size( y, 1 ), round( Int64, distortionFactor * size( y, 1 ) ) );
res[indices] = .!y[indices];
return res;
end
n = 100000;
y = Bool.( rand( 0:1,n ) );
yH1 = distortBinaryPrediction( y, 0.20 );
yH2 = distortBinaryPrediction( y, 0.21 );
yH3 = distortBinaryPrediction( y, 0.22 );
yH4 = distortBinaryPrediction( y, 0.23 );
yH5 = distortBinaryPrediction( y, 0.24 );
yH6 = distortBinaryPrediction( y, 0.24 );
yH7 = distortBinaryPrediction( y, 0.26 );
yH8 = distortBinaryPrediction( y, 0.27 );
yH9 = distortBinaryPrediction( y, 0.28 );
yH10 = distortBinaryPrediction( y, 0.29 );
yH11 = distortBinaryPrediction( y, 0.30 );
yH12 = distortBinaryPrediction( y, 0.33 );
yH13 = distortBinaryPrediction( y, 0.34 );
yH14 = distortBinaryPrediction( y, 0.35 );
yH15 = distortBinaryPrediction( y, 0.36 );
yH16 = distortBinaryPrediction( y, 0.37 );
#== split generated prediction set into disjoint sets eval and train==#
limit = round( Int64, 0.7 * size( y, 1 ) );
predictions = DataFrame( h1=yH1, h2=yH2, h3=yH3, h4=yH4, h5=yH5, h6=yH6, h7=yH7, h8=yH8, h9=yH9, h10=yH10, h11=yH11, h12=yH12, h13=yH13, h14=yH14, h15=yH15, h16=yH16 );
predTraining = predictions[1:limit,:];
predEval = predictions[limit+1:end,:];
predMatTraining = convert( Matrix{Float64}, predTraining );
predMatEval = convert( Matrix{Float64}, predEval );
errMatTraining = ( repeat( Float64.( y[1:limit] ),outer = [1,size(predictions,2)] ) .- predMatTraining ).^2;
errMatTraining = convert( Matrix{Float64}, errMatTraining );
sampleSize = 32
nrRuns = 100000
α_ = 1.0
#== use bootstrap correlation algorithm to estimate the model posterior distribution ==#
P = bootstrapPosteriorCorEstimation( predictions, y, sampleSize, nrRuns );
#== use bootstrap algorithm to estimate the model posterior distribution ==#
p = bootstrapPosteriorEstimation( Matrix( errMatTraining ), sampleSize, nrRuns );
#== use Dirichletian algorithm to estimate the model posterior distribution ==#
P = dirichletPosteriorEstimation( errMatTraining, nrRuns, α_ );
#== use T-Distribution algorithm to estimate the model posterior distribution ==#
P = TDistPosteriorEstimation( errMatTraining, nrRuns );
#== make ensemble prediction ==#
prediction = predictEnsemble( predictionsEval, p );
"""
supported problems per algorithm
algorithm | univariate Classification | multivariate Classification | univariate Regression | multivariate Classification |
---|---|---|---|---|
bootstrap | yes | yes | yes | yes |
bootstrap cor. | yes | no | yes | no |
dirichletian | yes, only {0,1}-loss | yes, only {0,1}-loss | no | no |
t-distribution | yes | yes | yes | yes |
___
supported problems per fine tuning algorithms
algorithm | univariate Classification | multivariate Classification | univariate Regression | multivariate Classification |
---|---|---|---|---|
δOptimizationMSE | yes | no | yes | no |
δOptimizationHinge | yes | no | no | no |
δOptimizationHingeRegularized | yes | no | no | no |
δOptimizationMSERegularized | yes | no | yes | no |
Index
AgnosticBayesEnsemble.GMatrix
AgnosticBayesEnsemble.TDistPosteriorEstimation
AgnosticBayesEnsemble.TDistPosteriorEstimationReference
AgnosticBayesEnsemble.bootstrapPosteriorCorEstimation
AgnosticBayesEnsemble.bootstrapPosteriorEstimation
AgnosticBayesEnsemble.bootstrapPosteriorEstimation!
AgnosticBayesEnsemble.directOptimHinge
AgnosticBayesEnsemble.directOptimNaiveMSE
AgnosticBayesEnsemble.dirichletPosteriorEstimation
AgnosticBayesEnsemble.dirichletPosteriorEstimation!
AgnosticBayesEnsemble.dirichletPosteriorEstimationV2
AgnosticBayesEnsemble.metaParamSearchValidationDirichlet
AgnosticBayesEnsemble.predictEnsemble