A-quick-tour-of-NMoE

Introduction

NMoE (Normal Mixtures-of-Experts) provides a flexible modelling framework for heterogenous data with Gaussian distributions. NMoE consists of a mixture of K Normal expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:

It was written in R Markdown, using the knitr package for production.

See help(package="meteorits") for further details and references provided by citation("meteorits").

Application to a simulated dataset

Generate sample

n <- 500 # Size of the sample
alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network
betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts
sigmak <- c(1, 1) # Standard deviations of the experts
x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors)

# Generate sample of size n
sample <- sampleUnivNMoE(alphak = alphak, betak = betak, sigmak = sigmak, x = x)
y <- sample$y

Set up tMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
## EM NMoE: Iteration: 1 | log-likelihood: -841.382104469357
## EM NMoE: Iteration: 2 | log-likelihood: -841.232822861522
## EM NMoE: Iteration: 3 | log-likelihood: -840.961897977121
## EM NMoE: Iteration: 4 | log-likelihood: -840.328493820929
## EM NMoE: Iteration: 5 | log-likelihood: -838.805868340979
## EM NMoE: Iteration: 6 | log-likelihood: -835.284322144359
## EM NMoE: Iteration: 7 | log-likelihood: -827.81480465452
## EM NMoE: Iteration: 8 | log-likelihood: -814.053515931741
## EM NMoE: Iteration: 9 | log-likelihood: -793.354676108963
## EM NMoE: Iteration: 10 | log-likelihood: -769.747643371433
## EM NMoE: Iteration: 11 | log-likelihood: -750.801287680501
## EM NMoE: Iteration: 12 | log-likelihood: -740.110229279826
## EM NMoE: Iteration: 13 | log-likelihood: -735.105474317739
## EM NMoE: Iteration: 14 | log-likelihood: -732.631433195022
## EM NMoE: Iteration: 15 | log-likelihood: -731.225497976821
## EM NMoE: Iteration: 16 | log-likelihood: -730.327369034398
## EM NMoE: Iteration: 17 | log-likelihood: -729.70461805428
## EM NMoE: Iteration: 18 | log-likelihood: -729.251408045268
## EM NMoE: Iteration: 19 | log-likelihood: -728.91300801255
## EM NMoE: Iteration: 20 | log-likelihood: -728.656099114937
## EM NMoE: Iteration: 21 | log-likelihood: -728.457965274363
## EM NMoE: Iteration: 22 | log-likelihood: -728.302505169266
## EM NMoE: Iteration: 23 | log-likelihood: -728.17827167698
## EM NMoE: Iteration: 24 | log-likelihood: -728.077143731534
## EM NMoE: Iteration: 25 | log-likelihood: -727.993340557231
## EM NMoE: Iteration: 26 | log-likelihood: -727.922708059608
## EM NMoE: Iteration: 27 | log-likelihood: -727.862220571622
## EM NMoE: Iteration: 28 | log-likelihood: -727.80964020916
## EM NMoE: Iteration: 29 | log-likelihood: -727.763285356876
## EM NMoE: Iteration: 30 | log-likelihood: -727.72187264597
## EM NMoE: Iteration: 31 | log-likelihood: -727.684408007886
## EM NMoE: Iteration: 32 | log-likelihood: -727.650110610075
## EM NMoE: Iteration: 33 | log-likelihood: -727.618359453292
## EM NMoE: Iteration: 34 | log-likelihood: -727.588653524551
## EM NMoE: Iteration: 35 | log-likelihood: -727.560585266958
## EM NMoE: Iteration: 36 | log-likelihood: -727.533819663819
## EM NMoE: Iteration: 37 | log-likelihood: -727.508079238599
## EM NMoE: Iteration: 38 | log-likelihood: -727.483132999833
## EM NMoE: Iteration: 39 | log-likelihood: -727.458788309494
## EM NMoE: Iteration: 40 | log-likelihood: -727.434884943844
## EM NMoE: Iteration: 41 | log-likelihood: -727.411290809236
## EM NMoE: Iteration: 42 | log-likelihood: -727.387898902839
## EM NMoE: Iteration: 43 | log-likelihood: -727.364625190746
## EM NMoE: Iteration: 44 | log-likelihood: -727.341407128219
## EM NMoE: Iteration: 45 | log-likelihood: -727.318202580231
## EM NMoE: Iteration: 46 | log-likelihood: -727.294988924302
## EM NMoE: Iteration: 47 | log-likelihood: -727.271762139712
## EM NMoE: Iteration: 48 | log-likelihood: -727.248535713999
## EM NMoE: Iteration: 49 | log-likelihood: -727.225339233752
## EM NMoE: Iteration: 50 | log-likelihood: -727.202216573926
## EM NMoE: Iteration: 51 | log-likelihood: -727.179223656928
## EM NMoE: Iteration: 52 | log-likelihood: -727.156425451874
## EM NMoE: Iteration: 53 | log-likelihood: -727.133894125782
## EM NMoE: Iteration: 54 | log-likelihood: -727.111704849592
## EM NMoE: Iteration: 55 | log-likelihood: -727.089933066924
## EM NMoE: Iteration: 56 | log-likelihood: -727.0686515925
## EM NMoE: Iteration: 57 | log-likelihood: -727.047928065085
## EM NMoE: Iteration: 58 | log-likelihood: -727.027822881704
## EM NMoE: Iteration: 59 | log-likelihood: -727.008387991342
## EM NMoE: Iteration: 60 | log-likelihood: -726.989665033376
## EM NMoE: Iteration: 61 | log-likelihood: -726.971685930922
## EM NMoE: Iteration: 62 | log-likelihood: -726.954472676238
## EM NMoE: Iteration: 63 | log-likelihood: -726.938037841439
## EM NMoE: Iteration: 64 | log-likelihood: -726.92238538057
## EM NMoE: Iteration: 65 | log-likelihood: -726.907511620225
## EM NMoE: Iteration: 66 | log-likelihood: -726.893406346164
## EM NMoE: Iteration: 67 | log-likelihood: -726.880053909298
## EM NMoE: Iteration: 68 | log-likelihood: -726.867434292648
## EM NMoE: Iteration: 69 | log-likelihood: -726.855524098823
## EM NMoE: Iteration: 70 | log-likelihood: -726.844297433495
## EM NMoE: Iteration: 71 | log-likelihood: -726.833726673385
## EM NMoE: Iteration: 72 | log-likelihood: -726.823783117087
## EM NMoE: Iteration: 73 | log-likelihood: -726.814437523958
## EM NMoE: Iteration: 74 | log-likelihood: -726.805660550544
## EM NMoE: Iteration: 75 | log-likelihood: -726.797423096332
## EM NMoE: Iteration: 76 | log-likelihood: -726.789696571368
## EM NMoE: Iteration: 77 | log-likelihood: -726.78245309806

Summary

nmoe$summary()
## ------------------------------------------
## Fitted Normal Mixture-of-Experts model
## ------------------------------------------
## 
## NMoE model with K = 2 experts:
## 
##  log-likelihood df       AIC       BIC       ICL
##       -726.7825  8 -734.7825 -751.6409 -774.2446
## 
## Clustering table (Number of observations in each expert):
## 
##   1   2 
## 281 219 
## 
## Regression coefficients:
## 
##     Beta(k = 1) Beta(k = 2)
## 1    0.07962311   0.3293571
## X^1 -2.34468274   2.9456271
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##       1.065111      1.057344

Plots

Mean curve

nmoe$plot(what = "meancurve")

Confidence regions

nmoe$plot(what = "confregions")

Clusters

nmoe$plot(what = "clusters")

Log-likelihood

nmoe$plot(what = "loglikelihood")

Application to a real dataset

Load data

data("tempanomalies")
x <- tempanomalies$Year
y <- tempanomalies$AnnualAnomaly

Set up tMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
## EM NMoE: Iteration: 1 | log-likelihood: 48.3135498211327
## EM NMoE: Iteration: 2 | log-likelihood: 48.7021394575936
## EM NMoE: Iteration: 3 | log-likelihood: 49.1773653245368
## EM NMoE: Iteration: 4 | log-likelihood: 50.3595103831193
## EM NMoE: Iteration: 5 | log-likelihood: 53.3225276945388
## EM NMoE: Iteration: 6 | log-likelihood: 59.2059736964644
## EM NMoE: Iteration: 7 | log-likelihood: 66.5084561908942
## EM NMoE: Iteration: 8 | log-likelihood: 71.6698294742357
## EM NMoE: Iteration: 9 | log-likelihood: 74.450447584389
## EM NMoE: Iteration: 10 | log-likelihood: 76.2965362459835
## EM NMoE: Iteration: 11 | log-likelihood: 78.0652973676674
## EM NMoE: Iteration: 12 | log-likelihood: 80.1011870778738
## EM NMoE: Iteration: 13 | log-likelihood: 82.7086208649667
## EM NMoE: Iteration: 14 | log-likelihood: 86.3168780134924
## EM NMoE: Iteration: 15 | log-likelihood: 90.9036306741408
## EM NMoE: Iteration: 16 | log-likelihood: 94.438419556814
## EM NMoE: Iteration: 17 | log-likelihood: 95.7833599069938
## EM NMoE: Iteration: 18 | log-likelihood: 96.2275872383811
## EM NMoE: Iteration: 19 | log-likelihood: 96.414530924196
## EM NMoE: Iteration: 20 | log-likelihood: 96.5274452761453
## EM NMoE: Iteration: 21 | log-likelihood: 96.6211691466595
## EM NMoE: Iteration: 22 | log-likelihood: 96.7139572600177
## EM NMoE: Iteration: 23 | log-likelihood: 96.8124961429116
## EM NMoE: Iteration: 24 | log-likelihood: 96.9190778749052
## EM NMoE: Iteration: 25 | log-likelihood: 97.0337223799033
## EM NMoE: Iteration: 26 | log-likelihood: 97.1548714540442
## EM NMoE: Iteration: 27 | log-likelihood: 97.2798101999998
## EM NMoE: Iteration: 28 | log-likelihood: 97.4052037031475
## EM NMoE: Iteration: 29 | log-likelihood: 97.5278434072014
## EM NMoE: Iteration: 30 | log-likelihood: 97.645464421148
## EM NMoE: Iteration: 31 | log-likelihood: 97.7574153181095
## EM NMoE: Iteration: 32 | log-likelihood: 97.8649317789619
## EM NMoE: Iteration: 33 | log-likelihood: 97.9708810941274
## EM NMoE: Iteration: 34 | log-likelihood: 98.079015703324
## EM NMoE: Iteration: 35 | log-likelihood: 98.1929989454507
## EM NMoE: Iteration: 36 | log-likelihood: 98.3155508217666
## EM NMoE: Iteration: 37 | log-likelihood: 98.4480385580903
## EM NMoE: Iteration: 38 | log-likelihood: 98.5906609977787
## EM NMoE: Iteration: 39 | log-likelihood: 98.7431113492518
## EM NMoE: Iteration: 40 | log-likelihood: 98.9053222779114
## EM NMoE: Iteration: 41 | log-likelihood: 99.0780646772237
## EM NMoE: Iteration: 42 | log-likelihood: 99.2632556955678
## EM NMoE: Iteration: 43 | log-likelihood: 99.4641640564058
## EM NMoE: Iteration: 44 | log-likelihood: 99.685779862926
## EM NMoE: Iteration: 45 | log-likelihood: 99.935591947143
## EM NMoE: Iteration: 46 | log-likelihood: 100.224916392958
## EM NMoE: Iteration: 47 | log-likelihood: 100.570636347252
## EM NMoE: Iteration: 48 | log-likelihood: 100.995459308499
## EM NMoE: Iteration: 49 | log-likelihood: 101.515795348348
## EM NMoE: Iteration: 50 | log-likelihood: 102.082569523463
## EM NMoE: Iteration: 51 | log-likelihood: 102.537226050965
## EM NMoE: Iteration: 52 | log-likelihood: 102.688703507615
## EM NMoE: Iteration: 53 | log-likelihood: 102.719133334263
## EM NMoE: Iteration: 54 | log-likelihood: 102.721229161696
## EM NMoE: Iteration: 55 | log-likelihood: 102.72187714441

Summary

nmoe$summary()
## ------------------------------------------
## Fitted Normal Mixture-of-Experts model
## ------------------------------------------
## 
## NMoE model with K = 2 experts:
## 
##  log-likelihood df      AIC      BIC      ICL
##        102.7219  8 94.72188 83.07126 83.17734
## 
## Clustering table (Number of observations in each expert):
## 
##  1  2 
## 84 52 
## 
## Regression coefficients:
## 
##       Beta(k = 1)  Beta(k = 2)
## 1   -12.667293919 -42.36199675
## X^1   0.006474808   0.02149263
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##     0.01352346     0.0119311

Plots

Mean curve

nmoe$plot(what = "meancurve")

Confidence regions

nmoe$plot(what = "confregions")

Clusters

nmoe$plot(what = "clusters")

Log-likelihood

nmoe$plot(what = "loglikelihood")