r - Extracting data used to make a smooth plot in mgcv -

this thread couple of years ago describes how extract data used plot smooth components of fitted gam model. works, when there 1 smooth variable. i've got more 1 smooth variable, , unfortunately can extract smooths last of series. here example:

library(mgcv) = rnorm(100) b = runif(100) y = a*b/(a+b)  mod = gam(y~s(a)+s(b)) summary(mod)  plotdata <- list() trace(mgcv:::plot.gam, at=list(c(25,3,3,3)),          #this gets location plot.gam calls plot.mgcv.smooth (see ?trace)         #plot.mgcv.smooth function actual plotting ,         #we assign main argument global workspace         #so can work later.....         quote({                     #browser()                     plotdata <<- c(plotdata, pd[[i]])                 })) plot(mod,pages=1) plotdata

i'm trying estimated smooth functions both a , b, list plotdata gives me estimates b. i've looked guts of plot.gam function, , i'm having difficult time understanding how works. if has solved problem, i'd grateful.

updated answer mgcv >= 1.8-6

as of version 1.8-6 of mgcv, plot.gam() returns plotting data invisibly (from changelog):

plot.gam silently returns list of plotting data, advanced users (fabian scheipl) produce custimized plot.

as such, , using mod example shown below in original answer, 1 can do

> plotdata <- plot(mod, pages = 1) > str(plotdata) list of 2  $ :list of 11   ..$ x      : num [1:100] -2.45 -2.41 -2.36 -2.31 -2.27 ...   ..$ scale  : logi true   ..$ se     : num [1:100] 4.23 3.8 3.4 3.05 2.74 ...   ..$ raw    : num [1:100] -0.8969 0.1848 1.5878 -1.1304 -0.0803 ...   ..$ xlab   : chr "a"   ..$ ylab   : chr "s(a,7.21)"   ..$ main   : null   ..$ se.mult: num 2   ..$ xlim   : num [1:2] -2.45 2.09   ..$ fit    : num [1:100, 1] -0.251 -0.242 -0.234 -0.228 -0.224 ...   ..$ plot.me: logi true  $ :list of 11   ..$ x      : num [1:100] 0.0126 0.0225 0.0324 0.0422 0.0521 ...   ..$ scale  : logi true   ..$ se     : num [1:100] 1.25 1.22 1.18 1.15 1.11 ...   ..$ raw    : num [1:100] 0.859 0.645 0.603 0.972 0.377 ...   ..$ xlab   : chr "b"   ..$ ylab   : chr "s(b,1.25)"   ..$ main   : null   ..$ se.mult: num 2   ..$ xlim   : num [1:2] 0.0126 0.9906   ..$ fit    : num [1:100, 1] -0.83 -0.818 -0.806 -0.794 -0.782 ...   ..$ plot.me: logi true

the data therein can used custom plots etc.

the original answer below still contains useful code generating same sort of data used generate these plots.

original answer

there couple of ways easily, , both involve predicting model on range of covariates. trick hold 1 variable @ value (say sample mean) whilst varying other on range.

the 2 methods involve:

predicting fitted responses data, including intercept , model terms (with other covariates held @ fixed values), or
predict model above, return contributions of each term

the second of these closer (if not what) plot.gam does.

here code works example , implements above ideas.

library("mgcv") set.seed(2) <- rnorm(100) b <- runif(100) y <- a*b/(a+b) dat <- data.frame(y = y, = a, b = b)  mod <- gam(y~s(a)+s(b), data = dat)

now produce prediction data

pdat <- with(dat,              data.frame(a = c(seq(min(a), max(a), length = 100),                               rep(mean(a), 100)),                         b = c(rep(mean(b), 100),                               seq(min(b), max(b), length = 100))))

predict fitted responses model new data

this bullet 1 above

pred <- predict(mod, pdat, type = "response", se.fit = true)  > lapply(pred, head) $fit         1         2         3         4         5         6  0.5842966 0.5929591 0.6008068 0.6070248 0.6108644 0.6118970   $se.fit        1        2        3        4        5        6  2.158220 1.947661 1.753051 1.579777 1.433241 1.318022

you can plot $fit against covariate in pdat - though remember have predictions holding b constant holding a constant, need first 100 rows when plotting fits against a or second 100 rows against b. example, first add fitted , upper , lower confidence interval data data frame of prediction data

pdat <- transform(pdat, fitted = pred$fit) pdat <- transform(pdat, upper = fitted + (1.96 * pred$se.fit),                         lower = fitted - (1.96 * pred$se.fit))

then plot smooths using rows 1:100 variable a , 101:200 variable b

layout(matrix(1:2, ncol = 2)) ## plot 1 want <- 1:100 ylim <- with(pdat, range(fitted[want], upper[want], lower[want])) plot(fitted ~ a, data = pdat, subset = want, type = "l", ylim = ylim) lines(upper ~ a, data = pdat, subset = want, lty = "dashed") lines(lower ~ a, data = pdat, subset = want, lty = "dashed") ## plot 2 want <- 101:200 ylim <- with(pdat, range(fitted[want], upper[want], lower[want])) plot(fitted ~ b, data = pdat, subset = want, type = "l", ylim = ylim) lines(upper ~ b, data = pdat, subset = want, lty = "dashed") lines(lower ~ b, data = pdat, subset = want, lty = "dashed") layout(1)

this produces

enter image description here

if want common y-axis scale delete both ylim lines above, replacing first with:

ylim <- with(pdat, range(fitted, upper, lower))

predict contributions fitted values individual smooth terms

the idea in 2 above done in same way, ask type = "terms".

pred2 <- predict(mod, pdat, type = "terms", se.fit = true)

this returns matrix $fit , $se.fit

> lapply(pred2, head) $fit         s(a)       s(b) 1 -0.2509313 -0.1058385 2 -0.2422688 -0.1058385 3 -0.2344211 -0.1058385 4 -0.2282031 -0.1058385 5 -0.2243635 -0.1058385 6 -0.2233309 -0.1058385  $se.fit       s(a)      s(b) 1 2.115990 0.1880968 2 1.901272 0.1880968 3 1.701945 0.1880968 4 1.523536 0.1880968 5 1.371776 0.1880968 6 1.251803 0.1880968

just plot relevant column $fit matrix against same covariate pdat, again using first or second set of 100 rows. again, example

pdat <- transform(pdat, fitted = c(pred2$fit[1:100, 1],                                     pred2$fit[101:200, 2])) pdat <- transform(pdat, upper = fitted + (1.96 * c(pred2$se.fit[1:100, 1],                                                     pred2$se.fit[101:200, 2])),                         lower = fitted - (1.96 * c(pred2$se.fit[1:100, 1],                                                     pred2$se.fit[101:200, 2])))

then plot smooths using rows 1:100 variable a , 101:200 variable b

layout(matrix(1:2, ncol = 2)) ## plot 1 want <- 1:100 ylim <- with(pdat, range(fitted[want], upper[want], lower[want])) plot(fitted ~ a, data = pdat, subset = want, type = "l", ylim = ylim) lines(upper ~ a, data = pdat, subset = want, lty = "dashed") lines(lower ~ a, data = pdat, subset = want, lty = "dashed") ## plot 2 want <- 101:200 ylim <- with(pdat, range(fitted[want], upper[want], lower[want])) plot(fitted ~ b, data = pdat, subset = want, type = "l", ylim = ylim) lines(upper ~ b, data = pdat, subset = want, lty = "dashed") lines(lower ~ b, data = pdat, subset = want, lty = "dashed") layout(1)

this produces

enter image description here

notice subtle difference here between plot , 1 produced earlier. first plot include both effect of intercept term and contribution mean of b. in second plot, value of smoother a shown.

Search This Blog

Bready