r - Extracting data used to make a smooth plot in mgcv -
this thread couple of years ago describes how extract data used plot smooth components of fitted gam model. works, when there 1 smooth variable. i've got more 1 smooth variable, , unfortunately can extract smooths last of series. here example:
library(mgcv) = rnorm(100) b = runif(100) y = a*b/(a+b) mod = gam(y~s(a)+s(b)) summary(mod) plotdata <- list() trace(mgcv:::plot.gam, at=list(c(25,3,3,3)), #this gets location plot.gam calls plot.mgcv.smooth (see ?trace) #plot.mgcv.smooth function actual plotting , #we assign main argument global workspace #so can work later..... quote({ #browser() plotdata <<- c(plotdata, pd[[i]]) })) plot(mod,pages=1) plotdata i'm trying estimated smooth functions both a , b, list plotdata gives me estimates b. i've looked guts of plot.gam function, , i'm having difficult time understanding how works. if has solved problem, i'd grateful.
updated answer mgcv >= 1.8-6
as of version 1.8-6 of mgcv, plot.gam() returns plotting data invisibly (from changelog):
- plot.gam silently returns list of plotting data, advanced users (fabian scheipl) produce custimized plot.
as such, , using mod example shown below in original answer, 1 can do
> plotdata <- plot(mod, pages = 1) > str(plotdata) list of 2 $ :list of 11 ..$ x : num [1:100] -2.45 -2.41 -2.36 -2.31 -2.27 ... ..$ scale : logi true ..$ se : num [1:100] 4.23 3.8 3.4 3.05 2.74 ... ..$ raw : num [1:100] -0.8969 0.1848 1.5878 -1.1304 -0.0803 ... ..$ xlab : chr "a" ..$ ylab : chr "s(a,7.21)" ..$ main : null ..$ se.mult: num 2 ..$ xlim : num [1:2] -2.45 2.09 ..$ fit : num [1:100, 1] -0.251 -0.242 -0.234 -0.228 -0.224 ... ..$ plot.me: logi true $ :list of 11 ..$ x : num [1:100] 0.0126 0.0225 0.0324 0.0422 0.0521 ... ..$ scale : logi true ..$ se : num [1:100] 1.25 1.22 1.18 1.15 1.11 ... ..$ raw : num [1:100] 0.859 0.645 0.603 0.972 0.377 ... ..$ xlab : chr "b" ..$ ylab : chr "s(b,1.25)" ..$ main : null ..$ se.mult: num 2 ..$ xlim : num [1:2] 0.0126 0.9906 ..$ fit : num [1:100, 1] -0.83 -0.818 -0.806 -0.794 -0.782 ... ..$ plot.me: logi true the data therein can used custom plots etc.
the original answer below still contains useful code generating same sort of data used generate these plots.
original answer
there couple of ways easily, , both involve predicting model on range of covariates. trick hold 1 variable @ value (say sample mean) whilst varying other on range.
the 2 methods involve:
- predicting fitted responses data, including intercept , model terms (with other covariates held @ fixed values), or
- predict model above, return contributions of each term
the second of these closer (if not what) plot.gam does.
here code works example , implements above ideas.
library("mgcv") set.seed(2) <- rnorm(100) b <- runif(100) y <- a*b/(a+b) dat <- data.frame(y = y, = a, b = b) mod <- gam(y~s(a)+s(b), data = dat) now produce prediction data
pdat <- with(dat, data.frame(a = c(seq(min(a), max(a), length = 100), rep(mean(a), 100)), b = c(rep(mean(b), 100), seq(min(b), max(b), length = 100)))) predict fitted responses model new data
this bullet 1 above
pred <- predict(mod, pdat, type = "response", se.fit = true) > lapply(pred, head) $fit 1 2 3 4 5 6 0.5842966 0.5929591 0.6008068 0.6070248 0.6108644 0.6118970 $se.fit 1 2 3 4 5 6 2.158220 1.947661 1.753051 1.579777 1.433241 1.318022 you can plot $fit against covariate in pdat - though remember have predictions holding b constant holding a constant, need first 100 rows when plotting fits against a or second 100 rows against b. example, first add fitted , upper , lower confidence interval data data frame of prediction data
pdat <- transform(pdat, fitted = pred$fit) pdat <- transform(pdat, upper = fitted + (1.96 * pred$se.fit), lower = fitted - (1.96 * pred$se.fit)) then plot smooths using rows 1:100 variable a , 101:200 variable b
layout(matrix(1:2, ncol = 2)) ## plot 1 want <- 1:100 ylim <- with(pdat, range(fitted[want], upper[want], lower[want])) plot(fitted ~ a, data = pdat, subset = want, type = "l", ylim = ylim) lines(upper ~ a, data = pdat, subset = want, lty = "dashed") lines(lower ~ a, data = pdat, subset = want, lty = "dashed") ## plot 2 want <- 101:200 ylim <- with(pdat, range(fitted[want], upper[want], lower[want])) plot(fitted ~ b, data = pdat, subset = want, type = "l", ylim = ylim) lines(upper ~ b, data = pdat, subset = want, lty = "dashed") lines(lower ~ b, data = pdat, subset = want, lty = "dashed") layout(1) this produces

if want common y-axis scale delete both ylim lines above, replacing first with:
ylim <- with(pdat, range(fitted, upper, lower)) predict contributions fitted values individual smooth terms
the idea in 2 above done in same way, ask type = "terms".
pred2 <- predict(mod, pdat, type = "terms", se.fit = true) this returns matrix $fit , $se.fit
> lapply(pred2, head) $fit s(a) s(b) 1 -0.2509313 -0.1058385 2 -0.2422688 -0.1058385 3 -0.2344211 -0.1058385 4 -0.2282031 -0.1058385 5 -0.2243635 -0.1058385 6 -0.2233309 -0.1058385 $se.fit s(a) s(b) 1 2.115990 0.1880968 2 1.901272 0.1880968 3 1.701945 0.1880968 4 1.523536 0.1880968 5 1.371776 0.1880968 6 1.251803 0.1880968 just plot relevant column $fit matrix against same covariate pdat, again using first or second set of 100 rows. again, example
pdat <- transform(pdat, fitted = c(pred2$fit[1:100, 1], pred2$fit[101:200, 2])) pdat <- transform(pdat, upper = fitted + (1.96 * c(pred2$se.fit[1:100, 1], pred2$se.fit[101:200, 2])), lower = fitted - (1.96 * c(pred2$se.fit[1:100, 1], pred2$se.fit[101:200, 2]))) then plot smooths using rows 1:100 variable a , 101:200 variable b
layout(matrix(1:2, ncol = 2)) ## plot 1 want <- 1:100 ylim <- with(pdat, range(fitted[want], upper[want], lower[want])) plot(fitted ~ a, data = pdat, subset = want, type = "l", ylim = ylim) lines(upper ~ a, data = pdat, subset = want, lty = "dashed") lines(lower ~ a, data = pdat, subset = want, lty = "dashed") ## plot 2 want <- 101:200 ylim <- with(pdat, range(fitted[want], upper[want], lower[want])) plot(fitted ~ b, data = pdat, subset = want, type = "l", ylim = ylim) lines(upper ~ b, data = pdat, subset = want, lty = "dashed") lines(lower ~ b, data = pdat, subset = want, lty = "dashed") layout(1) this produces

notice subtle difference here between plot , 1 produced earlier. first plot include both effect of intercept term and contribution mean of b. in second plot, value of smoother a shown.
Comments
Post a Comment