mgcv: how to specify interaction between smooth and factor? part II
mgcv: how to specify interaction between smooth and factor? part II
There is a question with the same title, however, I hope this question is also of interest [and personally I'd like to know the answer].
First, I will specify a dataset with one continuous and one factor covariate:
set.seed(1)
n <- 50
u1 <- sample(c(1,2), n, replace = TRUE)
u1 <- factor(u1)
u2 <- runif(n)
data <- data.frame(u1, u2)
I don't want to run a gam
model, but only to create the design matrix. I have considered 2 ways.
gam
First,
a<-mgcv::s(u2,k=5,bs="ps",by=u1)
b<- mgcv::smoothCon(a,data=data,absorb.cons=TRUE)
However,
b[[1]]$X[u1==2,]
consists of only 0's.
Second,
a<- mgcv::s(u1,u2,k=5,bs="ad")
b<- mgcv::smoothCon(a,data=data,absorb.cons=TRUE)
However, it gives me an error message. Error in Summary.factor(c(1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, :
‘min’ not meaningful for factors
Error in Summary.factor(c(1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, :
‘min’ not meaningful for factors
How can this problem be resolved? I'd like to have the design matrix that models a smooth term separately for each level of factor u1
[binary, as here, or otherwise].
u1
1 Answer
1
I don't want to run a gam
model, but only to create the design matrix.
gam
I'd like to have the design matrix that models a smooth term separately for each level of factor u1
.
u1
library(mgcv)
set.seed(1)
n <- 50
u1 <- sample(c(1,2), n, replace = TRUE)
u1 <- factor(u1)
u2 <- runif(n)
data <- data.frame(u1, u2)
a <- s(u2, k = 5, bs = "ps", by = u1)
b <- smoothCon(a, data = data, absorb.cons = TRUE)
smoothCon
produces a list of length nlevels(u1)
for this factor "by" smooth. A Dummy matrix is created for u1
(with two columns in this example), and each column is multiplied to the design matrix of s(u2, l = 5, bs = "ps")
. So, b[[1]]$X
is the design matrix for the first level, and b[[2]]$X
is the one for the second level. You shouldn't be surprised that b[[1]]$X[u1 == 2]
gives all zeros, as that is the design matrix for u1 == 1
. You want to cbind
those separate matrices together:
smoothCon
nlevels(u1)
u1
s(u2, l = 5, bs = "ps")
b[[1]]$X
b[[2]]$X
b[[1]]$X[u1 == 2]
u1 == 1
cbind
cbind(b[[1]]$X, b[[2]]$X)
Note that, because you have set absorb.cons = TRUE
, the spline function is centered. I.e., mean of each level is constrained (not necessarily at 0 but is fixed at some value). A practical effect is that you then need to put u1
in the model otherwise you can not model the group mean. In mgcv
, you have to specify
absorb.cons = TRUE
u1
mgcv
s(u2, k = 5, bs = 'ps', by = u1) + u1
This is also mentioned in the Q & A you referenced: mgcv: how to specify interaction between smooth and factor?. In your application outside mgcv
you need be aware of this as well. So a complete model matrix should be
mgcv
cbind(b[[1]]$X, b[[2]]$X, model.matrix(~ u1))
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you agree to our terms of service, privacy policy and cookie policy