Using the weighted.mean function, inside an lapply function, with data.table

Using the weighted.mean function, inside an lapply function, with data.table



I have the following dataset:


# A tibble: 450 x 546
matchcode idstd year country wt region income industry sector ownership exporter c201 c202 c203a c203b c203c c203d c2041 c2042 c205a c205b1 c205b2 c205b3 c205b4 c205b5 c205b6 c205b7
<chr+lbl> <dbl> <dbl> <chr+l> <dbl> <dbl+> <dbl+> <dbl+lb> <dbl+> <dbl+lbl> <dbl+lb> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+> <dbl+> <dbl+> <dbl+> <dbl+> <dbl+> <dbl+>
1 "BGD 200~ 2474 2002 Bangla~ 0.7 6 1 3 1 2 1 1994 2 100 0 0 NA 2 NA NA NA NA NA NA NA NA NA
2 "BGD 200~ 2717 2002 Bangla~ 0.9 6 1 2 1 2 2 1986 4 100 0 0 NA 2 NA NA NA NA NA NA NA NA NA
3 "BGD 200~ 2410 2002 Bangla~ NA 6 1 3 1 2 1 1999 4 100 0 0 NA 2 NA NA NA NA NA NA NA NA NA
4 "BRA 200~ 14917 2003 Brazil~ NA 4 2 8 1 2 2 1984 2 100 0 0 0 2 NA 50 1 NA NA NA NA NA NA
5 "BRA 200~ 14546 2003 Brazil~ 1.1 4 2 2 1 2 2 1976 2 100 0 0 0 2 NA 50 1 NA NA NA NA NA NA
6 "BRA 200~ 14709 2003 Brazil~ NA 4 2 3 1 2 2 1990 2 100 0 0 0 2 NA 100 NA 1 NA NA NA NA NA
7 "KHM 200~ 16475 2003 Cambod~ NA 2 1 20 2 2 2 1999 2 100 0 0 0 2 NA 100 NA NA NA 1 NA NA NA
8 "KHM 200~ 16298 2003 Cambod~ NA 2 1 4 3 2 2 1993 4 100 0 0 0 2 NA 100 1 NA NA NA NA NA NA
9 "KHM 200~ 16036 2003 Cambod~ 0.5 2 1 21 2 2 2 1997 2 100 0 0 0 2 NA 100 NA 1 NA NA NA NA NA
10 "CHN 200~ 17862 2002 China2~ 1.2 2 2 18 2 2 2 1993 3 49 0 51 NA NA NA NA NA NA NA NA NA NA NA



and I am using the following data.table solution to create country level data from observational data:


cols = sapply(df, is.numeric) #
cols = names(cols)[cols]
dfclevel = df[, lapply(.SD, mean, na.rm=TRUE), .SDcols = cols, by=matchcode]



Although the code works well, my dataset has weights for some observations, which I want to incorporate into my code. I have been thinking about how to do this but I cannot figure it out. Would it be possible to do write a function to add to the data.table solution? Something like:


weights


dfclevel = df[, lapply(.SD, wfunc, na.rm=TRUE), .SDcols = cols,]

wfunc <- function(x,y) # x = df, y=weights
for (i in nrow(df$weights) {
if (df$weights[i] == !is.na)
df[,i] <- df[,i]*df$weights[i]



Or maybe I am even overthinking this?



EDIT: Based on the comments below I tried:


dfclevel= df[, lapply(.SD, weighted.mean(x, as.vector(wt), na.rm=TRUE), na.rm=TRUE), .SDcols = cols, by=matchcode]



weighted.mean



It gives me the error:


Error in weighted.mean(x, as.vector(wt), na.rm = TRUE) :
object 'x' not found



How do I specify that x should be a column of the df?


x


df



I tried this but it did not work:


dfclevel= df[, lapply(.SD, lapply(weighted.mean(x, as.vector(wt), na.rm=TRUE)), na.rm=TRUE), .SDcols = cols, by=matchcode]
Error in match.fun(FUN) : argument "FUN" is missing, with no default






do you mean weighted.mean?

– chinsoon12
Sep 13 '18 at 10:12


weighted.mean






That is way easier than anything I could have hoped for. It becomes dfclevel = df[, lapply(.SD, weighted.mean(x, df$weight, na.rm=TRUE), na.rm=TRUE), .SDcols = cols, by=matchcode] then right?

– Tom
Sep 13 '18 at 10:18



dfclevel = df[, lapply(.SD, weighted.mean(x, df$weight, na.rm=TRUE), na.rm=TRUE), .SDcols = cols, by=matchcode]






maybe u don't need the df$ inside weighted.mean?

– chinsoon12
Sep 13 '18 at 10:23


df$


weighted.mean






@chinsoon12 Thank you for your comments! I think I got quite a bit further (I updated my post), but I am still not completely there. Would you mind taking another look?

– Tom
Sep 13 '18 at 11:16






lapply(.SD, weighted.mean, wt, na.rm = TRUE)

– Hugh
Sep 13 '18 at 11:33


lapply(.SD, weighted.mean, wt, na.rm = TRUE)




0



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)