Filter from a dataframe and create another dataframe using r

I have two data frames as following:

df <- data.frame(month = c("mazda", "yamaha"), april = c(11,12), may = c(14,15)) df_whole <- data.frame(month = c("car", "bikes"), april = c(.1,.2), may = c(.5,.2))

What I want to do is create a row called total_car which would be a product of car and mazda. However if df_whole doesnt have row car I would still like to create a row total_car which would mazda*0.The output I want to get to is as below. How could I calculate something like this in R

total_car

car

mazda

df_whole

car

total_car

mazda*0

df_car <- data.frame(month = c("mazda", "yamaha","total"), april = c(11,12,1.1), may = c(14,15,7))

How do I know that mazda is multiplied by car and not by bikes? What happens to the bikes row of df_whole? What do you mean by "if df_whole doesnt have row car"? What's the general format of df_whole?

– Maurits Evers
Sep 17 '18 at 21:21

mazda

car

bikes

df_whole

car

df_whole

I want to show three tables one is df ,second being df_whole and third being df_car .The dataframedf_whole is being derived from some other table which always might not have the row car. So if car doesnt exist I would still like to create df_whole and multiply mazda with 0 . I would like filter df_whole %>% filter(df_whole$month =="car")

– SNT
Sep 17 '18 at 21:47

df

df_whole

df_car

df_whole

car

mazda

0

df_whole %>% filter(df_whole$month =="car")

1 Answer
1

It sounds like a sql query would work better to join your tables and get the summary data.

Here's a tidyverse solution (note the inner_join(), which is sql):

tidyverse

inner_join()

library(dplyr) library(tidyr) df <- data_frame(month = c("mazda", "yamaha"), april = c(11,12), may = c(14,15)) df_whole <- data_frame(month = c("car", "bikes"), april = c(.1,.2), may = c(.5,.2)) # adds car row if missing - there is probably a better way to handle the missing data if (!"car" %in% df_whole$month) df_whole <- bind_rows(df_whole, data_frame(month = "car")) # converts NAs to 0 df_whole[is.na(df_whole)] <- 0 # convert to long format df_long <- df %>% rename("brand" = "month") %>% gather(key = "month", value = "val", april:may) df_whole_long <- df_whole %>% rename("type" = "month") %>% gather(key = "month", value = "val", april:may) # calcualte the multiplication dat <- inner_join(df_long, df_whole_long, by = "month") %>% # combine dfs filter(brand == "mazda", type == "car") %>% # filter out key rows mutate(total = val.x * val.y) # do multiplication and add into new column # reformat and append df_car <- bind_rows(df, bind_cols(data_frame(month = "total"), dat %>% select(month, total) %>% spread(month, total)))

df_car is:

month april may 1 mazda 11.0 14 2 yamaha 12.0 15 3 total 1.1 7

I used tidyverse data frames (data_frame() rather than data.frame()) so strings weren't converted to factors (the same can be achieved by adding stringsAsFactors = F to data.frame()) and I converted your data to long format, which would probably be a better structure for your database anyway.

tidyverse

data_frame()

data.frame()

stringsAsFactors = F

data.frame()

Thanks for contributing an answer to Stack Overflow!

But avoid …

To learn more, see our tips on writing great answers.

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service, privacy policy and cookie policy

搜尋此網誌

Dfyjkt