How to print boxplots for each row of a dataset in R?
How to print boxplots for each row of a dataset in R?
Here's an example data and I was wondering as to what should I do if I want to boxplot stock1, stock2, stock 3 and stock 4, for Day1, Day2... Day6 seperately, in R?
head(StockExample)
X1 Stock1 Stock2 Stock3 Stock4
1 Day1 185.74 1.47 1605 95.05
2 Day2 184.26 1.56 1580 97.49
3 Day3 162.21 1.39 1490 88.57
4 Day4 159.04 1.43 1520 85.55
5 Day5 164.87 1.42 1550 92.04
6 Day6 162.72 1.36 1525 91.70
So 6 boxplots for each of the days, with stock 1-4 in each of those boxplots. Hope I'm making sense. Also, can I do this using the apply function?
I tried looking up for an answer for this, but couldn't get it right. Appreciate any help in this regard. Many thanks!
4 Answers
4
A good practice to develop would be to 'gather' the stock columns. Something like the following should work. (But I've not tested this.)
require(tidyverse)
require(stringr)
StockExample %>%
tidyr::gather(key = "Stock", value = "value", -X1) %>%
dplyr::mutate(day = stringr::str_replace(X1, "Day", "") %>% as.numeric() ) %>%
dplyr::mutate(Stock = stringr::str_replace(Stock, "Stock", "") %>% as.numeric() ) %>%
ggplot(aes(x = day, y = value)) +
geom_boxplot()
(The code above uses scoping, packagename::functionname
, to indicate specific functions within the tidyverse used to perform some of the operations.
packagename::functionname
colour = X1
structure(list(X1 = structure(1:6, .Label = c("Day1", "Day2", "Day3", "Day4", "Day5", "Day6"), class = "factor"), Stock1 = c(185.74, 184.26, 162.21, 159.04, 164.87, 162.72), Stock2 = c(1.47, 1.56, 1.39, 1.43, 1.42, 1.36), Stock3 = c(1605L, 1580L, 1490L, 1520L, 1550L, 1525L), Stock4 = c(95.05, 97.49, 88.57, 85.55, 92.04, 91.7)), class = "data.frame", row.names = c(NA, -6L))
There are several ways of doing this, both of the above require that the data be in long format.
In order to reformat the data I will use function melt
from package reshape2
.
melt
reshape2
long <- reshape2::melt(StockExample, id.var = "X1")
Now the graphs.
First, using base R graphics.
boxplot(value ~ X1, long)
And second, with package ggplot2
.
ggplot2
library(ggplot2)
ggplot(long, aes(X1, value)) +
geom_boxplot()
Data.
StockExample <-
structure(list(X1 = structure(1:6, .Label = c("Day1", "Day2",
"Day3", "Day4", "Day5", "Day6"), class = "factor"), Stock1 = c(185.74,
184.26, 162.21, 159.04, 164.87, 162.72), Stock2 = c(1.47, 1.56,
1.39, 1.43, 1.42, 1.36), Stock3 = c(1605L, 1580L, 1490L, 1520L,
1550L, 1525L), Stock4 = c(95.05, 97.49, 88.57, 85.55, 92.04,
91.7)), class = "data.frame", row.names = c("1", "2", "3", "4",
"5", "6"))
You can get the data in long format and then it's straightforward with ggstatsplot
:
ggstatsplot
# needed libraris
library(tidyverse)
library(ggstatsplot)
# provided data sample
df <- read.table(
text = "Row Day Stock1 Stock2 Stock3 Stock4
1 Day1 185.74 1.47 1605 95.05
2 Day2 184.26 1.56 1580 97.49
3 Day3 162.21 1.39 1490 88.57
4 Day4 159.04 1.43 1520 85.55
5 Day5 164.87 1.42 1550 92.04
6 Day6 162.72 1.36 1525 91.70",
header = TRUE
) %>%
tibble::as_data_frame()
# converting to long format
(
df_long <- df %>%
tidyr::gather(
data = .,
key = "stock type",
value = "stock value",
Stock1:Stock4
)
)
#> # A tibble: 24 x 4
#> Row Day `stock type` `stock value`
#> <int> <fct> <chr> <dbl>
#> 1 1 Day1 Stock1 186.
#> 2 2 Day2 Stock1 184.
#> 3 3 Day3 Stock1 162.
#> 4 4 Day4 Stock1 159.
#> 5 5 Day5 Stock1 165.
#> 6 6 Day6 Stock1 163.
#> 7 1 Day1 Stock2 1.47
#> 8 2 Day2 Stock2 1.56
#> 9 3 Day3 Stock2 1.39
#> 10 4 Day4 Stock2 1.43
#> # ... with 14 more rows
# plot
ggstatsplot::ggbetweenstats(
data = df_long,
x = Day,
y = `stock value`,
plot.type = "box"
)
Created on 2018-08-26 by the reprex package (v0.2.0.9000).
The simplest solution is probably splitting the data.frame
by row:
data.frame
byDay <- split(StockExample[,-1], StockExample$X1)
Then convert all of those into numeric format:
byDay <- lapply(byDay, as.numeric)
And then simply call boxplot
on it:
boxplot
boxplot(byDay)
Or with everything in one line:
boxplot(lapply(split(StockExample[,-1], StockExample$X1), as.numeric))
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Your solution works except you should add
colour = X1
in aes. Here is the data:structure(list(X1 = structure(1:6, .Label = c("Day1", "Day2", "Day3", "Day4", "Day5", "Day6"), class = "factor"), Stock1 = c(185.74, 184.26, 162.21, 159.04, 164.87, 162.72), Stock2 = c(1.47, 1.56, 1.39, 1.43, 1.42, 1.36), Stock3 = c(1605L, 1580L, 1490L, 1520L, 1550L, 1525L), Stock4 = c(95.05, 97.49, 88.57, 85.55, 92.04, 91.7)), class = "data.frame", row.names = c(NA, -6L))
was just about to put an answer but you were faster.– JBGruber
Aug 26 at 15:47