Sometimes your sampling rate is too high; group_every allows you to
down-sample by creating "bins" which can subsequently be summarised on. When using n
, data
needs to be regularly sampled; if there are gaps in time, the bin duration will differ.
Works well with calculate_summary()
for movement data.
Examples
## Group by every 5 seconds
df_time <- data.frame(
time = seq(from = 0.02, to = 100, by = 1/30), # time at 30Hz, slightly offset
y = rnorm(3000)) # random numbers
df_time |>
group_every(seconds = 5) |> # group for every 5 seconds
dplyr::summarise(time = min(time), # summarise for time and y
mean_y = mean(y)) |>
dplyr::mutate(time = floor(time)) # floor to get the round second number
#> # A tibble: 20 × 3
#> bin time mean_y
#> <dbl> <dbl> <dbl>
#> 1 0 0 0.0554
#> 2 1 5 0.0484
#> 3 2 10 0.0538
#> 4 3 15 -0.0803
#> 5 4 20 -0.126
#> 6 5 25 -0.0140
#> 7 6 30 0.0443
#> 8 7 35 -0.189
#> 9 8 40 -0.0825
#> 10 9 45 0.120
#> 11 10 50 0.0276
#> 12 11 55 0.124
#> 13 12 60 -0.0545
#> 14 13 65 0.0156
#> 15 14 70 -0.0550
#> 16 15 75 0.0348
#> 17 16 80 0.0331
#> 18 17 85 0.0543
#> 19 18 90 0.00880
#> 20 19 95 0.158
# Group every n observations
df <- data.frame(
x = seq(1:1000),
y = rnorm(1000))
df |>
group_every(n = 30) |> # group every 30 observations together
dplyr::summarise(mean_x = mean(x),
mean_y = mean(y))
#> # A tibble: 34 × 3
#> bin mean_x mean_y
#> <dbl> <dbl> <dbl>
#> 1 1 15.5 0.0405
#> 2 2 45.5 -0.226
#> 3 3 75.5 -0.0563
#> 4 4 106. 0.189
#> 5 5 136. -0.0854
#> 6 6 166. 0.0903
#> 7 7 196. 0.0856
#> 8 8 226. -0.136
#> 9 9 256. -0.0732
#> 10 10 286. 0.00451
#> # ℹ 24 more rows