Sometimes your sampling rate is too high; group_every allows you to
down-sample by creating "bins" which can subsequently be summarised on. When using n
, data
needs to be regularly sampled; if there are gaps in time, the bin duration will differ.
Works well with calculate_summary()
for movement data.
Examples
## Group by every 5 seconds
df_time <- data.frame(
time = seq(from = 0.02, to = 100, by = 1/30), # time at 30Hz, slightly offset
y = rnorm(3000)) # random numbers
df_time |>
group_every(seconds = 5) |> # group for every 5 seconds
dplyr::summarise(time = min(time), # summarise for time and y
mean_y = mean(y)) |>
dplyr::mutate(time = floor(time)) # floor to get the round second number
#> # A tibble: 20 × 3
#> bin time mean_y
#> <dbl> <dbl> <dbl>
#> 1 0 0 -0.0669
#> 2 1 5 -0.128
#> 3 2 10 0.000244
#> 4 3 15 0.0100
#> 5 4 20 -0.131
#> 6 5 25 -0.161
#> 7 6 30 0.0575
#> 8 7 35 0.0127
#> 9 8 40 0.166
#> 10 9 45 -0.0199
#> 11 10 50 -0.0209
#> 12 11 55 -0.0108
#> 13 12 60 0.0113
#> 14 13 65 -0.0000138
#> 15 14 70 -0.00843
#> 16 15 75 0.129
#> 17 16 80 0.0124
#> 18 17 85 0.0706
#> 19 18 90 0.0656
#> 20 19 95 0.0229
# Group every n observations
df <- data.frame(
x = seq(1:1000),
y = rnorm(1000))
df |>
group_every(n = 30) |> # group every 30 observations together
dplyr::summarise(mean_x = mean(x),
mean_y = mean(y))
#> # A tibble: 34 × 3
#> bin mean_x mean_y
#> <dbl> <dbl> <dbl>
#> 1 1 15.5 -0.100
#> 2 2 45.5 -0.0723
#> 3 3 75.5 -0.232
#> 4 4 106. -0.379
#> 5 5 136. -0.243
#> 6 6 166. -0.0330
#> 7 7 196. -0.174
#> 8 8 226. -0.187
#> 9 9 256. 0.0321
#> 10 10 286. -0.0391
#> # ℹ 24 more rows