Get first element of nested data2019-06-17
I was cleaning some data today and found myself wanting to grab just the first row of a set of rows belonging to a certain group. Here’s how to do that using some tidyverse packages in R.
library(tidyr); packageVersion("tidyr") #>  '0.8.3' library(dplyr); packageVersion("dplyr") #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union #>  '0.8.0.1' library(purrr); packageVersion("purrr") #>  '0.3.2' iris %>% group_by(Species) %>% nest() %>% mutate(sample = map(data, function(x) head(x, 1))) %>% select(-data) %>% unnest() #> # A tibble: 3 x 5 #> Species Sepal.Length Sepal.Width Petal.Length Petal.Width #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.1 3.5 1.4 0.2 #> 2 versicolor 7 3.2 4.7 1.4 #> 3 virginica 6.3 3.3 6 2.5
Created on 2019-06-18 by the reprex package (v0.2.1)
So here, I’ve taken the ubiquitous
iris dataset and grouped the rows by
species and took the first row. You can probably do a random sample in the
map() function for stratified sampling.