簡體   English   中英

從各種不同的計數在 R 中構建一個小標題

[英]Build a tibble in R from various different counts

我有一個相當簡單的問題:如果你有一個原始數據集,然后你通過過濾數據集來計算值來回答一個問題:你如何構建一個數據框/你的答案的tibble?

        #load the packages
    library(easypackages)
    packages("tidyverse","readxl","sf","tmaptools","tmap","lubridate",
             "lwgeom","Cairo","nngeo","purrr","scales", "ggthemes","janitor")
    
    polls<-st_as_sf(read.csv(url("https://www.caerphilly.gov.uk/CaerphillyDocs/FOI/Datasets_polling_stations_csv.aspx")),
                    coords = c("Easting","Northing"),crs = 27700)%>%
      mutate(date = sample(seq(as.Date('2020/01/01'), as.Date('2020/05/31'), by="day"), 147))
    
    test_stack<-polls%>%st_join(polls%>%st_buffer(dist=1000),join=st_within)%>%
      filter(Ballot.Box.Polling.Station.x!=Ballot.Box.Polling.Station.y)%>%
      add_count(Ballot.Box.Polling.Station.x)%>%
      rename(number_of_neighbours = n)%>%
      mutate(interval_date = date.x-date.y)%>%
      subset(select = -c(6:8,10,11,13:18))## removing this comment will summarise the data so that only number of neighbours is returned %>%
    distinct(Ballot.Box.Polling.Station.x,number_of_neighbours,date.x)%>%
      filter(number_of_neighbours >=2)
    
    polls%>%mutate(id = as.numeric(row_number()))%>% mutate(thing = case_when(id %% 2 == 0 ~ "stuff",
                                                                              id %% 2 !=0 ~ "type"))->polls 


 polls%>%filter(thing=="stuff"& Polling.District.Code =="AC")%>%count()

 polls%>%filter(thing == "type" & Polling.District.Code =="IA")%>%count()

如何構建行名稱有意義且列是計算值的數據框?

所以有點像

行名稱值

東西 AC 1

IA 1 型

這聽起來像你想group_bythingPolling.District.Code ,然后summarize通過計算其每組length 如果希望匯總數據框去掉幾何列,則需要使用st_set_geometry(NULL)

 polls %>% 
   group_by(thing, Polling.District.Code) %>% 
   summarize(count = length(thing), .groups = "keep") %>%
   st_set_geometry(NULL)
#> # A tibble: 147 x 3
#> # Groups:   thing, Polling.District.Code [147]
#>    thing Polling.District.Code count
#>  * <chr> <chr>                 <int>
#>  1 stuff AC                        1
#>  2 stuff AE                        1
#>  3 stuff BB1                       1
#>  4 stuff CA1                       1
#>  5 stuff CB1                       1
#>  6 stuff CC                        1
#>  7 stuff CE                        1
#>  8 stuff DA2                       1
#>  9 stuff DB1                       1
#> 10 stuff DB3                       1
#> # ... with 137 more rows

或者,如果您想保留幾何圖形,請使用:

 polls %>% 
   group_by(thing, Polling.District.Code) %>% 
   summarize(count = length(thing), .groups = "keep")
#> Simple feature collection with 147 features and 3 fields
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: 310399 ymin: 186331 xmax: 325960 ymax: 207788
#> projected CRS:  OSGB 1936 / British National Grid
#> # A tibble: 147 x 4
#> # Groups:   thing, Polling.District.Code [147]
#>    thing Polling.District.Code count        geometry
#>    <chr> <chr>                 <int>     <POINT [m]>
#>  1 stuff AC                        1 (311777 206968)
#>  2 stuff AE                        1 (311734 206047)
#>  3 stuff BB1                       1 (310577 205577)
#>  4 stuff CA1                       1 (314777 202748)
#>  5 stuff CB1                       1 (314777 202748)
#>  6 stuff CC                        1 (314622 203396)
#>  7 stuff CE                        1 (315255 201843)
#>  8 stuff DA2                       1 (315780 200318)
#>  9 stuff DB1                       1 (314693 199774)
#> 10 stuff DB3                       1 (315034 199159)
#> # ... with 137 more rows

我認為答案是 bind_rows

polls%>%filter(thing=="stuff"& Polling.District.Code =="AC")%>%count()->a
polls%>%filter(thing == "type" & Polling.District.Code =="IA")%>%count()->b

bind_rows(a,b)->c

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM