简体   繁体   English

r 中的组(房屋数据)

[英]groups in r (housing data)

I have a housing dataset which is grouped by the property code.我有一个按属性代码分组的住房数据集。 It indicates who owns a property in any given year.它表明在任何给定年份谁拥有房产。 The dataset includes year, property code, and name of owner.数据集包括年份、房产代码和所有者姓名。 It also includes a binary variable called "change" which indicates whether the owner of the property has changed.它还包括一个名为“change”的二进制变量,它指示财产的所有者是否已更改。

I want to loop through each property group to find when there is a change in owner (change=1).我想遍历每个属性组以查找所有者何时发生更改(更改 = 1)。 When it finds a change in owner, it should create a new dataset where one column has the name of the old owner, and the other column has the name of the new owner.当它发现所有者发生变化时,它应该创建一个新数据集,其中一列具有旧所有者的名称,另一列具有新所有者的名称。

The aim of doing this is to eventually run an analysis of whether the gender or ethnicity of the owner changes.这样做的目的是最终分析所有者的性别或种族是否发生变化。 I am using the packages wru and gender, and was going to compare the old owner with the new one after both had been identified.我正在使用 wru 和性别包,并在确定了旧所有者和新所有者后将两者进行比较。

I'm very new to R and would love if someone could guide me through this.我对 R 很陌生,如果有人能指导我完成这个,我会很高兴。

property_changes <- dataset[,c(1,2,3,10)]
change_col_names <- c("year","change", "property", "name")
colnames(property_changes) <- change_col_names
groups <- group_by(property_changes, property_changes$property)

sample of the dataset数据集样本

Welcome to R. I strongly suggest you look at the the package "dplyr" with the function recode(), instead of looping, we can create a new column with a "yes" or "no" for if the ownership of the property has changed, this can allow you to pull only the rows where the ownership changed by filtering.欢迎使用 R。我强烈建议您使用 recode() 函数查看包“dplyr”,而不是循环,我们可以创建一个带有“是”或“否”的新列,如果属性的所有权有更改,这可以让您仅提取所有权通过过滤更改的行。 I created a simple example for explanation.我创建了一个简单的例子来解释。

library(dplyr)

year = seq(2000, 2009, 1)
change = c(0,0,0,0,1,0,1,0,0,0)
owner = (c("bob", "bob", "bob", "bob", "alice", "alice", "lisa", "lisa", "lisa", "lisa"))
 prop <- data.frame(year, change, owner)


prop %>% 
  group_by("owner") %>% 
  mutate(change_in_ownership=recode(change, 
             `0`="No",
             `1`="yes")) %>% 
  filter(change_in_ownership == 'yes')

where my output after the filtering is:过滤后我的输出是:

year change owner `"owner"` change_in_ownership
  <dbl>  <dbl> <fct> <chr>     <chr>              
1  2004      1 alice owner     yes                
2  2006      1 lisa  owner     yes  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM