简体   繁体   English

如何对 1 行中的列进行分组并在 r 中收集信息?

[英]How to group columns in 1 row and gather information in r?

I have this data.frame:我有这个data.frame:

# Variable 1        Variable 2    Business(spec)
# Name Business     Macaco SA     NA
# ID Business       10            NA
# Type Business     Fishing       NA
# Name Business     Simio SA      NA
# ID Business       20            NA
# Type Business     Other         Transport

This was a file in excel that I imported to R.这是我导入到 R 的 excel 中的一个文件。 Variable 2 is a list of specific type of business.变量 2 是特定业务类型的列表。 If the type of business is not in the list, in Business(Spec) they need to write the type of their business (so this variable can have different values).如果业务类型不在列表中,则在Business(Spec)中他们需要编写业务类型(因此该变量可以具有不同的值)。 And i want to sort like this:我想这样排序:

# Name Business    ID Business    Type Business    Business(spec)
# Macaco SA        10             Fishing          NA
# Simio SA         20             Other            Transport

I tried with pivot_wider(data, names_from = "Variable 1", values_from = "Variable 2") but it didn't work.我尝试使用pivot_wider(data, names_from = "Variable 1", values_from = "Variable 2")但它没有用。 How can I do this?我怎样才能做到这一点?

Thanks!!谢谢!!

If you can group things together (is it always 3 rows?), then you have a better chance.如果您可以将事物组合在一起(总是 3 行吗?),那么您就有更好的机会。

I had to munge the data to use it more easily, so variable names and values are slightly different.我不得不修改数据以更轻松地使用它,因此变量名称和值略有不同。

dat <- read.table(header=TRUE, stringsAsFactors=FALSE, text="
Variable1         Variable2     Business(spec)
Name_Business     Macaco_SA     NA
ID_Business       10            NA
Type_Business     Fishing       NA
Name_Business     Simio_SA      NA
ID_Business       20            NA
Type_Business     Other         Transport")

dat$grp <- cumsum(dat$Variable1 == "Name_Business")
dat
#       Variable1 Variable2 Business.spec. grp
# 1 Name_Business Macaco_SA           <NA>   1
# 2   ID_Business        10           <NA>   1
# 3 Type_Business   Fishing           <NA>   1
# 4 Name_Business  Simio_SA           <NA>   2
# 5   ID_Business        20           <NA>   2
# 6 Type_Business     Other      Transport   2
pivot_wider(dat, id_cols = "grp", names_from = "Variable1", values_from = "Variable2")
# # A tibble: 2 x 4
#     grp Name_Business ID_Business Type_Business
#   <int> <chr>         <chr>       <chr>        
# 1     1 Macaco_SA     10          Fishing      
# 2     2 Simio_SA      20          Other        

This makes an assumption that "Name_Business" is always the first within a grouping of rows, and that everything immediately after it is related to that row.这假设"Name_Business"始终是一组行中的第一个,并且紧随其后的所有内容都与该行相关。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM