简体   繁体   English

循环清理表,其中观察值存储为列

[英]Loop to clean up table where observations are stored as column

I have a table that stores observations in column x and variable names in column y the following way.我有一个表,按以下方式在x列中存储观察值,在y列中存储变量名称。

I am trying to write an R loop to create a matrix where each observation is a row and each variable is a column.我正在尝试编写一个R循环来创建一个矩阵,其中每个观察值是一行,每个变量是一列。

The problem is that not all observations have all the variables.问题是并非所有观察结果都包含所有变量。

Original data:原始数据:

x X y
Apple苹果 Fruit水果
Austria奥地利 Origin起源
Summer夏天 Season季节
Orange橙子 Fruit水果
Spain西班牙 Origin起源
Pear Fruit水果
Tomato番茄 Fruit水果
Italy意大利 Origin起源
Summer夏天 Season季节

Desired output:所需的 output:

Fruit水果 Origin起源 Season季节
Apple苹果 Austria奥地利 Summer夏天
Orange橙子 Spain西班牙
Pear
Tomato番茄 Italy意大利 Summer夏天

My thinking so far (pseudo R code):到目前为止我的想法(伪R代码):

df_old <- data.frame( x = c( "Apple", "Austria", "Summer", "Orange", "Spain", "Pear", "Tomato", "Italy", "Summer" ),
                      y = c( "Fruit", "Origin", "Season", "Fruit", "Origin", "Fruit", "Fruit", "Origin", "Season" ) )

df_new <- data.frame( matrix( ncol = 3, nrow = 0 ) )
colnames( df_new ) <- c( "Fruit", "Origin", "Season")

for ( i in seq_along( df_old ) ) {
  if ( y == "Fruit" ) {
    # add new row
    df_new$Fruit <- df_old$x
  } else if ( y == "Origin" ) {
    df_new$Origin <- df_old$x
  } else ( y == "Season" ) {
    df_new$Season <- df_old$x
  }
}

Thank you for helping.谢谢你的帮忙。

Here is a solution based on the idea you were giving using a for-loop.这是一个基于您使用 for 循环给出的想法的解决方案。

df_old <- data.frame( x = c( "Apple", "Austria", "Summer", "Orange", "Spain", "Pear", "Tomato", "Italy", "Summer" ),
                          y = c( "Fruit", "Origin", "Season", "Fruit", "Origin", "Fruit", "Fruit", "Origin", "Season" ) ,stringsAsFactors=F)
    
    
df_new <- as.data.frame(matrix(NA, nrow=sum(df_old$y == "Fruit"), ncol=length(unique(df_old$y))))
names(df_new) <- c("Fruit", "Origin", "Season")


j <- 0
for (i in 1:(nrow(df_old))){
  print(df_old$y[i])
  if (df_old$y[i] == "Fruit") { j <- j + 1 ; df_new$Fruit[j] <-  df_old$x[i]
    print("new colum")
    if ((df_old$y[i+1] == "Origin")){ df_new$Origin[j] <-  df_old$x[i+1] }
    print("add origin")
    if ((df_old$y[i+1] == "Season") |  (df_old$y[i+2] == "Season")){
      df_new$Season[j] <-  df_old$x[c(i+1,i+2)][df_old$y[c(i+1,i+2)] == "Season"]
      print("add Season")
    }
  }
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 删除相邻列不等于 100 的匹配观测值 - Removing matching observations where their adjacent column does not equal to 100 如何用正则表达式清理 dataframe 列? - How to clean up dataframe column with regular expression? For循环将所有非NA观测值重命名为R中的列名 - For loop to rename all non-NA observations to column name in R 使用变量/列 i R 的唯一值循环所有观察值 - Loop over all observations with a unique value of a variable/column i R 如何根据列名存储在变量中的列值过滤 data.table - How to filter a data.table based on values of Column where Column name is stored in a variable 数据框:循环参与者/观察并将列写入文本文件 - Data frame: loop over participants/observations and write column to text file 如何从数据框中提取观察值并创建一个显示观察值,列名和行名的表? - How to extract observations from a data frame and make a table showing observations, column name, and row name? 有条件地过滤,如果 A 列中的某个值被超过,则超出相应时间戳(B 列)的进一步观察将被删除 - Filter conditionally, where if a value is exceeded in column A, further observations beyond the respective timestamp (column B) are dropped 有没有办法在 R 中生成数据,其中观察值的总和等于特定值? - Is there a way to generate data in R where the sum of the observations add up to a specific value? 每列的唯一观察结果 - Unique observations per column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM