![](/img/trans.png)
[英]Removing matching observations where their adjacent column does not equal to 100
[英]Loop to clean up table where observations are stored as column
我有一個表,按以下方式在x
列中存儲觀察值,在y
列中存儲變量名稱。
我正在嘗試編寫一個R
循環來創建一個矩陣,其中每個觀察值是一行,每個變量是一列。
問題是並非所有觀察結果都包含所有變量。
原始數據:
X | 是 |
---|---|
蘋果 | 水果 |
奧地利 | 起源 |
夏天 | 季節 |
橙子 | 水果 |
西班牙 | 起源 |
梨 | 水果 |
番茄 | 水果 |
意大利 | 起源 |
夏天 | 季節 |
所需的 output:
水果 | 起源 | 季節 |
---|---|---|
蘋果 | 奧地利 | 夏天 |
橙子 | 西班牙 | |
梨 | ||
番茄 | 意大利 | 夏天 |
到目前為止我的想法(偽R
代碼):
df_old <- data.frame( x = c( "Apple", "Austria", "Summer", "Orange", "Spain", "Pear", "Tomato", "Italy", "Summer" ),
y = c( "Fruit", "Origin", "Season", "Fruit", "Origin", "Fruit", "Fruit", "Origin", "Season" ) )
df_new <- data.frame( matrix( ncol = 3, nrow = 0 ) )
colnames( df_new ) <- c( "Fruit", "Origin", "Season")
for ( i in seq_along( df_old ) ) {
if ( y == "Fruit" ) {
# add new row
df_new$Fruit <- df_old$x
} else if ( y == "Origin" ) {
df_new$Origin <- df_old$x
} else ( y == "Season" ) {
df_new$Season <- df_old$x
}
}
謝謝你的幫忙。
這是一個基於您使用 for 循環給出的想法的解決方案。
df_old <- data.frame( x = c( "Apple", "Austria", "Summer", "Orange", "Spain", "Pear", "Tomato", "Italy", "Summer" ),
y = c( "Fruit", "Origin", "Season", "Fruit", "Origin", "Fruit", "Fruit", "Origin", "Season" ) ,stringsAsFactors=F)
df_new <- as.data.frame(matrix(NA, nrow=sum(df_old$y == "Fruit"), ncol=length(unique(df_old$y))))
names(df_new) <- c("Fruit", "Origin", "Season")
j <- 0
for (i in 1:(nrow(df_old))){
print(df_old$y[i])
if (df_old$y[i] == "Fruit") { j <- j + 1 ; df_new$Fruit[j] <- df_old$x[i]
print("new colum")
if ((df_old$y[i+1] == "Origin")){ df_new$Origin[j] <- df_old$x[i+1] }
print("add origin")
if ((df_old$y[i+1] == "Season") | (df_old$y[i+2] == "Season")){
df_new$Season[j] <- df_old$x[c(i+1,i+2)][df_old$y[c(i+1,i+2)] == "Season"]
print("add Season")
}
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.