如何修復“無效因子水平”？

Question

我不能運行一個卑鄙的功能。 這是我的代碼：

我已經成功地嘗試了因子（數據$ date）功能。 shell回答它由890個51級別的條目組成。

   data <- read.table("R/DATA.csv", sep = ";", header = TRUE, dec = ",")
   View(data)
   colnames(data)[1] <- "Date"
   eau <- data$"Tension"
   eaucalculee <- ( 0.000616 * eau - 0.1671) * 100
   data["Eau"] <- eaucalculee
     tata <- data.frame("Aucun","Augmentation","Interception")

   tata[1,1]<-mean(data$Eau[data$Date == levels(factor(data$Date))[1]& 
   data$Traitement == "Aucun"])

dataframe to be filled with the mean but in fact I get this error message : 我希望數據幀的第一列第一行用均值填充，但事實上我收到此錯誤消息：

   In `[<-.factor`(`*tmp*`, iseq, value = 8.6692) :
   invalid factor level, NA generated

請問你能幫幫我嗎？

您可以在那里找到csv文件： https ： //drive.google.com/file/d/1zbA25vajouQ4MiUF72hbeV8qP9wlMqB9/view? usp =sharing

非常感謝你

Answer 1

tata是一個因子data.frame，你想在try中插入一個數字

tata <- data.frame("Aucun","Augmentation","Interception" ,stringsAsFactors = F)

Answer 2

我不確定行tata <- data.frame("Aucun","Augmentation","Interception")是否符合您的預期。 如果使用View(tata)檢查其結果，您將看到一個數據框，其中包含一條記錄和3列，其值為 3個字符串（轉換為因子，如@ s-brunel所說）。 列名是從它們的值（ X.Aucun.等）推斷出來的。 我想你更想要創建一個數據框，其列名是給定的字符串。

建議的代碼，帶注釋

data <- read.table("R/DATA.csv", sep = ";", header = TRUE, dec = ",")

# The following is useless since first column is already named Date
# colnames(data)[1] <- "Date"

# No need to create your intermediate variables eau and eaucalculee: you can 
# do it directly with the data frame columns
data$Eau <- ( 0.000616 * data$Tension - 0.1671) * 100

# No need to create your tata data frame before filling its actual content, you
# can do it directly
tata <- data.frame(
  Aucun = mean(data$Eau[
    data$Date == levels(factor(data$Date))[1] & data$Traitement == "Aucun"
    ])
  )
tata$Augmentation = your_formula_here
tata$Interception = your_formula_here

注1 ：引用數據框列的最簡單方法是使用$ ，您不需要使用任何雙引號。 您也可以使用[[使用雙引號（等效），但要注意[將返回帶有單列的數據框：

class(data$Date)
# [1] "factor"
class(data[["Date"]])
# [1] "factor"
class(data["Date"])
# [1] "data.frame"
class(data[ , "Date"])
# [1] "factor"

注意2 ：嘗試對您提出的問題進行逆向工程，也許您想為每個Date和Traitement組合計算Eau的平均值。 在這種情況下，我建議你dplyr和tidyr從令人敬畏的tidyverse包：

# install.packages("tidyverse") # if you don't already have it
library(tidyverse)

data <- data %>% 
  mutate(Eau = ( 0.000616 * data$Tension - 0.1671) * 100)

tata_vertical <- data %>% 
  group_by(Date, Traitement) %>% 
  summarise(mean_eau = mean(eau))
View(tata_vertical)

tata <- tata_vertical %>% spread(Traitement, mean_eau)
View(tata)

關於https://www.tidyverse.org/learn/的大量文檔

如何修復“無效因子水平”？

問題描述

2 個解決方案

解決方案1
0 2019-05-17 09:08:14

解決方案2
0 已采納 2019-05-17 10:10:40

如何修復“無效因子水平”？

問題描述

2 個解決方案

解決方案1 0 2019-05-17 09:08:14

解決方案2 0 已采納 2019-05-17 10:10:40

解決方案1
0 2019-05-17 09:08:14

解決方案2
0 已采納 2019-05-17 10:10:40