简体   繁体   English

使用read.table读取两列数据将导致三列

[英]Reading two-column data with read.table results in three columns

I am using read.table() to get data from a web page. 我正在使用read.table()从网页获取数据。 The data table has two columns and is from NIST. 数据表有两列,来自NIST。 Following is my code, including the URL if you want to preview the data: 以下是我的代码,包括要预览数据的URL:

    options(digits = 12)
    theURL <- "https://www.itl.nist.gov/div898/strd/anova/AtmWtAg.dat"
    AgData <- read.table(theURL, header = TRUE , skip = 59)
    # Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
    # line 1 did not have 3 elements
    AgData$Instrument = as.factor(AgData$Instrument)
    fitAgData = aov(AgWt ~ Instrument, data=AgData)

I have inserted the error message as a comment in the code where it occurs. 我已将错误消息作为注释插入到发生错误的代码中。

Other answers on stack exchange seem to deal with missing values causing this error. 关于堆栈交换的其他答案似乎是在处理导致此错误的缺失值。 The data on this site is complete, so I'm not sure what is causing the error. 该站点上的数据是完整的,所以我不确定是什么导致了错误。

So far, I have fiddled with the skip = value; 到目前为止,我已经用过skip = value了; inserted the column names as a read.table argument; 插入列名作为read.table参数; and added fill = TRUE to read.table The latter one resulted in a data table with three columns, one containing NA values. 然后将fill = TRUE添加到read.table 。后一个结果生成了一个包含三列的数据表,其中一列包含NA值。 Since there are names in the table, I have the header=TRUE argument on. 由于表中有名称,因此我具有header=TRUE参数。

Somehow read.table() thinks there are three columns, and I don't see a way to tell it there are two. 不知何故read.table()认为有三列,但我看不出有两种方法可以告诉它。

After the skipped rows, I saw the start of the file like this: 在跳过的行之后,我看到了文件的开头,如下所示:

Data:  
      Instrument           AgWt
          1            107.8681568
          1            107.8681465
          1            107.8681572
          1            107.8681785

In fact, the data looks like this: 实际上,数据如下所示:

Data:  Instrument           AgWt
           1            107.8681568
           1            107.8681465
           1            107.8681572
           1            107.8681785

It identifies "Data:" as the name of your first column. 它标识“数据:”作为第一列的名称。 So it scans the header and identifies 3 columns, then gets to the data and only sees 2. Try skipping that line and then manually naming your data. 因此它将扫描标题并标识3列,然后访问数据并仅看到2。请尝试跳过该行,然后手动命名数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM