简体   繁体   English

使用 read.csv.ffdf 设置 csv 的列类型

[英]Set column types for csv with read.csv.ffdf

I am using a payments dataset from Austin Text Open Data.我正在使用 Austin Text Open Data 的支付数据集 I am trying to load the data with the following code:-我正在尝试使用以下代码加载数据:-

library(ff)

asd <- read.table.ffdf(file = "~/Downloads/Fiscal_Year_2010_eCheckbook_Payments.csv", first.rows = 100, next.ros = 50, FUN = "read.csv", VERBOSE = TRUE)

This shows me the following error:-这向我显示了以下错误:-

read.table.ffdf 301..Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : scan() expected 'an integer', got '7AHM' read.table.ffdf 301..Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : scan() expected 'an integer', got '7AHM'

This happens on 339th line of csv file at 5th column of the dataset.这发生在数据集5th列的csv文件的339th行。 The reason why I think this is happening is that all the values of the 5th column are integers where as this happens to be string.我认为发生这种情况的原因是第 5 列的所有值都是整数,而这恰好是字符串。 But the actual type of the column should be string.但列的实际类型应该是字符串。

So I wanted to know if there was a way I could set the types of the column所以我想知道是否有办法设置列的类型

Below I am providing the types for all the columns in a vector:-下面我提供了向量中所有列的类型:-

c("character","integer","integer","character","character", "character","character","character","character","character","integer","character","character","character","character","character","character","character","integer","character","character","character","character","character","integer","integer","integer","character","character","character","character","double","character","integer")

You can also find the type of each column from the description of the dataset.您还可以从数据集的描述中找到每一列的类型。

Please also keep in mind that I am very new to this library.还请记住,我是这个图书馆的新手。 Practically just found out about it today.实际上是今天才知道的。

Maybe you need to transform your data type...The following is just an example that maybe to help you.也许您需要转换您的数据类型...以下只是一个可能对您有所帮助的示例。

data <- transform(
  data,
  age=as.integer(age),
  sex=as.factor(sex),
  cp=as.factor(cp),
  trestbps=as.integer(trestbps),
  choi=as.integer(choi),
  fbs=as.factor(fbs),
  restecg=as.factor(restecg),
  thalach=as.integer(thalach),
  exang=as.factor(exang),
  oldpeak=as.numeric(oldpeak),
  slope=as.factor(slope),
  ca=as.factor(ca),
  thai=as.factor(thai),
  num=as.factor(num)
)
sapply(data, class)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM