简体   繁体   English

R 将数值上传到 boolean 列 - 为什么?

[英]R uploads numeric value into boolean column - why?

I want to upload an Excel file as a dataframe in R. It is a large file with a lot of numbers and some #NV values.我想上传一个 Excel 文件作为 R 中的 dataframe。这是一个包含很多数字和一些#NV 值的大文件。

The upload works good for the majority of columns (in total, there are 4,000 columns).上传适用于大多数列(总共有 4,000 列)。 But for some columns, R changes the columns to "TRUE" or "FALSE", creating a boolean column.但对于某些列,R 将列更改为“TRUE”或“FALSE”,从而创建 boolean 列。

I don't want that, since all of the columns are supposed to be numeric.我不希望这样,因为所有的列都应该是数字的。

Do you know why R does that?你知道为什么 R 这样做吗?

It would really help if you provided code snippets, because there are many different excel-to-dataframe libraries/methods/behaviors.如果您提供代码片段,那将非常有帮助,因为有许多不同的 excel 到数据框库/方法/行为。

But assuming that you are using writexl , the read_excel function has a guess_max parameter for this kind of case.但假设您使用的是writexl ,则read_excel function 有一个针对这种情况的guess_max参数。 guess_max is 1000 by default. guess_max默认为 1000。

Try df <- read_excel(path = filepath, sheet = sheet_name, guess_max = 100000)试试df <- read_excel(path = filepath, sheet = sheet_name, guess_max = 100000)

Since dataframes cannot have different data types in the same column, read_excel has to read your excel file and guess what data type each column should be, before actually filling the dataframe. If a column happens to only have NA values in the first 1000 rows, read_excel will assume you have a column of booleans, and then all subsequent values encountered in future rows will be cast accordingly.由于数据帧在同一列中不能有不同的数据类型, read_excel必须读取您的 excel 文件并猜测每列应该是什么数据类型,然后再实际填充 dataframe。如果一列恰好在前 1000 行中只有 NA 值, read_excel将假定您有一列布尔值,然后将相应地转换未来行中遇到的所有后续值。 So if you set guess_max to something huge, you make read_excel slower, but it might avoid the casting of numerics to booleans.因此,如果将guess_max设置为很大的值,会使read_excel变慢,但它可能会避免将数字转换为布尔值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM