[英]How to convert all column data type to numeric and character dynamically?
我手動轉換我的列數據類型:
data[,'particles'] <- as.numeric(as.character(data[,'particles']))
這不太理想,因為數據可能會發展,我不確定會出現什么物種,例如它們可能是 - "nox", "no2", "co", "so2", "pm10"
等等。
反正有沒有自動轉換它們?
我目前的數據集:
structure(list(particles = structure(c(1L, 3L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 6L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 4L, 4L,
4L, 3L, 3L, 3L, 3L, 5L, 6L, 5L, 3L), .Label = c("1", "11", "1.1",
"2", "2.1", "3.1"), class = "factor"), humidity = structure(c(4L,
7L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 6L, 1L, 1L, 1L,
5L, NA, NA, NA, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0.1",
"1", "1.1", "1.3", "21", "2.1", "3"), class = "factor"), timestamp = c(1468833354929,
1468833365186, 1468833378458, 1468833538213, 1468833538416, 1468833538613,
1468833538810, 1468833538986, 1468833539172, 1468833539358, 1468833539539,
1468833554592, 1468833559059, 1468833562357, 1468833566225, 1468833573486,
1468840019118, 1468840024950, 1469029568849, 1469029584243, 1469029590530,
1469029622391, 1469029623598, 1469245154003, 1469245156533, 1469245156815,
1469245157123, 1469245162358, 1469245165911, 1469245170178, 1469245173788
), date = structure(c(1468833354.929, 1468833365.186, 1468833378.458,
1468833538.213, 1468833538.416, 1468833538.613, 1468833538.81,
1468833538.986, 1468833539.172, 1468833539.358, 1468833539.539,
1468833554.592, 1468833559.059, 1468833562.357, 1468833566.225,
1468833573.486, 1468840019.118, 1468840024.95, 1469029568.849,
1469029584.243, 1469029590.53, 1469029622.391, 1469029623.598,
1469245154.003, 1469245156.533, 1469245156.815, 1469245157.123,
1469245162.358, 1469245165.911, 1469245170.178, 1469245173.788
), class = c("POSIXct", "POSIXt"), tzone = "Asia/Singapore")), .Names = c("particles",
"humidity", "timestamp", "date"), row.names = c(NA, -31L), class = "data.frame")
它有particles
, humidity
, timestamp
, date
。
使用來自dplyr
mutate_if()
另一個選項,它允許您對謂詞返回TRUE
列進行操作
library(dplyr)
df %>%
mutate_if(is.factor, funs(as.numeric(as.character(.))))
注意 :此方法也適用於您的后續問題
如果您不知道需要事先轉換哪些列,則可以從數據框中提取該信息,如下所示:
vec <- sapply(dat, is.factor)
這使:
> vec
particles humidity timestamp date
TRUE TRUE FALSE FALSE
然后,您可以使用此向量來使用lapply
集進行lapply
:
# notation option one:
dat[, vec] <- lapply(dat[, vec], function(x) as.numeric(as.character(x)))
# notation option two:
dat[vec] <- lapply(dat[vec], function(x) as.numeric(as.character(x)))
如果要同時檢測因子和字符列,可以使用:
sapply(dat, function(x) is.factor(x)|is.character(x))
我們可以使用data.table
library(data.table)
setDT(df)[, lapply(.SD, function(x) if(is.factor(x)) as.numeric(as.character(x)) else x)]
最好的選擇是我認為適用
你可以做
newD<-apply(data[,"names"], 2,function(x) as.numeric(as.character(x)))
在“名稱”中你放置了你想要的所有變量。 然后應用2作為第二個參數將在第一個參數的所有列(如果你按行放置1)上應用函數(x)。 您可以將其保存為新數據集或使用重寫舊數據集
data[,"names"]<-apply....
使用lapply
:
cols <- c("particles", "nox", ...)
data[,cols] <- lapply(data[,cols], function(x) as.numeric(as.character(x)))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.