[英]Convert delimited string to numeric vector in dataframe
This is such a basic question, I'm embarrassed to ask. 这是一个很基本的问题,我很尴尬地问。
Let's say I have a dataframe full of columns which contain data of the following form: 假设我有一个充满列的数据框,其中包含以下形式的数据:
test <-"3000,9843,9291,2161,3458,2347,22925,55836,2890,2824,2848,2805,2808,2775,2760,2706,2727,2688,2727,2658,2654,2588"
I want to convert this to a numeric vector, which I have done like so: 我想将其转换为数值向量,如下所示:
test <- as.numeric(unlist(strsplit(test, split=",")))
I now want to convert a large dataframe containing a column full of this data into a numeric vector equivalent: 我现在想将包含充满此数据的列的大型数据框转换为等效的数值向量:
mutate(data,
converted = as.numeric(unlist(strsplit(badColumn, split=","))),
)
This doesn't work because presumably it's converting the entire column into a numeric vector and then replacing a single row with that value: 这行不通,因为大概是将整个列转换为数值向量,然后用该值替换一行:
Error in mutate_impl(.data, dots) : Column
converted
must be length 20 (the number of rows) or one, not 1274mutate_impl(.data,点)中的错误:
converted
列的长度必须为20(行数)或1,而不是1274
How do I do this? 我该怎么做呢?
This might help: 这可能会有所帮助:
library(purrr)
mutate(data, converted = map(badColumn, function(txt) as.numeric(unlist(strsplit(txt, split = ",")))))
What you get is a list column which contains the numeric vectors. 您将获得一个包含数字向量的列表列。
Base R 基数R
A=c(as.numeric(strsplit(test,',')[[1]]))
A
[1] 3000 9843 9291 2161 3458 2347 22925 55836 2890 2824 2848 2805 2808 2775 2760 2706 2727 2688 2727 2658 2654 2588
df$NEw2=lapply(df$NEw, function(x) c(as.numeric(strsplit(x,',')[[1]])))
df%>%mutate(NEw2=list(c(as.numeric(strsplit(NEw,',')[[1]]))))
Here's some sample data that reproduces your error: 以下是一些重现您的错误的示例数据:
data <- data.frame(a = 1:3,
badColumn = c("10,20,30,40,50", "1,2,3,4,5,6", "9,8,7,6,5,4,3"),
stringsAsFactors = FALSE)
Here's the error: 这是错误:
library(tidyverse)
mutate(data, converted = as.numeric(unlist(strsplit(badColumn, split=","))))
# Error in mutate_impl(.data, dots) :
# Column `converted` must be length 3 (the number of rows) or one, not 18
A straightforward way would be to just use strsplit
on the entire column, and lapply
... as.numeric
to convert the resulting list values from character vectors to numeric vectors. 一种简单的方法是仅在整个列上使用
strsplit
,并lapply
... as.numeric
将结果列表值从字符向量转换为数字向量。
x <- mutate(data, converted = lapply(strsplit(badColumn, ",", TRUE), as.numeric))
str(x)
# 'data.frame': 3 obs. of 3 variables:
# $ a : int 1 2 3
# $ badColumn: chr "10,20,30,40,50" "1,2,3,4,5,6" "9,8,7,6,5,4,3"
# $ converted:List of 3
# ..$ : num 10 20 30 40 50
# ..$ : num 1 2 3 4 5 6
# ..$ : num 9 8 7 6 5 4 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.