简体   繁体   English

将定界字符串转换为数据帧中的数字矢量

[英]Convert delimited string to numeric vector in dataframe

This is such a basic question, I'm embarrassed to ask. 这是一个很基本的问题,我很尴尬地问。

Let's say I have a dataframe full of columns which contain data of the following form: 假设我有一个充满列的数据框,其中包含以下形式的数据:

test <-"3000,9843,9291,2161,3458,2347,22925,55836,2890,2824,2848,2805,2808,2775,2760,2706,2727,2688,2727,2658,2654,2588"

I want to convert this to a numeric vector, which I have done like so: 我想将其转换为数值向量,如下所示:

test <- as.numeric(unlist(strsplit(test, split=",")))

I now want to convert a large dataframe containing a column full of this data into a numeric vector equivalent: 我现在想将包含充满此数据的列的大型数据框转换为等效的数值向量:

mutate(data,
  converted = as.numeric(unlist(strsplit(badColumn, split=","))),
)

This doesn't work because presumably it's converting the entire column into a numeric vector and then replacing a single row with that value: 这行不通,因为大概是将整个列转换为数值向量,然后用该值替换一行:

Error in mutate_impl(.data, dots) : Column converted must be length 20 (the number of rows) or one, not 1274 mutate_impl(.data,点)中的错误: converted列的长度必须为20(行数)或1,而不是1274

How do I do this? 我该怎么做呢?

This might help: 这可能会有所帮助:

library(purrr)

mutate(data, converted = map(badColumn, function(txt) as.numeric(unlist(strsplit(txt, split = ",")))))

What you get is a list column which contains the numeric vectors. 您将获得一个包含数字向量的列表列。

Base R 基数R

A=c(as.numeric(strsplit(test,',')[[1]]))

A
[1]  3000  9843  9291  2161  3458  2347 22925 55836  2890  2824  2848  2805  2808  2775  2760  2706  2727  2688  2727  2658  2654  2588


df$NEw2=lapply(df$NEw, function(x) c(as.numeric(strsplit(x,',')[[1]])))

df%>%mutate(NEw2=list(c(as.numeric(strsplit(NEw,',')[[1]]))))

Here's some sample data that reproduces your error: 以下是一些重现您的错误的示例数据:

data <- data.frame(a = 1:3, 
                   badColumn = c("10,20,30,40,50", "1,2,3,4,5,6", "9,8,7,6,5,4,3"), 
                   stringsAsFactors = FALSE)

Here's the error: 这是错误:

library(tidyverse)
mutate(data, converted = as.numeric(unlist(strsplit(badColumn, split=","))))
# Error in mutate_impl(.data, dots) : 
#   Column `converted` must be length 3 (the number of rows) or one, not 18

A straightforward way would be to just use strsplit on the entire column, and lapply ... as.numeric to convert the resulting list values from character vectors to numeric vectors. 一种简单的方法是仅在整个列上使用strsplit ,并lapply ... as.numeric将结果列表值从字符向量转换为数字向量。

x <- mutate(data, converted = lapply(strsplit(badColumn, ",", TRUE), as.numeric))
str(x)
# 'data.frame': 3 obs. of  3 variables:
#  $ a        : int  1 2 3
#  $ badColumn: chr  "10,20,30,40,50" "1,2,3,4,5,6" "9,8,7,6,5,4,3"
#  $ converted:List of 3
#   ..$ : num  10 20 30 40 50
#   ..$ : num  1 2 3 4 5 6
#   ..$ : num  9 8 7 6 5 4 3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM