简体   繁体   English

如何将数据框中的列按数字和字母分为两列

[英]How to split a column in a data frame into two columns by numbers and letters

EDIT for clarity: I have a dataframe called "dat" with 3 columns "Trial.Type", "Affect", and "Reaction.Time" 为清楚起见,请进行编辑:我有一个名为“ dat”的数据框,其中包含3列“ Trial.Type”,“ Affect”和“ Reaction.Time”

The first three rows: 前三行:

Trial.Type Affect Reaction.Time 试验类型影响反应时间

Aa, 1 0 1231 Aa,1 0 1231

Aa, 2 1 1241 Aa,2 1 1241

Ha, 1 1 1112 哈,1 1 1112

I am wondering if there is a way to split the column "Trial.Type" so that "Aa" and "1" are two columns; 我想知道是否有一种方法可以拆分列“ Trial.Type”,以便“ Aa”和“ 1”是两列; Trial.Type and Intensity respectively. Trial.Type和Intensity。 Resulting in a data frame with 4 columns. 结果是具有4列的数据框。

Much thanks for any help. 非常感谢您的帮助。 I had trouble finding an answer to this question, my apologies if it is a repeat! 我很难找到这个问题的答案,如果这是重复的话,我深表歉意!

You can just use read.csv on your "column". 您可以仅在“列”上使用read.csv

> dat <- c("Aa, 1", "Aa, 2", "Ha, 1", "Hpa, 8")
> read.csv(text = dat, header=FALSE, col.names = c("Trial.Type", "Intensity"))
  Trial.Type Intensity
1         Aa         1
2         Aa         2
3         Ha         1
4        Hpa         8

Replace "dat" with your column name (for example, mydf$Trial.Type , or possibly as.character(mydf$Trial.Type) ). 用您的列名替换“ dat”(例如, mydf$Trial.Type ,或者可能是as.character(mydf$Trial.Type) )。

You can also check out my "splitstackshape" package, particularly the concat.split group of functions. 您还可以检出我的“ splitstackshape”包,特别是concat.split函数组。


For the benefit of the OP, here's a reproducible example and a solution using my "splitstackshape" package. 为了OP的好处,这是一个可重现的示例,以及使用我的“ splitstackshape”包的解决方案。 Of course, this can be done with base R too, using the approach I described above (or using one of the strsplit approaches mentioned here). 当然,也可以使用我上面描述的方法(或使用此处提到的strsplit方法之一)使用基数R完成此操作。

First, some sample data with a column that is character and another that is factor : 首先,一些样本数据的列是character ,另一列是factor

mydf <- data.frame(A = factor(c("1, Z", "2, Y", "3, X", "4, W")),
                   B = c("11, ZZ", "22, YY", "33, XX", "44, WW"),
                   C = c(123, 234, 345, 456), stringsAsFactors = FALSE)
mydf
#      A      B   C
# 1 1, Z 11, ZZ 123
# 2 2, Y 22, YY 234
# 3 3, X 33, XX 345
# 4 4, W 44, WW 456
str(mydf)
# 'data.frame':  4 obs. of  3 variables:
#  $ A: Factor w/ 4 levels "1, Z","2, Y",..: 1 2 3 4
#  $ B: chr  "11, ZZ" "22, YY" "33, XX" "44, WW"
#  $ C: num  123 234 345 456

Second, load the package and explore the options: 其次,加载程序包并浏览选项:

library(splitstackshape)
## Split a factor column
concat.split(mydf, split.col = "A", sep = ",")
#      A      B   C A_1 A_2
# 1 1, Z 11, ZZ 123   1   Z
# 2 2, Y 22, YY 234   2   Y
# 3 3, X 33, XX 345   3   X
# 4 4, W 44, WW 456   4   W

## Split a character column
concat.split(mydf, split.col = "B", sep = ",")
#      A      B   C B_1 B_2
# 1 1, Z 11, ZZ 123  11  ZZ
# 2 2, Y 22, YY 234  22  YY
# 3 3, X 33, XX 345  33  XX
# 4 4, W 44, WW 456  44  WW

## Split two columns in one go
concat.split.multiple(mydf, split.cols = c("A", "B"), seps = ",")
#     C A_1 A_2 B_1 B_2
# 1 123   1   Z  11  ZZ
# 2 234   2   Y  22  YY
# 3 345   3   X  33  XX
# 4 456   4   W  44  WW

You can do this with strsplit : 您可以使用strsplit来做到这strsplit

dat = c("Aa, 1", "Aa, 2", "Ha, 1", "Hpa, 8")
spl = strsplit(dat, ", ")
data.frame(Trial.Type = unlist(lapply(spl, "[", 1)),
           Intensity = as.numeric(unlist(lapply(spl, "[", 2))))
#   Trial.Type Intensity
# 1         Aa         1
# 2         Aa         2
# 3         Ha         1
# 4        Hpa         8

You could use these instructions: 您可以使用以下说明:

datNames <- names(dat)
dat <- cbind(t(matrix(unlist(strsplit(dat$Trial.type, ", ")),ncol=dim(dat)[1])))
names(dat) <- c(datNames,"Trial.type2","Intensity")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM