[英]How to split a column in a data frame into two columns by numbers and letters
EDIT for clarity: I have a dataframe called "dat" with 3 columns "Trial.Type", "Affect", and "Reaction.Time" 为清楚起见,请进行编辑:我有一个名为“ dat”的数据框,其中包含3列“ Trial.Type”,“ Affect”和“ Reaction.Time”
Trial.Type Affect Reaction.Time 试验类型影响反应时间
Aa, 1 0 1231 Aa,1 0 1231
Aa, 2 1 1241 Aa,2 1 1241
Ha, 1 1 1112 哈,1 1 1112
I am wondering if there is a way to split the column "Trial.Type" so that "Aa" and "1" are two columns; 我想知道是否有一种方法可以拆分列“ Trial.Type”,以便“ Aa”和“ 1”是两列; Trial.Type and Intensity respectively.
Trial.Type和Intensity。 Resulting in a data frame with 4 columns.
结果是具有4列的数据框。
Much thanks for any help. 非常感谢您的帮助。 I had trouble finding an answer to this question, my apologies if it is a repeat!
我很难找到这个问题的答案,如果这是重复的话,我深表歉意!
You can just use read.csv
on your "column". 您可以仅在“列”上使用
read.csv
。
> dat <- c("Aa, 1", "Aa, 2", "Ha, 1", "Hpa, 8")
> read.csv(text = dat, header=FALSE, col.names = c("Trial.Type", "Intensity"))
Trial.Type Intensity
1 Aa 1
2 Aa 2
3 Ha 1
4 Hpa 8
Replace "dat" with your column name (for example, mydf$Trial.Type
, or possibly as.character(mydf$Trial.Type)
). 用您的列名替换“ dat”(例如,
mydf$Trial.Type
,或者可能是as.character(mydf$Trial.Type)
)。
You can also check out my "splitstackshape" package, particularly the concat.split
group of functions. 您还可以检出我的“ splitstackshape”包,特别是
concat.split
函数组。
For the benefit of the OP, here's a reproducible example and a solution using my "splitstackshape" package. 为了OP的好处,这是一个可重现的示例,以及使用我的“ splitstackshape”包的解决方案。 Of course, this can be done with base R too, using the approach I described above (or using one of the
strsplit
approaches mentioned here). 当然,也可以使用我上面描述的方法(或使用此处提到的
strsplit
方法之一)使用基数R完成此操作。
First, some sample data with a column that is character
and another that is factor
: 首先,一些样本数据的列是
character
,另一列是factor
:
mydf <- data.frame(A = factor(c("1, Z", "2, Y", "3, X", "4, W")),
B = c("11, ZZ", "22, YY", "33, XX", "44, WW"),
C = c(123, 234, 345, 456), stringsAsFactors = FALSE)
mydf
# A B C
# 1 1, Z 11, ZZ 123
# 2 2, Y 22, YY 234
# 3 3, X 33, XX 345
# 4 4, W 44, WW 456
str(mydf)
# 'data.frame': 4 obs. of 3 variables:
# $ A: Factor w/ 4 levels "1, Z","2, Y",..: 1 2 3 4
# $ B: chr "11, ZZ" "22, YY" "33, XX" "44, WW"
# $ C: num 123 234 345 456
Second, load the package and explore the options: 其次,加载程序包并浏览选项:
library(splitstackshape)
## Split a factor column
concat.split(mydf, split.col = "A", sep = ",")
# A B C A_1 A_2
# 1 1, Z 11, ZZ 123 1 Z
# 2 2, Y 22, YY 234 2 Y
# 3 3, X 33, XX 345 3 X
# 4 4, W 44, WW 456 4 W
## Split a character column
concat.split(mydf, split.col = "B", sep = ",")
# A B C B_1 B_2
# 1 1, Z 11, ZZ 123 11 ZZ
# 2 2, Y 22, YY 234 22 YY
# 3 3, X 33, XX 345 33 XX
# 4 4, W 44, WW 456 44 WW
## Split two columns in one go
concat.split.multiple(mydf, split.cols = c("A", "B"), seps = ",")
# C A_1 A_2 B_1 B_2
# 1 123 1 Z 11 ZZ
# 2 234 2 Y 22 YY
# 3 345 3 X 33 XX
# 4 456 4 W 44 WW
You can do this with strsplit
: 您可以使用
strsplit
来做到这strsplit
:
dat = c("Aa, 1", "Aa, 2", "Ha, 1", "Hpa, 8")
spl = strsplit(dat, ", ")
data.frame(Trial.Type = unlist(lapply(spl, "[", 1)),
Intensity = as.numeric(unlist(lapply(spl, "[", 2))))
# Trial.Type Intensity
# 1 Aa 1
# 2 Aa 2
# 3 Ha 1
# 4 Hpa 8
You could use these instructions: 您可以使用以下说明:
datNames <- names(dat)
dat <- cbind(t(matrix(unlist(strsplit(dat$Trial.type, ", ")),ncol=dim(dat)[1])))
names(dat) <- c(datNames,"Trial.type2","Intensity")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.