简体   繁体   English

将字符串强制转换为矢量

[英]Coercing String to Vector

I'm trying to create a calculator that multiplies permutation groups written in cyclic form (the process of which is described in this post, for anyone unfamiliar: https://math.stackexchange.com/questions/31763/multiplication-in-permutation-groups-written-in-cyclic-notation ). 我正在尝试创建一个计算器,以乘以循环形式编写的置换组(对于不熟悉的人,此过程在本文中进行了描述: https : //math.stackexchange.com/questions/31763/multiplication-in-permutation -groups-in-cycle-notation )。 Although I know this would be easier to do with Python or something else, I wanted to practice writing code in R since it is relatively new to me. 尽管我知道使用Python或其他方法会更容易,但是我想练习用R编写代码,因为它对我来说相对较新。

My gameplan for this is take an input, such as "(1 2 3)(2 4 1)" and split it into two separate lists or vectors. 为此,我的游戏计划是输入“(1 2 3)(2 4 1)”,然后将其分为两个单独的列表或向量。 However, I am having trouble starting this because from my understanding of character functions (which I researched here: https://www.statmethods.net/management/functions.html ) I will ultimately have to use the function grep() to find the points where ")(" occur in my string to split from there. However, grep only takes vectors for its argument, so I am trying to coerce my string into a vector. In researching this problem, I have mostly seen people suggest to use as.integer(unlist(str_split())), however, this doesn't work for me as when I split, not everything is an integer and the values become NA, as seen in this example. 但是,由于我对字符函数的了解(我在这里进行了研究: https//www.statmethods.net/management/functions.html ),因此我在启动它时遇到了麻烦。在我的字符串中出现“)(”的点从那里分裂。但是,grep仅将向量用作其参数,因此我试图将我的字符串强制为向量。在研究此问题时,我大部分看到人们建议使用as.integer(unlist(str_split())),但是这对我不起作用,因为当我拆分时,并非所有都是整数,并且值变成NA,如本例所示。

    library(tidyverse)
    x <- "(1 2 3)(2 4 1)"
    x <- as.integer(unlist(str_split(x," ")))'
    x

Is there an alternative way to turn a string into a vector when there are not just integers involved? 当不只涉及整数时,是否有其他方法可以将字符串转换为向量? I also realize that the means by which I am trying to split up the two permutations is very roundabout, but that is because of the character functions that I researched this seems like the only way. 我还意识到,我试图将两个排列分开的方法非常round回,但这是因为我研究过的字符函数似乎是唯一的方法。 If there are other functions that would make this easier, please let me know. 如果还有其他功能可以简化此操作,请告诉我。

Thank you! 谢谢!

Comments in the code. 代码中的注释。

x <- "(1 2 3)(2 4 1)"

out1 <- strsplit(x, split = ")(", fixed = TRUE)[[1]] # split on close and open bracket
out2 <- gsub("[\\(|\\)]", replacement = "", out1) # remove brackets
out3 <- strsplit(out2, " ") # tease out numbers between spaces
lapply(out3, as.integer)

[[1]]
[1] 1 2 3

[[2]]
[1] 2 4 1

There aren't really any scalars on R. Single values like 1 , TRUE , and "a" are all 1-element vectors. R上实际上没有任何标量。像1TRUE"a"这样的单个值都是1元素向量。 grep(pattern, x) will work fine on your original string. grep(pattern, x)在原始字符串上可以正常工作。 As a starting point for getting towards your desired goal, I would suggest splitting the groups using: 作为实现您期望目标的起点,我建议使用以下方法将组划分:

> str_extract_all(x, "\\([0-9 ]+\\)")
[[1]]
[1] "(1 2 3)" "(2 4 1)"

If we need to split the strings with the brackets 如果我们需要用方括号将字符串分开

strsplit(x, "(?<=\\))(?=\\()", perl = TRUE)[[1]]
#[1] "(1 2 3)" "(2 4 1)"

Or we can use convenient wrapper from qdapRegex 或者我们可以使用来自qdapRegex便捷包装器

library(qdapRegex)
ex_round(x, include.marker = TRUE)[[1]]
#[1] "(1 2 3)" "(2 4 1)"

alternative: using library(magrittr) 选择:使用library(magrittr)

x <- "(1 2 3)(2 4 1)" 

x %>%
gsub("^\\(","c(",.) %>% gsub("\\)\\(","),c(",.) %>% gsub("(?=\\s\\d)",", ",.,perl=T) %>%
    paste0("list(",.,")") %>% {eval(parse(text=.))}

result: 结果:

# [[1]]
# [1] 1 2 3
# 
# [[2]]
# [1] 2 4 1

You could use chartr with read.table : 您可以将chartrread.table chartr使用:

read.table(text= chartr("()"," \n",x))
#   V1 V2 V3
# 1  1  2  3
# 2  2  4  1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM