简体   繁体   English

R:如何匹配正则表达式中的正斜杠?

[英]R: How to match a forward-slash in a regular expression?

How do I match on a forward slash / in a regular expression in R?如何匹配正斜杠/在 R 的正则表达式中?

As demonstrated in the example below, I am trying to search for.csv files in a subdirectory and my attempts to use a literal / are failing.如下例所示,我正在尝试在子目录中搜索 .csv 文件,并且尝试使用文字/失败。 Looking for a modification to my regex in base R, not a function that does this for me.在基础 R 中寻找对我的正则表达式的修改,而不是为我执行此操作的 function。

Example subdirectory示例子目录

# Create subdirectory in current working directory with two .csv files
# - remember to delete these later or they'll stay in your current working directory!
dir.create(path = "example")
write.csv(data.frame(x1 = letters), file = "example/example1.csv")
write.csv(data.frame(x2 = 1:20), file = "example/example2.csv")

Get relative paths of all.csv files in the example subdirectory获取example子目录下all.csv文件的相对路径

# This works for the example, but could mistakenly return paths to other files based on:
# (a) file name: foo/example1.csv
# (b) subdirectory name: example_wrong/foo.csv
list.files(pattern = "example.*csv", recursive = TRUE)
#> [1] "example/example1.csv" "example/example2.csv"

# This fixes issue (a) but doesn't fix issue (b)
list.files(pattern = "^example.*?\\.csv$", recursive = TRUE)
#> [1] "example/example1.csv" "example/example2.csv"

# Adding / to the end of `example` guarantees we get the correct subdirectory

# Doesn't work: / is special regex and not escaped
list.files(pattern = "^example/.*?\\.csv$", recursive = TRUE)

# Doesn't work: escapes / but throws error
list.files(pattern = "^example\/.*?\\.csv$", recursive = TRUE)

# Doesn't work: even with the \\ escaping in R!
list.files(pattern = "^example\\/.*?\\.csv$", recursive = TRUE)

Some of the solutions above work with regex tools but not in R.上面的一些解决方案适用于正则表达式工具,但不适用于 R。 I've checked SO for solutions (most related below) but none seem to apply:我已经检查了 SO 的解决方案(下面最相关),但似乎没有一个适用:

Escaping a forward slash in a regular expression Escaping 正则表达式中的正斜杠

Regex string does not start or end (or both) with forward slash 正则表达式字符串不以正斜杠开始或结束(或两者)

Reading multiple csv files from a folder with R using regex 使用正则表达式从具有 R 的文件夹中读取多个 csv 文件

The pattern argument is only used for matching file (or directory) names, not the full path they are on (even when recursive and full.names are set to TRUE ). pattern参数仅用于匹配文件(或目录)名称,而不是它们所在的完整路径(即使recursivefull.names设置为TRUE )。 That's why your last approach doesn't work even though it is the correct way to match / in a regular expression.这就是为什么您的最后一种方法不起作用的原因,即使它是在正则表达式中匹配/的正确方法。 You can get the correct file names by specifying path and setting full.names to TRUE .您可以通过指定path并将full.names设置为TRUE来获取正确的文件名。

list.files(path='example', pattern='\\.csv$', full.names=T)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM