简体   繁体   English

在 Sqlite 中使用 REGEX 匹配和替换字符串?

[英]Match and replace string using REGEX in Sqlite?

I have a table called personal_websessions that contains data in the following format:我有一个名为 personal_websessions 的表,其中包含以下格式的数据:

 id_no | website_link 
 1     | google.com 
 2     | stackoverflow.com 
 3     | msn.com 

You can create this table using the following SQL commands:您可以使用以下 SQL 命令创建此表:

CREATE TABLE personal_websessions(id_no INTEGER PRIMARY KEY, website_link TEXT);
INSERT INTO personal_websessions VALUES(1, 'google.com'), (2, 'stackoverflow.com '), (3, 'msn.com ')

I would like to perform a find and replace using regex :我想使用 regex 执行查找和替换

What I would like to do is if the character is 'msn.com' or 'msnnews.com etc (so something with msn in the word) in the website_link column, find that value of 'msn' and replace it with an string 'toast', but if it is not the word msn then leave it as it is.我想做的是,如果在 website_link 列中字符是'msn.com' 或 'msnnews.com等(所以单词中有 msn),找到 'msn' 的值并将其替换为字符串' toast',但如果不是 msn 一词,则保持原样。 so the example above - google.com and stackoverflow.com will stay the same.所以上面的例子 - google.com 和 stackoverflow.com 将保持不变。

I know that the regex will be of the form (msn) as a grouping structure to match on but I do not know how to write a regex match in Sqlite.我知道正则表达式将采用 (msn) 形式作为要匹配的分组结构,但我不知道如何在 Sqlite 中编写正则表达式匹配。

Essentially i will have the following desired output below:本质上,我将在下面获得以下所需的输出:

 id_no | website_link 
 1     | google.com 
 2     | stackoverflow.com 
 3     | toast

I am currently using SQlite and I know that I will have to use the REPLACE function as it is can find a pattern and then provide a replacement,我目前正在使用 SQlite,我知道我将不得不使用REPLACE函数,因为它可以找到一个模式,然后提供一个替换,

However in this link, they are not using any regex to match the words just defining them但是在这个链接中,他们没有使用任何正则表达式来匹配刚刚定义它们的单词

I am really just trying to find out how to use a regex pattern to find and replace values in sqllite.我真的只是想找出如何使用正则表达式模式来查找和替换 sqllite 中的值。

I am using an RSQLITE connection if that helps.如果有帮助,我正在使用 RSQLITE 连接。


What you are describing sounds like filtering using like :您所描述的内容听起来像是使用like过滤:

update personal_websessions
    set website_link = 'toast'
    where website_link like 'msn%';

In your examples, the "msn" is at the beginning, so I've arranged the like pattern to match that.在您的示例中,“msn”位于开头,因此我安排了类似的模式来匹配它。 If you really do mean "msn" anywhere, then the pattern should be '%msn%' .如果您确实在任何地方都表示“msn”,那么模式应该是'%msn%'

The function replace() really has nothing to do with this problem.函数replace()确实与这个问题无关。 If you want to change the underlying data, then update is the operative command.如果要更改基础数据,则update是操作命令。


If you don't want to change the data but just want a select , then use a case expression:如果您不想更改数据而只想要一个select ,请使用case表达式:

select pw.id_no,
       (case when pw.website_link like 'msn%' 
             then 'toast'
             else pw.website_link
        end) as website_link
from personal_websessions pw;

You can use SQLite's regexp function, but only after having it registered.您可以使用 SQLite 的regexp函数,但只能在注册后使用。

con0 <- DBI::dbConnect(RSQLite::SQLite())
DBI::dbExecute(con0, "CREATE TABLE personal_websessions(id_no INTEGER PRIMARY KEY, website_link TEXT)")
# [1] 0
DBI::dbExecute(con0, "INSERT INTO personal_websessions VALUES(1, 'google.com'), (2, 'stackoverflow.com '), (3, 'msn.com ')")
# [1] 3
DBI::dbExecute(con0, "INSERT INTO personal_websessions VALUES(4, 'msnnews.com')")
# [1] 1
DBI::dbGetQuery(con0, "select * from personal_websessions where website_link like 'msn%'")
#   id_no website_link
# 1     3     msn.com 
# 2     4  msnnews.com
DBI::dbGetQuery(con0, "select * from personal_websessions where website_link regexp '\\bmsn\\b'")
# Error: no such function: regexp
DBI::dbGetQuery(con0, "select * from personal_websessions where website_link regexp '\\bmsn\\b'")
#   id_no website_link
# 1     3     msn.com 

In order to replace the "msn" with "toast" (within the string, as a substring replacement), though, SQLite does not currently have native support for regex-replacement (short of icu_replace.c , found here ).但是,为了将“msn”替换为“toast”(在字符串中,作为子字符串替换),SQLite 当前不支持正则表达式替换(缺少icu_replace.c ,可在此处找到)。

If you are confident that you will not find "msn" multiple times in one string (eg, "msnnews.msn.com" ), though, you can find with a regex (as above) and then use the non-regex replace .但是,如果您确信不会在一个字符串中多次找到“msn”(例如, "msnnews.msn.com" ),则可以使用正则表达式(如上)查找,然后使用非正则表达式replace Continuing the above example:继续上面的例子:

DBI::dbGetQuery(con0, "
  select id_no, replace(website_link,'msn','toast') as website_link
  from personal_websessions
  where website_link regexp '\\bmsn\\b'")
#   id_no website_link
# 1     3   toast.com 

And if you need all rows with just that portion replaced, then a union would work:如果您需要仅替换该部分的所有行,那么联合将起作用:

DBI::dbGetQuery(con0, "
  select id_no, replace(website_link,'msn','toast') as website_link
  from personal_websessions
  where website_link regexp '\\bmsn\\b'
  select id_no, website_link
  from personal_websessions
  where not website_link regexp '\\bmsn\\b' ")
#   id_no       website_link
# 1     1         google.com
# 2     2 stackoverflow.com 
# 3     3         toast.com 
# 4     4        msnnews.com

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM