简体   繁体   English

正则表达式以匹配逗号分隔的包含逗号格式小数的字符串

[英]Regex to match comma-separated strings containing comma-formatted decimals

I have comma-separated strings like this one: 我有这样一个逗号分隔的字符串:

"Assistência 24hs com Guincho s/limite de km, 2o. Guincho 100 km no mesmo evento, Pacote de Benefícios HDI, Táxi sem Franquia, Serviços Residenciais, 7 dias de Carro Reserva quando Terceiro (sem ar cond), 7 dias de Carro Reserva, Vidros com franquia de R$ 260,00."

I want to split the string by comma, but the problem is that there are numbers with a comma as the decimal separator in the string (for example: 260,00), for which I don't want a split to happen. 我想用逗号分割字符串,但问题是字符串中有一些用逗号作为小数点分隔符的数字(例如:260,00),我不希望发生分割。

You could split by comma, followed by space: 您可以按逗号分隔,然后按空格:

>>> s.split(", ")
['Assist\xc3\xaancia 24hs com Guincho s/limite de km',
 '2o. Guincho 100 km no mesmo evento',
 'Pacote de Benef\xc3\xadcios HDI',
 'T\xc3\xa1xi sem Franquia',
 'Servi\xc3\xa7os Residenciais',
 '7 dias de Carro Reserva quando Terceiro (sem ar cond)',
 '7 dias de Carro Reserva',
 'Vidros com franquia de R$ 260,00.']

Note that this will remove both the comma and the following space from the resulting strings. 请注意,这将从结果字符串中删除逗号和以下空格。

You're walking on thin ice here. 您在这里如履薄冰。 From your example, it seems like using ", " as the field separator (comma-space) would work. 在您的示例中,好像使用“,”作为字段分隔符(逗号-空格)将起作用。 Most would opt to quote the strings or use a different delimiter (pipe, tab, \\x1F, etc). 大多数人会选择引用字符串或使用其他定界符(竖线,制表符,\\ x1F等)。

This seems very fragile to me, and you could easily be broken further out in time. 在我看来,这非常脆弱,您很容易及时将其分解。 If you have any influence on what is being given to you, have that conversation first. 如果您对所获得的东西有任何影响,请先进行对话。

The following avoids the fragility that was pointed out by @dsz. 以下内容避免了@dsz指出的脆弱性。

txt = '''Assistência 24hs com Guincho s/limite de km, 2o. Guincho 100 km no mesmo evento, Pacote de Benefícios HDI, Táxi sem 
Franquia, Serviços Residenciais, 7 dias de Carro Reserva quando Terceiro (sem ar cond), 7 dias de Carro
Reserva, Vidros com franquia de R$ 260,00.'''

import re
re.split("\,[^\d+\.\d+]",txt)

output: 输出:

['Assist\xc3\xaancia 24hs com Guincho s/limite de km',
 '2o. Guincho 100 km no mesmo evento',
 'Pacote de Benef\xc3\xadcios HDI',
 'T\xc3\xa1xi sem Franquia',
 'Servi\xc3\xa7os Residenciais',
 '7 dias de Carro Reserva quando Terceiro (sem ar cond)',
 '7 dias de Carro\nReserva',
 'Vidros com franquia de R$ 260,00.']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用于有条件地捕获逗号分隔字符串的 Python 正则表达式 - Python regex for capturing comma-separated strings conditionally 正则表达式匹配一个或多个逗号分隔的单词列表,除非字符串以逗号结尾 - Regex to match a list of one or more comma-separated words, unless the string ends in a comma RegEx:逗号分隔的对列表 - RegEx: comma-separated list of pairs 什么正则表达式将匹配逗号分隔的数字对,用管道分隔的数字对? - What Regex will match on pairs of comma-separated numbers, with pairs separated by pipes? 正则表达式:允许逗号分隔的字符串,包括字符和非字符 - Regex: allow comma-separated strings, including characters and non-characters 在字符串的每个单词中匹配第一个元音,并用正则表达式用逗号分隔将它们打印出来? - match first vowel in each word of a string and print them comma-separated with regex? 在Wagtail页面admin中对DecimalField使用逗号格式的数字 - Using comma-formatted numbers for DecimalField in Wagtail page admin 如何将包含未用逗号分隔的值列表的字符串转换为列表? - How to convert a string containing a list of values that are not comma-separated to a list? 重新拆分特殊情况以拆分逗号分隔的字符串 - re split special case to split comma-separated strings 将逗号分隔字符串的熊猫列转换为整数 - Converting pandas column of comma-separated strings into integers
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM