简体   繁体   English

正则表达式A“或” B

[英]regular expression A 'or' B

I have the following lines: 我有以下几行:

  9 (1224) Starting item export: IPM.Appointment, Zomverbanden (wielen) monteren, 2,61 B, John \Calendar, E:\tmp\John Kn
  9 (1224) Starting item export: IPM.Appointment,  [JK], 7,97 KB, John Knappers\Calendar, E:
  9 (1224) Starting item export: IPM.Appointment, Niet op kantoor (Auto), 1,66 GB, John \Calendar, E:\tmp\John .
  9 (1224) Starting item export: IPM.Appointment, Bespip / Tobias , 9,13 KB, John \Calendar, E:\tmp\John K
  9 (1224) Starting item export: IPM.Appointment, Q-ware el / Mehan [JK], 8,01 MB, \Calendar, E:\tmp\J

how can I find a matching pattern for these bytes and megabytes and so on? 如何找到这些字节和兆字节等的匹配模式?

I have tried 我努力了

res = re.findall(r'(\d*,\d* KB)|(\d*,\d* MB) | (\d*,\d* B)| (\d*,\d* GB)', i)

but it returns me dict of 4 tuples, but I need only 1 item for each line: 但是它返回了我4个元组的字典,但是我每行只需要1个项目:

2,61 B
7,97 KB
1,66 GB
9,13 KB
8,01 MB

You can rewrite it making it shorter: 您可以重写它,使其更短:

\d+,\d* (?:KB|MB|B|GB)

Live Demo 现场演示

You could make it even shorter: 您可以使其更短:

\d+,\d* [KMG]?B

Live Demo 现场演示

This is of course assuming there's a single space in between the number and the unit. 当然,这是假设数字和单位之间只有一个空格。 Instead of the space you could have [ \\t]+ , making it possible to be multiple spaces or even tabs. 可以使用[ \\t]+代替空格,从而可以使用多个空格甚至制表符。

In case 10 KB is valid (without the decimal) then you could do: 如果10 KB有效(不含小数点),则可以执行以下操作:

\d+(,\d+)? [KMG]?B

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM