简体   繁体   English

Python提取前3个单词和3个单词后带有正则表达式的特定单词列表

[英]Python extract 3 words before and 3 words after a specific list of words with a regex

I need to use python to extract 3 words before and 3 words after a specific list of words 我需要使用python来提取3个单词之前和3个单词后面的特定单词

Nokia Lumia 930 Smartphone, Display 5 pollici, Fotocamera 20 MP, 2GB RAM, Processore Quad-Core 2,2GHz, Memoria 32GB, Windows Phone 8.1, Bianco [Germania] 诺基亚Lumia 930智能手机,显示器5 pollici,Fotocamera 20 MP,2GB RAM,处理器四核2.2GHz,Memoria 32GB,Windows Phone 8.1,Bianco [Germania]

At the moment I'm using this regex without success 目前我正在使用这个正则表达式而没有成功

((?:[\S,]+\s+){0,3})ram\s+((?:[\S,]+\s*){0,3})

https://regex101.com/r/yN6iI0/1 https://regex101.com/r/yN6iI0/1

My list of words that I need is: 我需要的单词列表是:

  • Display 显示
  • Fotocamera Fotocamera
  • RAM 内存
  • Processore Processore
  • Memoria MEMORIA

You regex did not work because \\s+ requires at least 1 whitespace, but between RAM and , there is none. 你正则表达式没有工作,因为\\s+至少需要1点的空白,但之间的RAM,是没有的。 Either use a * quantifier or just remove it and use `` 要么使用*量词,要么删除它并使用``

(?i)((?:\S+\s+){0,3})\bRAM\b\s*((?:\S+\s+){0,3})

See demo 演示

I added \\b (word boundary) to make sure we match RAM , not RAMBUS . 我添加\\b (字边界)以确保我们匹配RAM ,而不是RAMBUS

Mind the re.I modifier (or use an inline version (?i) at the beginning of the pattern). 注意re.I修饰符(或在模式的开头使用内联版本(?i) )。

Other patterns can be formed in a similar way, just replace RAM with the words from your list. 其他模式可以以类似的方式形成,只需用列表中的单词替换RAM

((?:[\S,]+\s+){0,3})ram,?\s+((?:[\S,]+\s*){0,3})

                       ^^

Just add a , .See demo. 只需添加一个,参见演示。

https://regex101.com/r/yN6iI0/4 https://regex101.com/r/yN6iI0/4

You can use this finally, 你终于可以用了,

((?:[\S,]+\s+){0,3})(?:ram|Display|Fotocamera|RAM|Processore|Memoria),?\s+((?:[\S,]+\s*){0,3})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 正则表达式提取Python中的所有单词对及其分别的3个单词之后的单词和3个单词之前的单词 - Regex to extract all pair of words and their respectively 3 words after and 3 words before in Python 在python正则表达式中提取一个字符前后的两个单词 - extract two words before and after a character in python regex 在Python中使用正则表达式从字符串中提取具有特定字符的单词列表 - Extract list of words with specific character from string using regex in Python Python RegEx在特定字符串后获取单词 - Python RegEx to get words after a specific string 在 Python 中特定单词列表之后的每行中查找单词 - Find words in a column per row after list of specific words in Python 使用正则表达式在 pandas dataframe 中的单词列表之前提取数字 - Use regex to extract number before a list of words in pandas dataframe 正则表达式删除python中的特定单词 - Regex to remove specific words in python 提取符号后的单词 python - Extract words after a symbol in python 当在 Python 中使用正则表达式之间有单词时,如何提取特定关键字之后的下一行? - How to extract the next line after a specific keyword when there are words in between using regex in Python? Python 正则表达式查找接近(之前和/或之后)特定单词的数值 - Python RegEx to find number value close to (before and/or behind) specific words
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM