简体   繁体   English

正则表达式在最后一次出现字符后查找文本直到另一个字符

[英]Regex to find text after last occurence of character till another one

I am looking to find a regular expression to extract information starting with " including: " and ending with the text after the last occurence of the character "\n*" or "\n•" until the character "\n".我正在寻找一个正则表达式来提取以“包括:”开头并以字符“\n*”或“\n•”最后一次出现之后的文本结尾的信息,直到字符“\n”。 In other words, i am trying to find an ending as the first occurence of "\n" right after the last occurence of "\n*" or "\n•".换句话说,我试图在最后一次出现“\n*”或“\n•”之后找到一个结尾作为“\n”的第一次出现。 I have tried this demo but doesn't work as i want it to.我已经尝试过这个演示,但没有按我的意愿工作。 I would like to include the next sentence untill "guidance.\n".我想包括下一句,直到“指导。\n”。 I am using python and i am trying to extract that to a new column in my pandas DataFrame called "Skills".我正在使用 python 并且我正在尝试将其提取到我的 pandas DataFrame 称为“技能”的新列中。 The "Job Description" column has the information “职位描述”列包含信息

df["Skills"]=df["Job description"].str.extract("including:((?:.)*\\n[*|•])")

You might use你可能会使用

(?s)\bincluding:(.*\\n[*•]).*?\\n(?![*•])
  • (?s) Inline modifier to make the dot match a newline (?s)内联修饰符使点匹配换行符
  • \bincluding: Match including: preceded by a word boundary \bincluding:匹配including:前面有一个单词边界
  • ( Capture group 1 (捕获组 1
    • .*\\n[*•] Match till the last occurrence of \n followed by either * or • .*\\n[*•]匹配直到最后出现的\n后跟 * 或 •
  • ( Close group 1 (关闭组 1
  • .*?\\n Match till the first occurrence of \n .*?\\n匹配直到第一次出现\n

Regex demo正则表达式演示

Or when \\n is a real newline或者当\\n是一个真正的换行符时

(?s)\bincluding:(.*\n[*•]).*?\n(?![*•])

Regex demo正则表达式演示

For example例如

df["Skills"] = df["Job description"].str.extract(r"(?s)\bincluding:(.*\n[*•]).*?\n(?![*•])")

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python正则表达式查找数字的最后一次出现 - Python Regex to find last occurence of digit Python:使用正则表达式查找最后一对事件 - Python: Using regex to find the last pair of occurence 在文本文件的每一行中,将第n个字符替换为另一个字符 - Change nth occurence of a character with another character in every line of text file 如何在Dataframe中最后一次出现字符后删除所有内容? - How to remove everything after the last occurence of a character in a Dataframe? 查找一个字符后的最后一个substring - Find the last substring after a character Python在文件中找到最后一次出现 - Python find last occurence in a file 符号的第一次和最后一次出现(没有正则表达式的python) - First and last occurence of a symbol (python without regex) 查找连续(一个接一个)重复的文本块并对重复进行编号 - Find continuous (one after another) duplicated blocks of text and number the repetitions 正则表达式:在前瞻断言的最后一次匹配之后查找文本,直到前瞻断言的第一次匹配 - Regex: Find text after last match of lookahead assertion until first match of lookahead assertion 正则表达式查找一个或多个字符,包括中间有句号、撇号或连字符的字符。 如果最后只出现一次,则没有最后一个符号 - Regex find one or more character, including those with period or apostrophe or hypen in between. Without last symbol if it occur only once at end
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM