简体   繁体   中英

Regex capture data between String and \n character in Python

I am learning python, I wanted to capture the data between 'NUMBER:' and \\n

NUMBER: 3741733552\\n556644

the number after the new line character in variable, hence cannot count on it to capture.

    re.search(r'NUMBER:(.*?)[\n]', string_data).group(1)

I tried above code(which is wrong) in vain, please help in capturing that number. Thank you.

Edit:

I have a String "NAME: KHAN NASEEM\\n\\n22972 LAHSER RD\\n\\n..." to which I used like the code

    name = re.search(r'NAME:\s*(.+)', string_data) 

but the output I got is "KHAN NASEEM\\n\\n22972 LAHSER RD\\n\\n...", But I want only KHAN NASEEM only.

\\n = string literal, not actual new line

You can try this:

import re
s = "NUMBER: 3741733552\n556644"
final_data = re.findall('NUMBER:\s*(.*?)\n', s)

Output:

['3741733552']

Below is my solution to your question. It is short and simple, also easy to read. You could get more complex with it, but I like to keep things easy :-). I hope this helps you!

>>> import re
>>> num = 'NUMBER: 3741733552\n556644'
>>> search = re.search(r'([0-9].*)', num).group(0)
>>> print(search)
3741733552

If you are trying to get all chars from NAME: up to the backslash followed with n letter, use

\bNAME:\s*(.+?)(?:\\n|$)

See the regex demo .

Details

  • \\b - a word boundary
  • NAME: - a NAME: substring
  • \\s* - 0+ whitespaces
  • (.+?) - Group 1: one or more chars other than line breal chars, as few as possible
  • (?:\\\\n|$) - either the end of string or a backslash followed with n

Below is the Python demo :

import re
s = r'NAME: KHAN NASEEM\n\n22972 LAHSER RD\n\n...' # Note r'' prefix: all \ are literal backslashes here!
m = re.search(r'\bNAME:\s*(.+?)(?:\\n|$)', s)
if m:
    print(m.group(1)) # => KHAN NASEEM

NOTE : You should check how text is fetched from the DB to Python. The \\n should actually be newlines. Once fixed, you will just have to use

r'\bNAME:\s*(.+)'

A whole word NAME: , 0+ whitespaces, and Group 1 will capture one or more chars other than line break chars, as many as possible (ie the rest of the line).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM