简体   繁体   中英

using regex to identify characters and digits in python

I have phone numbers that might look like:

927-6847
611-6701p3715ou264-5435
869-6289fillemichelinemoisan
613-5000p4238soirou570-9639cel

and so on...

I want to identify and break them into:

9276847
6116701
2645435
8696289
6135000
5709639

String to store somewhere else:

611-6701p3715ou264-5435
869-6289fillemichelinemoisan
613-5000p4238soirou570-9639cel

When there is a p between digits, The number after p is an extension- get the number before p and save the whole string somewhere else When there is ou , another number starts after that When there is cel or any random string, get the number part and save the whole string somewhere else

Edit: This is what I have tried:

phNumber='928-4612cel'
if not re.match('^[\d]*$', phNumber):
     res = re.match("(.*?)[a-z]",re.sub('[^\d\w]', '', phNumber)).group(1)    

I am looking to handle cases and identify which of the strings had more characters before they were chopped off through regex

First let me confirm again your request:

  1. find out the number with pattern "xxx-xxxx" where x is any number from 0-9, and then save the numbers with the pattern "xxxxxxx".
  2. if there is any random string in the text, save the whole string.
import re

# make a list to input all the string want to test, 
EXAMPLE = [
    "927-6847",
    "9276847"
    "927.6847"
    "611-6701p3715ou264-5435",
    "6116701p3715ou264-5435",
    "869-6289fillemichelinemoisan",
    "869.6289fillemichelinemoisan",
    "8696289fillemichelinemoisan",
    "613-5000p4238soirou570-9639cel",
]

def save_phone_number(test_string,output_file_name):
    number_to_save = []

    # regex pattern of "xxx-xxxx" where x is digits
    regex_pattern = r"[0-9]{3}-[0-9]{4}"
    phone_numbers = re.findall(regex_pattern,test_string)

    # remove the "-"
    for item in phone_numbers:
        number_to_save.append(item.replace("-",""))

    # save to file
    with open(output_file_name,"a") as file_object:
        for item in number_to_save:
            file_object.write(item+"\n")

def save_somewhere_else(test_string,output_file_name):
    string_to_save = []

    # regex pattern if there is any alphabet in the string
    # (.*) mean any character with any length
    # [a-zA-Z] mean if there is a character that is lower or upper alphabet
    regex_pattern = r"(.*)[a-zA-Z](.*)"
    if re.match(regex_pattern,test_string) is not None:
        with open(output_file_name,"a") as file_object:
            file_object.write(test_string+"\n")

if __name__ == "__main__":

    phone_number_file = "phone_number.txt"
    somewhere_file = "somewhere.txt"

    for each_string in EXAMPLE:
        save_phone_number(each_string,phone_number_file)
        save_somewhere_else(each_string,somewhere_file)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM