使用正则表达式识别 python 中的字符和数字

Question

I have phone numbers that might look like:我的电话号码可能如下所示：

927-6847
611-6701p3715ou264-5435
869-6289fillemichelinemoisan
613-5000p4238soirou570-9639cel

and so on...等等...

I want to identify and break them into:我想识别并将它们分成：

String to store somewhere else:存储在其他地方的字符串：

611-6701p3715ou264-5435
869-6289fillemichelinemoisan
613-5000p4238soirou570-9639cel

When there is a p between digits, The number after p is an extension- get the number before p and save the whole string somewhere else When there is ou , another number starts after that When there is cel or any random string, get the number part and save the whole string somewhere else当数字之间有一个p时，p 之后的数字是一个扩展 - 获取 p 之前的数字并将整个字符串保存在其他地方当有ou时，另一个数字在该数字之后开始当有cel或任何随机字符串时，获取数字部分并将整个字符串保存在其他地方

Edit: This is what I have tried:编辑：这是我尝试过的：

phNumber='928-4612cel'
if not re.match('^[\d]*$', phNumber):
     res = re.match("(.*?)[a-z]",re.sub('[^\d\w]', '', phNumber)).group(1)

I am looking to handle cases and identify which of the strings had more characters before they were chopped off through regex我正在寻找处理案例并确定哪些字符串在通过正则表达式被切断之前具有更多字符

Answer 1

First let me confirm again your request:首先让我再次确认您的要求：

find out the number with pattern "xxx-xxxx" where x is any number from 0-9, and then save the numbers with the pattern "xxxxxxx".找出模式为“xxx-xxxx”的数字，其中x是0-9中的任意数字，然后保存模式为“xxxxxxx”的数字。
if there is any random string in the text, save the whole string.如果文本中有任何随机字符串，则保存整个字符串。

import re

# make a list to input all the string want to test, 
EXAMPLE = [
    "927-6847",
    "9276847"
    "927.6847"
    "611-6701p3715ou264-5435",
    "6116701p3715ou264-5435",
    "869-6289fillemichelinemoisan",
    "869.6289fillemichelinemoisan",
    "8696289fillemichelinemoisan",
    "613-5000p4238soirou570-9639cel",
]

def save_phone_number(test_string,output_file_name):
    number_to_save = []

    # regex pattern of "xxx-xxxx" where x is digits
    regex_pattern = r"[0-9]{3}-[0-9]{4}"
    phone_numbers = re.findall(regex_pattern,test_string)

    # remove the "-"
    for item in phone_numbers:
        number_to_save.append(item.replace("-",""))

    # save to file
    with open(output_file_name,"a") as file_object:
        for item in number_to_save:
            file_object.write(item+"\n")

def save_somewhere_else(test_string,output_file_name):
    string_to_save = []

    # regex pattern if there is any alphabet in the string
    # (.*) mean any character with any length
    # [a-zA-Z] mean if there is a character that is lower or upper alphabet
    regex_pattern = r"(.*)[a-zA-Z](.*)"
    if re.match(regex_pattern,test_string) is not None:
        with open(output_file_name,"a") as file_object:
            file_object.write(test_string+"\n")

if __name__ == "__main__":

    phone_number_file = "phone_number.txt"
    somewhere_file = "somewhere.txt"

    for each_string in EXAMPLE:
        save_phone_number(each_string,phone_number_file)
        save_somewhere_else(each_string,somewhere_file)

使用正则表达式识别 python 中的字符和数字

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-05-16 01:45:28

使用正则表达式识别 python 中的字符和数字

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-05-16 01:45:28

解决方案1
1 已采纳 2020-05-16 01:45:28