簡體   English   中英

如何在 python 中以簡潔的方式更改特定的子字符串?

[英]how to change specific sub-string in a concise way in python?

我想優化數據集:

df = pd.DataFrame({'address':['123 AB 45 CD','123 AB 45TH CD','123 AB 12ND CD','123 AB 12 CD','123 AB 12TH CD']})

進入:

df = pd.DataFrame({'address':['123 AB 45TH CD','123 AB 45TH CD','123 AB 12ND CD','123 AB 12ND CD','123 AB 12ND CD']})

我只是以一種乏味的方式找出一個部分:

def action(name):
    middle = name.split(" ")[2]
    if middle.isnumeric():
        if int(middle[-1]) == 1:
            name = name.replace(middle, middle+'st')
        elif int(middle[-1]) == 2:
            name = name.replace(middle, middle+'nd')
        elif int(middle[-1]) == 3:
            name = name.replace(middle, middle+'rd')
        else:
            name = name.replace(middle, middle+'th')
    return name
    
test['street'] = test['street'].apply(lambda x: action(x))

有人可以用簡潔的方式重寫它並添加第 12 到第 12 個條件嗎? 我認為

ordinal = lambda n: "%d%s" % (n,"tsnrhtdd"[(n//10%10!=1)*(n%10<4)*n%10::4])

和 lambda function 可能會有所幫助,但作為較新的,無法弄清楚。 謝謝!

也許使用正則表達式來獲得更一致的 output?

import re
import pandas

# Your dataset
df = pandas.DataFrame(
    {
        "address": [
            "123 AB 45 CD",
            "123 AB 45TH CD",
            "123 AB 12ND CD",
            "123 AB 12 CD",
            "123 AB 12TH CD",
        ] + [
            f"123 AB {i} CD" for i in range(1, 100)  # Generate all options from 1 to 100
        ]
    }
)


def action(name):

    # Pick the name appart
    match = re.search(
        r"^(?P<before>\d+\s\w+)\s(?P<number>\d+)(?P<extension>\w*)\s(?P<after>\w+)$",
        name,
    )

    # Impossible to do anything, return as is with a warning
    if match is None:
        print("Impossible to parse", name)
        return name

    # Extract the elements of the name
    before = match.group("before")
    number = int(match.group("number"))
    extension = match.group("extension")
    after = match.group("after")

    # Check if the name already has an extension
    if extension != "":
        return name

    # Return the formatted string (using your ordinal function)
    return f"{before} {number}{'TSNRHTDD'[(number//10%10!=1)*(number%10<4)*number%10::4]} {after}"


df["street"] = df["address"].apply(lambda x: action(x))
print(df["street"].to_list())

Output:

['123 AB 45TH CD',
 '123 AB 45TH CD',
 '123 AB 12ND CD',
 '123 AB 12TH CD',
 '123 AB 12TH CD',
 '123 AB 1ST CD',
 '123 AB 2ND CD',
 '123 AB 3RD CD',
 '123 AB 4TH CD',
 '123 AB 5TH CD',
 '123 AB 6TH CD',
 '123 AB 7TH CD',
 '123 AB 8TH CD',
 '123 AB 9TH CD',
 '123 AB 10TH CD',
 '123 AB 11TH CD',
 '123 AB 12TH CD',
 '123 AB 13TH CD',
 '123 AB 14TH CD',
 '123 AB 15TH CD',
 '123 AB 16TH CD',
 '123 AB 17TH CD',
 '123 AB 18TH CD',
 '123 AB 19TH CD',
 '123 AB 20TH CD',
 '123 AB 21ST CD',
 '123 AB 22ND CD',
 '123 AB 23RD CD',
 '123 AB 24TH CD',
 '123 AB 25TH CD',
 '123 AB 26TH CD',
 '123 AB 27TH CD',
 '123 AB 28TH CD',
 '123 AB 29TH CD',
 '123 AB 30TH CD',
 '123 AB 31ST CD',
 '123 AB 32ND CD',
 '123 AB 33RD CD',
 '123 AB 34TH CD',
 '123 AB 35TH CD',
 '123 AB 36TH CD',
 '123 AB 37TH CD',
 '123 AB 38TH CD',
 '123 AB 39TH CD',
 '123 AB 40TH CD',
 '123 AB 41ST CD',
 '123 AB 42ND CD',
 '123 AB 43RD CD',
 '123 AB 44TH CD',
 '123 AB 45TH CD',
 '123 AB 46TH CD',
 '123 AB 47TH CD',
 '123 AB 48TH CD',
 '123 AB 49TH CD',
 '123 AB 50TH CD',
 '123 AB 51ST CD',
 '123 AB 52ND CD',
 '123 AB 53RD CD',
 '123 AB 54TH CD',
 '123 AB 55TH CD',
 '123 AB 56TH CD',
 '123 AB 57TH CD',
 '123 AB 58TH CD',
 '123 AB 59TH CD',
 '123 AB 60TH CD',
 '123 AB 61ST CD',
 '123 AB 62ND CD',
 '123 AB 63RD CD',
 '123 AB 64TH CD',
 '123 AB 65TH CD',
 '123 AB 66TH CD',
 '123 AB 67TH CD',
 '123 AB 68TH CD',
 '123 AB 69TH CD',
 '123 AB 70TH CD',
 '123 AB 71ST CD',
 '123 AB 72ND CD',
 '123 AB 73RD CD',
 '123 AB 74TH CD',
 '123 AB 75TH CD',
 '123 AB 76TH CD',
 '123 AB 77TH CD',
 '123 AB 78TH CD',
 '123 AB 79TH CD',
 '123 AB 80TH CD',
 '123 AB 81ST CD',
 '123 AB 82ND CD',
 '123 AB 83RD CD',
 '123 AB 84TH CD',
 '123 AB 85TH CD',
 '123 AB 86TH CD',
 '123 AB 87TH CD',
 '123 AB 88TH CD',
 '123 AB 89TH CD',
 '123 AB 90TH CD',
 '123 AB 91ST CD',
 '123 AB 92ND CD',
 '123 AB 93RD CD',
 '123 AB 94TH CD',
 '123 AB 95TH CD',
 '123 AB 96TH CD',
 '123 AB 97TH CD',
 '123 AB 98TH CD',
 '123 AB 99TH CD']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM