简体   繁体   English

从少于4位的列表中提取数字,并在每个数字的开头和结尾添加一个字符串

[英]Extract number from a list with less than 4 digits, and to each number append a string to the beginning and to the end

I have a file that contains several numbers. 我有一个包含几个数字的文件。

If the number is less than 4 digit longs, we need to extract it and append 0 to the beginning, + a suffix and then append to master list. 如果该数字少于4位数字长,我们需要将其提取并在开头添加0,再加上一个后缀,然后再添加到主列表。

ex

DF = [ 1, 23, 333, 4444] DF = [1,23,333,4444]

should be 应该

DF = [0001.hk, 0023.hk, 0333.hk, 4444.hk] DF = [0001.hk,0023.hk,0333.hk,4444.hk]

The below code works, and helps me accomplish the above task. 下面的代码有效,并且可以帮助我完成上述任务。

Master_List = [Here is where all tickers should be store for some further processing]

def prework1():
    file = 'Path/to/document'
    tickers = []
    read = pd.read_csv(file, names =['IB_Symbol', 'Description', 'Symbol', 
    'Currency'])
    tickers = read['Symbol'].tolist()

    ticker_list = []

    for ticker in tickers:
        if len(ticker) == 1:
            ticker_list.append(ticker)

    ticker_list1 = []

    for ticker in ticker_list:
        string = '000'
        string1 = '.hk'
        tickers1 = [string + ticker + string1]
        ticker_list1.append(tickers1)

    ticker_list2 = []

    for sublist in ticker_list1:
        for item in sublist:
            ticker_list2.append(item)

    return ticker_list2



def prework2():
    file = 'Path/to/document'
    tickers = []
    read = pd.read_csv(file, names =['IB_Symbol', 'Description', 'Symbol', 'Currency'])
    tickers = read['Symbol'].tolist()

ticker_list = []

    for ticker in tickers:
        if len(ticker) == 2:
            ticker_list.append(ticker)

    ticker_list1 = []

    for ticker in ticker_list:
        string = '00'
        string1 = '.hk'
        tickers1 = [string + ticker + string1]
        ticker_list1.append(tickers1)

    ticker_list3 = []

    for sublist in ticker_list1:
        for item in sublist:
            ticker_list3.append(item)

    return ticker_list3


def prework3():
    file = 'Path/to/document'
    tickers = []
    read = pd.read_csv(file, names =['IB_Symbol', 'Description', 'Symbol', 
    'Currency'])
    tickers = read['Symbol'].tolist()

    ticker_list = []

    for ticker in tickers:
        if len(ticker) == 3:
            ticker_list.append(ticker)

    ticker_list1 = []

    for ticker in ticker_list:
        string = '0'
        string1 = '.hk'
        tickers1 = [string + ticker + string1]
        ticker_list1.append(tickers1)

    ticker_list4 = []

    for sublist in ticker_list1:
        for item in sublist:
            ticker_list4.append(item)

    return ticker_list4



test1 = prework1()
test2 = prework2()
test3 = prework3()

print(test1)
print(test2)
print(test3)

There are a couple of issues with the above approach. 上述方法存在两个问题。

With the above code, it will give me 3 lists, but the result should be only 1 list so I can do some further processing / tasks. 使用上面的代码,它将给我3个列表,但是结果应该只有1个列表,这样我可以做一些进一步的处理/任务。

Also, I feel it looks weird and repetitive. 另外,我觉得它看起来很奇怪且重复。 It will do what is intended, but is there a way to make it a tad nicer ? 它将执行预期的操作,但是有没有办法使它变得更好呢?

Appreciate all the help !! 感谢所有帮助!

The simplest way to do this would be: 最简单的方法是:

>>> result = [f'{i:04}.hk' for i in DF]
>>> result
['0001.hk', '0023.hk', '0333.hk', '4444.hk']

Read more about format strings in the PEP 498 document that introduced them. 在介绍它们的PEP 498文档中了解有关格式字符串的更多信息。

You could use zfill in a list comprehension : 您可以在列表 推导中使用zfill

DF = [1, 23, 333, 4444]


def fill(lst, end='.hk'):
    return [s.zfill(4) + end for s in map(str, lst)]


print(fill(DF))

Output 输出量

['0001.hk', '0023.hk', '0333.hk', '4444.hk']

The above list comprehension is equivalent to: 上面的列表理解等效于:

def fill(lst, end='.hk'):
    result = []
    for s in map(str, lst):
        result.append(s.zfill(4) + end)
    return result

From the documentation, zfill : 从文档zfill

Return a copy of the string left filled with ASCII '0' digits to make a string of length width. 返回一个字符串的副本,该字符串的左半部分用ASCII'0'数字填充,以形成长度为长度的字符串。

So as the code is calling s.zfill(4) it will append '0' at beginning of the string until the string is of length 4. 因此,当代码调用s.zfill(4) ,它将在字符串的开头附加'0' ,直到字符串的长度为4。

Python has a zfill() function that will add zeroes to the front of a string until the string has the length you want, if you have fewer digits: Python有一个zfill()函数,该函数将在字符串的前面添加零,直到该字符串具有所需的长度(如果位数较少):

>>> '23'.zfill(4)
'0023'
>>> '1234'.zfill(4)
'1234'

So you could just do: 因此,您可以这样做:

>>> DF = [ 1, 23, 333, 4444]
>>> D = [ str(i).zfill(4) + '.hk' for i in DF ]
>>> D
['0001.hk', '0023.hk', '0333.hk', '4444.hk']

A cool one liner could be: 一个不错的班轮可能是:

list(map(lambda x: ('000' + str(x) + '.hk')[-7:], ls))

What this does is to add a '000' at the begining of each number, and the suffix at the end. 这样做是在每个数字的开头添加一个'000' ,并在结尾添加一个后缀。 Then it just cuts the first part (as the extension is always the same, you know that the amount of chars that the string has is always seven). 然后,它仅剪切第一部分(由于扩展名始终相同,因此您知道字符串具有的字符数始终为7)。 This would be: 这将是:

333 ==> 00033.hk ==> 0033.hk 333 ==> 00033.hk ==> 0033.hk

In case you are not familiar with the map function, what it does is to apply a function to each element in an iterable, so this will apply this function to each number in this list. 如果您不熟悉map函数,它的作用是将函数应用于迭代器中的每个元素,因此这会将此函数应用于此列表中的每个数字。

Here's a mini example for you to try: 这是一个迷你示例供您尝试:

ls = [1, 23, 333, 4444]
print(list(map(lambda x: ('000' + str(x) + '.hk')[-7:], ls)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM