繁体   English   中英

检查列表中的字符串是否在列表中的其他两个字符串之间?

[英]Check if a string in a list is between two other strings in a list?

import pandas as pd
    
nameBank = ["John Doe", "Jane Doe", "Patrick Star", "Spongebob Squarepants"]
phoneList = []
nameList = []

list1 = ["1234567890", "John doe", "Not a NAME/USELESS FILLERINFO",  "2345678901", "jane doe", "Not a NAME/USELESS FILLERINFO", "Not a NAME/USELESS FILLERINFO", "3456789012", "4567890123", "5678901234", "patrick star", "6789012345"]

df = pd.DataFrame({'Phone Number': phoneList, 'Name': nameList})
df.to_csv('results.csv', index=False, encoding='utf-8')
print(df)

我想要做的是从此 list1 中检索每个电话号码并将其放入phoneList

从那里我想查看nameBank中是否有一个名称位于列表中当前电话号码之后和列表中下一个电话号码之前。

如果电话号码后面有名字,那么我希望能够将 append 到nameList ,如果电话号码后面没有名字,那么我想 append “No Name Found”到nameList 因此它基本上可以对应于 excel 图表。

即电话号码1234567890在两个列表之间具有与其对应的名称 John Doe。 第二个电话号码附加了姓名 Jane Doe,因此当您使用这两个列表使用 pandas 创建表时,它们将对应。 第三个电话号码3456789012在其自身和列表中的下一个电话号码之间没有名称,因此我希望 nameList 的附加值是"no name found"

基本上 output 表看起来像:图表示例

因此,您想将 list1 解析为一个系列:

list1 = ["1234567890", "John doe", "Not a NAME/USELESS FILLERINFO",  "2345678901", "jane doe", "Not a NAME/USELESS FILLERINFO", "Not a NAME/USELESS FILLERINFO", "3456789012", "4567890123", "5678901234", "patrick star", "6789012345"]


import re

num = re.compile('\d{10}')
output = {}
i = 0
while i < len(list1):
    if not num.match(list1[i]):
        i += 1
        continue
    output[list1[i]] = list1[i+1] if i+1<len(list1) and not num.match(list1[i+1]) else 'not found'
    i += 1
    
series = pd.Series(output)

Output:

1234567890        John doe
2345678901        jane doe
3456789012       not found
4567890123       not found
5678901234    patrick star
6789012345       not found
dtype: object
import pandas as pd

nameBank = ["John Doe", "Jane Doe", "Patrick Star", "Spongebob Squarepants"]
list1 = ["1234567890", "John doe", "Not a NAME/USELESS FILLERINFO",  "2345678901", "jane doe", "Not a NAME/USELESS FILLERINFO", "Not a NAME/USELESS FILLERINFO", "3456789012", "4567890123", "5678901234", "patrick star", "6789012345"]

data = []
for index, elem in enumerate(list1):
    if elem.isnumeric():
        if (len(list1) - 1) > index:
            if list1[index+1].casefold() in map(str.casefold, nameBank):
                data.append([elem,list1[index+1].title()])
            else:
                data.append([elem, 'No Name Found'])
        else:
            data.append([elem, 'No Name Found'])
 
df = pd.DataFrame(data, columns=['Phone Number', 'Name'])      
# df.to_csv('results.csv', index=False, encoding='utf-8'     
print(df)

output:

  Phone Number           Name
0   1234567890       John Doe
1   2345678901       Jane Doe
2   3456789012  No Name Found
3   4567890123  No Name Found
4   5678901234   Patrick Star
5   6789012345  No Name Found
import re
import pandas as pd

list1 = ["1234567890", "John doe", "Not a NAME/USELESS FILLERINFO",  "2345678901", "jane doe", "Not a NAME/USELESS FILLERINFO", "Not a NAME/USELESS FILLERINFO", "3456789012", "4567890123", "5678901234", "patrick star", "6789012345"]
nameBank = ["John Doe", "Jane Doe", "Patrick Star", "Spongebob Squarepants"]

def mapList(list1):
    output = []
    for index, item in enumerate(list1, start=0): 
        if re.match("^\d{10}", item):
            # Use any one condition             
            # if index < len(list1) - 1 and list1[index + 1] in nameBank:
            if index < len(list1) - 1 and not re.match("^\d{10}", list1[index + 1]):
                output.append([list1[index], list1[index+1]]);
            else:
                output.append([list1[index],'No Name Found']);
    return output;


df = pd.DataFrame(mapList(list1), columns=['Phone Number', 'Name'])      
print(df)

Output:

  Phone Number           Name
0   1234567890       John doe
1   2345678901       jane doe
2   3456789012  No Name Found
3   4567890123  No Name Found
4   5678901234   patrick star
5   6789012345  No Name Found

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM