簡體   English   中英

Python:從字符串列表中刪除一部分字符串

[英]Python: Remove a portion of a string from a list of strings

我使用xlrd從Excel工作表中提取一列以制成列表。

from xlrd import open_workbook
book = xlrd.open_workbook("HEENT.xlsx").sheet_by_index(0)
med_name = []
for row in sheet.col(2):
    med_name.append(row)
med_school = []
for row in sheet.col(3):
    med_school.append(row)
print(med_school)

下面是該列表的一個片段:med_school。

[text:'University of San Francisco', 
text: 'Harvard University', 
text:'Class of 2016, University of Maryland School of Medicine', 
text:'Class of 2015, Johns Hopkins University School of Medicine', 
text:'Class of 2014, Raymond and Ruth Perelman School of Medicine at the
University of Pennsylvania']

我想從列表中的每個字符串中刪除“ text:'Class of 2014”。 我嘗試了列表理解,但是遇到了屬性錯誤:“ Cell”對象沒有屬性“ strip”。 有誰知道一種創建醫學院名稱的列表的方法,這些名稱僅包含醫學院名稱而沒有上課年份和單詞“ text”?

xlrd不返回您的字符串,而是返回您稱為Cell的類的實例。 該屬性value包含您看到的字符串。

要簡單地修改它們:

for cell in med_school:
    cell.value = cell.value[:15]

這將刪除前15個字符(“ 2014年班級”)。 另外,您可以使用其他方法,例如字符串分割(在“,”上)或正則表達式。

這里的重點是您不應該直接在med_schools列表中的值上工作,而應在它們的.value屬性上工作。 或將其提取到其他可以使用的位置。

例如,要獲取所有文本屬性,請刪除前綴:

values = [cell.value[15:] for cell in med_schools]

或者使用正則表達式替換僅替換包含違規數據的那些

values = [re.sub(r"^Class of \d{4}, ", "", cell.value) for cell in med_schools]

使用給定的分隔符切斷每根弦的頭部。 首先檢查以確保它具有“ Class”,因此我們知道逗號空間在那里。

med_school = ["text:'Class of 2016, University of Maryland School of Medicine'",  
              "text:'Class of 2015, Johns Hopkins University School of Medicine'", 
              "text:'Class of 2014, Raymond and Ruth Perelman School of Medicine at the University of Pennsylvania'",
              "text:'Class of 1989, Rush Medical School / Knox College'",
              "text:'Bernie\'s Back-Alley School of Black-Market Techniques'"
             ]

school_name = []
for first in med_school:
    name = first.value
    if ", " in name:
        cut  = name.index(", ")
        name = name[cut+2:]
    else:
        name = name[6:-1]
    school_name.append(name)

print school_name

輸出(帶有額外的換行符以提高可讀性):

["University of Maryland School of Medicine'",
 "Johns Hopkins University School of Medicine'",
 "Raymond and Ruth Perelman School of Medicine at the University of Pennsylvania'"
 "Rush Medical School / Knox College'", 
 "Bernie's Back-Alley School of Black-Market Techniques"]

您還可以將循環包裝為列表推導:

school_name = [name.value[name.value.index(", ")+2:] \
                       if ", " in name \
                       else name[6:-1]   \
                   for name in med_school]

for row in sheet.col(2)更改for row in sheet.col(2) for row in sheet.col(2).value
U將刪除do文件類型並獲取實際值。 做這個。

for row in sheet.col(2).value: print(row) results =[] for row in sheet.col(2).value: print(row)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM