使用Python從列的每一行中的字符串中切片子字符串

Question

我絕對是初學者。 我在使用 Python 對 Excel 文件中的字符串進行切片時遇到問題。 我的 Excel 文件包含以下信息：

Column 1:

ordercode   
PMC11-AA1L1FAVWJA   
PMC21-AA1A1CBVXJA   
PMP11-AA1L1FAWJJ    
PMP21-AA1A1FBWJJ    
PMP23-AA1A1FA3EJ+JA
PTP31B-AA3D1HGBVXJ  
PTC31B-AA3D1CGBWBJA 
PTP33B-AA3D1HGB1JJ

我想根據是否為“ordercode”列中的字符串進行切片
"PMC11"/"PMC21"/"PMP21"/"PMP11"/"PMP23"/"PTP31B"/"PTP33B"/"PTC31B"在不同位置，並保存在新列"壓力范圍"中。 在 Excel 中，我使用了以下代碼並且運行良好：

=IF(OR(ISNUMBER(SEARCH("PMC11",A2)),ISNUMBER(SEARCH("PMC21",A2)),ISNUMBER(SEARCH("PMP11",A2)),ISNUMBER(SEARCH("PMP21",A2)),ISNUMBER(SEARCH("PMP23",A2))),MID(A2,11,2),MID(A2,12,2))

但是在 Python 中我使用了下面的編碼，但它不能正常工作。

蟒蛇代碼：

import pandas as pd
#Assigning the worksheet to file
file="Stratification_worksheet.xlsx"
#Loading the spreadsheet 
data= pd.ExcelFile(file)
#sheetname
print(data.sheet_names)
#loading the sheetname to df1
df=data.parse("Auftrag")
print(df)

#creating a new column preessurerange and slicing the pressure range from order code

for index,row in df.iterrows():
    if "PMC11" in df.loc[index,"ordercode"]:
        df["pressurerange"]=df["ordercode"].str.slice(10,12)
    elif "PMC21" in df.loc[index,"ordercode"]:
        df["pressurerange"]=df["ordercode"].str.slice(10,12)
    elif "PMP11" in df.loc[index,"ordercode"]:
        df["pressurerange"]=df["ordercode"].str.slice(10,12)
    elif "PMP21" in df.loc[index,"ordercode"]:
        df["pressurerange"]=df["ordercode"].str.slice(10,12)
    elif "PMP23" in df.loc[index,"ordercode"]:
        df["pressurerange"]=df["ordercode"].str.slice(10,12)
    elif "PTP31B" in df.loc[index,"ordercode"]:
        df["pressurerange"]=df["ordercode"].str.slice(11,13)
    elif "PTP33B" in df.loc[index,"ordercode"]:
        df["pressurerange"]=df["ordercode"].str.slice(11,13)
    elif "PTC31B" in df.loc[index,"ordercode"]:
        df["pressurerange"]=df["ordercode"].str.slice(11,13)
    else:
        df["pressurerange"]="NONE"
    print(df.loc[:,["pressurerange"]])
    break

這里它所做的是檢查第一個 IF 條件，並在所有列的位置 (10,12) 處對字符串進行切片。 我知道我在下面的代碼中犯了錯誤。 但我不知道要使用的確切代碼是什么。

=df["pressurerange"]=df["ordercode"].str.slice(10,12)

Answer 1

Genera 解決方案使用沒有-數據，然后返回NaN 。

我相信需要numpy.select與由str.startswith創建的str.startswith ：

L1 = ["PMC11","PMC21","PMP21","PMP11","PMP23"]
L2 = ["PTP31B","PTP33B","PTC31B"]
m1 = df["ordercode"].str.startswith(tuple(L1))
m2 = df["ordercode"].str.startswith(tuple(L2))

a = df["ordercode"].str.slice(10,12)
b = df["ordercode"].str.slice(11,13)

df["pressurerange"] = np.select([m1, m2], [a, b], default=np.nan)
print (df)
             ordercode pressurerange
0    PMC11-AA1L1FAVWJA            1F
1    PMC21-AA1A1CBVXJA            1C
2     PMP11-AA1L1FAWJJ            1F
3     PMP21-AA1A1FBWJJ            1F
4  PMP23-AA1A1FA3EJ+JA            1F
5   PTP31B-AA3D1HGBVXJ            1H
6  PTC31B-AA3D1CGBWBJA            1C
7   PTP33B-AA3D1HGB1JJ            1H

如果所有值都有-解決方案應該使用str.split進行簡化，然后通過str[1]選擇第二個列表，最后通過str[4:6]或Series.str.slice選擇5-6字符：

df["pressurerange"] = df['ordercode'].str.split('-', n=1).str[1].str[4:6]
#alternative solution
#df["pressurerange"] = df['ordercode'].str.split('-', n=1).str[1].str.slice(4,6)
print (df)
             ordercode pressurerange
0    PMC11-AA1L1FAVWJA            1F
1    PMC21-AA1A1CBVXJA            1C
2     PMP11-AA1L1FAWJJ            1F
3     PMP21-AA1A1FBWJJ            1F
4  PMP23-AA1A1FA3EJ+JA            1F
5   PTP31B-AA3D1HGBVXJ            1H
6  PTC31B-AA3D1CGBWBJA            1C
7   PTP33B-AA3D1HGB1JJ            1H

Answer 2

Python 為您提供了比 Excel 多得多的選擇。 如果你有一個字符串code = "PMC21-AA1A1CBVXJA" ，你可以寫

pressurerange, rest = code.split("-")

你有-之前的部分和之后的部分。 我會讓你弄清楚如何在你的工作流程中使用它。

（注意：如果rest部分可以包含額外的連字符，請使用code.split("-", 1)將拆分限制為一個匹配。）

Answer 3

我會使用拆分：

string = 'PMC11-AA1L1FAVWJA'
pressure_range, columns = string.split('-', 1)
column = columns[4:6]

使用Python從列的每一行中的字符串中切片子字符串

問題描述

3 個解決方案

解決方案1
1 已采納 2018-09-03 08:09:14

解決方案2
1 2018-09-03 08:19:39

解決方案3
0 2018-09-03 09:18:05

使用Python從列的每一行中的字符串中切片子字符串

問題描述

3 個解決方案

解決方案1 1 已采納 2018-09-03 08:09:14

解決方案2 1 2018-09-03 08:19:39

解決方案3 0 2018-09-03 09:18:05

解決方案1
1 已采納 2018-09-03 08:09:14

解決方案2
1 2018-09-03 08:19:39

解決方案3
0 2018-09-03 09:18:05