如何提取特定编号来自 dataframe 列 Python 的字符

Question

Like Left formula we Used in Excel same as I want to extract no.就像我们在 Excel 中使用的左公式一样，我想提取编号。 of character from Policy no.保单号的特征column based on Insurer column like.....基于 Insurer 列的列，例如.....

if insurer is HDFC then Extract only 10 character form the sting and if insurer is tata then Extract only 7 character form the sting.如果保险公司是 HDFC，则仅从 sting 中提取 10 个字符，如果保险公司是 tata，则仅从 sting 中提取 7 个字符。

How I will achieve this in python我将如何在 python 中实现这一点

Insurer保险公司	Policy no.保单号	Expected OutPut*预期 OutPut*
Hdfc高清晰度电视	4509242332 4509242332	4509242332 4509242332
Tata塔塔	tatadigitNational tatadigit National	tatadig塔塔迪格
Hdfc高清晰度电视	09082323ab12sd 09082323ab12sd	09082323ab 09082323ab
Hdfc高清晰度电视	nolanheroman诺兰英雄	nolanherom诺兰英雄
Tata塔塔	97543007356 97543007356	9754300 9754300
Tata塔塔	pqrsequence2o202 pqrsequence2o202	pqrsequ pqrsequ
Tata塔塔	987654321 987654321	9876543 9876543

Answer 1

What you can do is define a new function that does the comparison and apply that to a new column ( Expected OutPut in your example).您可以做的是定义一个新的 function 进行比较并将其应用于新列（在您的示例中为Expected OutPut ）。

def f(row): 
    val = str(row['Policy no.'])
    return val[:10] if row['Insurer'] == "Hdfc" else val[:7]

df['Expected OutPut'] = df.apply(f, axis=1)

Answer 2

You can try np.select你可以试试np.select

df['out'] = np.select(
    [df['Insurer'].str.lower().eq('hdfc'),
     df['Insurer'].str.lower().eq('tata')],
    [df['Policy no.'].str[:10],
     df['Policy no.'].str[:7],],
    df['Policy no.']
    )

print(df)

  Insurer         Policy no. Expected OutPut         out
0    Hdfc         4509242332      4509242332  4509242332
1    Tata  tatadigitNational         tatadig     tatadig
2    Hdfc     09082323ab12sd      09082323ab  09082323ab
3    Hdfc       nolanheroman      nolanherom  nolanherom
4    Tata        97543007356         9754300     9754300
5    Tata   pqrsequence2o202         pqrsequ     pqrsequ
6    Tata          987654321         9876543     9876543

Answer 3

One possible solution is,一种可能的解决方案是，

df['temp'] = df['Insurer'].map({'Hdfc':10, 'Tata':7})
df['Expected Output'] = df.apply(lambda x: x['Policy no.'][:x['temp']], axis=1)

O/P:输出/输出：

  Insurer         Policy no. Expected Output  temp
0    Hdfc         4509242332      4509242332    10
1    Tata  tatadigitNational         tatadig     7
2    Hdfc     09082323ab12sd      09082323ab    10
3    Hdfc       nolanheroman      nolanherom    10
4    Tata        97543007356         9754300     7
5    Tata   pqrsequence2o202         pqrsequ     7
6    Tata          987654321         9876543     7

Answer 4

Another solution:另一种解决方案：

df['Expected OutPut'] = df.apply(lambda x: x['Policy no.'][0:10] if x['Insurer']=='Hdfc' else x['Policy no.'][0:7], axis = 1)
print(df)

Insurer         Policy no. Expected OutPut
0    Hdfc         4509242332      4509242332
1    Tata  tatadigitNational         tatadig
2    Hdfc     09082323ab12sd      09082323ab
3    Hdfc       nolanheroman      nolanherom
4    Tata        97543007356         9754300
5    Tata   pqrsequence2o202         pqrsequ
6    Tata          987654321         9876543

Answer 5

A vector solution using pandas slicing使用pandas slicing的向量解决方案

df['Expected Output'] = df['Policy no.'].str[:10]
df.loc[df.index[df.Insurer.eq('Tata')], 'Expected Output'] = df['Expected Output'].loc[df.index[df.Insurer.eq('Tata')]].str[:7]

which gives us the expected output:这给了我们预期的 output：

df
  
  Insurer         Policy no. Expected Output
0    Hdfc         4509242332      4509242332
1    Tata  tatadigitNational         tatadig
2    Hdfc     09082323ab12sd      09082323ab
3    Hdfc       nolanheroman      nolanherom
4    Tata        97543007356         9754300
5    Tata   pqrsequence2o202         pqrsequ
6    Tata          987654321         9876543

Answer 6

You could try this你可以试试这个


df['Expected Output'] = np.where(df['Insurer']== 'Hdfc', df["Policy no"].str[:10],df["Policy no"].str[:7])

Answer 7

You can easily do this by converting the dataframe to an array, then iterating through every row您可以通过将 dataframe 转换为数组，然后遍历每一行来轻松做到这一点

array = df.to_numpy()

for row in array:
    #assuming you only have 2 columns, check to see if the insurer is Tata
    if row[0] == 'Tata':
        #slice string in the Policy column
        row[1] = row[1][:7]
#now, convert array back to df
pd.Dataframe(array, columns=['Insurer','Policy no.'])

如何提取特定编号来自 dataframe 列 Python 的字符

问题描述

6 个解决方案

解决方案1
2 2022-08-01 15:35:47

解决方案2
1 2022-08-01 15:37:14

解决方案3
0 2022-08-01 15:38:40

解决方案4
0 2022-08-01 15:48:58

解决方案5
0 2022-08-01 15:54:34

解决方案6
0 2022-08-01 15:56:30

解决方案7
-1 2022-08-01 15:38:38

如何提取特定编号来自 dataframe 列 Python 的字符

问题描述

6 个解决方案

解决方案1 2 2022-08-01 15:35:47

解决方案2 1 2022-08-01 15:37:14

解决方案3 0 2022-08-01 15:38:40

解决方案4 0 2022-08-01 15:48:58

解决方案5 0 2022-08-01 15:54:34

解决方案6 0 2022-08-01 15:56:30

解决方案7 -1 2022-08-01 15:38:38

解决方案1
2 2022-08-01 15:35:47

解决方案2
1 2022-08-01 15:37:14

解决方案3
0 2022-08-01 15:38:40

解决方案4
0 2022-08-01 15:48:58

解决方案5
0 2022-08-01 15:54:34

解决方案6
0 2022-08-01 15:56:30

解决方案7
-1 2022-08-01 15:38:38