简体   繁体   English

如何提取特定编号来自 dataframe 列 Python 的字符

[英]How to extract specific no. of character from dataframe column Python

Like Left formula we Used in Excel same as I want to extract no.就像我们在 Excel 中使用的左公式一样,我想提取编号。 of character from Policy no.保单号的特征column based on Insurer column like.....基于 Insurer 列的列,例如.....

if insurer is HDFC then Extract only 10 character form the sting and if insurer is tata then Extract only 7 character form the sting.如果保险公司是 HDFC,则仅从 sting 中提取 10 个字符,如果保险公司是 tata,则仅从 sting 中提取 7 个字符。

How I will achieve this in python我将如何在 python 中实现这一点

Insurer保险公司 Policy no.保单号 Expected OutPut预期 OutPut
Hdfc高清晰度电视 4509242332 4509242332 4509242332 4509242332
Tata塔塔 tatadigitNational tatadigit National tatadig塔塔迪格
Hdfc高清晰度电视 09082323ab12sd 09082323ab12sd 09082323ab 09082323ab
Hdfc高清晰度电视 nolanheroman诺兰英雄 nolanherom诺兰英雄
Tata塔塔 97543007356 97543007356 9754300 9754300
Tata塔塔 pqrsequence2o202 pqrsequence2o202 pqrsequ pqrsequ
Tata塔塔 987654321 987654321 9876543 9876543

What you can do is define a new function that does the comparison and apply that to a new column ( Expected OutPut in your example).您可以做的是定义一个新的 function 进行比较并将其应用于新列(在您的示例中为Expected OutPut )。

def f(row): 
    val = str(row['Policy no.'])
    return val[:10] if row['Insurer'] == "Hdfc" else val[:7]

df['Expected OutPut'] = df.apply(f, axis=1)

You can try np.select你可以试试np.select

df['out'] = np.select(
    [df['Insurer'].str.lower().eq('hdfc'),
     df['Insurer'].str.lower().eq('tata')],
    [df['Policy no.'].str[:10],
     df['Policy no.'].str[:7],],
    df['Policy no.']
    )
print(df)

  Insurer         Policy no. Expected OutPut         out
0    Hdfc         4509242332      4509242332  4509242332
1    Tata  tatadigitNational         tatadig     tatadig
2    Hdfc     09082323ab12sd      09082323ab  09082323ab
3    Hdfc       nolanheroman      nolanherom  nolanherom
4    Tata        97543007356         9754300     9754300
5    Tata   pqrsequence2o202         pqrsequ     pqrsequ
6    Tata          987654321         9876543     9876543

One possible solution is,一种可能的解决方案是,

df['temp'] = df['Insurer'].map({'Hdfc':10, 'Tata':7})
df['Expected Output'] = df.apply(lambda x: x['Policy no.'][:x['temp']], axis=1)

O/P:输出/输出:

  Insurer         Policy no. Expected Output  temp
0    Hdfc         4509242332      4509242332    10
1    Tata  tatadigitNational         tatadig     7
2    Hdfc     09082323ab12sd      09082323ab    10
3    Hdfc       nolanheroman      nolanherom    10
4    Tata        97543007356         9754300     7
5    Tata   pqrsequence2o202         pqrsequ     7
6    Tata          987654321         9876543     7

Another solution:另一种解决方案:

df['Expected OutPut'] = df.apply(lambda x: x['Policy no.'][0:10] if x['Insurer']=='Hdfc' else x['Policy no.'][0:7], axis = 1)
print(df)

Insurer         Policy no. Expected OutPut
0    Hdfc         4509242332      4509242332
1    Tata  tatadigitNational         tatadig
2    Hdfc     09082323ab12sd      09082323ab
3    Hdfc       nolanheroman      nolanherom
4    Tata        97543007356         9754300
5    Tata   pqrsequence2o202         pqrsequ
6    Tata          987654321         9876543

A vector solution using pandas slicing使用pandas slicing的向量解决方案

df['Expected Output'] = df['Policy no.'].str[:10]
df.loc[df.index[df.Insurer.eq('Tata')], 'Expected Output'] = df['Expected Output'].loc[df.index[df.Insurer.eq('Tata')]].str[:7]

which gives us the expected output:这给了我们预期的 output:

df
  
  Insurer         Policy no. Expected Output
0    Hdfc         4509242332      4509242332
1    Tata  tatadigitNational         tatadig
2    Hdfc     09082323ab12sd      09082323ab
3    Hdfc       nolanheroman      nolanherom
4    Tata        97543007356         9754300
5    Tata   pqrsequence2o202         pqrsequ
6    Tata          987654321         9876543

You could try this你可以试试这个


df['Expected Output'] = np.where(df['Insurer']== 'Hdfc', df["Policy no"].str[:10],df["Policy no"].str[:7])

You can easily do this by converting the dataframe to an array, then iterating through every row您可以通过将 dataframe 转换为数组,然后遍历每一行来轻松做到这一点

array = df.to_numpy()

for row in array:
    #assuming you only have 2 columns, check to see if the insurer is Tata
    if row[0] == 'Tata':
        #slice string in the Policy column
        row[1] = row[1][:7]
#now, convert array back to df
pd.Dataframe(array, columns=['Insurer','Policy no.'])
    

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何提取没有索引号的特定列。 以及 python dataframe 中的所有行? - How to extract specific columns without index no. and with all the rows in python dataframe? Python:从 dataframe 列值中删除特定字符 - Python: Remove specific character from dataframe column value 基于列单元格值如何使用 python 从字符串中提取特定的字符编号 - based on column cell value how to extract specific no# of character from string using python 如何从数据框的列中提取特定内容并创建新列? - How to extract specific content from a column of a dataframe and make new column? 如何提取Python pandas数据帧的特定列的特定位 - How to extract a particular bits of a specific column of Python pandas dataframe 如何使用python根据列特定值从数据框中提取行 - How to extract rows from dataframe based on column specific values using python 如何使用 python pandas dataframe 从 excel 列中提取特定值 - How to extract specific value from excel column using python pandas dataframe 如何从 dataframe python 中提取特定的键和值 - how to extract specific key and value from a dataframe python 如何根据条件 python 从 dataframe 中提取特定内容 - how to extract specific content from dataframe based on condition python 如何从python中的数据帧行中提取特定长度的范围? - How to extract ranges with specific length from dataframe row in python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM