在 python 中拆分和附加字符串

Question

I have these strings that look like this:我有这些看起来像这样的字符串：

'Census Tract 201, Autauga County, Alabama: Summary level: 140, state:01> county:001> tract:020100'

I want to take the state number 01 the county number 001 and the tract 020100 and make a new string 01001020100. How do I achieve this in Python?我想取 state 编号 01、县编号 001 和区域 020100 并创建一个新字符串 01001020100。如何在 Python 中实现这一点？

All of these strings are in a pandas dataframe so I need to apply this method across all of the rows.所有这些字符串都在 pandas dataframe 中，所以我需要将此方法应用于所有行。 There are all of type string as of I said above.正如我上面所说的，有所有类型的字符串。

To provide more context here is all my code:在这里提供更多上下文是我的所有代码：

import pandas as pd
import numpy as np
import re

df = pd.read_csv('all_data.csv')

df = pd.read_csv('all_data.csv')

column_of_interest = df['Location+Type']

column_of_interest.head()

print(type(column_of_interest[0][0]))

<class 'str'>

find_census = lambda text: text.split('state:')[1].split('>')[0] + text.split('county:')[1].split('>')[0] + text.split('tract:')[1].split('>')[0]
column_of_interest['GEOID'] = column_of_interest.apply(lambda x: find_census(x['Location+Type']))

and I am getting this error for the lambda:我收到 lambda 的此错误：

     1 find_census = lambda text: text.split('state:')[1].split('>')[0] + text.split('county:')[1].split('>')[0] + text.split('tract:')[1].split('>')[0]
----> 2 column_of_interest['GEOID'] = column_of_interest.apply(lambda x: find_census(x['Location+Type']))

TypeError: string indices must be integers

Answer 1

To achieve your goal, you can use a regular expression syntax.为了实现您的目标，您可以使用正则表达式语法。 But, It seems you are a beginner, so I come here with a basic logic based on split method.但是，看来您是初学者，所以我来这里是基于split方法的基本逻辑。 Here is the code:这是代码：

census = 'Census Tract 201, Autauga County, Alabama: Summary level: 140, state:01> county:001> tract:020100'

state = census.split('state:')[1].split('>')[0]
county = census.split('county:')[1].split('>')[0]
tract = census.split('tract:')[1].split('>')[0]
result = state + county + tract

print(result) # 01001020100

Update: using lambda expression to generate the desired outputs更新：使用lambda 表达式生成所需的输出

find_census = lambda text: text.split('state:')[1].split('>')[0] + text.split('county:')[1].split('>')[0] + text.split('tract:')[1].split('>')[0]

# to use the above lambda expression
print(find_census(census)) # 01001020100

Answer 2

Assuming your text follows the pattern you have given you can use regular expressions to get the result.假设您的文本遵循您提供的模式，您可以使用正则表达式来获取结果。

Here \d corresponds to extracting a number \s is a blank space这里\d对应提取一个数字\s是一个空格

s = 'Census Tract 201, Autauga County, Alabama: Summary level: 140, state:01> county:001> tract:020100'
import re
m=re.search("state:(\d+)>\scounty:(\d+)>\stract:(\d+)",s)
''.join(m.groups())

Output Output

'01001020100'

在 python 中拆分和附加字符串

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-11-20 23:26:32

解决方案2
0 2019-11-20 22:50:34

在 python 中拆分和附加字符串

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-11-20 23:26:32

解决方案2 0 2019-11-20 22:50:34

解决方案1
1 已采纳 2019-11-20 23:26:32

解决方案2
0 2019-11-20 22:50:34