简体   繁体   English

在Pandas列中将字符串转换为int

[英]Converting string to int in Pandas column

I have a .csv with US Congress biographical data that I read as a Panda df: 我有一个.csv文件,其中包含以熊猫df格式读取的美国国会传记数据

df = pd.read_csv('congress100.csv', delimiter = ';', names = ['Name', 'Position', 'Party', 'State', 'Congress'], header = 0)

My dataframe looks like this: 我的数据框如下所示:

0                   'ACKERMAN, Gary Leonard'        'Representative'    'Democrat'  'NY'  '100(1987-1988)'
1                  'ADAMS, Brockman (Brock)'               'Senator'    'Democrat'  'WA'  '100(1987-1988)'
2                   'AKAKA, Daniel Kahikina'        'Representative'    'Democrat'  'HI'  '100(1987-1988)'
3    'ALEXANDER, William Vollie (Bill), Jr.'        'Representative'    'Democrat'  'AR'  '100(1987-1988)'
4                  'ANDERSON, Glenn Malcolm'        'Representative'    'Democrat'  'CA'  '100(1987-1988)'
5                   'ANDREWS, Michael Allen'        'Representative'    'Democrat'  'TX'  '100(1987-1988)'
6                          'ANNUNZIO, Frank'        'Representative'    'Democrat'  'IL'  '100(1987-1988)'
7             'ANTHONY, Beryl Franklin, Jr.'        'Representative'    'Democrat'  'AR'  '100(1987-1988)'
8                  'APPLEGATE, Douglas Earl'        'Representative'    'Democrat'  'OH'  '100(1987-1988)'
9            'ARCHER, William Reynolds, Jr.'        'Representative'  'Republican'  'TX'  '100(1987-1988)'
10                    'ARMEY, Richard Keith'        'Representative'  'Republican'  'TX'  '100(1987-1988)'

I want to convert the data in the 'Congress' column to an integer. 我想将“会议”列中的数据转换为整数。 Right now, I am first converting it to a simpler string: 现在,我首先将其转换为更简单的字符串:

df['Congress'] = df['Congress'].str.replace(r'100\(1987-1988\)', '1987')

This is successful. 这是成功的。 But, I am then trying to convert that simpler string to an integer: 但是,然后我尝试将更简单的字符串转换为整数:

df['Congress'] = df['Congress'].pd.to_numeric(errors='ignore')

I am getting an error: 我收到一个错误:

AttributeError: 'Series' object has no attribute 'pd'

Please help me resolve this error and simplify my code. 请帮助我解决此错误并简化代码。

You need to call pd.numeric like this: 您需要像这样调用pd.numeric

import pandas as pd

df = pd.DataFrame(data=[str(i + 1980) for i in range(10)], columns=['Congress'])
df['Congress'] = pd.to_numeric(df['Congress'], errors='ignore')
print(df)

The code above is meant as a toy example , you just need to change your line: 上面的代码仅作为玩具示例 ,您只需要更改以下行即可:

df['Congress'] = df['Congress'].pd.to_numeric(errors='ignore')

to: 至:

df['Congress'] = pd.to_numeric(df['Congress'], errors='ignore')

One more way to achieve it. 实现它的另一种方法。 It would work if there are only digits in the column:- 如果该列中只有数字,它将起作用:-

 df['Congress'] = df['Congress'].astype(int)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM