简体   繁体   English

我如何操作 Dataframe - Python

[英]How can I manipulate Dataframe - Python

I am reading some csv files and unfortunately I get values with ' ' or values ending with .0 I would like to know if it is possible to remove this from the DataFrame ?我正在阅读一些 csv 文件,不幸的是我得到了带有 ' ' 的值或以 .0 结尾的值我想知道是否可以从 DataFrame 中删除它?

These are the data that I'm getting:这些是我得到的数据:

['100002134511', '100002087058', '100002087114', '100002087082', '100002087074', '100002087072', '100002087070', '100002087068', '100002087148', '100002087149', '100002087151', 'ESZ1', 'NQZ1', 'IKZ1', 'OEZ1', 'UBZ1', 'G Z1', 'FVZ1', 'BTSZ1', 'TYZ1', 'JBZ1', 'OATZ1', 'DUZ1', 'UXYZ1', 'YMZ1', 'L M4', 'EDU3', 'SFIH3', 'L H3', 'EDH6', 'EDZ4', 'EDZ5', 'EDZ1', 'L U3', 'EDU4', 'SFIU1', 'EDH3', 'EDU5', 'EDM2', 'EDH4', 'EDZ3', 'EDM5', 'L H2', 'L M3', 'EDH2', 'EDM6', 'SFIM4', 'L M5', 'SFIZ3', 'EDM3', 'ERH2', 'L M2', 'L U4', 'EDZ2', 'L Z3', 'L U2', 'SFIH4', 'L H4', 'ERM2', 'EDH5', 'SFIZ2', 'EDU2', 'SFIH2', 'L Z2', 'L H5', 'EDM4', 'SFIZ1', 'SFIU2', 'SFIM3', 'ERH3', 'EDU6', 'L Z1', 'SFIU3', 'ERU2', 'L U5', 'SFIU4', 'L Z4', 'ERU3', 'ERZ1', 'SFIM2', 'ERV1', 'EDZ6', 'EDH7', 'ERM3', 'ERM4', 'ERH4', 'ERZ3', 'ERZ2', 'ERU4']

I tried to solve it by making a replace(), but it didn't work :s我试图通过做一个替换()来解决它,但它没有用:s

# Drop any blank fields and duplicates
nan_value = float("NaN")
df_position.replace("", nan_value, inplace=True)
df_position.dropna(subset=["SecurityReference"], inplace=True)
df_position.drop_duplicates(subset=["SecurityReference"], inplace=True)

df_tradeCash.replace("", nan_value, inplace=True)
df_tradeCash.dropna(subset=["MurexSecurityReference"], inplace=True)
df_tradeCash.drop_duplicates(subset=["MurexSecurityReference"], inplace=True)

# Get values
tradePositionList = df_position["SecurityReference"].tolist()  # 34076
tradeCashList = df_tradeCash["MurexSecurityReference"].tolist()  # 35777
securitylist = tradePositionList + tradeCashList

# remove .0 and ''
str_list = [str(i).replace(".0", "") for i in securitylist if i != ""]
new_list = [str(i).replace('', "") for i in str_list]
print(new_list)

Any ideas, how can I get these values without the ' ' ?任何想法,如何在没有 ' ' 的情况下获得这些值?

Thank you all.谢谢你们。

Try this one .试试这个

lst=['100002111020','', '100002114960', '100002118038', '100002118341', '100002118723', '100002124056', '100002124472', '100002125623', '100002132063', '100002133259', '100002140470', '100002142166', '100002145213', '100002145655', '100002147566', '100002147568', '100002149569', '100002149570', '100002153436', '100002155722', '100002156059', '100002156610', '100002160798', '100002167870', '100002167871', '100002172281', '100002173832', '100002173833', '100002173834', '100002175111', '100002178288', 100001385479.0, 100001419963.0, 100001465490.0, 100001475101.0, 100001481123.0, 100001499246.0, 100001519126.0, 100001526718.0, 100001540507.0, 100001547351.0]

#Convert list element to string then truncate '.0' and remove any empty elements.
str_list=[str(i).replace('.0','') for i in lst if i !='']

# Then convert list element to int values
int_list=[int(i) for i in str_list]

print(int_list)

[ Output ] [输出]

   [100002111020, 100002114960, 100002118038, 100002118341, 100002118723, 100002124056, 100002124472, 100002125623, 100002132063, 100002133259, 100002140470, 100002142166, 100002145213, 100002145655, 100002147566, 100002147568, 100002149569, 100002149570, 100002153436, 100002155722, 100002156059, 100002156610, 100002160798, 100002167870, 100002167871, 100002172281, 100002173832, 100002173833, 100002173834, 100002175111, 100002178288, 100001385479, 100001419963, 100001465490, 100001475101, 100001481123, 100001499246, 100001519126, 100001526718, 100001540507, 100001547351]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM