简体   繁体   中英

Trying to remove the trailing .0 while printing integers from a pandas dataframe

There's a problem that's been eating at me for the past few days. I haven't been able to find a solution for this on SO or anywhere. Please bear in mind that I'm still in my python learning process. What I'm trying to do is remove the trailing '.0' from 2 columns in the pandas dataframe.

engine = sqlalchemy.create_engine(url, client_encoding='utf8')
def user_history_summary(userid=198):
connection = engine.connect()    
start_date = datetime.datetime(2016,8,6)
end_date = start_date+ datetime.timedelta(days=14)
last_date=datetime.datetime.now()
result = connection.execute(text(                                         
    "SELECT u.id as userid,CASE WHEN h.receiver_user_id = u.id AND h.sender_user_id IS NOT NULL THEN 'Received' WHEN h.sender_user_id=u.id THEN 'Given' ELSE NULL END AS Type, h.sentiment as sentiment, h.context as context,'{0}' as time_period,COUNT(*) as value" 
    " FROM \"User\" u, \"HoorahTransaction\" h"
    " WHERE (u.id= h.receiver_user_id OR u.id=h.sender_user_id) AND sentiment in ('+','-') AND h.created>'{0}' AND h.created<'{1}'"
    " group by userid,type,sentiment,context".format(start_date,end_date)))
answer= result.fetchall()
totalReceived= pd.DataFrame(answer,columns=["userId","Type","Sentiment","Context","TimePeriod","Value"])
counter=0
while start_date<last_date: 
    counter = counter + 1
    start_date = start_date+ datetime.timedelta(days=14)
    end_date = end_date+ datetime.timedelta(days=14)
    result = connection.execute(text(                                         
    "SELECT u.id as userid,CASE WHEN h.receiver_user_id = u.id AND h.sender_user_id IS NOT NULL THEN 'Received' WHEN h.sender_user_id=u.id THEN 'Given' ELSE NULL END AS Type, h.sentiment as sentiment, h.context as context,'{0}' as time_period,COUNT(*) as value" 
    " FROM \"User\" u, \"HoorahTransaction\" h"
    " WHERE (u.id= h.receiver_user_id OR u.id=h.sender_user_id) AND sentiment in ('+','-') AND h.created>'{0}' AND h.created<'{1}'"
    " group by userid,type,sentiment,context".format(start_date,end_date)))
    answer= result.fetchall()       
    df=pd.DataFrame(answer,columns=["userId","Type","Sentiment","Context","TimePeriod","Value"])
    totalReceived= totalReceived.append(df,ignore_index=True)         
return totalReceived
totalReceived = user_history_summary()
print(totalReceived)

Below is the output dataframe that I'm seeing

      userId      Type Sentiment   Context           TimePeriod  Value
0     204.0  Received         +      work  2016-08-06 00:00:00    1.0
1     208.0     Given         +      work  2016-08-06 00:00:00    5.0
2     220.0  Received         +      work  2016-08-06 00:00:00    3.0
3     199.0  Received         +      work  2016-08-06 00:00:00    2.0
4     218.0     Given         +      work  2016-08-06 00:00:00    2.0
5     199.0     Given         -      work  2016-08-06 00:00:00    1.0
6     210.0     Given         +      work  2016-08-06 00:00:00    3.0
7     200.0  Received         +      work  2016-08-06 00:00:00    8.0
8     207.0     Given         -      work  2016-08-06 00:00:00    1.0
9     206.0     Given         +      work  2016-08-06 00:00:00    6.0
10    198.0  Received         +      work  2016-08-06 00:00:00   34.0
11    212.0     Given         +      work  2016-08-06 00:00:00    1.0

I need to remove the trailing '.0' from the 'userId' and 'Value' column. The columns in the database from where the values are being taken from are both integer columns.

You can just convert the columns to the int datatype. It looks like they are currently being stored as float64 .

for column in ['userId', 'Value']:
    totalRecieved[column] = totalRecieved[column].astype(int)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM