简体   繁体   中英

how to copy values of one column of a dataframe to another column of other dataframe in pandas?

while copying one by one values of string column to another dataframe's column I got this as an output containing square brackets:

chk.at[index,'StartLocation1'] = chkn['StartLocation1'].values
chk.at[index,'EndLocation1'] = chkn['EndLocation1'].values

0        [Petrol Pump-Ramji Ambedkar Nagar]
1                           [V Enterprises]
2                                   [Baola]
3                         [Dharmajyot-Vapi]
4    [KINGSTON TOWER VASAI Dominos-(THANE)]
Name: StartLocation1, dtype: object

So further I thought to remove this [] bracket: I have applied this:

chk['EndLocation1'].str.strip('[]').astype(str)

0      nan
1      nan
2      nan
3      nan
4      nan

But, I have got nan values. Please support!

See this is my whole code:

chk['StartLocation1'] = ''
chk['EndLocation1'] = ''

for index, row in chk.iterrows():
    start = row.StartTime
    end = row.EndTime
    reg = row.RegistrationNo

    query = "SELECT TOP 1 RegistrationNo, GPSDateTime, Location  FROM GPSEventsDataCurrentWeek where GPSDateTime Between 'start_date' and 'end_date' and RegistrationNo = 'reg' and GroundSpeed > 0 ORDER BY GPSDateTime ASC"

    query = query.replace('start_date', start.strftime('%m/%d/%Y %H:%M:%S'))
    query = query.replace('end_date', end.strftime('%m/%d/%Y %H:%M:%S'))
    query = query.replace('reg', str(reg))

    chk1 = pd.read_sql(query, con=engine)
    

    chk1 = chk1.rename({'Location': 'StartLocation1','GPSDateTime': 'StartTime'}, axis=1)
   

    query2 = "SELECT TOP 1 RegistrationNo, GPSDateTime, Location  FROM GPSEventsDataCurrentWeek where GPSDateTime Between 'start_date' and 'end_date' and RegistrationNo = 'reg' and GroundSpeed > 0 ORDER BY GPSDateTime DESC"

    query2 = query2.replace('start_date', start.strftime('%m/%d/%Y %H:%M:%S'))
    query2 = query2.replace('end_date', end.strftime('%m/%d/%Y %H:%M:%S'))
    query2 = query2.replace('reg', str(reg))

    chk2 = pd.read_sql(query2, con=engine)
    

    chk2 = chk2.rename({'Location': 'EndLocation1','GPSDateTime': 'EndTime'}, axis=1)
  
    chkn = pd.merge(chk1,chk2, on = ['RegistrationNo'], how = 'outer')
    print(chkn[['StartLocation1','EndLocation1']])
    

    chk.at['StartLocation1'] = chkn['StartLocation1'].values
    chk.at[index,'EndLocation1'] = chkn['EndLocation1'].values

and this is my dataframe chk:

Companyid   RegistrationNo  Date    Hour    Value   RunningDuration StartTime   EndTime
0   236.0   MH-01-CJ-3571   2020-09-01  0.0 True    00:42:00    2020-09-01 00:08:00 2020-09-01 00:59:00
1   236.0   MH-01-CV-7460   2020-09-01  0.0 True    00:49:00    2020-09-01 00:09:00 2020-09-01 00:58:00
2   654.0   MH-04-JK-4102   2020-09-01  0.0 True    00:03:00    2020-09-01 00:11:00 2020-09-01 00:24:00
3   654.0   DN-09-R-9421    2020-09-01  0.0 True    00:02:00    2020-09-01 00:24:00 2020-09-01 00:54:00
4   236.0   MH-01-CV-7456   2020-09-01  0.0 True    00:04:00    2020-09-01 00:38:00 2020-09-01 00:42:00

Maybe a possible solution could be select the unique value in the list throught a join of the elements of lists ( join list of lists in python ). df.values returns an n-array object, thus the returned values are a list of length 1, but is recommended used df.to_numpy instead ( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.values.html )

Anyway, I think that assigning the columns that way is not the best way to do it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM