简体   繁体   中英

How to convert dataframe to string in Python

We're working on a python project, where we're using SQLalchemy to retrieve data from our MySQL database. We have succeeded in establishing a connection to the database:

engine = create_engine("mysql://xxx:xxx@localhost/xxx")
conn = engine.connect()

When we are connected, we are retrieving first and last timestamp, so we can set up a range and an interval of 20 minutes. We are doing so by using following line of code:

lastLpnTime = pd.read_sql('SELECT MAX(timestamp) FROM Raw_Data WHERE topic = "lpn1"', conn).astype(str).values
firstLpnTime = pd.read_sql('SELECT MIN(timestamp) FROM Raw_Data WHERE topic = "lpn1"', conn).astype(str).values.tolist()

We then want to round down the timestamp for firstLpnTime , so the minutes will be round down to either xx.00, xx.20, xx:40, so for example 15:09:25 will round down to 15:00:00

firstLpnTime = pd.to_datetime(firstLpnTime).round('20T')

And then we set up a range between firstLpnTime and lastLpnTime and an interval of 20 min

index = pd.date_range(start=firstLpnTime.min(), end=lastLpnTime.max(), freq='20T')

However... we are receiving an error for this line of code:

firstLpnTime = pd.to_datetime(firstLpnTime).round('20T')

Error message is saying:

TypeError: arg must be a string, datetime, list, tuple, 1-d array, or Series

Variable firstLpnTime has an astype(str) , so it should be a string, so we don't get why it's not working. Maybe the datatype is not changing to string?

The output, when I print firstLpnTime , is: [['2019-07-26 15:09:25']]

It looks like your firstLpnTime is nested within an array.

So, you can try taking first index from this variable and then casting it using pandas.

firstLpnTime = pd.to_datetime(firstLpnTime[0]).round('20T')

Then, firstLpnTime should give this result:

DatetimeIndex(['2019-07-26 15:00:00'], dtype='datetime64[ns]', freq=None)

I hope it solves your problem.

In order to avoid value nested within array, you can use DataFrame directly as shown in below code:

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('mysql://xxxx:xxxx@localhost:3306/test')
conn = engine.connect()

lastLpnTime = pd.read_sql('SELECT MAX(timestamp) as max_time FROM 
     test.test_table WHERE topic = "lpn1";', conn).iloc[0]['max_time']
firstLpnTime = pd.read_sql('SELECT MIN(timestamp) as min_time FROM 
     test.test_table WHERE topic = "lpn1"', conn).iloc[0]['min_time']

firstLpnTime = firstLpnTime.round('20T')
firstLpnTime

This will provide the expected output in Pandas Timestamp class.

Timestamp('2019-07-26 15:00:00')

Hope it helps! :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM