Requirement: Pull API data with a date range of every 10 mins, so that I don't get a bulk of data and cause CPU/RAM issue and have better performance
Example : Range of dates is start_date = '2022-06-05' and end_date ='2022-06-06'
So When I loop data from API, it should first start_date and end_date as below
start_dt:2022-06-05 00:00:00 end_dt:2022-06-05 00:10:00
start_dt:2022-06-05 00:10:00 end_dt:2022-06-05 00:20:00
start_dt:2022-06-05 00:20:00 end_dt:2022-06-05 00:30:00
start_dt:2022-06-05 00:30:00 end_dt:2022-06-05 00:40:00
try:
batch_start_dt = datetime.strptime(date_run, '%Y-%m-%d')
batch_end_dt = datetime.strptime(date_run, '%Y-%m-%d') + timedelta(days=1)
timedelta_index = pd.date_range(start=batch_start_dt, end=batch_end_dt, freq='10T').to_series()
for index, value in timedelta_index.iteritems():
start_dt = index.to_pydatetime()
end_dt = start_dt + pd.Timedelta("20T")
start_dt2 = start_dt.strftime('%Y-%m-%d %H:%M:%S')
end_dt2 = end_dt.strftime('%Y-%m-%d %H:%M:%S')
request_url = ("https://%s:443/api/ChangeList?fromDate=%sZ&endDate=%sZ&timeBufferInSecs=1200&firstLevelFieldsOnly=true"
% (api_url,start_dt2.replace(" ", "T"),end_dt2.replace(" ", "T")))
client.request("GET", request_url, headers=headers)
response = client.getresponse()
data = response.read().decode("UTF-8").strip("")
json_results = json.loads(data)
json_arr.extend(json_results)
start_dt += delta
end_dt += delta
Error:=
Traceback (most recent call last):
File "/opt/airflow/dags/date_api.py", line 165, in execute
File "/opt/airflow/dags/date_api.py", line 129, in call_api
json_results = json.loads(data)
File "/usr/local/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
You could use pandas.DateRange here:
import pandas as pd
idx = pd.DateRange(start_date, end_date, freq='10M') # Generate 10-minute intervals
for t in idx:
# ...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.