I need to go through each day from January 2015 up until February 05th, 2020.
The following script gets me the dates for each month up until feb 05 2020:
import pandas as pd
date = pd.datetime.now().strftime("%Y%m%d")
dates = pd.date_range(start='20150101', end='20200205', freq = "M").strftime("%Y%m%d")
print(dates)
Result:
Index(['20150131', '20150228', '20150331', '20150430', '20150531', '20150630',
'20150731', '20150831', '20150930', '20151031', '20151130', '20151231',
'20160131', '20160229', '20160331', '20160430', '20160531', '20160630',
'20160731', '20160831', '20160930', '20161031', '20161130', '20161231',
'20170131', '20170228', '20170331', '20170430', '20170531', '20170630',
'20170731', '20170831', '20170930', '20171031', '20171130', '20171231',
'20180131', '20180228', '20180331', '20180430', '20180531', '20180630',
'20180731', '20180831', '20180930', '20181031', '20181130', '20181231',
'20190131', '20190228', '20190331', '20190430', '20190531', '20190630',
'20190731', '20190831', '20190930', '20191031', '20191130', '20191231',
'20200131'],
dtype='object'
The following script scrapes wind speed for each day in January 2015: In my main I specify API key, startdate and enddate which is used in the URL. I believe this is where the merge of the two scripts could take place.
import pandas as pd
import requests
import warnings
headers = {
'scheme': 'https',
'accept': 'application/json, text/plain, */*',
'accept-encoding' : 'gzip, deflate, br',
'accept-language': 'en-GB,en;q=0.9,en-US;q=0.8,da;q=0.7',
'origin': 'https://www.wunderground.com',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'cross-site',
'user-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36'
}
#Here I get the relevant data, being the dates and wind speed, and add it to a seperate dataframe called dkk
def get_data(response):
df = response.json()
df = pd.DataFrame(df["observations"])#[1]["valid_time_gmt", "wspd"]
df["time"] = pd.to_datetime(df["valid_time_gmt"],unit='s')
dkk = df.groupby(df["time"].dt.date)["wspd"].mean()
return dkk
if __name__ == "__main__":
date = pd.datetime.now().strftime("%d-%m-%Y")
api_key = "xxxxxx"
start_date = "20150101"
end_date = "20150131"
urls = [
"https://api.weather.com/v1/location/EGNV:9:GB/observations/historical.json?apiKey="+api_key+"&units=e&startDate="+start_date+"&endDate="+end_date+""
]
#here I append data to dataframe and transpose it and store in df_transposed, which results in the
below.
df = pd.DataFrame()
for url in urls:
warnings.simplefilter('ignore' ,InsecureRequestWarning)
res = requests.get(url, headers= headers, verify = False)
data = get_data(res)
df = df.append(data)
df_transposed = df.T
print(df_transposed)
Results:
wspd
2015-01-01 24.333333
2015-01-02 18.696970
...
2015-01-30 12.121212
2015-01-31 21.575758
The question is: I need to get the wind speed from January 01 2015 - February 05 2020. How can I best combine my scripts to get the desired output, which would be a two-column dataframe with dates in one and wind speed (wspd) in the second.
The desired output:
wspd
2015-01-01 24.333333
2015-01-02 18.696970
2015-01-03 8.454545
2015-01-04 10.363636
2015-01-05 11.333333
...
2020-02-04 13.5
2020-02-05 7.1
The wspd for the last two dates can be seen here:
https://www.wunderground.com/history/monthly/gb/darlington/EGNV/date/2020-2
Use Series.where
:
s = df_transposed.index.to_series()
df_transposed= df_transposed.where((s >='2015-01-01') &(s<='2020-02-05'),'XXX')
EDIT
s = df_transposed.index.to_series()
df_transposed= df_transposed.where((s >=pd.to_datetime('2015-01-01')) &
(s<=pd.to_datetime('2020-02-05')),'XXX')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.