I have a Python code that pulls data from a 3 rd party API.Below is the code.
for sub in sublocation_ids:
city_num_int = sub['id']
city_num_str = str(city_num_int)
city_name = sub['name']
filter_text_new = filter_text.format(city_num_str)
data = json.dumps({"filters": [filter_text_new], "sort_by":"fb_tw_and_li", "size":200, "from":1580491663000, "to":1588184960000, "content_type":"stories"})
r = requests.post(url = api_endpoint, data = data).json()
if r['articles'] != empty_list:
articles_list = r["articles"]
time.sleep(5)
articles_list_normalized = json_normalize(articles_list)
df = articles_list_normalized
df['publication_timestamp'] = pd.to_datetime(df['publication_timestamp'])
df['publication_timestamp'] = df['publication_timestamp'].apply(lambda x: x.now().strftime('%Y-%m-%d'))
df['citystate'] = city_name
df = df.drop('has_video', 1)
df.to_excel(writer, sheet_name = city_name)
writer.save()
Now city_num_int = sub['id']
is a unique ID for different cities. Now the API returns a "videos" column for few cities and not for other. I want to get rid of that video column before it gets written to Excel file.
I was able to drop "has_video" column using df.drop
as that column is present in each and every city data pull. But how do do conditional dropping for "videos" column as it is only present for few cities.
You can ignore the errors raised by Dataframe.drop:
df = df.drop(['videos'], axis=1, errors='ignore')
Another way is to first check if column is present in DF, and only then delete it
Ref: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html
You can use list comprehension on the column names to achieve what you want:
cols_to_keep = [c for c in df.columns if c != "videos"]
df = df[cols_to_keep]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.