[英]plt.scatter plot turns out blank
the code below returns a blank plot in Python:下面的代码在 Python 中返回空白 plot:
# import libraries
import pandas as pd
import os
import matplotlib.pyplot as plt
import numpy as np
os.chdir('file path')
# import data files
activity = pd.read_csv('file path\dailyActivity_merged.csv')
intensity = pd.read_csv('file path\hourlyIntensities_merged.csv')
steps = pd.read_csv('file path\hourlySteps_merged.csv')
sleep = pd.read_csv('file path\sleepDay_merged.csv')
# ActivityDate in activity df only includes dates (no time). Rename it Dates
activity = activity.rename(columns={'ActivityDate': 'Dates'})
# ActivityHour in intensity df and steps df includes date-time. Split date-time column into dates and times in intensity. Drop the date-time column
intensity['Dates'] = pd.to_datetime(intensity['ActivityHour']).dt.date
intensity['Times'] = pd.to_datetime(intensity['ActivityHour']).dt.time
intensity = intensity.drop(columns=['ActivityHour'])
# split date-time column into dates and times in steps. Drop the date-time column
steps['Dates'] = pd.to_datetime(steps['ActivityHour']).dt.date
steps['Times'] = pd.to_datetime(steps['ActivityHour']).dt.time
steps = steps.drop(columns=['ActivityHour'])
# split date-time column into dates and times in sleep. Drop the date-time column
sleep['Dates'] = pd.to_datetime(sleep['SleepDate']).dt.date
sleep['Times'] = pd.to_datetime(sleep['SleepDate']).dt.time
sleep = sleep.drop(columns=['SleepDate', 'TotalSleepRecords'])
# add a column & calculate time_awake_in_bed before falling asleep
sleep['time_awake_in_bed'] = sleep['TotalTimeInBed'] - sleep['TotalMinutesAsleep']
# merge activity and sleep
list = ['Id', 'Dates']
activity_sleep = sleep.merge(activity,
on = list,
how = 'outer')
# plot relation between calories used daily vs how long it takes users to fall asleep
plt.scatter(activity_sleep['time_awake_in_bed'], activity_sleep['Calories'], s=20, c='b', marker='o')
plt.axis([0, 200, 0, 5000])
plt.show()
NOTE: max(Calories) = 4900 and min(Calories) =0.注意:最大(卡路里)= 4900 和最小(卡路里)= 0。 max(time_awake_in_bed) = 0 and min(time_awake_in_bed) = 150
max(time_awake_in_bed) = 0 和 min(time_awake_in_bed) = 150
Please let me know how I can get a scatter plot out of this.请告诉我如何从中得到散点图 plot。 Thank you in advance for any help.
预先感谢您的任何帮助。
The same variables from the same data-frame work perfectly with geom_point() in R.来自同一数据框的相同变量与 R 中的 geom_point() 完美配合。
I found where the problem was.我找到了问题所在。 As @Redox and @cheersmate mentioned in comments, the data-frame that I created by merging included NaN values.
正如@Redox 和@cheersmate 在评论中提到的,我通过合并创建的数据框包含 NaN 值。 I fixed this by merging them only on 'Id'.
我通过仅在“Id”上合并它们来解决此问题。 Then I could create a scatter plot:
然后我可以创建一个散点图 plot:
list = ['Id']
activity_sleep = sleep.merge(activity,
on = list,
how = 'outer')
The column "Dates" is not a good one to merge on, as in each data frame the same dates are repeated in multiple rows. “日期”列不适合合并,因为在每个数据框中,相同的日期在多行中重复出现。 Also I noticed that I get the same plot whether I outer or inner merge.
我还注意到,无论是外部合并还是内部合并,我都会得到相同的 plot。 Thank you.
谢谢你。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.