简体   繁体   English

plt.scatter plot 结果是空白

[英]plt.scatter plot turns out blank

the code below returns a blank plot in Python:下面的代码在 Python 中返回空白 plot:

# import libraries
import pandas as pd
import os
import matplotlib.pyplot as plt
import numpy as np

os.chdir('file path')

# import data files
activity = pd.read_csv('file path\dailyActivity_merged.csv')
intensity = pd.read_csv('file path\hourlyIntensities_merged.csv')
steps = pd.read_csv('file path\hourlySteps_merged.csv')
sleep = pd.read_csv('file path\sleepDay_merged.csv')

# ActivityDate in activity df only includes dates (no time). Rename it Dates
activity = activity.rename(columns={'ActivityDate': 'Dates'})

# ActivityHour in intensity df and steps df includes date-time. Split date-time column into dates and times in intensity. Drop the date-time column

intensity['Dates'] = pd.to_datetime(intensity['ActivityHour']).dt.date
intensity['Times'] = pd.to_datetime(intensity['ActivityHour']).dt.time
intensity = intensity.drop(columns=['ActivityHour'])

# split date-time column into dates and times in steps. Drop the date-time column

steps['Dates'] = pd.to_datetime(steps['ActivityHour']).dt.date
steps['Times'] = pd.to_datetime(steps['ActivityHour']).dt.time
steps = steps.drop(columns=['ActivityHour'])

# split date-time column into dates and times in sleep. Drop the date-time column

sleep['Dates'] = pd.to_datetime(sleep['SleepDate']).dt.date
sleep['Times'] = pd.to_datetime(sleep['SleepDate']).dt.time
sleep = sleep.drop(columns=['SleepDate', 'TotalSleepRecords'])

# add a column & calculate time_awake_in_bed before falling asleep

sleep['time_awake_in_bed'] = sleep['TotalTimeInBed'] - sleep['TotalMinutesAsleep']

# merge activity and sleep
list = ['Id', 'Dates']
activity_sleep = sleep.merge(activity,
                on = list,
                how = 'outer')

# plot relation between calories used daily vs how long it takes users to fall asleep

plt.scatter(activity_sleep['time_awake_in_bed'], activity_sleep['Calories'], s=20, c='b', marker='o')
plt.axis([0, 200, 0, 5000])
plt.show()

NOTE: max(Calories) = 4900 and min(Calories) =0.注意:最大(卡路里)= 4900 和最小(卡路里)= 0。 max(time_awake_in_bed) = 0 and min(time_awake_in_bed) = 150 max(time_awake_in_bed) = 0 和 min(time_awake_in_bed) = 150

Please let me know how I can get a scatter plot out of this.请告诉我如何从中得到散点图 plot。 Thank you in advance for any help.预先感谢您的任何帮助。

The same variables from the same data-frame work perfectly with geom_point() in R.来自同一数据框的相同变量与 R 中的 geom_point() 完美配合。

I found where the problem was.我找到了问题所在。 As @Redox and @cheersmate mentioned in comments, the data-frame that I created by merging included NaN values.正如@Redox 和@cheersmate 在评论中提到的,我通过合并创建的数据框包含 NaN 值。 I fixed this by merging them only on 'Id'.我通过仅在“Id”上合并它们来解决此问题。 Then I could create a scatter plot:然后我可以创建一个散点图 plot:

list = ['Id']
activity_sleep = sleep.merge(activity,
                on = list,
                how = 'outer')

The column "Dates" is not a good one to merge on, as in each data frame the same dates are repeated in multiple rows. “日期”列不适合合并,因为在每个数据框中,相同的日期在多行中重复出现。 Also I noticed that I get the same plot whether I outer or inner merge.我还注意到,无论是外部合并还是内部合并,我都会得到相同的 plot。 Thank you.谢谢你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM