![](/img/trans.png)
[英]Using pandas/matplotlib/python, I cannot visualize my csv file as clusters
[英]Using matplotlib/pandas/python, I cannot visualize data as values per 30mins and per days
我正在使用Matplotlib / Python分析CSV文件。
導入CSV文件后,我使用以下代碼成功繪制了圖表並可視化了每30分鍾的能耗。(謝謝! 使用Matplotlib,可視化CSV數據 )
from matplotlib import style
from matplotlib import pylab as plt
import numpy as np
style.use('ggplot')
filename='total_watt.csv'
date=[]
number=[]
import csv
with open(filename, 'rb') as csvfile:
csvreader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in csvreader:
if len(row) ==2 :
date.append(row[0])
number.append(row[1])
number=np.array(number)
import datetime
for ii in range(len(date)):
date[ii]=datetime.datetime.strptime(date[ii], '%Y-%m-%d %H:%M:%S')
plt.plot(date,number)
plt.title('Example')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()
但問題是,我無法想象每天的能源消耗...
------------編輯(謝謝弗洛里安!)------------
我安裝了熊貓,並在代碼中添加了熊貓代碼。
現在,我的代碼如下所示;
from matplotlib import style
from matplotlib import pylab as plt
import numpy as np
import pandas as pd
style.use('ggplot')
filename='total_watt.csv'
date=[]
number=[]
import csv
with open(filename, 'rb') as csvfile:
df = pd.read_csv('total_watt.csv', parse_dates=[0], index_col=[0])
df.resample('1D', how='sum')
for row in df:
if len(row) == 2 :
date.append(row[0])
number.append(row[1])
number=np.array(number)
import datetime
for ii in range(len(date)):
date[ii]=datetime.datetime.strptime(date[ii], '%Y-%m-%d %H:%M:%S')
plt.plot(date,number)
plt.title('Example')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()
當我實現此代碼時 我沒有錯。 但是在我的圖中,什么也沒畫。.我怎么解決..?
使用pandas
和resample
功能可以使您的生活更輕松。
import io
import pandas as pd
content = '''timestamp value
2011-04-18 16:52:00 152.684299188514
2011-04-18 17:22:00 327.579073188405
2011-04-18 17:52:00 156.826945856169
2011-04-18 18:22:00 330.202764488018
2011-04-18 18:52:00 1118.60404324133
2011-04-18 19:22:00 243.972250782998
2011-04-18 19:52:00 852.88815851216
2011-04-18 20:22:00 491.859992982456
2011-04-18 20:52:00 466.738983617709
2011-04-18 21:22:00 659.670303375527
2011-04-18 21:52:00 576.304871428571
2011-04-18 22:22:00 2497.20620579196
2011-04-18 22:52:00 2790.20392088608
2011-04-18 23:22:00 1092.20906629318
2011-04-18 23:52:00 825.994417375886
2011-04-19 00:22:00 2397.16672089666
2011-04-19 00:52:00 1411.66659265233
2011-04-19 01:22:00 2379.18391111111
2011-04-19 01:52:00 841.224212511672
2011-04-19 02:22:00 471.520308479532
2011-04-19 02:52:00 1189.78122544232
2011-04-19 03:22:00 343.7574197609
2011-04-19 03:52:00 336.486834795322
2011-04-19 04:22:00 541.401434220355
2011-04-19 04:52:00 316.106452883263
2011-04-19 05:22:00 502.502274561404
2011-04-19 05:52:00 314.832323976608
'''
df = pd.read_table(io.BytesIO(content.encode('UTF-8')), sep='\s{2,}', parse_dates=[0], index_col=[0], engine='python')
請參閱此處的文檔: http : //pandas-docs.github.io/pandas-docs-travis/
df = df.resample('30min', how='sum')
Out[496]:
value
timestamp
2011-04-18 16:30:00 152.684299
2011-04-18 17:00:00 327.579073
2011-04-18 17:30:00 156.826946
2011-04-18 18:00:00 330.202764
2011-04-18 18:30:00 1118.604043
2011-04-18 19:00:00 243.972251
2011-04-18 19:30:00 852.888159
2011-04-18 20:00:00 491.859993
2011-04-18 20:30:00 466.738984
2011-04-18 21:00:00 659.670303
2011-04-18 21:30:00 576.304871
2011-04-18 22:00:00 2497.206206
2011-04-18 22:30:00 2790.203921
2011-04-18 23:00:00 1092.209066
2011-04-18 23:30:00 825.994417
2011-04-19 00:00:00 2397.166721
2011-04-19 00:30:00 1411.666593
2011-04-19 01:00:00 2379.183911
2011-04-19 01:30:00 841.224213
2011-04-19 02:00:00 471.520308
2011-04-19 02:30:00 1189.781225
2011-04-19 03:00:00 343.757420
2011-04-19 03:30:00 336.486835
2011-04-19 04:00:00 541.401434
2011-04-19 04:30:00 316.106453
2011-04-19 05:00:00 502.502275
2011-04-19 05:30:00 314.832324
df = df.resample('1D', how='sum')
Out[497]:
value
timestamp
2011-04-18 12582.945297
2011-04-19 11045.629711
希望能幫助到你!
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.