简体   繁体   中英

Error message : x and y must have same first dimension, but have shapes (23,) and (1,)

I have an excel file and I wanna plot a specific range of rows in a specific column I could get the access to the column and rows that I wanted but whenever I run my code this error message turns up

x and y must have same first dimension, but have shapes (23,) and (1,) the first row of the consumption sheet is :

在此处输入图片说明

import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt
import math

start_date = "2017-07-24 00:00:00"
end_date = "2019-03-09 23:00:00"

data_generation2 = pd.read_excel(r"C:\\Users\\user\\Desktop\\Master\\Thesis\\Tarek\\Parent.xlsx", index_col="Timestamp")

data_consumption2 = pd.read_excel(r"C:\\Users\\user\\Desktop\\Master\\Thesis\\Tarek\\Parent.xlsx", index_col="Timestamp", sheetname="Consumption")

data_generation = data_generation2.loc[start_date:end_date] 
data_consumption = data_consumption2.loc[start_date:end_date] 
data_consumption2["Timestamp"] = 
pd.to_datetime(data_consumption2["Timestamp"], format="%d/%m/%y %H%M:%S")
df_to_plot = data_consumption2.loc[(data_consumption2.Timestamp >= start_date) & (data_consumption2.Timestamp < end_date)]
df_to_plot.set_index(df_to_plot.Timestamp)
df_to_plot = df_to_plot[df_to_plot.columns[1:]]  

I wanna the output to be the plot of the first 720 rows in the column 2053G with respect to the time column.

I will try to give you some tips.

Once you have read your data, you should transform the date column (string) to a datetime format. The to_datetime function do the job (doc) .

Then, you need to filter by date . As you have done, you can use loc to do that.

Now, we want to plot the dataframe. Pandas module provides a plot method (doc) . Let's try to use it. By default, it will plot all columns. In our case, we want to plot consumption over date. One way to do is to transform the date columns as an index. The set_index method offer you this possibily (doc) .

Once it's done, you are ready to make the plot !

Here is an exemple assuming having the following dataframe:

  • First column is the date (dd/MM/YYYY HH:MM:ss)
  • The next twenty columns are the electricity consumption per house
import pandas as pd
import matplotlib.pyplot as plt

start_date = "2019-01-01 00:00:00"
end_date = "2019-01-02 00:00:00"

data_generation2 = pd.read_excel("your_path")

# transform date column (type string) to datetime type
data_generation2["Date"] = pd.to_datetime(data_generation2["Date"], format="%d/%m/%y %H%M:%S")
print(data_generation2)
#                      Date    House1    House2    House3    House4  ...   House17   House18   House19   House20
# 0   2019-01-01 00: 00: 00  5.087228  6.416912  0.997723  4.094832  ...  0.462354  5.105688  7.504992  4.688377
# 1   2019-01-01 00: 01: 00  5.343098  3.205145  8.874018  0.567703  ...  5.593759  3.584948  5.503817  7.027469
# 2   2019-01-01 00: 02: 00  5.894551  6.411896  3.208524  6.582306  ...  9.149066  5.485656  0.820700  6.391030
# 3   2019-01-01 00: 03: 00  5.674840  6.558950  3.043355  6.759310  ...  2.270176  3.513524  1.833294  9.889432
# 4   2019-01-01 00: 04: 00  9.022593  4.088398  6.082084  8.095797  ...  9.270887  4.947194  2.505969  7.396191
# 5   2019-01-01 00: 05: 00  9.217690  1.950666  5.684551  7.782337  ...  6.990156  1.430377  2.269080  7.343031
#                       ...                                                              ...
# 715 2019-01-30 01: 09: 00  6.892563  0.553806  0.744839  3.151984  ...  9.418145  2.575782  0.801205  4.290434
# 716 2019-01-30 02: 00: 00  0.894091  8.872986  8.445927  2.781972  ...  0.832707  4.264321  6.972225  4.155494
# 717 2019-01-30 02: 01: 00  0.079396  5.270916  0.452875  5.248232  ...  5.331906  4.026245  6.598595  1.637325
# 718 2019-01-30 02: 02: 00  5.287946  4.745940  8.479177  7.713943  ...  0.642650  9.735159  2.499371  7.824247
# 719 2019-01-30 02: 03: 00  1.376087  6.262784  5.621571  4.937615  ...  0.801079  5.506654  6.878042  2.430062
# [720 rows x 23 columns]

# Filter according the date interval
df_to_plot = data_generation2.loc[(data_generation2.Date >= start_date) & (data_generation2.Date < end_date)]

# Transform the columns of date as index
df_to_plot.set_index(df_to_plot.Date)
df_to_plot = df_to_plot[df_to_plot.columns[1:]] # Remove date columns
print(df_to_plot)
# [720 rows x 21 columns]
#       House1    House2    House3  ...   House17   House18   House19   House20
# 0   5.087228  6.416912  0.997723  ...  0.462354  5.105688  7.504992  4.688377
# 1   5.343098  3.205145  8.874018  ...  5.593759  3.584948  5.503817  7.027469
# 2   5.894551  6.411896  3.208524  ...  9.149066  5.485656  0.820700  6.391030
# 3   5.674840  6.558950  3.043355  ...  2.270176  3.513524  1.833294  9.889432
# 4   9.022593  4.088398  6.082084  ...  9.270887  4.947194  2.505969  7.396191
# 5   9.217690  1.950666  5.684551  ...  6.990156  1.430377  2.269080  7.343031
# ...
# 20  8.908322  7.452523  9.046442  ...  9.786092  6.370266  5.635464  8.560888
# 21  7.666228  1.294614  2.034965  ...  0.091431  3.210088  9.038868  1.492585
# 22  8.266986  6.209395  9.689687  ...  6.500988  6.133358  2.237199  7.289748
# 23  2.687560  6.852061  5.136214  ...  0.815944  8.267767  0.244232  7.705181
# [24 rows x 20 columns]



# create a figure (2 x 2)
fig, axes = plt.subplots(nrows=2, ncols=2)

# Daily chart
ax = df_to_plot.plot(ax=axes[0, 0], title="Daily consumption", legend=False)
ax.set_xlabel("Days")
ax.set_ylabel("Consumption")

# Cumulative chart
ax = df_to_plot.sort_index().cumsum().plot(ax=axes[0, 1], title="Cumuative consumption")
ax.set_xlabel("Days")
ax.set_ylabel("Cumulative consumption")
ax.legend(ncol=3)

# Histogram chart house 1
ax = df_to_plot.House1.plot(ax=axes[1, 0], kind="bar",
                       title="House 1 daily consumption", rot=1)
ax.set_xlabel("Days")
ax.set_ylabel("Consumption")

# Pie chart
df_to_plot.sum().sort_values().plot(
    ax=axes[1, 1], kind="pie", title="Pie chart consumption")

# Main title
fig.suptitle("From " + start_date +' to ' +end_date + ' summary')
plt.show()

在此处输入图片说明

The main task you have to do is to rebuilt your dataframe. I can not to it for you. Once it's done, the previous code can give you some ideas.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM