简体   繁体   中英

Python Pandas - Can Dataframe have multiple indexes?

I have a dataset in CSV which I read with:

df = pd.read_csv(requestfile, header=[0,1], parse_dates= [0])

The following Dataframe is in following format [0..8759]:

                   time output direct diffuse temperature
                 UTC     kW  kW/m2   kW/m2       deg C
0    2014-01-01 00:00:00  0.000  0.000   0.000       1.495
1    2014-01-01 01:00:00  0.000  0.000   0.000       1.543
2    2014-01-01 02:00:00  0.000  0.000   0.000       1.517

Now I want do things with it using https://github.com/renewables-ninja/gsee (gsee.pv.run_plant_model), however I receive the following error:

File "C:\Data\Solar\gsee-master\gsee\trigon.py", line 183, in aperture_irradiance
sunrise_set_times = sun_rise_set_times(direct.index, coords)

File "C:\Data\Solar\gsee-master\gsee\trigon.py", line 56, in sun_rise_set_times
dtindex = pd.DatetimeIndex(datetime_index.to_series().map(pd.Timestamp.date).unique())

File "C:\Users\XX\Anaconda3\lib\site-packages\pandas\core\series.py", line 2177, in map
new_values = map_f(values, arg)

File "pandas\src\inference.pyx", line 1207, in pandas.lib.map_infer (pandas\lib.c:66124)
TypeError: descriptor 'date' requires a 'datetime.datetime' object but received a 'int'

So I assumed the fault is in my default index, so I modified the CSV-reading to use the 'time' column as index:

df = pd.read_csv(requestfile, header=[0,1], index_col=0, parse_dates= [0])

time                output direct diffuse temperature
UTC                     kW  kW/m2   kW/m2       deg C
2014-01-01 00:00:00  0.000  0.000   0.000       1.495
2014-01-01 01:00:00  0.000  0.000   0.000       1.543

Now the error I get is following:

File "C:\Users\XX\Anaconda3\lib\site-packages\pandas\core\frame.py", line 402, in _init_dict
return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)

File "C:\Users\XX\Anaconda3\lib\site-packages\pandas\core\frame.py", line 5398, in _arrays_to_mgr
index = extract_index(arrays)

File "C:\Users\XX\Anaconda3\lib\site-packages\pandas\core\frame.py", line 5437, in extract_index
raise ValueError('If using all scalar values, you must pass'

ValueError: If using all scalar values, you must pass an index

So if I understood correctly, the first error is because my index is just numbers [0..8759] in INT when it should be in datetime-format, and my second error is because my index is in datetime-format and

index = extract_index(arrays)

doesn't have the orginal index [0..8759]. Or have I completely understood the scalar value error wrong? Would it be possible to have 2 indexes for the DataFrame, one [0..8759] and other ['time']-column? How would this be translated to pd.read_csv function or by other method?

If it is any help, I also do the following with the DataFrame (which don't show for some beginner mistake when I call the DataFrame df) (but they are used by the run_plant_model function and) :

df.global_horizontal = df.direct + df.diffuse
df.diffuse_fraction = df.diffuse / df.global_horizontal
df.diffuse_fraction = df.diffuse_fraction.fillna(0)

EDIT: I now properly added the latest columns to the dataframe. It did not have any effect on the error.

Function call:

gsee.pv.run_plant_model(df, site.coords, angle, azimuth, tracking, 
                        capacity, technology, system_loss, 
                        angles=None, include_raw_data=False)    

I believe the original question might have been bad:

C:\Users\XX\Anaconda3\lib\site-packages\pandas\indexes\base.py:2683: RuntimeWarning: Cannot compare type 'Timestamp' with type 'str', sort order is undefined for incomparable objects
return this.join(other, how=how, return_indexers=return_indexers)

So I have 'str' where I should have 'Timestamp'?

Ok, I found the error and the original question was bad:

Solution:

df = pd.read_csv(requestfile, index_col=[0], parse_dates=[0], skiprows=[1])

Headers were left out, and I added the read_csv to skip the row containing units in 'str'. So the problem was one of the functions used was trying to compare 'Timestamp' with the unit row ('str').

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM