简体   繁体   English

如何将包含数据和datetime64 [ns]的列表与带有datetime64 [ns]索引的熊猫数据框合并

[英]How to merge a list containing data and datetime64[ns] with a pandas dataframe with datetime64[ns] index

I want to read two columns S1_max and S2_max from a dataframe data . 我想从dataframe data读取两列S1_max和S2_max。 Wherever a value is present in the S1_max column I want to check that each S1_max is succeeded by a corresponding S2_max signal. 无论S1_max列中存在什么值,我都想检查每个S1_max是否由相应的S2_max信号接替。 If so I calculate the time delta between the S1_max and S2_max signal. 如果是这样,我计算了S1_maxS2_max信号之间的时间增量。 This result is then indexed at the datetime[64ns] index of the S2_max column in a separate dict d which is then appended to a list delta_data . 这个结果然后在索引datetime[64ns]在一个单独的S2_max列的索引dict d ,然后将其附加到一个list delta_data How can I add this result to my already existing data dataframe at the corresponding datetime[64ns] index? 如何在对应的datetime[64ns]索引处将此结果添加到我已经存在的data数据datetime[64ns]

This is my creation of delta_data : 这是我创建的delta_data

#time between each S2 global maxima: 86 ns/samp freq 200 = 0.43 ns
#Checking that each S1 is succeeded by a corresponging S2 signal and calculating the time delta:
delta_data = []
diff_S1 = 0
diff_S2 = 0
i = 0
while((i + diff_S1 + 1 < len(peak_indexes_S1)) and (i + diff_S2<len(peak_indexes_S2))):
# Find next ppg peak after S1 peak
    while (df["S2"].index[peak_indexes_S2[i + diff_S2]] < df["S1"].index[peak_indexes_S1[i+diff_S1]]):
        diff_S2=diff_S2+1

    while (df["S1"].index[peak_indexes_S1[i+diff_S1+1]] < df["S2"].index[peak_indexes_S2[i + diff_S2]]):
        diff_S1=diff_S1+1

    i_peak_S2 = peak_indexes_S2[i + diff_S2]
    i_peak_S1 = peak_indexes_S1[i + diff_S1]

    d={}
    d["td"] = (df["S2"].index[i_peak_S2]-df["S1"].index[i_peak_S1]).microseconds
    d["time"] = df["S2"].index[i_peak_S2]
    PATdata.append(d)

    i = i + 1

time_delta=pd.DataFrame(delta_data)

delta_data printed out: delta_data打印出来:

         td                    time
0    355000 2019-08-07 13:06:31.010
1    355000 2019-08-07 13:06:31.850
2    355000 2019-08-07 13:06:32.695

This is my data dataframe: 这是我的data框:

                           l1        l2        l3        l4       S1       S2   S2_max   S1_max

2019-08-07 13:11:21.485  0.572720  0.353433  0.701320  1.418840  4.939690  2.858326  2.858326       NaN
2019-08-07 13:11:21.490  0.572807  0.353526  0.701593  1.419052  4.939804  2.854604       NaN  4.939804

This dataframe is created by: 该数据框的创建者:

data = pd.read_csv('file.txt')
data.columns = ['l1','l2','l3','l4','S1','S2']
nbrMeasurments = sum(1 for line in open('file.txt'))
data.index = pd.date_range('2019-08-07 13:06:30'), periods=nbrMeasurments-1, freq="5L")

I have tried DataFrame.combine_first and append . 我已经尝试过DataFrame.combine_firstappend

Also, the same problem occurs when trying to add another dataframe to data . 另外,尝试向data添加另一个数据帧时,也会发生相同的问题。 This dataframe doesn't have ms in the datetime frame: 此数据帧在日期时间帧中没有ms:

                     S3   S4 
Date                                       
2019-08-07 13:06:30         111          61

As far as I could understand you are trying to append another column to an existing DataFrame. 据我了解,您正在尝试将另一列追加到现有的DataFrame中。

here how to do it: 这里是怎么做的:

df1 = pd.DataFrame({'names':['bla', 'blah', 'blahh'], 'values':[1,2,3]})
df2_to_concat = pd.DataFrame({'put_me_as_a_new_column':['row1', 'row2', 'row3']})

pd.concat([df1.reset_index(drop=True), df2_to_concat.reset_index(drop=True)], axis=1)

The reset_index(drop=True) makes sure you don't produce NaNs or duplicate index columns. reset_index(drop=True)确保您不会产生NaN或重复的索引列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM