[英]Repeat dataframe rows n times according to the unique column values and to each row repeat create a new column with different values
[英]Repeat each Row in a Dataframe different N times according to the difference between two value in the Time Column
這里的時間分辨率是 1 秒我想把它轉換成 10ms
我想將該表中的時間分辨率從 1 秒更改為 10 毫秒,方法是減去每行之間的時間差乘以 100,然后用該數字復制每一行。
例如:Row[n] 將被重復 Time((n+1)-n)*100
當時間 = 2 秒(第三行)時,我們有某些值組合將保持不變,直到下一行,時間 = 22 秒(第四行),所以這里的時間差 = 20 秒,基於我想要的(第三行) ) 重復 (20*100)
Row[2] 將重復 (22-2)*100
import pandas
import pandas as pd
# Dataframe from Excel sheet
excel_data_Outputs_df = pandas.read_excel(".xlsx", sheet_name='Outputs')
excel_data_Inputs_df = pandas.read_excel("xlsx", sheet_name='Inputs')
# Exclude All zeros columns
excel_data_Outputs_df = excel_data_Outputs_df.loc[:, (excel_data_Outputs_df != 0).any(axis=0)]
excel_data_Inputs_df = excel_data_Inputs_df.loc[:, (excel_data_Inputs_df != 0).any(axis=0)]
# Get the time difference and convert it 10ms resolution
shifted=excel_data_Inputs_df.Time.shift(-1)
excel_data_Inputs_df.Time=(shifted-excel_data_Inputs_df.Time)*100
excel_data_Inputs_df['Time'] = excel_data_Inputs_df['Time'].fillna(0)
excel_data_Inputs_df.Time=excel_data_Inputs_df.Time.astype(int)
# Repeat Rows
newexcel_data_Inputs_df = excel_data_Inputs_df.loc[excel_data_Inputs_df.index.repeat(excel_data_Inputs_df.Time)].reset_index(drop=True)
print(newexcel_data_Inputs_df)
print(excel_data_Outputs_df)
創建另一列來保存列值的差異,以供重復引用,然后執行如下操作:
import pandas as pd
# Sample dataframe
df = pd.DataFrame({
'id' : ['a', 'b', 'c', 'd'],
'col1' : [4, 5, 6, 7],
'col2' : [3, 2, 4, 3]
})
# Create a new column to hold the difference in column values
# i.e. the number of times the row repition is required.
df['times'] = df.col1 - df.col2
# create the finalDf with repeated rows
finalDf = df.loc[df.index.repeat(df.times)].reset_index(drop=True)
print(finalDf.head())
print
語句的 output 如下所示:
id col1 col2 times
0 a 4 3 1
1 b 5 2 3
2 b 5 2 3
3 b 5 2 3
4 c 6 4 2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.