[英]Repeat each Row in a Dataframe different N times according to the difference between two value in the Time Column
Time resolution here is 1 second i want to convert it to 10ms这里的时间分辨率是 1 秒我想把它转换成 10ms
i want to change the Time resolution in that table from 1s to be 10ms by subtraction the difference in time between each row multiply it by 100 and replicate each row with that number.我想将该表中的时间分辨率从 1 秒更改为 10 毫秒,方法是减去每行之间的时间差乘以 100,然后用该数字复制每一行。
for example: Row[n] will be repeated Time((n+1)-n)*100例如:Row[n] 将被重复 Time((n+1)-n)*100
when Time=2 sec (third row ) we have certain values combination that will stay the same till the next Row which time =22 sec (Fourth row) so the difference in time here is = 20 sec based on this i want (third row) to be repeated (20*100)当时间 = 2 秒(第三行)时,我们有某些值组合将保持不变,直到下一行,时间 = 22 秒(第四行),所以这里的时间差 = 20 秒,基于我想要的(第三行) ) 重复 (20*100)
Row[2] will be repeated (22-2)*100 Row[2] 将重复 (22-2)*100
import pandas
import pandas as pd
# Dataframe from Excel sheet
excel_data_Outputs_df = pandas.read_excel(".xlsx", sheet_name='Outputs')
excel_data_Inputs_df = pandas.read_excel("xlsx", sheet_name='Inputs')
# Exclude All zeros columns
excel_data_Outputs_df = excel_data_Outputs_df.loc[:, (excel_data_Outputs_df != 0).any(axis=0)]
excel_data_Inputs_df = excel_data_Inputs_df.loc[:, (excel_data_Inputs_df != 0).any(axis=0)]
# Get the time difference and convert it 10ms resolution
shifted=excel_data_Inputs_df.Time.shift(-1)
excel_data_Inputs_df.Time=(shifted-excel_data_Inputs_df.Time)*100
excel_data_Inputs_df['Time'] = excel_data_Inputs_df['Time'].fillna(0)
excel_data_Inputs_df.Time=excel_data_Inputs_df.Time.astype(int)
# Repeat Rows
newexcel_data_Inputs_df = excel_data_Inputs_df.loc[excel_data_Inputs_df.index.repeat(excel_data_Inputs_df.Time)].reset_index(drop=True)
print(newexcel_data_Inputs_df)
print(excel_data_Outputs_df)
Create another column to hold the difference in the values of columns, for repetition reference and then do the operation like this:创建另一列来保存列值的差异,以供重复引用,然后执行如下操作:
import pandas as pd
# Sample dataframe
df = pd.DataFrame({
'id' : ['a', 'b', 'c', 'd'],
'col1' : [4, 5, 6, 7],
'col2' : [3, 2, 4, 3]
})
# Create a new column to hold the difference in column values
# i.e. the number of times the row repition is required.
df['times'] = df.col1 - df.col2
# create the finalDf with repeated rows
finalDf = df.loc[df.index.repeat(df.times)].reset_index(drop=True)
print(finalDf.head())
The output of print
statement looks like: print
语句的 output 如下所示:
id col1 col2 times
0 a 4 3 1
1 b 5 2 3
2 b 5 2 3
3 b 5 2 3
4 c 6 4 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.