如何删除喜欢python的行之间的更多行？

Question

I have a weird file format 我有一个奇怪的文件格式

###########################################################
# Name of file#
# stuff[hh:mm:ss:ms] stuff[num] stuff[num] stuff[] stuff[]#
###########################################################
00:00:00.000 -1000 -1000 0.000001 20
00:00:00.001 -1000 -1000 0.000001 20
00:00:00.002 -1000 -1000 0.000001 20
00:00:00.003 -1000 -1000 0.000001 20
00:00:00.004 -1000 -1000 0.000001 20
00:00:00.005 -1000 -1000 0.000001 20
00:00:00.006 -1000 -1000 0.000001 20
00:00:00.007 -1000 -1000 0.000001 20

the problem is I need only info every 2 sec. 问题是我每2秒只需要信息。 Which means i need to edit out 1999 lines in between.(the space is actually /t) What is the best way of doing that. 这意味着我需要在它们之间编辑出1999行。（空格实际上是/ t）什么是最好的方法。 I would also like to have the numbers saved as numbers not strings. 我也想将数字另存为数字而不是字符串。

df = pd.read_csv('file.txt', sep="\t",
names=("time", "num1", "num2", "num3", "num4"), skiprows=4)
df["abs_time"] = df.index * 1e-3

I had to define time differently i already have the code for that i just need to save it properly. 我必须以不同的方式定义时间，我已经有了相应的代码，只需要正确保存即可。

def get_sec(time_str):
m, s, ss = time_str.split(':')
return int(m) * 60 + int(s) + 0.01*int(ss)

Any help well appreciated. 任何帮助，不胜感激。

Answer 1

As you need data for every 2 seconds, it will indicate you need to have second which is even and ending with "000"(you could choose odd seconds as well) assuming you have no missing data 由于您每2秒需要数据，这将表明您需要秒数为偶数，并且以“ 000”结尾（也可以选择奇数秒），前提是您没有丢失数据

def is_select(time_str):
    return str.endswith(time_str, ".000") and int(time_str[6:8])%2
df['even_seconds'] = pd.apply(lambda x: is_select(x["time"]), axis=1)
select_data = df[df.even_seconds==True]

x["time"][6:8] will give you seconds information (you could adjust the index yourself). x["time"][6:8]将为您提供秒信息（您可以自己调整索引）。

Of course, you could modify lambda function for other data selections. 当然，您可以为其他数据选择修改lambda函数。

Answer 2

You can use skiprows parameter to get odd rows (or even). 您可以使用skiprows参数获取奇数行（或偶数行）。 From the documentation: 从文档中：

If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. 如果可调用，则将针对行索引评估可调用函数，如果应跳过该行，则返回True，否则返回False。 An example of a valid callable argument would be lambda x: x in [0, 2]. 有效的可调用参数的示例为lambda x：[0，2]中的x。

Here you have an example csv: 这里有一个示例csv：

#
#
#
#
A,B
1,1
2,2
3,3
4,4

Then you can: 那么你也能：

pd.read_csv('test.csv', skiprows=lambda x: True if x < 4 or x%2 == 1 else False)

Output: 输出：

   A  B
0  2  2
1  4  4

As you can see, you can read odd or even lines and thus getting only rows every 2 seconds. 如您所见，您可以读取奇数行或偶数行，因此每2秒仅获得一行。 Notice though, this assumes: 但是请注意，这假定：

You are using latest pandas version 0.20.2 您正在使用最新的熊猫0.20.2版本
Your data is consecutive, ie one row per second 您的数据是连续的，即每秒一行

Answer 3

You cumsum the milisecond and check if they are modulo 2000, assuming you have strings in your first column. 假定毫秒在第一列中，您将毫秒加上了毫秒并检查它们是否为2000模。

vector_bool = df[df.columns[0]].apply(lambda x: x.split(".")[-1]).astype(int).cumsum().apply( lambda x: x%2000 == 0 )

Then take only the row wich are true. 然后仅取至真的那一行。

df_clean = df[vector_bool]

如何删除喜欢python的行之间的更多行？

问题描述

3 个解决方案

解决方案1
1 2017-06-29 08:08:45

解决方案2
1 已采纳 2017-06-29 08:11:25

解决方案3
0 2017-06-29 08:56:49

如何删除喜欢python的行之间的更多行？

问题描述

3 个解决方案

解决方案1 1 2017-06-29 08:08:45

解决方案2 1 已采纳 2017-06-29 08:11:25

解决方案3 0 2017-06-29 08:56:49

解决方案1
1 2017-06-29 08:08:45

解决方案2
1 已采纳 2017-06-29 08:11:25

解决方案3
0 2017-06-29 08:56:49