[英]How to read specific lines in a file with numbers in Python?
I am writing a script to calculate the average and standard deviation for some measurements that I have. 我正在编写一个脚本,以计算某些测量的平均值和标准偏差。 I would like to read the file and make it select the data that I want.
我想读取文件,并使其选择所需的数据。
Let's say I have the table as below: 假设我的表格如下:
(1 2 3 4;
4 x x x;
4 x x x;
4 x x x;
4 x x x)
now I want to make the script such that I will be able to select all the values that are under 1, then all the values under 2 and so on, so which files I import depend on the value of the first line. 现在,我想制作一个脚本,以便能够选择所有小于1的值,然后选择小于2的所有值,依此类推,因此我导入的文件取决于第一行的值。
You want to use the enumerate() function. 您想使用enumerate()函数。
with open(filename,'r') as file_object:
for line_number, line in enumerate(file_object):
if line_number in list_of_line_numbers:
do_stuff_to(line)
Where list_of_line_numbers is a list containing the lines you want to take. 其中list_of_line_numbers是包含您要采用的行的列表。 This approach also has the advantage of not loading the entire file into memory, in the event that you're working with something big.
如果您使用的是大型文件,此方法还具有不将整个文件加载到内存中的优势。
More info on the enumerate function: 有关枚举函数的更多信息:
https://docs.python.org/3/library/functions.html#enumerate https://docs.python.org/3/library/functions.html#enumerate
If your data set is not too large I would consider using a pandas.DataFrame
from the Pandas Wrangling Library : 如果您的数据集不太大,我会考虑使用Pandas Wrangling Library中的
pandas.DataFrame
:
pandas.DataFrame(two_dimensional_array_like_object)
If you have a csv ( example.csv
) that looks like: 如果您的csv(
example.csv
)如下所示:
1,2,3
2,3,4
3,4,5
Importing this into a pandas.DataFrame
: 将其导入到
pandas.DataFrame
:
In[7]: import pandas as pd
In[8]: df = pd.read_csv('example.csv', headers=False)
In[9]: print(df)
0 1 2
0 1 2 3
1 2 3 4
2 3 4 5
Now you have an extremely functional object ( df
) that has many built in methods for data wrangling. 现在,您有了一个功能非常强大的对象(
df
),该对象具有许多内置的数据整理方法。
To perform your intended slicing: 要执行预期的切片:
In[10]: df_copy = df.loc[df[0]==2, :] # select rows that have the number 2 in the first column and make a copy
In[11]: print(df_copy) # print selected rows
0 1 2
1 2 3 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.