有条件地从python列表中提取数字

Question

I have a list of numbers like 我有一个数字列表，例如

That I want to be able to say, for example the average distance between numbers < 50 is 12.5. 我想说的是，例如数字<50之间的平均距离是12.5。

import numpy as np
from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        assert x.strip().split()
        positions.append(x)

position_list= []

for x in positions:
    if x < 50:
        position_list.append(x)

print np.mean[position_list]

this doesn't work - I think because when I print the positions list I get 20,40,45,60,80 - so I think it is not treating the numbers as individual numbers hence it cannot test if x < 50. What am I doing wrong? 这不起作用-我认为是因为当我打印位置列表时，我得到20、40、45、60、80-所以我认为它没有将数字视为单个数字，因此无法测试x <50。我做错了吗？

EDIT: looks like the data is rather made of lines like: 编辑：看起来数据更像是由以下行组成：

467,1977,3751,4013,5752,6406,6446,7362,7585,8285,8624,8741,‌9143,9304,11879,1319‌7,13460,14401,14785,‌15117,22264,23714,24‌294,24534,26053,2695‌9,27714,29462,35342,‌36538,36612,37031,39‌093,42281,42967,4394‌5

Answer 1

There are several things wrong with your code: 您的代码有几处错误：

you do not convert them to an int or float ; 您不将它们转换为int或float ；
you use np.mean[..] instead of np.mean(..) and np.mean is not scriptable. 您使用np.mean[..]而不是np.mean(..)并且np.mean无法编写脚本。

The solution is: 解决方案是：

import numpy as np
from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        assert x.strip().split()
        positions.append(int(x))

position_list= [x for x in positions if x < 50]

print np.mean(position_list)

EDIT 编辑

Based on your comments however, it looks like you feed a comma separated list: 但是，根据您的评论，您似乎输入了逗号分隔的列表：

import numpy as np
from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        positions += (int(i) for i in x.strip().split())

position_list= [x for x in positions if x < 50]

print np.mean(position_list)

Or: 要么：

import numpy as np
from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        for i in x.strip().split():
            positions.append(int(i))

position_list= [x for x in positions if x < 50]

print np.mean(position_list)

You can also, as @Jean-FrançoisFabre says, use the sum and divide by the number of items, so: 您也可以按照@Jean-FrançoisFabre的说法，使用总和除以项目数，因此：

from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        for i in x.strip().split():
            positions.append(int(i))

position_list= [x for x in positions if x < 50]

print sum(position_list)/len(position_list)

In that case you do not have to import numpy . 在这种情况下，您不必导入numpy 。

Answer 2

There are a few errors in your code, other answers pointed that out, but I feel I should rewrite it for you in a much cleaner way: 您的代码中有一些错误，指出了其他答案，但是我觉得我应该以一种更简洁的方式为您重写它：

with open(pos_file) as f:
    positions = [int(x) for line in f for x in line.strip().split(',') if int(x) < 50]

print(sum(positions)/len(positions))

you don't need numpy to compute a mean , this isn't rocket science 您不需要numpy来计算mean ，这不是火箭科学
the assert statement is useless. assert语句是无用的。 If a line is empty, split() returns an empty list, not a problem for the list comprehension. 如果一行为空，则split()返回一个空列表，这对列表理解来说不是问题。
the added double loop allows to read several integers located on the same line 添加的双循环允许读取同一行上的多个整数
no memory wasted storing numbers when you only want to keep the lowest ones 当您只想保留最低的数字时，不会浪费存储数字的内存
took advantage of your feedback on one of the answers to figure out that the list is comma separated. 利用您对其中一个答案的反馈来确定列表是逗号分隔的。 Now I realize that the csv module could have been used. 现在，我意识到可以使用csv模块。

so csv solution: 所以CSV解决方案：

import csv
with open(pos_file) as f:
    cr = csv.reader(f)
    positions = [int(x) for row in cr for x in row if int(x) < 50]

print(sum(positions)/len(positions))

有条件地从python列表中提取数字

问题描述

2 个解决方案

解决方案1
1 2017-01-19 21:22:48

解决方案2
0 已采纳

有条件地从python列表中提取数字

问题描述

2 个解决方案

解决方案1 1 2017-01-19 21:22:48

解决方案2 0 已采纳

解决方案1
1 2017-01-19 21:22:48

解决方案2
0 已采纳