简体   繁体   English

有条件地从python列表中提取数字

[英]Conditionally extract numbers from python list

I have a list of numbers like 我有一个数字列表,例如

20
40
45
60
80

That I want to be able to say, for example the average distance between numbers < 50 is 12.5. 我想说的是,例如数字<50之间的平均距离是12.5。

import numpy as np
from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        assert x.strip().split()
        positions.append(x)

position_list= []

for x in positions:
    if x < 50:
        position_list.append(x)

print np.mean[position_list]

this doesn't work - I think because when I print the positions list I get 20,40,45,60,80 - so I think it is not treating the numbers as individual numbers hence it cannot test if x < 50. What am I doing wrong? 这不起作用-我认为是因为当我打印位置列表时,我得到20、40、45、60、80-所以我认为它没有将数字视为单个数字,因此无法测试x <50。我做错了吗?

EDIT: looks like the data is rather made of lines like: 编辑:看起来数据更像是由以下行组成:

467,1977,3751,4013,5752,6406,6446,7362,7585,8285,8624,8741,‌​9143,9304,11879,1319‌​7,13460,14401,14785,‌​15117,22264,23714,24‌​294,24534,26053,2695‌​9,27714,29462,35342,‌​36538,36612,37031,39‌​093,42281,42967,4394‌​5

There are several things wrong with your code: 您的代码有几处错误:

  • you do not convert them to an int or float ; 您不将它们转换为intfloat
  • you use np.mean[..] instead of np.mean(..) and np.mean is not scriptable. 您使用np.mean[..]而不是np.mean(..)并且np.mean无法编写脚本。

The solution is: 解决方案是:

import numpy as np
from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        assert x.strip().split()
        positions.append(int(x))

position_list= [x for x in positions if x < 50]

print np.mean(position_list)

EDIT 编辑

Based on your comments however, it looks like you feed a comma separated list: 但是,根据您的评论,您似乎输入了逗号分隔的列表:

import numpy as np
from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        positions += (int(i) for i in x.strip().split())

position_list= [x for x in positions if x < 50]

print np.mean(position_list)

Or: 要么:

import numpy as np
from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        for i in x.strip().split():
            positions.append(int(i))

position_list= [x for x in positions if x < 50]

print np.mean(position_list)

You can also, as @Jean-FrançoisFabre says, use the sum and divide by the number of items, so: 您也可以按照@Jean-FrançoisFabre的说法,使用总和除以项目数,因此:

from sys import argv
script, pos_file, output = argv
positions = []
with open(pos_file) as f:
    for x in f:
        for i in x.strip().split():
            positions.append(int(i))

position_list= [x for x in positions if x < 50]

print sum(position_list)/len(position_list)

In that case you do not have to import . 在这种情况下,您不必导入

There are a few errors in your code, other answers pointed that out, but I feel I should rewrite it for you in a much cleaner way: 您的代码中有一些错误,指出了其他答案,但是我觉得我应该以一种更简洁的方式为您重写它:

with open(pos_file) as f:
    positions = [int(x) for line in f for x in line.strip().split(',') if int(x) < 50]

print(sum(positions)/len(positions))
  • you don't need numpy to compute a mean , this isn't rocket science 您不需要numpy来计算mean ,这不是火箭科学
  • the assert statement is useless. assert语句是无用的。 If a line is empty, split() returns an empty list, not a problem for the list comprehension. 如果一行为空,则split()返回一个空列表,这对列表理解来说不是问题。
  • the added double loop allows to read several integers located on the same line 添加的双循环允许读取同一行上的多个整数
  • no memory wasted storing numbers when you only want to keep the lowest ones 当您只想保留最低的数字时,不会浪费存储数字的内存
  • took advantage of your feedback on one of the answers to figure out that the list is comma separated. 利用您对其中一个答案的反馈来确定列表是逗号分隔的。 Now I realize that the csv module could have been used. 现在,我意识到可以使用csv模块。

so csv solution: 所以CSV解决方案:

import csv
with open(pos_file) as f:
    cr = csv.reader(f)
    positions = [int(x) for row in cr for x in row if int(x) < 50]

print(sum(positions)/len(positions))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM