简体   繁体   English

帮助Python中的if else循环

[英]Help with an if else loop in python

Hi here is my problem. 嗨,这是我的问题。 I have a program that calulcates the averages of data in columns. 我有一个计算列中数据平均值的程序。 Example

Bob
1
2
3

the output is 输出是

Bob
2

Some of the data has 'na's So for Joe 有些数据对乔来说是“ na”。

Joe
NA
NA
NA

I want this output to be NA 我希望此输出为NA

so I wrote an if else loop 所以我写了一个if else循环

The problem is that it doesn't execute the second part of the loop and just prints out one NA. 问题在于它不执行循环的第二部分,而只是打印出一个NA。 Any suggestions? 有什么建议么?

Here is my program: 这是我的程序:

with open('C://achip.txt', "rtU") as f:
    columns = f.readline().strip().split(" ")
    numRows = 0
    sums = [0] * len(columns)

    numRowsPerColumn = [0] * len(columns) # this figures out the number of columns

    for line in f:
        # Skip empty lines since I was getting that error before
        if not line.strip():
            continue

        values = line.split(" ")
        for i in xrange(len(values)):
            try: # this is the whole strings to math numbers things
                sums[i] += float(values[i])
                numRowsPerColumn[i] += 1
            except ValueError:
                continue 

    with open('c://chipdone.txt', 'w') as ouf:
        for i in xrange(len(columns)):
           if numRowsPerColumn[i] ==0 :
               print 'NA' 
           else:
               print>>ouf, columns[i], sums[i] / numRowsPerColumn[i] # this is the average calculator

The file looks like so: 该文件如下所示:

Joe Bob Sam
1 2 NA
2 4 NA
3 NA NA
1 1  NA

and final output is the names and the averages 最后的输出是名称和平均值

Joe Bob Sam 
1.5 1.5 NA

Ok I tried Roger's suggestion and now I have this error: 好吧,我尝试了罗杰的建议,现在我遇到了这个错误:

Traceback (most recent call last): File "C:/avy14.py", line 5, in for line in f: ValueError: I/O operation on closed file 追溯(最近一次呼叫最近):文件“ C:/avy14.py”,第5行,在f中的行:ValueError:对关闭文件的I / O操作

Here is this new code: 这是新代码:

with open('C://achip.txt', "rtU") as f: columns = f.readline().strip().split(" ") sums = [0] * len(columns) rows = 0 for line in f: line = line.strip() if not line: continue 使用open('C://achip.txt',“ rtU”)作为f:列= f.readline()。strip()。split(“”)sums = [0] * len(columns)行= 0对于f中的行:line = line.strip()如果不是line:继续

rows += 1 for col, v in enumerate(line.split()): if sums[col] is not None: if v == "NA": sums[col] = None else: sums[col] += int(v) col +的行+ = 1,enumerate(line.split())中的v:如果sums [col]不是None:如果v ==“ NA”:sums [col] =其他:sums [col] + = int (v)

with open("c:/chipdone.txt", "w") as out: for name, sum in zip(columns, sums): print >>out, name, if sum is None: print >>out, "NA" else: print >>out, sum / rows 使用open(“ c:/chipdone.txt”,“ w”)作为out:对于名称,zip中的总和(列,总和):print >> out,名称,如果总和为None:print >> out,“ NA “ else:打印>> out,求和/行

with open("c:/achip.txt", "rU") as f:
  columns = f.readline().strip().split()
  sums = [0.0] * len(columns)
  row_counts = [0] * len(columns)

  for line in f:
    line = line.strip()
    if not line:
      continue

    for col, v in enumerate(line.split()):
      if v != "NA":
        sums[col] += int(v)
        row_counts[col] += 1

with open("c:/chipdone.txt", "w") as out:
  for name, sum, rows in zip(columns, sums, row_counts):
    print >>out, name,
    if rows == 0:
      print >>out, "NA"
    else:
      print >>out, sum / rows

I'd also use the no-parameter version of split when getting the column names (it allows you to have multiple space separators). 获取列名称时,我也会使用split的无参数版本(它允许您使用多个空格分隔符)。

Regarding your edit to include input/output sample, I kept your original format and my output would be: 关于您的编辑以包括输入/​​输出样本,我保留了原始格式,输出为:

Joe 1.75
Bob 2.33333333333
Sam NA

This format is 3 rows of (ColumnName, Avg) columns, but you can change the output if you want, of course. 此格式为(ColumnName,Avg)列的3行,但是您可以根据需要更改输出。 :) :)

Using numpy: 使用numpy:

import numpy as np

with open('achip.txt') as f:
    names=f.readline().split()
    arr=np.genfromtxt(f)

print(arr)
# [[  1.   2.  NaN]
#  [  2.   4.  NaN]
#  [  3.  NaN  NaN]
#  [  1.   1.  NaN]]

print(names)
# ['Joe', 'Bob', 'Sam']

print(np.ma.mean(np.ma.masked_invalid(arr),axis=0))
# [1.75 2.33333333333 --]

Using your original code, I would add one loop and edit the print statement 使用您的原始代码,我将添加一个循环并编辑打印语句

    with open(r'C:\achip.txt', "rtU") as f:
    columns = f.readline().strip().split(" ")
    numRows = 0
    sums = [0] * len(columns)

    numRowsPerColumn = [0] * len(columns) # this figures out the number of columns

    for line in f:
        # Skip empty lines since I was getting that error before
        if not line.strip():
            continue

        values = line.split(" ")

        ### This removes any '' elements caused by having two spaces like
        ### in the last line of your example chip file above
        for count, v in enumerate(values):      
            if v == '':     
                values.pop(count)
        ### (End of Addition)

        for i in xrange(len(values)):
            try: # this is the whole strings to math numbers things
                sums[i] += float(values[i])
                numRowsPerColumn[i] += 1
            except ValueError:
                continue 

    with open('c://chipdone.txt', 'w') as ouf:
        for i in xrange(len(columns)):
           if numRowsPerColumn[i] ==0 :
               print>>ouf, columns[i], 'NA' #Just add the extra parts
           else:
               print>>ouf, columns[i], sums[i] / numRowsPerColumn[i]

This solution also gives the same result in Roger's format, not your intended format. 此解决方案还以Roger格式而不是您想要的格式提供了相同的结果。

Solution below is cleaner and has fewer lines of code ... 下面的解决方案更干净,代码行更少...

import pandas as pd

# read the file into a DataFrame using read_csv
df = pd.read_csv('C://achip.txt', sep="\s+")

# compute the average of each column
avg = df.mean()

# save computed average to output file
avg.to_csv("c:/chipdone.txt")

They key to the simplicity of this solution is the way the input text file is read into a Dataframe. 它们是实现此解决方案简单性的关键,是将输入文本文件读入数据框的方式。 Pandas read_csv allows you to use regular expressions for specifying the sep/delimiter argument. 熊猫read_csv允许您使用正则表达式指定sep / delimiter参数。 In this case, we used the "\\s+" regex pattern to take care of having one or more spaces between columns. 在这种情况下,我们使用“ \\ s +”正则表达式模式来确保列之间具有一个或多个空格。

Once the data is in a dataframe, computing the average and saving to a file can all be done with straight forward pandas functions. 一旦数据在数据框中,就可以使用简单的熊猫函数来计算平均值并将其保存到文件中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM