[英]Python- Columnwise average from a file
I have a file whose content looks like: 我有一个文件,其内容如下所示:
A B 2 4
C D 1 2
A D 3 4
A D 1 2
A B 4 7 and so on..
My objective is to get the final output as below: 我的目标是获得最终输出,如下所示:
A B 3 5.5
C D 1 2
A D 2 3
That is, for each unique combination of first two columns, the result should be column-wise average of other two columns of the file. 也就是说,对于前两列的每个唯一组合,结果应为文件中其他两列的按列平均值。 I tried using loops and it is just increasing the complexity of the program.
我尝试使用循环,但这只是增加了程序的复杂性。 Is there any other way to achieve the objective.
是否有其他方法可以实现目标。
Sample Code: 样例代码:
with open(r"C:\Users\priya\Desktop\test.txt") as f:
content = f.readlines()
content = [x.split() for x in content]
for i in range(len(content)):
valueofa=[content[i][2]]
valueofb=[content[i][3]]
for j in xrange(i+1,len(content)):
if content[i][0]==content[j][0] and content[i][1]==content[j][1]:
valueofa.append(content[j][2])
valueofb.append(content[j][3])
and I intended to take the average of both lists by index. 我打算将两个列表的平均值作为索引。
You can store each combination of letters as a tuple in a dictionary and then average at the end, eg: 您可以将每个字母组合作为一个元组存储在字典中,然后在末尾取平均值,例如:
d = {}
with open(r"C:\Users\priya\Desktop\test.txt") as f:
for line in f:
a, b, x, y = line.split()
d.setdefault((a, b), []).append((int(x), int(y)))
for (a, b), v in d.items():
xs, ys = zip(*v)
print("{} {} {:g} {:g}".format(a, b, sum(xs)/len(v), sum(ys)/len(v)))
Output: 输出:
A B 3 5.5
C D 1 2
A D 2 3
If you can use pandas, it will much simpler: 如果您可以使用熊猫,它将更加简单:
import pandas as pd
df = pd.read_csv(r"C:\Users\priya\Desktop\test.txt", names=['A','B','C','D'])
df
A B C D
0 A B 2 4
1 C D 1 2
2 A D 3 4
3 A D 1 2
4 A B 4 7
df.groupby(['A','B']).mean().reset_index()
A B C D
0 A B 3.0 5.5
1 A D 2.0 3.0
2 C D 1.0 2.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.