简体   繁体   中英

In Python, with a list of extracted values from file, how to find sum of column of float values?

I am having trouble find the sum of a column with a list of values that is extracted from a txt file using Python 3. The file contains data that looks something like:

  • 1800 -0.19 -0.11
  • 1801 -0.1 -0.17
  • 1802 -0.2 -0.11
  • .
  • . [cont.]
  • .
  • 1900 -0.15 -.15

How can I find the sum of the second and third columns separately?

First, I opened the file using

with open('file.txt') as f:

for line in f:
    column_2 = line.split()
    b = float(column_2[-2])

    print(b)

I was able to print the second column of values successfully. However, after I added:

print(sum(b))

It was not able to run the code.

When I run the code, I am given the error:

'int' object is not callable

There are two things wrong with your code.

Firstly, you have assigned some value to sum . Do not use builtin names for your variables .

Execute del sum and everything will be fine.

However , your logic is still wrong because doing that only takes the sum of the last value of b . You need to store all values you have seen somewhere.

With the way you are doing things, the best method would be using an accumulator variable:

b_sum = 0

for line in f:
    column_2_value = float(line.split()[-2])
    b_sum += column_2_value 

A better approach, IMO, is to load the entire data structure into memory:

with open('Global_Temperature_Data_File.txt') as f:
    data = [row.split() for row in f.read().split('\n')]
    transposed_data = list(zip(*data))

This will turn transposed_data into a nested list , where each inner element represents one column.

You can then just do this:

b_sum = sum(transposed_data[2])

How about reading the lines from the file and then using split() get your columns and with sum() you can compute the sum. Obviously, this will only work if your lines have the same structure.

with open('Global_Temperature_Data_File.txt') as f:
    lines = f.readlines()


second_column_sum = sum([float(line.split('-')[1]) for line in lines])
third_column_sum = sum([float(line.split('-')[2]) for line in lines])

*Here I assumed your values are positive and - is a separator, if this is not the case, use line.split(' ')

If you added your print statement inside the loop, this will not work as you're hoping. sum needs an iterable, like a list. sum(b) in your case will fail (though when I tried to replicate your error I got a slightly different one!).

I suggest you change your code to have b=0' before your for loop, and the b=... line to be b + = ...` . This will sum up the values as you go through the loop.

Note that you might have problems if some rows on your dataset are not numbers...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM