简体   繁体   中英

Printing the same values from a list without duplicates

I have a text file looking something like this, but much bigger (spans from 1993 - 2013)

2001:3.342
2001:2.345
2001:2.211
2001:4.31
2002:3.23
2002:2.231
2002:2.12
2003:1.23
2003:2.32
2003:2.12

On the left we have years and on the right their sales. What I'm trying to do is take each year and find the average mean of that year. I know what I've done with my code is the long route, so I'm just wondering if there is a quicker way for the future.

What I've done:

for line in year_value:
    x, y = line.split(":")
    if x == "2001":
        value.append(float(y))
        avg_value01 = sum(value) / len(value)
    print("The average sales in 2001 were : ", avg_value01)

I had to do this for 20 different years

I'm wondering if there's a quicker way to do this rather using vanilla Python, as I'm only a beginner and want to get used to the basics.

The below should work for you (use a dict to hold the data for each year)

from collections import defaultdict
data = defaultdict(list)
with open('test.txt') as f:
  for line in f:
    year,value = line.split(':')
    data[year].append(float(value))
for year,sales in data.items():
  print(f'{year} -> {sum(sales) / len(sales)}')

test.txt

2001:3.342
2001:2.345
2001:2.211
2001:4.31
2002:3.23
2002:2.231
2002:2.12
2003:1.23
2003:2.32
2003:2.12

output

2001 -> 3.0519999999999996
2002 -> 2.527
2003 -> 1.89

As has been pointed out in the comments, the condition x == x will always be True as you don't progress to the next line n+1 , say, while you are still on n .

One approach would be to convert these pairings to a dictionary, where the key will the year and its value will be a list of the values corresponding to that year.

year_value_dict = {}
for line in year_value:
    year, value = line.split(':')
    year_values_dict.setdefault(year, []).append(value)

for year, values in year_values_dict.items():
    print(f"The average sales in {year} was : {sum(values)/len(values)}"

The setdefault method returns the value for the key provided in the first argument ( year ); if not it creates the key and sets the second argument (an empty list) as a value. In any case, you will have a list to which you can append value .

In the second loop, you iterate through each key-value pair at print the values.

NOTE:

  • If you are using Python < 3.6, the dictionary keys may not be ordered. Otherwise they are ordered in which the way they were inserted.

  • I use f-strings in the second loop, which are also only available in Python >= 3.6. You can format in the old fashion, if you want.

I've been meaning to try this for a while, writing a Mean aggregator class.

Copying balderman's solution, but keeping track of means with constant space each, instead of lists with linear space each. With a convenient mean() method, making the "main" code cleaner.

from collections import defaultdict

class Mean:
    def __init__(self):
        self._sum = 0
        self._size = 0
    def add(self, value):
        self._sum += value
        self._size += 1
    def mean(self):
        return self._sum / self._size

data = defaultdict(Mean)
with open('test.txt') as f:
  for line in f:
    year,value = line.split(':')
    data[year].add(float(value))
for year,sales in data.items():
  print(f'{year} -> {sales.mean()}')

Output ( Try it online! ):

2001 -> 3.0519999999999996
2002 -> 2.527
2003 -> 1.89

You can read the lines of the file and create a dictionary that had the all the values for each year associated with a list of the value encountered for it. Once you have that it is each to calculate the mean value of what's in each list using the fmean() function in the statistics module.

text = '''\
2001:3.342
2001:2.345
2001:2.211
2001:4.31
2002:3.23
2002:2.231
2002:2.12
2003:1.23
2003:2.32
2003:2.12
'''

from statistics import fmean

yearly = {}
for line in text.splitlines():
    pair1, pair2 = zip([int, float],line.split(':'))
    year, value = pair1[0](pair1[1]), pair2[0](pair2[1])
    yearly.setdefault(year, []).append(value)

for year, data in yearly.items():
    avg = fmean(data)
    print(f'The average sales in {year} were {avg:.3f}')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM