简体   繁体   English

Python:将一个列表的索引与另一个列表进行比较,将第二个列表的值附加到第一个列表

[英]Python: compare one list's index to another, append second list value to first list

I have a .csv file as follows (snippet). 我有一个.csv文件,如下所示(摘要)。

Country,Year,GDP ($US),Population
Angola,2002,11431738368,10760510
Angola,2005,32810672128,11706954
Antigua and Barbuda,2002,714677760,67448
Antigua and Barbuda,2005,875751360,68722
Argentina,2002,1.02E+11,38331121
Argentina,2005,1.83E+11,39537943
Armenia,2002,2376335104,3013818
Armenia,2005,4902779392,2982904
...

I need to find the five lowest GDP/Pop countries for 2002, then find their corresponding GDP/Pop values in 2005, then compute the difference and the percent difference. 我需要找到2002年GDP / Pop最低的五个国家,然后在2005年找到它们对应的GDP / Pop值,然后计算差异和差异百分比。 There are blanks for either GDP or Population values for some records, which I omit. 我忽略了某些记录的GDP或人口值的空白。

So far I used 到目前为止,我使用

import csv
import operator

data = open('file.csv')
read_data = csv.reader(data)

thisthing = []
for line in read_data:
#find 2002 GDP/Pop, omit blanks, append to list
    if line[7] == '2002' and line[8] != ' ' and line[9] != ' ':
        thisthing.append([line[0], (float(line[8])/(int(line[9])))])

thisthing.sort(key=operator.itemgetter(1))

This produces a list which print line by line as follows (Country, GDP/Pop): 这将产生一个列表,按行(国家,GDP / Pop)逐行打印:

['Burma (Myanmar)', 69.07171351277908]
['Burundi', 89.45864552423431]
['Congo (Dem. Rep.)', 99.23033109735835]
['Ethiopia', 109.33326343550823]
['Eritrea', 142.8576737907048]
['Guinea-Bissau', 151.110429668747]
['Afghanistan', 159.7524117568956]
['Malawi', 159.7614709537829]
['Sierra Leone', 174.6506490278577]

I want to now iterate back through 'read_data', using the country name in 'thisthing' as a conditional along with my blank prevention conditional 我现在想使用“ thisthing”中的国家/地区名称作为条件以及我的空白预防条件来遍历“ read_data”

and line[8] != ' ' and line[9] != ' ':

to select and append the 2005 GDP/Pop to 'thisthing' 选择并将“ 2005年GDP / Pop”附加到“ thisthing”

I have no idea where to begin doing that, and I have been stuck here for about a week now...any help would be most appreciated. 我不知道从哪里开始做,我已经被困在这里约一个星期了……任何帮助将不胜感激。

try this!! 尝试这个!!

import csv 
import operator

data = open('file.csv') read_data = csv.reader(data)

data_2002 = {}
data_2005 = {}

thisthing = [["country", "2002%", "2005%"]] 

for line in read_data:
    try: 
        gdp = float(line[8])/(int(line[9]))
        if line[7] == '2002' and line[8] != ' ' and line[9] != ' ':
            data_2002[line[0]] = gdp

        elif line[7] == '2005' and line[8] != ' ' and line[9] != ' ':
            data_2002[line[0]] = gdp
    except KeyError:
        print line[0]
        continue

for country in data_2002:
    thisthing.append([country, data_2002[country], data_2005[country]])

print thisthing

Using this as read_data : 使用它作为read_data

[['Country', 'Year', 'GDP ($US)', 'Population'],
 ['Angola', '2002', '11431738368', '10760510'],
 ['Angola', '2005', '32810672128', '11706954'],
 ['Antigua and Barbuda', '2002', '714677760', '67448'],
 ['Antigua and Barbuda', '2005', '875751360', '68722'],
 ['Argentina', '2002', '1.02E+11', '38331121'],
 ['Argentina', '2005', '1.83E+11', '39537943'],
 ['Armenia', '2002', '2376335104', '3013818'],
 ['Armenia', '2005', '4902779392', '2982904']]

We don't want the first line: 我们不要第一行:

read_data = read_data[1:]

If you use a csv.read object for read_data do: 如果将csv.read对象用于read_data请执行以下操作:

next(read_data)

Actually, the code is robust enough to iterate over all lines because we skip the lines with exceptions caused by converting a string into a number that doesn't work , ie 'GDP ($US)' and 'Population' . 实际上,该代码具有足够的健壮性来遍历所有行,因为我们跳过了将字符串转换为无效的数字'GDP ($US)''GDP ($US)''Population' 'GDP ($US)'引起的异常行。 But it is still good practice to show our intention to skip the first line. 但是,表明我们有意跳过第一行仍然是一种好习惯。 Because we all know: Explicit is better than implicit. 因为我们都知道: 显式优于隐式。

We use a defaultdict to avoid testing at first insert of year: 我们使用defaultdict来避免在一年的第一个插入时进行测试:

import collections
data = collections.defaultdict(dict)

for line in read_data:
    try:
        gdp = float(line[2]) / float(line[3])
    # Make sure this exception catches what you want.
    except (ValueError, ZeroDivisionError):
        continue
    data[line[0]][line[1]] = gdp

Now we get this for data : 现在我们获取data

{'Angola': {'2002': 1062.3788619684383, '2005': 2802.6651619200006},
 'Antigua and Barbuda': {'2002': 10595.981496856837,
                         '2005': 12743.391635866245},
 'Argentina': {'2002': 2661.023140961622, '2005': 4628.465370593508},
 'Armenia': {'2002': 788.4799626254804, '2005': 1643.6262756025671}}

We need to rearrange to get to your list: 我们需要重新排列才能到达您的列表:

list_data = []
for key, value in data.items():
    list_data.append([key] + [value[year] for year in sorted(value.keys())])

Result: 结果:

[['Antigua and Barbuda', 10595.981496856837, 12743.391635866245],
 ['Argentina', 2661.023140961622, 4628.465370593508],
 ['Angola', 1062.3788619684383, 2802.6651619200006],
 ['Armenia', 788.4799626254804, 1643.6262756025671]]

This solution works for any number of years and puts in them the chronological order. 该解决方案可以使用很多年,并且按时间顺序排列。

EDIT 编辑

As it turns out, the data contains more than two years. 事实证明,数据包含两年以上的时间。 I you don't want all years. 我你不想所有岁月。 Change the last section to include only the years you explicitly want: 将最后一部分更改为仅包含您明确想要的年份:

list_data = []
for key, value in data.items():
    list_data.append([key] + [value[year] for year in ('2002', '2005')])

EDIT2 编辑2

Small modification if year is missing as requested by the OP: 如果缺少OP要求的年份,则进行小的修改:

list_data = []
for key, value in data.items():
    list_data.append([key] + [value.get(year, 0) for year in ('2002', '2005')])

This puts in 0 if the year is missing. 如果缺少年份,则输入0 Use any suitable other value to indicate missing values. 使用任何其他合适的值表示缺少的值。

EDIT3 编辑3

Another variation as requested by the OP. OP要求的另一种变化。 No append if no value: 如果没有值,则不附加:

list_data = []
for key, value in data.items():
    list_data.append([key] + [value.get(year) for year in ('2002', '2005')
                              if value.get(year) is not None])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python:在一个列表中查找索引,然后用第一个列表中的索引中的项目替换第二个列表 - python: Finding a index in one list and then replacing second list with the item from a index in first list Python:将列表附加到另一个列表并清除第一个列表 - Python: Append a list to another list and Clear the first list 追加另一个列表中的列表 - Append a list that’s in another list 检查第一个列表是否是第二个python的子列表 - check if the first list is a sublist of the second one python 有没有办法在python中的特定索引处附加/扩展带有另一个列表的列表? - Is there a way to append/extend a list with another list at a specific index in python? Python:如果嵌套列表的第二项等于另一个列表的嵌套列表的第二项,则获取嵌套列表的第一项 - Python: getting first item of a nested list if its second item equals a nested list's second item of another list Python-如果某个键不在一个列表中,请追加到另一个 - Python - if a key is not in one list, append to another 如何使用python比较列表中的元素并检查第一个列表元素是否包含在另一个列表的元素中 - How to compare elements in lists and check if first list element contains in another list's element using python 比较列表项与python中另一个列表的索引 - compare between item of list with index of another list in python 如何遍历嵌套列表,将第一个元素与另一个列表进行比较,然后附加到新列表? - How to loop through a nested list, compare the first element to another list, and then append to a new list?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM