简体   繁体   English

如何根据索引位置将列表列表连接到另一个小列表?

[英]How to connect a list of lists to another small list based on index position?

I have to take a list of lists and another list and create a relationship between them.我必须获取一个列表列表和另一个列表,并在它们之间建立关系。 For context, I am not allowed to use pandas or dictionaries or anything like that.对于上下文,我不允许使用熊猫或字典或类似的东西。

I have opened two files and created two functions, the first function that turns the file with the year and the 12 job creation values assigned to that year making a list of lists.我打开了两个文件并创建了两个函数,第一个函数将文件与年份和分配给该年份的 12 个工作创建值一起转换为列表。 The next file is the name of the presidents and the years they served, for the second function I made the data in that file into a list.下一个文件是总统的名字和他们任职的年份,第二个功能是我将该文件中的数据制成列表。

I have to correlate their party and the president's term with the year they served and when I am done with assigning the president with the list of lists, I have to average them.我必须将他们的政党和总统的任期与他们服务的年份相关联,当我完成为总统分配名单时,我必须对它们进行平均。

What I need help doing is somehow merging the lists so that I could have each president with their assigned year and the data for that year in a list.我需要帮助做的是以某种方式合并列表,以便我可以将每个总统指定的年份和该年的数据放在一个列表中。 an example of what I need is like: [James Earl Carter, 1979-1981, Democrat, 140990, 14135, etc...] for all the years he served.我需要的一个例子是: [James Earl Carter, 1979-1981, Democrat, 140990, 14135, etc...]在他服务的所有岁月里。 ` `

Here is an example of both files:这是两个文件的示例:

1979,14090,14135,14152,14191,14221,14239,14288,14328,14422,14484,14532,14559
1980,14624,14747,14754,14795,14827,14784,14861,14870,14824,14900,14903,14946
1981,14969,14981,14987,14985,14971,14963,14993,15007,14971,15028,15073,15075
1982,15056,15056,15050,15075,15132,15207,15299,15328,15403,15463,15515,15538
1983,15611,15671,15731,15797,15834,15852,15901,15891,15819,15858,15894,15911
1984,15937,15947,15956,15977,15990,16045,16150,16229,16128,16136,16173,16180
1985,16201,16226,16296,16583,16454,16441,16418,16410,16330,16386,16391,16373
1986,16360,16346,16292,16260,16198,16159,16175,16110,16031,16069,16078,16073
1987,16041,16011,16024,16010,16003,16016,15890,15930,15923,15956,15977,15981
1988,16023,16004,16005,15990,16005,16020,16011,16016,16042,15986,15997,16008
1989,16010,16025,16030,16075,16103,16127,16172,16224,16255,16274,16311,16282

James Earl Carter, 1979-1981, Democrat
Ronald Wilson Reagan, 1981-1989, Republican

And here is the code I have so far:这是我到目前为止的代码:

def avg():
      file = open("government_employment_Windows.txt")
      my_list = []

      for line in file:
          line.strip()
          line = line.split(',')
          line[1] = int(line[1])
          line[2] = int(line[2])
          line[3] = int(line[3])
          line[4] = int(line[4])
          line[5] = int(line[5])
          line[6] = int(line[6])
          line[7] = int(line[7])
          line[8] = int(line[8])
          line[9] = int(line[9])
          line[10] = int(line[10])
          line[11] = int(line[11])
          line[12] = int(line[12])

          my_list.append(line)
    file.close()
    return my_list


def pres(list_of_lists):
    p_file = open("presidents_Windows.txt")
    print(list_of_lists)

    for line in p_file:
        line = line.strip()
        line = line.split(',')
        line[1] = line[1].strip()
        line[2] = line[2].strip()
    print(line)

    for small_list in list_of_lists:
        if line[0] in small_list:
            small_list.append(line[0])
    print(list_of_lists)
list_of_lists = avg()
pres(list_of_lists)

Final result should look like:最终结果应如下所示:

Government employement average per month:
    Republican: 18562
    Democrat: 19599


Government Employment by President:
     First Month    Last Month     Difference
Carter: 14090          14946
Reagan: 14969           16008
Bush: 16010             17347
Clinton: 17365          19466
Bush: 19450             21546
Obama: 21538            22266
Trump: 22264            21902

Because this is a homework problem, I will not solve the entire problem for you;因为这是作业题,我不会为你解决全部问题; however, I will go over how to merge the lists together based on the years a president served.但是,我将讨论如何根据总统任职的年数将这些名单合并在一起。 You can take the output below and format it the way you like and take the average quite easily.您可以采用下面的输出并按照您喜欢的方式对其进行格式化,并很容易地取平均值。

To start, let's say you get the jobs and presidents in their respective lists (when you clean the data from the file), like so:首先,假设您在各自的列表中获取职位和总裁(当您清除文件中的数据时),如下所示:

jobs = [
    [1979,14090,14135,14152,14191,14221,14239,14288,14328,14422,14484,14532,14559],
    [1980,14624,14747,14754,14795,14827,14784,14861,14870,14824,14900,14903,14946],
    [1981,14969,14981,14987,14985,14971,14963,14993,15007,14971,15028,15073,15075],
    [1982,15056,15056,15050,15075,15132,15207,15299,15328,15403,15463,15515,15538],
    [1983,15611,15671,15731,15797,15834,15852,15901,15891,15819,15858,15894,15911],
    [1984,15937,15947,15956,15977,15990,16045,16150,16229,16128,16136,16173,16180],
    [1985,16201,16226,16296,16583,16454,16441,16418,16410,16330,16386,16391,16373],
    [1986,16360,16346,16292,16260,16198,16159,16175,16110,16031,16069,16078,16073],
    [1987,16041,16011,16024,16010,16003,16016,15890,15930,15923,15956,15977,15981],
    [1988,16023,16004,16005,15990,16005,16020,16011,16016,16042,15986,15997,16008],
    [1989,16010,16025,16030,16075,16103,16127,16172,16224,16255,16274,16311,16282]
]

pres = [
    ["James Earl Carter", "1979-1981", "Democrat"],
    ["Ronald Wilson Reagan", "1981-1989", "Republican"]
]

For demonstration purposes, let's define carter in a separate variable:出于演示目的,让我们在单独的变量中定义 carter:

carter = pres[0]

To merge this data, you can define two functions.要合并此数据,您可以定义两个函数。 One for turning the "YYYY-YYYY" into a range of integers that you can use to merge the lists of jobs for each of the years that the president served.用于将“YYYY-YYYY”转换为一系列整数,您可以使用这些整数来合并总统任职的每一年的工作列表。 The other function will be for actually merging the lists together.另一个功能将用于将列表实际合并在一起。

def to_range(years):
    r = [int(i) for i in years.split("-")]
    return [i for i in range(r[0], r[1]+1)]

The above function will take a string as input and expects it in the "YYYY-YYYY" form.上面的函数将接受一个字符串作为输入,并期望它以“YYYY-YYYY”的形式出现。 For example, if we want the years that Carter served, we can say:例如,如果我们想要卡特服务的年份,我们可以说:

to_range(carter[1])

which outputs:输出:

[1979, 1980, 1981]

Now that we have all of the years that Carter served, we can move on to the second function.现在我们已经拥有卡特服务的所有年份,我们可以继续进行第二个功能。 If you want a one-liner, you can do so with this rather ugly and long list comprehension:如果你想要一个单行,你可以用这个相当丑陋和长列表的理解来做到这一点:

def get_all_jobs(year_range, jobs):
    return [n[i] for n in jobs if n[0] in year_range for i in range(1, len(n))]

If you're looking for a "prettier" alternative, using extend will also work:如果您正在寻找“更漂亮”的替代方案,使用extend也可以:

def get_all_jobs(year_range, jobs):
    all_jobs = []
    for n in jobs:
        if n[0] in year_range:
            all_jobs.extend(n[1:])
    return all_jobs

The above function creates one merged list.上述函数创建一个合并列表。 If we call the function like this, passing in Carter's range:如果我们这样调用函数,传入卡特的范围:

carter_jobs = get_all_jobs(carter_years, jobs)

we get:我们得到:

[14090, 14135, 14152, 14191, 14221, 14239, 14288, 14328, 14422, 14484, 14532, 14559, 14624, 14747, 14754, 14795, 14827, 14784, 14861, 14870, 14824, 14900, 14903, 14946, 14969, 14981, 14987, 14985, 14971, 14963, 14993, 15007, 14971, 15028, 15073, 15075]

Here's an example to get you started:这是一个让您入门的示例:

from io import StringIO

# sample data, these behave like a file,
# as if you did government_employment = open('filename.txt')
f_government_employment = StringIO("""1979,14090,14135,14152,14191,14221,14239,14288,14328,14422,14484,14532,14559
1980,14624,14747,14754,14795,14827,14784,14861,14870,14824,14900,14903,14946
1981,14969,14981,14987,14985,14971,14963,14993,15007,14971,15028,15073,15075
1982,15056,15056,15050,15075,15132,15207,15299,15328,15403,15463,15515,15538
1983,15611,15671,15731,15797,15834,15852,15901,15891,15819,15858,15894,15911
1984,15937,15947,15956,15977,15990,16045,16150,16229,16128,16136,16173,16180
1985,16201,16226,16296,16583,16454,16441,16418,16410,16330,16386,16391,16373
1986,16360,16346,16292,16260,16198,16159,16175,16110,16031,16069,16078,16073
1987,16041,16011,16024,16010,16003,16016,15890,15930,15923,15956,15977,15981
1988,16023,16004,16005,15990,16005,16020,16011,16016,16042,15986,15997,16008
1989,16010,16025,16030,16075,16103,16127,16172,16224,16255,16274,16311,16282""")

f_presidents = StringIO("""James Earl Carter, 1979-1981, Democrat
Ronald Wilson Reagan, 1981-1989, Republican""")


# ideally, you'd use classes for something like this, but since the exercise appears
# to be about doing it with basic data structures, just using functions and lists:

def read_employment(f):
    # reads the employment 'file' into a list of lists that pairs up a year and a
    # list of months of employment
    for line in f:
        line = [int(x) for x in line.strip().split(',')]
        # you'd use tuples, but since you want lists only
        yield [line[0], line[1:]]


def get_employment(employment, year, month):
    # given an iterable (like a list) created from the read_employment generator,
    # gets the employment for a specific year and month combo
    for e in employment:
        if e[0] == year:
            # month-1, to have months from 1-12
            return e[1][month-1]


def read_presidents(f):
    for line in f:
        line = line.split(',')
        term = [int(x) for x in list(line[1].strip().split('-'))]
        yield [line[0].strip(), term, line[2].strip()]


def get_term_months(presidents, president):
    for p in presidents:
        if president in p[0]:
            for year in range(p[1][0], p[1][1]):
                for month in range(1, 13):
                    # skip January of the first year, the previous president still going
                    if month != 1 or year != p[1][0]:
                        yield year, month
            # only January of the list year
            yield [p[1][1], 1]


def main():
    government_employment = list(read_employment(f_government_employment))
    # for example, showing the government employment for March, 1981
    print(get_employment(government_employment, 1981, 3))

    presidents = list(read_presidents(f_presidents))
    # for example, showing all the months for Carter
    print(list(get_term_months(presidents, 'Carter')))

    # now, you can do things like compute the average government employment
    # during the Carter presidency
    values = [
        get_employment(government_employment, year_month[0], year_month[1])
        for year_month in get_term_months(presidents, 'Carter')
    ]
    print(f'Average government employment during the Carter presidency: {sum(values)/len(values)}')


if __name__ == '__main__':
    main()

Mind you, there's some ugly parts in here still.请注意,这里还有一些丑陋的部分。 For example, if you didn't use just lists, but also tuples, something like this:例如,如果您不仅使用列表,还使用元组,则如下所示:

    values = [
        get_employment(government_employment, year_month[0], year_month[1])
        for year_month in get_term_months(presidents, 'Carter')
    ]

Would look like this (assuming get_term_months would now return a tuple):看起来像这样(假设get_term_months现在将返回一个元组):

    values = [
        get_employment(government_employment, year, month)
        for year, month in get_term_months(presidents, 'Carter')
    ]

Well beyond the scope of your question, but to give you a sense, using classes would allow you to put stuff together in an even nicer way.远远超出了您的问题的范围,但为了给您一个感觉,使用类可以让您以更好的方式将东西放在一起。 For example, this is what the employment stuff starts to look like:例如,这就是就业的东西开始的样子:

class Employment:
    def __init__(f):
        self.data = [int(x) for x in line.strip().split(',') for line in f]

    def get_employment(employment, year, month):
        for e in self.data:
            if e[0] == year:
                return e[month]


government_employment = Employment(f_government_employment)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM