简体   繁体   English

如何按字母顺序排列 Python 中的文件?

[英]How do I alphabetize a file in Python?

I am trying to get a list of presidents alphabetized by last name, even though the file that it is being drawn is currently listed first name, last name, date in office, and date out of office.我正在尝试获取按姓氏字母顺序排列的总统列表,即使它正在绘制的文件当前列出了名字、姓氏、就职日期和离任日期。

Here is what I have, any help on what I need to do with this.这就是我所拥有的,任何关于我需要做什么的帮助。 I have searched around for some answers, and most of them are beyond my level of understanding.我搜索了一些答案,其中大多数超出了我的理解水平。 I feel like I am missing something small.我觉得我错过了一些小东西。 I tried to break them all out into a list, and then sort them, but I could not get it to work, so this is where I started from.我试图将它们全部分解成一个列表,然后对它们进行排序,但我无法让它工作,所以这就是我开始的地方。

INPUT_FILE = 'presidents.txt'
OUTPUT_FILE = 'president_NEW.txt'
OUTPUT_FILE2 = 'president_NEW2.txt'

def main():
  infile = open(INPUT_FILE)
  outfile = open(OUTPUT_FILE, 'w')
  outfile2 = open(OUTPUT_FILE2,'w')

  stuff = infile.readline()

  while stuff:
    stuff = stuff.rstrip()
    data = stuff.split('\t')

    president_First = data[1]
    president_Last = data[0]
    start_date = data[2]
    end_date = data[3]

    sentence = '%s %s was president from %s to %s' % \
              (president_First,president_Last,start_date,end_date)
    sentence2 = '%s %s was president from %s to %s' % \
               (president_Last,president_First,start_date, end_date)

    outfile2.write(sentence2+ '\n')
    outfile.write(sentence + '\n')

    stuff = infile.readline()

  infile.close()
  outfile.close()

main()

What you should do is put the presidents in a list, sort that list, and then print out the resulting list.你应该做的是将总统放在一个列表中,对该列表进行排序,然后打印出结果列表。

Before your for loop add:在你的 for 循环之前添加:

presidents = []

Have this code inside the for loop after you pull out the names/dates提取名称/日期后,将此代码放在 for 循环中

president = (last_name, first_name, start_date, end_date)
presidents.append(president)

After the for loop在 for 循环之后

presidents.sort() # because we put last_name first above
# it will sort by last_name

Then print it out:然后打印出来:

for president in presidents
    last_name, first_name, start_date, end_date = president
    string1 = "..."

It sounds like you tried to break them out into a list.听起来您试图将它们分解成一个列表。 If you had trouble with that, show us the code that resulting from that attempt.如果您对此有疑问,请向我们展示该尝试产生的代码。 It was right way to approach the problem.这是解决问题的正确方法。

Other comments:其他的建议:

Just a couple of points where you code could be simpler.只有几点可以让您的代码更简单。 Feel free to ignore or use this as you want:随意忽略或根据需要使用它:

president_First=data[1]
president_Last= data[0]
start_date=data[2]
end_date=data[3]

can be written as:可以写成:

president_Last, president_First, start_date, end_date = data


stuff=infile.readline()

And

while stuff:
    stuff=stuff.rstrip()
    data=stuff.split('\t')
    ...
    stuff = infile.readline()

can be written as:可以写成:

 for stuff in infile:
     ...
#!/usr/bin/env python

# this sounds like a homework problem, but ...

from __future__ import with_statement # not necessary on newer versions

def main():
    # input
    with open('presidents.txt', 'r') as fi:
        # read and parse
        presidents = [[x.strip() for x in line.split(',')] for line in fi]
        # sort
        presidents = sorted(presidents, cmp=lambda x, y: cmp(x[1], y[1]))
    # output
    with open('presidents_out.txt', 'w') as fo:
        for pres in presidents:
            print >> fo, "president %s %s was president %s %s" % tuple(pres)

if __name__ == '__main__':
    main()

I tried to break them all out into a list, and then sort them我试图将它们全部分解成一个列表,然后对它们进行排序

What do you mean by "them"?你说的“他们”是什么意思?

Breaking up the line into a list of items is a good start: that means you treat the data as a set of values (one of which is the last name) rather than just a string.将行分解为项目列表是一个好的开始:这意味着您将数据视为一组值(其中一个是姓氏),而不仅仅是一个字符串。 However, just sorting that list is no use;但是,仅对该列表进行排序是没有用的; Python will take the 4 strings from the line (the first name, last name etc.) and put them in order. Python 将从行中取出 4 个字符串(名字、姓氏等)并将它们按顺序排列。

What you want to do is have a list of those lists , and sort it by last name .您想要做的是拥有这些列表的列表,并按姓氏对其进行排序。

Python's lists provide a sort method that sorts them. Python 的列表提供了一种sort方法来对它们进行排序。 When you apply it to the list of president-info-lists, it will sort those.当您将其应用于总统信息列表列表时,它会对这些列表进行排序。 But the default sorting for lists will compare them item-wise (first item first, then second item if the first items were equal, etc.).但是列表的默认排序将逐项比较它们(首先是第一项,如果第一项相等,则为第二项,等等)。 You want to compare by last name, which is the second element in your sublists.您想按姓氏进行比较,这是子列表中的第二个元素。 (That is, element 1; remember, we start counting list elements from 0.) (即元素 1;记住,我们从 0 开始计算列表元素。)

Fortunately, it is easy to give Python more specific instructions for sorting.幸运的是,很容易给 Python 更具体的排序指令。 We can pass the sort function a key argument, which is a function that "translates" the items into the value we want to sort them by.我们可以将排序 function 传递给一个key参数,它是一个 function,它将项目“翻译”成我们想要对它们进行排序的值。 Yes, in Python everything is an object - including functions - so there is no problem passing a function as a parameter.是的,在 Python 中,所有内容都是 object -包括函数- 因此将 function 作为参数传递是没有问题的。 So, we want to sort "by last name", so we would pass a function that accepts a president-info-list and returns the last name (ie, element [1] ).因此,我们要“按姓氏”排序,因此我们将传递一个 function ,它接受总统信息列表并返回姓氏(即元素[1] )。

Fortunately, this is Python, and "batteries are included";幸好这个是Python,而且“含电池”; we don't even have to write that function ourself.我们甚至不必自己写 function。 We are given a magical tool that creates functions that return the nth element of a sequence (which is what we want here).我们得到了一个神奇的工具,它可以创建返回序列的第 n 个元素的函数(这就是我们想要的)。 It's called itemgetter (because it makes a function that gets the nth item of a sequence - "item" is more usual Python terminology; "element" is a more general CS term), and it lives in the operator module.它被称为itemgetter (因为它生成了一个 function 来获取序列的第 n 个项目 - “项目”是更常见的 Python 术语;“元素”是一个更通用的 CS 术语),它存在于operator模块中。

By the way, there are also much neater ways to handle the file opening/closing, and we don't need to write an explicit loop to handle reading the file - we can iterate directly over the file ( for line in file: gives us the lines of the file in turn, one each time through the loop), and that means we can just use a list comprehension (look them up).顺便说一句,还有更简洁的方法来处理文件打开/关闭,我们不需要编写显式循环来处理文件读取 - 我们可以直接遍历文件( for line in file:给我们文件的行依次循环,每次循环一个),意味着我们可以只使用list comprehension (查找它们)。

import operator
def main():
  # We'll set up 'infile' to refer to the opened input file, making sure it is automatically
  # closed once we're done with it. We do that with a 'with' block; we're "done with the file"
  # at the end of the block.
  with open(INPUT_FILE) as infile:
    # We want the splitted, rstripped line for each line in the infile, which is spelled:
    data = [line.rstrip().split('\t') for line in infile]

  # Now we re-arrange that data. We want to sort the data, using an item-getter for
  # item 1 (the last name) as the sort-key. That is spelled:
  data.sort(key=operator.itemgetter(1))

  with open(OUTPUT_FILE) as outfile:
    # Let's say we want to write the formatted string for each line in the data.
    # Now we're taking action instead of calculating a result, so we don't want
    # a list comprehension any more - so we iterate over the items of the sorted data:
    for item in data:
      # The item already contains all the values we want to interpolate into the string,
      # in the right order; so we can pass it directly as our set of values to interpolate:
      outfile.write('%s %s was president from %s to %s' % item)

I did get this working with Karls help above, although I did have to edit the code to get it to work for me, due to some errors I was getting.我确实在上面的 Karls 帮助下得到了这个工作,尽管由于我遇到了一些错误,我确实必须编辑代码才能让它为我工作。 I eliminated those and ended up with this.我消除了这些并最终得到了这个。

import operator

INPUT_FILE = 'presidents.txt'

OUTPUT_FILE2= 'president_NEW2.txt'

def main():

with open(INPUT_FILE) as infile:
    data = [line.rstrip().split('\t') for line in infile]

data.sort(key=operator.itemgetter(0))

outfile=open(OUTPUT_FILE2,'w')   

for item in data:
    last=item[0]
    first=item[1]
    start=item[2]
    end=item[3]

    outfile.write('%s %s was president from %s to %s\n' % (last,first,start,end))

main()主要的()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM