简体   繁体   中英

sort a file and get the first three words alphabetically in python

So I'm trying to print the first three words alphabetically from a text file. I know I need to sort the file first so its in alphabetical order and I tried this:

def top_three_by_alphabet(fileName):
 lines = f.readlines()
 lines.sort()
 print ('The first three words alphabetically in the file are' +
 str(line[:3])
print top_three_by_alphabet(f)

with this I'm getting a syntax error on my final print statement

您无需在函数上调用print,因为它是在打印自身。

It looks like you're missing a closing parenthesis on your "print" function, and a typo on lines/line.

def top_three_by_alphabet(fileName):
    lines = [line.strip() for line in open(fileName) if not line.isspace()]
    lines.sort()
    print ('The first three words alphabetically in the file are' +
    str(lines[:3]))

top_three_by_alphabet(f)

您在第一个打印语句后打开(,但不要关闭它

而且您还应该将文件名发送到函数, print top_three_by_alphabet('yourfilename')而不是print top_three_by_alphabet(f)

Small tutorial

How to read a file, line by line, using "utf8" encoding:

import io

filename = "path/to/file.txt"
with io.open(filename, mode="r", encoding="utf8") as fd:
    for line in fd:
        line = line.rstrip()  # drop the newline
        print(line)

How to sort a collection (iterable):

coll = ["acb", "bac", "cab", "fac", "dac", "gag", "abc"]

# using a sorted iterator:
for item in sorted(coll):
    print(item, end=" ")
print()
# -> abc acb bac cab dac fac gag

# sorting a list
coll.sort()
print(coll)
# -> ['abc', 'acb', 'bac', 'cab', 'dac', 'fac', 'gag']

# using a criterion
coll.sort(key=lambda i: i[::-1])
print(coll)
# -> ['cab', 'acb', 'bac', 'dac', 'fac', 'abc', 'gag']

Putting all thing together:

with io.open(filename, mode="r", encoding="utf8") as fd:
    for line in sorted(fd):
        line = line.rstrip()  # drop the newline
        print(line)

Splitting the words:

# The lazy way:
text = "The quick, brown, fox jumps *over* the lazy dog"
parts = text.split()
first_three_words = " ".join(parts[:3])
print(first_three_words)
# -> The quick, brown,

# using RegEx
import re

text = "The quick, brown, fox jumps *over* the lazy dog"
split_words = re.compile("\W+", flags=re.DOTALL).split
parts = split_words(text)
first_three_words = " ".join(parts[:3])
print(first_three_words)
# -> The quick brown

Your code will give you the first three Lines , to get the first three words you should do something like this:

def top_three_by_alphabet(file):
data = file.readall()
data = data.split(" ")   # Gets every word isolated
data.sort()
print ('The first three words alphabetically in the file are' + str(data[:3]))

print top_three_by_alphabet(file)

hope its helpful!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM