简体   繁体   中英

How to get just the first word of every line of file using python?

as you can see i'm a newbie and i don't know how to ask this question so i'm going to explain. i was writing Somali dictionary in text format and i have a lot of words and their meaning, so i want to have those words only not their meaning in another text format file in order to have a list of only vocabulary. Is their a way i can do that. Example " abaabid m.dh eeg abaab². ld ababid. ld abaab¹, abaabis." I have hundred of these words and their meaning and i want to pick only the word " abaabid " and etc. so how can i automate it in python instead of copy pasting manually all day?. Stop saying post the code as text, i don't even know how to write the code and that's why i'm asking this question. This screenshot is the text file showing words with their meaning.

If you just want a script to read the dictionary entries and then write the words into a separate file, try something like this:


def get_words(filename='Somali Dictionary.txt'):
    with open(filename, 'r') as f:
        lines = [line.split()[0] for line in f.readlines() if line != '\n']
        f.close()
    return lines

def write_words(lines, filename='Somali Words.txt'):
    with open(filename, 'w') as f:
        for line in lines:
            f.write(line)
            f.write('\n')
        f.close()

Example usage:

words = get_words()
write_words(words)

Or, alternatively:

if __name__ == '__main__':
    words = get_words()
    write_words(words)

In order to get the first word of every line follow these steps

f = open('file.txt', 'r') for line in f: print(line.split(' ')[0]) or

with open('convert.txt', 'r') as f: for line in f: print(line.split(' ')[0])

If it shows you error in console about (UnicodeDecodeError: 'charmap' codec can't decode) you can fix by adding encoding='utf-8'(i'm using.txt file) and my file format is utf-8 and down below is how you are adding in your code

with open('convert.txt', 'r', encoding='utf-8') as f: for line in f: print(line.split(' ')[0])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM