简体   繁体   English

如何从我的代码根据包含特定字符串选择的文本文件的行打印元素?

[英]How can I print elements from lines of a text file that my code selects based on its inclusion of a particular string?

I am new to python, and I am trying to figure out a problem.我是 python 的新手,我想找出一个问题。 So I have 2 text files;所以我有 2 个文本文件; the first one contains a word (I started with one word for simplicity), I read that word, assign it onto a string variable, then look for this word in tens of thousands of lines in my 2nd text file.第一个包含一个词(为了简单起见,我从一个词开始),我读了那个词,将它分配给一个字符串变量,然后在我的第二个文本文件中的数万行中查找这个词。 This part I have completed.这部分我已经完成了。 Now onto my issue.现在谈谈我的问题。

The 2nd text file contains 4 columns, for the sake of keeping things simple, I'll give an example below:第二个文本文件包含 4 列,为了简单起见,我将在下面举一个例子:

Alpha 100 200 thewordiamlookingforisapple
Beta 200 300 thewordiamnotlookingforispear
Gamma 300 400 onceagainapple
Theta 400 500 onceagainapple
Omega 500 600 andonceagainpear

Let's say that I am looking for the string "apple" and lines 1,3 and 4 contain it.假设我正在寻找字符串“apple”并且第 1,3 和 4 行包含它。 Now I want to print the 1st, 2nd, and 3rd columns of the associated lines.现在我想打印相关行的第一、第二和第三列。

My code so far is this:到目前为止,我的代码是这样的:

def word_match(File, String):
    wordnumber = 0
    listOfAssociatedWords = []
    with open(File, 'r') as read_obj:
        for line in read_obj:
            wordnumber += 1
            if String in line:
                listOfAssociatedWords.append((wordnumber, line.rstrip()))

    return listOfAssociatedWords
#------------------------------------------------------------------------------
firstfile = open("/Directory/firstfilename", "r")
String = firstfile.read()

firstfile.close()
#------------------------------------------------------------------------------
matched_words = word_match("/Directory/secondfilename", word)
print('Total Matched Words : ', len(matched_words))
for elem in matched_words:
    print('Word Number = ', elem[0], ' :: Line = ', elem[1])
Current Output:
('Total Matched Words : ', 3)
('Word Number = ', 1, ' :: Line = ', 'Alpha 100 200 thewordiamlookingforisapple')
('Word Number = ', 3, ' :: Line = ', 'Gamma 300 400 onceagainapple')
('Word Number = ', 4, ' :: Line = ', 'Theta 400 500 onceagainapple')


Desired Output:
Alpha 100 200
Gamma 300 400
Theta 400 500

I think you want this我想你想要这个

def word_match(File, String):
    wordnumber = 0
    listOfAssociatedWords = []
    with open(File, 'r') as read_obj:
        for line in read_obj:
            wordnumber += 1
            if String in line:
                listOfAssociatedWords.append(line.split()[:3])

    return listOfAssociatedWords

Another easy way to do this is to use pandas.另一种简单的方法是使用熊猫。 You may read your file into a pandas dataframe.您可以将文件读入熊猫数据帧。 This way later if you want to add more complexity to the logic, it would be fairly easy.这样以后如果您想为逻辑添加更多复杂性,那将相当容易。 Using pandas you can achieve the expected results as :使用大熊猫可以达到预期的结果:

import pandas as pd

initial_word = 'apple'

sample_dict = {'col1': ['Alpha', 'Beta', 'Gamma', 'Theta', 'Omega'], 'col2': [100, 200, 300, 400, 500],
                'col3': [200, 300, 400, 500, 600],
                'col4': ['thewordiamlookingforisapple', 'thewordiamnotlookingforispear', 'onceagainapple', 'onceagainapple', 'andonceagainpear']}
df = pd.DataFrame(data=sample_dict)

print(df)
new_df = df[df['col4'].str.contains(initial_word)]
new_df = new_df.drop('col4', 1)
print(new_df)

The output would look like (for df):输出看起来像(对于 df):

 col1  col2  col3                           col4
0  Alpha   100   200    thewordiamlookingforisapple
1   Beta   200   300  thewordiamnotlookingforispear
2  Gamma   300   400                 onceagainapple
3  Theta   400   500                 onceagainapple
4  Omega   500   600               andonceagainpear

And for new df:而对于新的 df:

    col1  col2  col3
0  Alpha   100   200
2  Gamma   300   400
3  Theta   400   500

You can read the txt file and convert to a pandas dataframe initially.您可以首先读取 txt 文件并转换为 Pandas 数据帧。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Python中从导入的文本文件中打印元素? - How can I print elements from an imported text file in Python? 如果我的输入存在于替代文本文件中,如何根据文本文件中的特定行打印字符串 - How to Print a string from a specific line in a text file based off if my input exists in an alternate text file 如何从文本文件中读取并根据特定类别对行进行排序并将其打印在 python shell 上 - How to read from text files and sorted the lines based on a particular category and print it on python shell 如何将 python 文件中的数据保存到文本文件中? 以下是我的代码行 - How can I save my data in python file to a text file? Following are my lines of code 在Python中:如何将文本文件中的元素打印为字符串,就像出现在文件中一样? - In Python: How do I print the elements as a string from a text file the same way as it appears in the file? python:如何打印我的文件行? - python:How can I print out my file lines? 如何循环代码以打印 json 文件中的下一个字符串,特别是 Email 字段。 PYTHON - how can I loop the code to print the next string from my json file specifically the Email field. PYTHON 尝试根据子字符串打印文本文件中的行 - Trying to print lines from text file based on substring python从文件中打印特定行 - python print particular lines from file 如何筛选制表符分隔的文本文件,该文件选择以某些字符串开头并转换为CSV的行 - How to filter tab delimited text file that selects lines that start with certain string and convert to a CSV
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM