[英]How can I print elements from lines of a text file that my code selects based on its inclusion of a particular string?
I am new to python, and I am trying to figure out a problem.我是 python 的新手,我想找出一个问题。 So I have 2 text files;
所以我有 2 个文本文件; the first one contains a word (I started with one word for simplicity), I read that word, assign it onto a string variable, then look for this word in tens of thousands of lines in my 2nd text file.
第一个包含一个词(为了简单起见,我从一个词开始),我读了那个词,将它分配给一个字符串变量,然后在我的第二个文本文件中的数万行中查找这个词。 This part I have completed.
这部分我已经完成了。 Now onto my issue.
现在谈谈我的问题。
The 2nd text file contains 4 columns, for the sake of keeping things simple, I'll give an example below:第二个文本文件包含 4 列,为了简单起见,我将在下面举一个例子:
Alpha 100 200 thewordiamlookingforisapple
Beta 200 300 thewordiamnotlookingforispear
Gamma 300 400 onceagainapple
Theta 400 500 onceagainapple
Omega 500 600 andonceagainpear
Let's say that I am looking for the string "apple" and lines 1,3 and 4 contain it.假设我正在寻找字符串“apple”并且第 1,3 和 4 行包含它。 Now I want to print the 1st, 2nd, and 3rd columns of the associated lines.
现在我想打印相关行的第一、第二和第三列。
My code so far is this:到目前为止,我的代码是这样的:
def word_match(File, String):
wordnumber = 0
listOfAssociatedWords = []
with open(File, 'r') as read_obj:
for line in read_obj:
wordnumber += 1
if String in line:
listOfAssociatedWords.append((wordnumber, line.rstrip()))
return listOfAssociatedWords
#------------------------------------------------------------------------------
firstfile = open("/Directory/firstfilename", "r")
String = firstfile.read()
firstfile.close()
#------------------------------------------------------------------------------
matched_words = word_match("/Directory/secondfilename", word)
print('Total Matched Words : ', len(matched_words))
for elem in matched_words:
print('Word Number = ', elem[0], ' :: Line = ', elem[1])
Current Output:
('Total Matched Words : ', 3)
('Word Number = ', 1, ' :: Line = ', 'Alpha 100 200 thewordiamlookingforisapple')
('Word Number = ', 3, ' :: Line = ', 'Gamma 300 400 onceagainapple')
('Word Number = ', 4, ' :: Line = ', 'Theta 400 500 onceagainapple')
Desired Output:
Alpha 100 200
Gamma 300 400
Theta 400 500
I think you want this我想你想要这个
def word_match(File, String):
wordnumber = 0
listOfAssociatedWords = []
with open(File, 'r') as read_obj:
for line in read_obj:
wordnumber += 1
if String in line:
listOfAssociatedWords.append(line.split()[:3])
return listOfAssociatedWords
Another easy way to do this is to use pandas.另一种简单的方法是使用熊猫。 You may read your file into a pandas dataframe.
您可以将文件读入熊猫数据帧。 This way later if you want to add more complexity to the logic, it would be fairly easy.
这样以后如果您想为逻辑添加更多复杂性,那将相当容易。 Using pandas you can achieve the expected results as :
使用大熊猫可以达到预期的结果:
import pandas as pd
initial_word = 'apple'
sample_dict = {'col1': ['Alpha', 'Beta', 'Gamma', 'Theta', 'Omega'], 'col2': [100, 200, 300, 400, 500],
'col3': [200, 300, 400, 500, 600],
'col4': ['thewordiamlookingforisapple', 'thewordiamnotlookingforispear', 'onceagainapple', 'onceagainapple', 'andonceagainpear']}
df = pd.DataFrame(data=sample_dict)
print(df)
new_df = df[df['col4'].str.contains(initial_word)]
new_df = new_df.drop('col4', 1)
print(new_df)
The output would look like (for df):输出看起来像(对于 df):
col1 col2 col3 col4
0 Alpha 100 200 thewordiamlookingforisapple
1 Beta 200 300 thewordiamnotlookingforispear
2 Gamma 300 400 onceagainapple
3 Theta 400 500 onceagainapple
4 Omega 500 600 andonceagainpear
And for new df:而对于新的 df:
col1 col2 col3
0 Alpha 100 200
2 Gamma 300 400
3 Theta 400 500
You can read the txt file and convert to a pandas dataframe initially.您可以首先读取 txt 文件并转换为 Pandas 数据帧。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.