简体   繁体   English

将选定的行从一个文件复制到另一个

[英]Copy selected lines from one file to another

I am trying to write a program in python which searches for user specified words in a txt file and copies the selected lines containing that word into another file. 我试图用python编写程序,该程序在txt文件中搜索用户指定的单词,并将包含该单词的选定行复制到另一个文件中。

Also the user will have an option to exclude any word. 此外,用户可以选择排除任何单词。

(eg Suppose the user searches for the word "exception" and want to exclude the word "abc", then the code will only copy the lines which has "exception" in it but not "abc"). (例如,假设用户搜索单词“ exception”并想排除单词“ abc”,那么代码将仅复制其中包含“ exception”的行,而不是“ abc”的行)。

Now all the work will be done from the command prompt. 现在,所有工作将在命​​令提示符下完成。

The input would be: 输入为:

file.py test.txt(input file) test_mod.txt(output file) -e abc(exclude word denoted by -e)-s exception(search word denoted by -s)

Now the user will have an option to enter multiple exclude words and multiple search words. 现在,用户可以选择输入多个排除词和多个搜索词。

Now so far I have achieved that the input format is: 到目前为止,我已经实现了输入格式为:

file.py test.txt test_mod.txt abc exception".

This excludes the word "abc" and search for "exception". 这排除了单词“ abc”并搜索“ exception”。

But I don't know how to: 但是我不知道如何:

  1. Include multiple search word and exclude words 包含多个搜索词并排除词
  2. How to denote them by -e and -s. 如何用-e和-s表示它们。 I have seen the argparse and the getopt tutorial. 我看过argparse和getopt教程。 But there's no tutorial on this specific topic. 但是,没有关于此特定主题的教程。

Please can somebody help me by modifying my code or write a new one? 请有人可以通过修改我的代码或编写新代码来帮助我吗?

Here's my code as of now: 到目前为止,这是我的代码:

#/Python33

import sys
import os




def main(): #main method

 try:

  f1 = open(sys.argv[1], 'r')    #takes the first input file in command line
  found = False
  user_input1 = (sys.argv[3])    #takes the word which is to be excluded.
  user_input2 = (sys.argv[4])    #takes the word which is to be included.
  if sys.argv[1] == sys.argv[2]: 
       f1.close()
       sys.exit('\nERROR!!\nThe two file names cannot be the same.') 

  if sys.argv[3] != sys.argv[4]:  

    for line in f1:

        if user_input1 in line or user_input2 in line:

           f2 = open(sys.argv[2], 'a') 

           if user_input1 in line:
              if user_input2 in line:
                   pass

           elif user_input2 in line:
              f2.write(line)
              found = True
              f2.close()


    if not found:
        print("ERROR: The Word couldn't be found.")            



    f1.close()


  if sys.argv[3] == sys.argv[4]: 
         f1.close()
         sys.exit('\nERROR!!\nThe word to be excluded and the word to be included  cannot be the same.') 



 except IOError:
       print('\nIO error or wrong file name.') 
 except IndexError:
       print('\nYou must enter 5 parameters.') #prevents less than 5 inputs which is  mandatory
 except SystemExit as e:                       #Exception handles sys.exit()
       sys.exit(e)


if __name__ == '__main__':
  main()

Thanks man. 谢啦。 That really helped me understand the logic. 这确实帮助我理解了逻辑。 But I'm new to python, so I'm still having some issues.Whenever I run it, it copies the file with the words specified by -s but it's not excluding the words specified by -e. 但是我是python的新手,所以我仍然遇到一些问题。每当我运行它时,它都会使用-s指定的单词复制文件,但不排除-e指定的单词。 What am I doing wrong? 我究竟做错了什么? So here's my code now: #/Python33 所以这是我的代码:#/ Python33

#takes a text file, finds a word and writes that line containing that word but not a 2nd word specified by the user. So if both of them are there, that line is not printed

import sys
import os
import argparse



def main(): #main method

 try:

  parser = argparse.ArgumentParser(description='Copies selected lines from files')
  parser.add_argument('input_file')
  parser.add_argument('output_file')
  parser.add_argument('-e',action="append")
  parser.add_argument('-s',action="append")
  args = parser.parse_args('test.txt, test_mod.txt, -e , -s exception'.split())


  user_input1 = (args.e)    #takes the word which is to be excluded.
  user_input2 = (args.s)    #takes the word which is to be included.

  def include_exclude(input_file, output_file, exclusion_list=[], inclusion_list=[]):


      with open(output_file, 'w') as fo:
        with open(input_file, 'r') as fi:
            for line in fi:
                inclusion_words_in_line = map(lambda x: x in line, inclusion_list)
                exclusion_words_in_line = map(lambda x: x in line, exclusion_list)
                if any(inclusion_words_in_line) and not any(exclusion_words_in_line):
                    fo.write(line)    
  if user_input1 != user_input2 : 
         include_exclude('test.txt', 'test_mod.txt', user_input1, user_input2);
         print("hello")

  if user_input1 == user_input2 : 


         sys.exit('\nERROR!!\nThe word to be excluded and the word to be included cannot be the same.') 



 except IOError:
       print('\nIO error or wrong file name.')  
 except IndexError:
       print('\nYou must enter 5 parameters.') 
 except SystemExit as e:                      
       sys.exit(e)


if __name__ == '__main__':
  main()

I think this does what you want: 我认为这可以满足您的需求:

»»» import argparse

»»» parser = argparse.ArgumentParser(description='foo baaar')

»»» parser.add_argument('input_file')
Out[3]: _StoreAction(option_strings=[], dest='input_file', nargs=None, const=None, default=None, type=None, choices=None, help=None, metavar=None)

»»» parser.add_argument('output_file')
Out[4]: _StoreAction(option_strings=[], dest='output_file', nargs=None, const=None, default=None, type=None, choices=None, help=None, metavar=None)

»»» parser.add_argument('-e', action="append")
Out[5]: _AppendAction(option_strings=['-e'], dest='e', nargs=None, const=None, default=None, type=None, choices=None, help=None, metavar=None)

»»» parser.add_argument('-s', action="append")
Out[6]: _AppendAction(option_strings=['-s'], dest='s', nargs=None, const=None, default=None, type=None, choices=None, help=None, metavar=None)

»»» parser.parse_args('foo1.txt foo2.txt -e abc -e def -s xyz -s pqr'.split())
Out[7]: Namespace(e=['abc', 'def'], input_file='foo1.txt', output_file='foo2.txt', s=['xyz', 'pqr'])

If you just call parser.parse_args() , it will parse the arguments passed to your script, but the above is handy for testing. 如果只调用parser.parse_args() ,它将解析传递给脚本的参数,但是上面的代码很方便进行测试。 Note how multiple search and exclude words are specified using multiple -s and -e flags. 注意如何使用多个-s-e标志指定多个搜索和排除单词。 By passing action="append" to the add_argument method, arguments after -s and -e are added to a list in the namespace returned by parser.parse_args . 通过将action="append"传递给add_argument方法,将-s-e之后的参数添加到parser.parse_args返回的名称空间的列表中。 This should address your questions 1. and 2. . 这应该解决您的问题1.2. .。

Here's an example of how you can access the values in a nice way: 这是一个如何以一种很好的方式访问值的示例:

»»» args = parser.parse_args('foo1.txt foo2.txt -e abc -e def -s xyz -s pqr'.split())

»»» args.e
Out[12]: ['abc', 'def']

I used the argparse docs , especially the add_argument method doc is very useful. 我使用了argparse文档 ,尤其是add_argument方法文档非常有用。

EDIT: here's one function that does the inclusion/exclusion logic: 编辑:这是一个执行包含/排除逻辑的功能:

def include_exclude(input_file, output_file, inclusion_list, exclusion_list=[]):
    with open(output_file, 'w') as fo:
        with open(input_file, 'r') as fi:
            for line in fi:
                inclusion_words_in_line = map(lambda x: x in line, inclusion_list)
                exclusion_words_in_line = map(lambda x: x in line, exclusion_list)
                if any(inclusion_words_in_line) and not any(exclusion_words_in_line):
                    fo.write(line)

The with statement ensures that the file is properly closed if anything goes wrong (see the doc ). with语句可确保在出现任何问题时正确关闭文件(请参阅doc )。 Instead, you could of course use the same open/close code you already have. 相反,您当然可以使用已经拥有的相同打开/关闭代码。 Indeed, my code doesn't include any error handling, I'll leave that as an exercise for the reader. 确实,我的代码不包含任何错误处理,我将其留给读者练习。 In the main for loop, I loop over all the lines in the input file. 在main for循环中,我遍历输入文件中的所有行。 Then, I look at all the words in inclusion_list, and check if they occur in the line . 然后,查看inclusion_list中的所有单词,并检查它们是否出现在该line The map function is IMHO an elegant way of doing this; map功能是恕我直言的一种优雅方式。 it takes (for example) the words in inclusion_list , and generates another list by mapping each of the items of inclusion_list to the function lambda x: x in line . 它需要(例如)词语的inclusion_list ,和由每个项目的映射产生另一个列表inclusion_list给函数lambda x: x in line The function just returns True if it's input (a word from inclusion_list appears in the line), so you end up with a list of True/False items. 如果输入,该函数将返回True (该行中inclusion_list一个来自inclusion_list的单词),因此最终得到的是True / False项目列表。 Brief example: 简要示例:

»»» line="foo bar"

»»» words=['foo', 'barz']

»»» map(lambda x: x in line, words)
Out[24]: [True, False]

Now I apply the any function to check if, well, any of the items in the inclusion_words_in_line list are True, and to check if none ( not any ) of the items in exclusion_words_in_line are True. 现在我应用any功能检查,那么,任何在该项目的inclusion_words_in_line名单是真实的,并检查无( not any在exclusion_words_in_line的项目),都是如此。 If that's the case, the line is appended to the output file. 在这种情况下,该line附加到输出文件。 If you wanted to ensure that all of the words in inclusion_list appear on the line, rather than any (this wasn't clear to me from your problem description), you can use the all function instead. 如果你想确保all的话inclusion_list出现在该行,而不是任何(这不是从你的问题描述清楚地知道),你可以使用all功能来代替。

Note that you can quite easily solve the above with for loops that loop over the inclusion_list and exclusion_list s, checking if the items are there, there's no need to use map and any . 请注意,您可以使用for循环轻松地解决上述问题,这些循环遍历inclusion_listexclusion_list ,检查是否有项目,不需要使用mapany

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将行从一个文件复制到另一个文件 - Copy lines from one file to another 使用命令行参数(带或不带空格)将选定的文本行从一个文件复制到另一个文件 - Copy selected lines of text from one file to another with command line argument with or without spaces 在Python中将“ N”行从一个文件复制到另一个文件? - Copy 'N' lines from one file to another in python? 从一个文件复制两行之间的代码,并在另一个文件中的相同两行之间覆盖 - Copy code between two lines from one file and overwrite between same two lines in another file 我可以跨选定的python函数和类方法从一个文件复制到另一个文件吗? - Can I copy across selected python functions and class methods from one file to another file? 从文本文件复制并将行编号写入另一个文件 - Copy from a text file and write to another file the lines numbered 如何从一个文件到另一文件一个接一个地添加行 - How to add lines one by one from one file into another file python:将特定行从一个文件复制到另一个文件 - python: copy specific lines from one file to anther file 将选定的列从一个文件复制到另一个 - Copying selected columns from one file to another python - 将列从一个文件复制到另一个文件 - python - copy columns from one file to another
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM