简体   繁体   English

Python在txt文件中查找字符串以特定格式写入新文件

[英]Python Find Strings in txt file writing in new file in specific form

I want my python program to search for some specific portion of strings in a text file and write them into a new text file. 我希望我的python程序在文本文件中搜索字符串的某些特定部分,并将它们写入新的文本文件中。

I have a txt-file in this form. 我有这种形式的txt文件。 Input-Text-File: 输入文本文件:

… different text
… different text
… different text
*
    <http://webadress.com/test.jpg>
    *Part ID:* 1234567
    *Design ID:* 54321
    *Part Name:* Test Object x2
    *Category:* Objects
    *Colour:* Yellow
    … different text
    … different text
    … different text

  *
    <http://webadress.com/test2.jpg>
    *Part ID:* 1234566
    *Design ID:* 54322
    *Part Name:* Test Object v4
    *Category:* Objects
    *Colour:* Red
    ... different text
    … different text
    … different text
  *

And so on…

I want to get out the following informations in following form. 我想以以下形式获取以下信息。

Output-Text-File: 输出文本文件:

[http; Part ID; Design ID; Part Name; Category; Colour]
[webadress.com/test.jpg; 1234567; 54321; Test Object x2; Objects; Yellow]
[webadress.com/test2.jpg; 1234566; 54322; Test Object v4; Objects; Red]

Can you help me please. 你能帮我吗。

I'll try to give some general advice. 我将尝试提供一些一般性建议。 Since your input format as well as your output format seem to be some kind of proprietary (or at least non-standard) formats (as opposed to XML, JSON, YAML, ... or even CSV) there is nothing you can do other than implementing these formats yourself. 由于您的输入格式和输出格式似乎都是某种专有(或至少是非标准)格式(与XML,JSON,YAML甚至是CSV相对),因此您无能为力而不是自己实现这些格式。

I'd start parsing the input file format into Python objects. 我将开始将输入文件格式解析为Python对象。 It looks like your input file contains multiple data sets, whereas each data set represents the same type of data, just with other values. 看起来您的输入文件包含多个数据集,而每个数据集代表相同类型的数据,只是带有其他值。 Define yourself a class that represents this data type. 为自己定义一个代表此数据类型的类。 Instances of this class can conveniently store one data set of the input data. 此类的实例可以方便地存储输入数据的一个数据集。 Parse the input file (use Python's powerful string methods or even regex for collecting information) and create instances of your data class on the fly. 解析输入文件(使用Python强大的字符串方法,甚至使用正则表达式收集信息),并动态创建数据类的实例。 You'll end up having a list with objects containing your input data. 您最终将得到一个包含包含输入数据的对象的列表。

In a second step, iterate through that list of objects and write your output file in the desired format. 在第二步中,遍历该对象列表,并以所需的格式写入输出文件。 Again, you will probably make heavy use of Python's string manipulation/formatting/construction methods. 同样,您可能会大量使用Python的字符串操作/格式化/构造方法。

This kind of abstraction will help you isolate the different components of the problem and keep the solution understandable, clean, maintainable. 这种抽象将帮助您隔离问题的不同部分,并使解决方案易于理解,清洁和可维护。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM