I want to delete certain strings from a text file in python

Question

I am very new to python. I have a text file which contains certain strings which I am logging to csv file using pandas.

I want to remove certain strings which starts with specific characters from the file. Please let me know how can i do it.

The Text file contains the strings something like this

 "<S t='a' s='3'/>SetRTEConfig,Done,<S t='s' c='IgnoreCase' s='5'/>{LogUutCurrentVersions}{GetMcbVersion}LogAndReportLastVersion,"v1.22.000",Passed....

and so on.. I need to delete all those which starts from <S t='a' . I need only those data which starts from <S t='s'

Answer 1

You can do something along these lines:

goodlines = []
with open('textfile.txt','r') as fp:
    lines = fp.readlines()
    for line in lines:
        if line[0:8]=="<S t='s'":
        goodlines.append(line)

If you need more advanced pattern matching you can use ' regular expressions '

Answer 2

Solution using regex:

import re
contents = "<S t='a' s='3'/>SetRTEConfig,Done,<S t='s' c='IgnoreCase'" \
           " s='5'/>{LogUutCurrentVersions}{GetMcbVersion}LogAndReportLastVersion,\"v1.22.000\",Passed...."

pattern = re.compile(r'<S t=\'s\'(.*)')
result = pattern.findall(contents)
print(result)

OUTPUT:

[' c=\'IgnoreCase\' s=\'5\'/>{LogUutCurrentVersions}{GetMcbVersion}LogAndReportLastVersion,"v1.22.000",Passed....']

Answer 3

I think the python solution here is much better than using a regex, but since you asked for it, you could do something like this for a regex solution:

import re
s=''' "<S t='a' s='3'/>SetRTEConfig,Done,<S t='s' c='IgnoreCase' s='5'/>{LogUutCurrentVersions}{GetMcbVersion}LogAndReportLastVersion,"v1.22.000",Passed....'''
re.findall(r"<S t='s'.+?>([^<]+)", s)
['{LogUutCurrentVersions}{GetMcbVersion}LogAndReportLastVersion,"v1.22.000",Passed....']
# assuming you don't want the tag but only the text after it?

Or, directly into pandas:

>>> from StringIO import StringIO; import pandas as pd; import re
>>> pd.read_csv(StringIO('\n'.join(re.findall(r"<S t='s'.+?>([^<]+)", s))), sep=",")
# this will look off since its falsey data with no header...
Empty DataFrame
Columns: [{LogUutCurrentVersions}{GetMcbVersion}LogAndReportLastVersion, v1.22.000, Passed....]
Index: []

However, the above is quite crude -- it won't work for example if there is a "<" in the csv data. To re-iterate: I would use the python example -- it will be much simpler to use and be more flexible when you run into more conditions.

I want to delete certain strings from a text file in python

Question

3 answers

solution1
1 2019-12-15 06:25:27

solution2
1 2019-12-15 07:33:32

solution3
0 2019-12-15 07:56:58

I want to delete certain strings from a text file in python

Question

3 answers

solution1 1 2019-12-15 06:25:27

solution2 1 2019-12-15 07:33:32

solution3 0 2019-12-15 07:56:58

solution1
1 2019-12-15 06:25:27

solution2
1 2019-12-15 07:33:32

solution3
0 2019-12-15 07:56:58