I have a text file that appears to be both space and pipe delimited.
Test Codes
ABCBBA 3 -1189.59 | ABCCHOICE 1 22.56 | ABCELECT 31 13516.72 | ABCFED 14 9070.74
ABCHMOBLUE 38 13183.27 | DCMCDNY 1 8.86 | ABCMEDHMO 7 6189.83 | ABCMEDPPO 17 6730.53
What I need to pull out is any code that starts with D and the corresponding value. So using the example above, my desired output would be:
Code Total
DCMCDNY 8.86
When I use:
for index, line in enumerate(lines):
if "Test Codes" in line:
print(re.split(r'\s{2,}',lines[index+2].lstrip()))
if "Test Codes" in line:
print(re.split(r'\s{2,}',lines[index+3].lstrip()))
I get the below output:
['ABCBBA', '3', '-1189.59', '|', 'ABCCHOICE', '1', '22.56', '|', 'ABCELECT', '31', '13516.72', '|', 'ABCFED', '14', '9070.74']
['ABCHMOBLUE', '38', '13183.27', '|', 'DCMCDNY', '1', '8.86', '|', 'ABCMEDHMO', '7', '6189.83', '|', 'ABCMEDPPO', '17', '6730.53']
However, I'm not sure if this is the most scalable approach or how I can pull the code and value from the list.
I would begin with splitting on the "|"character.
candidates = {} # To store results
lines = data_file.readlines()
for line in lines:
strip_pipe = line.strip("|")
# Process the list from split for whitespace delimiters
for candidate in strip_pipe:
stripped = candidate.strip() # Removes begin and end whitespace
# Check if the first item has the letter "D"
if stripped[0] == "D":
split_space = stripped.split(" ")
candidates.update({"Code": split_space[0], "Total": split_space[-1]})
From your example data above, this code will give you an output of
{'DCMCDNY': '8.86'}
Now, while this at least gives you the desired output, it may not be the most scalable for large data. Hopefully it sparks some ideas for you to improve it and make it meet your needs!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.