I have a txt file to parse that looks like:
--- What kind of submission is this? ---
Sold Property
--- State? ---
Los Angeles
...
and need to store values after --- ---
tags in variables. It works with all those if statements, but I was wondering whether it is possible to refactor a huge number of ifs into some structure (eg dictionary), and then easily write that to output file.
Here's something I made:
"""Open a file to read"""
for line in res:
if "Instagram Usernames" in line:
usernames = next(res)
if "Date" in line:
date = next(res)
if "Address" in line:
address = next(res)
if "Neighborhood" in line:
market = next(res)
if "State" in line:
city = next(res)
if "Asset" in line:
as_type = next(res)
if "Sale Price" in line:
price = next(res)
if "," in price:
price = price.replace(',', '')
if "$" in price:
price = price.replace('$', '')
if "Square" in line:
sf = next(res)
if "," in sf:
sf = sf.replace(',', '')
if "$" in sf:
sf = sf.replace('$', '')
if "Buyer" in line:
buyer = next(res)
if "Seller" in line:
seller = next(res)
if "Broker" in line:
brokers = next(res)
if "Notes" in line:
notes = next(res)
"""Write to output file"""
fin.write("IMAGE: @" + usernames)
fin.write("DATE: " + date)
fin.write("ADDRESS: " + address)
fin.write("MARKET: " + market)
fin.write("CITY: " + city)
if as_type == "Multi Family" or "Multi Family\n":
fin.write("ASSET TYPE: Multifamily\n")
else:
fin.write("ASSET TYPE: " + as_type)
fin.write("PRICE: $" + price)
if sf in bad_symb:
fin.write("SF: N/A\n")
fin.write("PPSF: N/A\n")
else:
fin.write("SF: " + sf)
fin.write("PPSF: $" + "{0:.2f}\n".format(float(price) / float(sf)))
fin.write("BUYER: " + buyer)
fin.write("SELLER: " + seller)
fin.write("BROKERS: " + brokers + "\n")
if notes != "\n":
fin.write("NOTES: " + notes + "\n")
fin.write(footer_sale(market, buyer, seller))
Any help would be appreciated, thanks in advance!
When I have a sequence of items like this, I like to set up a small data structure that specifies what I'm looking for, and if I find it where it should go.
def strip_currency(s):
"""Function to strip currency and commas from a real number string"""
return s.replace('$', '').replace(',', '')
# mapping of data labels to attribute/key names
label_attr_map = (
('Instagram Usernames', 'usernames'),
('Date', 'date'),
('Address', 'address'),
('Neighborhood', 'market'),
('State', 'city'), # <-- copy-paste bug?
('Asset', 'as_type'),
('Sale Price', 'price', strip_currency),
('Square', 'sf', strip_currency),
('Buyer', 'buyer'),
('Seller', 'seller'),
('Broker', 'broker'),
('Notes', 'notes'),
)
# populate data dict with values from file, as defined in the label_attr_map
data = {}
for line in file:
# find any matching label, or just go on to the next line
match_spec = next((spec for spec in label_attr_map if spec[0] in line), None)
if match_spec is None:
continue
# found a label, now extract the next line, and transform it if necessary
key = match_spec[1]
data[key] = next(file)
if len(match_spec) > 2:
transform_fn = match_spec[2]
data[key] = transform_fn(data[key])
Now your label-to-attribute mapping is easier to verify, and your cascade of 'if's is just a single next
expression.
To write the output, just access the different items in the data
dict.
You could use a dictionary, with everything in-between the dashes being the key and the next line being the corresponding value.
As we are not using a loop, we first split the contents of the file into its lines:
res = res.split("\n")
The next line produces the dictionary; res[::2]
chooses every second item in res
, starting with the first item (all lines with ---
), res[1::2]
every second item, starting with the second item (all lines with information).
Now we choose the lines with ---
as the key for each entry in the dictionary and the information lines as the values: key: value
; as you probably don't want to include the dashes, we strip them and the space from the beginning and the end with .rstrip("- ")
:
x = {key.rstrip("- "): value for key in res[::2] for value in res[1::2]}
Now you can easily index x
to get the desired information, which will also simplify writing to your output file.
Use a lambda function defined for finding the next line string from the list of all line strings.
search_func = lambda search_str : [line_list[line_list.index(line)+1] for line in line_list[:-1] if search_str in line]
Get variables as keys and corresponding particular search strings as values in another dictionary :
all_vars_search_dict = {'usernames' : "Instagram Usernames" , 'date' : "Date", 'address' : "Address", 'market' : "Neightbourhood", 'city' : "State",...}
Now create another dictionary calling previous function to get the required values you're searching for :
all_vals = {k: search_func(all_vars_search_dict[k]) for k in all_vars_search_dict}
While writing to the output file, you can just iterate over this dictionary.
Note : This process can't be done for searching the keywords "Square"
and "Sale Price"
in the lines.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.