I need to format the below shown multiple line string in python. I've tried many ways but they don't end up well.
AMAZON
IPHONE: 700
SAMSUNG: 600
=============
WALMART
IPHONE: 699
===========
ALIBABA
SONY: 500
So, the above data represent the online store and it's price of a mobile with its brand. I need to add these to a database. So, it should be like this
-------------------
AMAZON | IPHONE | 700
-------------------
AMAZON | SAMSUNG | 600
-------------------
WALMART | IPHONE | 699
-------------------
ALIBABA | SONY | 500
-------------------
I need to format the above text and store it in a database table.
What I have tried? I tried to split the multiple lines and create a dictionary more likely to be JSON. But It doesn't end well. But it takes only one line. If there is some other easy approach share me. Please help me with this!
I made some assumptions:
Working code:
str = """
AMAZON
IPHONE: 700
SAMSUNG: 600
=============
WALMART
IPHONE: 699
===========
ALIBABA
SONY: 500
"""
new_entry = True
print("-------------------")
for line in str.split("\n"):
# assuming first entry is always the vendor name
if not line.strip():
continue
elif new_entry:
vendor = line.strip()
new_entry = False
elif "===" in line:
new_entry = True
else:
product = line.split(":")
print("{} | {} | {}".format(vendor, product[0].strip(), product[1].strip()))
print("-------------------")
Output is:
-------------------
AMAZON | IPHONE | 700
-------------------
AMAZON | SAMSUNG | 600
-------------------
WALMART | IPHONE | 699
-------------------
ALIBABA | SONY | 500
-------------------
Alternative approach: The vendor name could also be found as being a text line, but without colon.
answer submitted by @scito is adequate enough, but i am putting mine just in case. you can use regex, following is a working example:
strng = """
AMAZON
IPHONE: 700
SAMSUNG: 600
=============
WALMART
IPHONE: 699
===========
ALIBABA
SONY: 500
======
"""
multistrng = strng.split("\n") # get each line seperated by \n
import re
market_re = re.compile('([a-zA-Z]+)') # regex to find market name
phone_re = re.compile(r"([a-zA-Z]+):\s(\d+)") # regex to find phone and its price
js = [] # list to hold all data found
for line in multistrng:
phone = phone_re.findall(line) # if line contains phone and its price
if phone:
js[-1].append(phone[0]) # add phone to recently found marketplace
continue
market = market_re.findall(line)
if market: # if line contains market place name
js.append([market[0]])
continue
else:
continue # empty lines ignore
# now you have the data in structured manner, you can print or add it to the database
for market in js:
for product in market[1:]:
print("---------------------")
print("{} | {} | {}".format(market[0], product[0], product[1]))
print("---------------------")
output:
---------------------
AMAZON | IPHONE | 700
---------------------
AMAZON | SAMSUNG | 600
---------------------
WALMART | IPHONE | 699
---------------------
ALIBABA | SONY | 500
---------------------
data is stored in js list, if you iterate over js, first element in sub-list is market place, and rest is products for that market place.
[['AMAZON', ('IPHONE', '700'), ('SAMSUNG', '600')], ['WALMART', ('IPHONE', '699')], ['ALIBABA', ('SONY', '500')]]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.