I have a list of items that are structured similarly to this:
[{'Condition': '2013 Yamaha FJR 1300',
'Date': '2018-02-28 11:30',
'Description': ['\n ',
'\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.',
'\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM',
'\n',
'\n '],
'Images': [],
'Latitude': '35.599694',
'Location': ' (Asheville)',
'Longitude': '-82.628866',
'Price': '$7500',
'Title': '2013 Yamaha FJR 1300',
'Url': 'https://asheville.craigslist.org/mcd/d/2013-yamaha-fjr-1300/6513320993.html',
'_id': {'$oid': '5a96dbee6f9ca5410cc9ed98'}},
{'Condition': '2014 Honda Accord Sedan',
'Date': '2018-02-28 11:24',
'Description': ['\n ',
'\n2014 Honda Accord Automatic, White , On Tan, It has Only 41,980 Miles It Has Spoiler, Power Windows, and Mirrors, Tan Cloth Seats, Power Seats, 4 Cylinder, 4 Door, Radio, 6 CD Changer, FM,AM,CD, XM Radio, Bluetooth, Back up Camera, Side and Curtain Air Bag, 16 Inch Factory Wheels with Firestone Great Tires, Tinted Glass, And Much More, Clean On inside, Runs and Drives Like New, Call Me for more info, 864-266-6936 Willing to Negotiate if offer is fair.....',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\nhonda, bmw, crv, mercedes, ford, mazda, lx, rx, ls, is, gs, 470 honda, lexus, toyota, ford, accord, civic, coupe, Mercedes,Honda Pilot, Lexus gx470 & 460, Chevrolet Tahoe, suburban, Tahoe, land rover, Nissan armada, GMC Yukon, Terrian, CX7, BMW x5, GMC Terrian, B 2011, 2010, 2009, 2008, 2007, 2012, 2013, 2014, 2016, 2006, 2005, 2017, 2018, ',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n',
'\n '],
'Images': ['https://images.craigslist.org/00b0b_gNOi9VtqAy3_600x450.jpg',
'https://images.craigslist.org/00a0a_gs2eKxUlQho_600x450.jpg',
'https://images.craigslist.org/00l0l_lPmE8ML0zcb_600x450.jpg',
'https://images.craigslist.org/00x0x_bS9gCuxM7ID_600x450.jpg',
'https://images.craigslist.org/01010_dTS4DnHjVWW_600x450.jpg',
'https://images.craigslist.org/00w0w_70D0xeDKa7d_600x450.jpg',
'https://images.craigslist.org/00606_4SUFT4ZCbmO_600x450.jpg',
'https://images.craigslist.org/00k0k_1AQ7kVbviPN_600x450.jpg',
'https://images.craigslist.org/00d0d_3STBecGHaXD_600x450.jpg',
'https://images.craigslist.org/01717_guG6n90XfQt_600x450.jpg',
'https://images.craigslist.org/00h0h_8be8866trLr_600x450.jpg',
'https://images.craigslist.org/00B0B_gaQQvQHlARl_600x450.jpg',
'https://images.craigslist.org/00b0b_ih84Nskx5xj_600x450.jpg',
'https://images.craigslist.org/01616_aveWbY1HQvr_600x450.jpg',
'https://images.craigslist.org/00x0x_Fflsg0wwsK_600x450.jpg',
'https://images.craigslist.org/00b0b_6FBg7KV8HYv_600x450.jpg',
'https://images.craigslist.org/00J0J_3vd5Ip3mQ5S_600x450.jpg',
'https://images.craigslist.org/00L0L_loNV2CrnnLn_600x450.jpg',
'https://images.craigslist.org/00K0K_fh8oSEa9fKn_600x450.jpg',
'https://images.craigslist.org/00r0r_8P0SjsOgNd5_600x450.jpg',
'https://images.craigslist.org/00k0k_ZY0ywNmKkr_600x450.jpg',
'https://images.craigslist.org/00y0y_7Gie7XD8uuH_600x450.jpg',
'https://images.craigslist.org/00c0c_2nVDzLJhnYi_600x450.jpg',
'https://images.craigslist.org/00202_7k10eK3bxMn_600x450.jpg'],
'Latitude': '35.039000',
'Location': ' (Cowpens)',
'Longitude': '-81.822000',
'Price': '$10995',
'Title': '2014 Honda Accord White 41k',
'Url': 'https://asheville.craigslist.org/ctd/d/2014-honda-accord-white-41k/6513312696.html',
'_id': {'$oid': '5a96dbf16f9ca5410cc9ed99'}}]
When I run the following code:
wanted_keys = ['Title', 'Location', 'Price', 'Description', 'Url', 'Latitude', 'Longitude']
for item in cl_used_items_raw[:2]:
for k in wanted_keys:
lines = str(item[k]).split()
split_lines = [line.replace('\n', '').strip() for line in lines]
print("{}".format(' '.join(split_lines) + '\t'))
print('\n')
I get an ouput of:
2013 Yamaha FJR 1300
(Asheville)
$7500
['\n ', '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '\n', '\n ']
https://asheville.craigslist.org/mcd/d/2013-yamaha-fjr-1300/6513320993.html
35.599694
-82.628866
2014 Honda Accord White 41k
(Cowpens)
$10995
['\n ', '\n2014 Honda Accord Automatic, White , On Tan, It has Only 41,980 Miles It Has Spoiler, Power Windows, and Mirrors, Tan Cloth Seats, Power Seats, 4 Cylinder, 4 Door, Radio, 6 CD Changer, FM,AM,CD, XM Radio, Bluetooth, Back up Camera, Side and Curtain Air Bag, 16 Inch Factory Wheels with Firestone Great Tires, Tinted Glass, And Much More, Clean On inside, Runs and Drives Like New, Call Me for more info, 864-266-6936 Willing to Negotiate if offer is fair.....', '\n', '\n', '\n', '\n', '\n', '\n', '\nhonda, bmw, crv, mercedes, ford, mazda, lx, rx, ls, is, gs, 470 honda, lexus, toyota, ford, accord, civic, coupe, Mercedes,Honda Pilot, Lexus gx470 & 460, Chevrolet Tahoe, suburban, Tahoe, land rover, Nissan armada, GMC Yukon, Terrian, CX7, BMW x5, GMC Terrian, B 2011, 2010, 2009, 2008, 2007, 2012, 2013, 2014, 2016, 2006, 2005, 2017, 2018, ', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n ']
https://asheville.craigslist.org/ctd/d/2014-honda-accord-white-41k/6513312696.html
35.039000
-81.822000
I know I'm close but I'm struggling to determine how to write my for-loop to remove the additional whitespace characters in Description values while still maintaining the structure of the output I already have?
line.strip()
doesn't modify line
in-place - it returns the modified value, so the way you are using it won't affect line
in any way.
You probably mean:
split_lines = [line.strip() for line in lines]
>>> desc = ['\n ',
... '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.',
... '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM',
... '\n',
... '\n ']
Before:
>>> desc
['\n ', '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '\n', '\n ']
Apply replace() and strip()
[x.replace('\n', '').strip() for x in desc ]
After:
['', '2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '', '']
If I understand you correctly, you can replace the newline character with empty string and then remove around whitespaces
[x.replace('\n', '').strip() for x in desc ]
This gave me the correct output:
for item in cl_used_items_raw[:2]:
for k in wanted_keys:
if k == 'Description':
lines = str(''.join(item[k])).split()
split_lines = [line.replace('\n', '').strip() for line in lines]
split_lines = ' '.join(split_lines)
print(split_lines)
else:
lines = str(item[k]).split()
split_lines = [line.replace('\n', '').strip() for line in lines]
print("{}".format(' '.join(split_lines) + '\t'))
print('\n')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.