繁体   English   中英

从列表中删除不需要的字符

[英]Removing Unwanted Characters From List

我有一个结构与此类似的项目列表:

[{'Condition': '2013 Yamaha FJR 1300',
 'Date': '2018-02-28 11:30',
 'Description': ['\n        ',
  '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.',
  '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM',
  '\n',
  '\n    '],
 'Images': [],
 'Latitude': '35.599694',
 'Location': ' (Asheville)',
 'Longitude': '-82.628866',
 'Price': '$7500',
 'Title': '2013 Yamaha FJR 1300',
 'Url': 'https://asheville.craigslist.org/mcd/d/2013-yamaha-fjr-1300/6513320993.html',
 '_id': {'$oid': '5a96dbee6f9ca5410cc9ed98'}},

{'Condition': '2014 Honda Accord Sedan',
 'Date': '2018-02-28 11:24',
 'Description': ['\n        ',
  '\n2014 Honda Accord  Automatic, White , On Tan, It has Only 41,980 Miles It Has Spoiler, Power Windows, and Mirrors, Tan Cloth Seats, Power Seats, 4 Cylinder, 4 Door, Radio, 6 CD Changer, FM,AM,CD, XM Radio, Bluetooth, Back up Camera, Side and Curtain Air Bag, 16 Inch Factory Wheels with Firestone  Great Tires, Tinted Glass, And Much More, Clean On inside, Runs and Drives Like New, Call Me for more info, 864-266-6936 Willing to Negotiate if offer is fair.....',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\nhonda, bmw, crv, mercedes, ford, mazda, lx, rx, ls, is, gs, 470 honda, lexus, toyota, ford, accord, civic, coupe, Mercedes,Honda Pilot, Lexus gx470 & 460, Chevrolet Tahoe, suburban, Tahoe, land rover, Nissan armada, GMC Yukon, Terrian, CX7, BMW x5, GMC Terrian, B 2011, 2010, 2009, 2008, 2007, 2012, 2013, 2014, 2016, 2006, 2005, 2017, 2018, ',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n    '],
 'Images': ['https://images.craigslist.org/00b0b_gNOi9VtqAy3_600x450.jpg',
  'https://images.craigslist.org/00a0a_gs2eKxUlQho_600x450.jpg',
  'https://images.craigslist.org/00l0l_lPmE8ML0zcb_600x450.jpg',
  'https://images.craigslist.org/00x0x_bS9gCuxM7ID_600x450.jpg',
  'https://images.craigslist.org/01010_dTS4DnHjVWW_600x450.jpg',
  'https://images.craigslist.org/00w0w_70D0xeDKa7d_600x450.jpg',
  'https://images.craigslist.org/00606_4SUFT4ZCbmO_600x450.jpg',
  'https://images.craigslist.org/00k0k_1AQ7kVbviPN_600x450.jpg',
  'https://images.craigslist.org/00d0d_3STBecGHaXD_600x450.jpg',
  'https://images.craigslist.org/01717_guG6n90XfQt_600x450.jpg',
  'https://images.craigslist.org/00h0h_8be8866trLr_600x450.jpg',
  'https://images.craigslist.org/00B0B_gaQQvQHlARl_600x450.jpg',
  'https://images.craigslist.org/00b0b_ih84Nskx5xj_600x450.jpg',
  'https://images.craigslist.org/01616_aveWbY1HQvr_600x450.jpg',
  'https://images.craigslist.org/00x0x_Fflsg0wwsK_600x450.jpg',
  'https://images.craigslist.org/00b0b_6FBg7KV8HYv_600x450.jpg',
  'https://images.craigslist.org/00J0J_3vd5Ip3mQ5S_600x450.jpg',
  'https://images.craigslist.org/00L0L_loNV2CrnnLn_600x450.jpg',
  'https://images.craigslist.org/00K0K_fh8oSEa9fKn_600x450.jpg',
  'https://images.craigslist.org/00r0r_8P0SjsOgNd5_600x450.jpg',
  'https://images.craigslist.org/00k0k_ZY0ywNmKkr_600x450.jpg',
  'https://images.craigslist.org/00y0y_7Gie7XD8uuH_600x450.jpg',
  'https://images.craigslist.org/00c0c_2nVDzLJhnYi_600x450.jpg',
  'https://images.craigslist.org/00202_7k10eK3bxMn_600x450.jpg'],
 'Latitude': '35.039000',
 'Location': ' (Cowpens)',
 'Longitude': '-81.822000',
 'Price': '$10995',
 'Title': '2014 Honda Accord  White  41k',
 'Url': 'https://asheville.craigslist.org/ctd/d/2014-honda-accord-white-41k/6513312696.html',
 '_id': {'$oid': '5a96dbf16f9ca5410cc9ed99'}}]

当我运行以下代码时:

wanted_keys = ['Title', 'Location', 'Price', 'Description', 'Url', 'Latitude', 'Longitude'] 
for item in cl_used_items_raw[:2]:
    for k in wanted_keys:
        lines = str(item[k]).split()
        split_lines = [line.replace('\n', '').strip() for line in lines]
        print("{}".format(' '.join(split_lines) + '\t'))
    print('\n')

我得到一个输出:

2013 Yamaha FJR 1300    
(Asheville) 
$7500   
['\n ', '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '\n', '\n ']    
https://asheville.craigslist.org/mcd/d/2013-yamaha-fjr-1300/6513320993.html 
35.599694   
-82.628866  


2014 Honda Accord White 41k 
(Cowpens)   
$10995  
['\n ', '\n2014 Honda Accord Automatic, White , On Tan, It has Only 41,980 Miles It Has Spoiler, Power Windows, and Mirrors, Tan Cloth Seats, Power Seats, 4 Cylinder, 4 Door, Radio, 6 CD Changer, FM,AM,CD, XM Radio, Bluetooth, Back up Camera, Side and Curtain Air Bag, 16 Inch Factory Wheels with Firestone Great Tires, Tinted Glass, And Much More, Clean On inside, Runs and Drives Like New, Call Me for more info, 864-266-6936 Willing to Negotiate if offer is fair.....', '\n', '\n', '\n', '\n', '\n', '\n', '\nhonda, bmw, crv, mercedes, ford, mazda, lx, rx, ls, is, gs, 470 honda, lexus, toyota, ford, accord, civic, coupe, Mercedes,Honda Pilot, Lexus gx470 & 460, Chevrolet Tahoe, suburban, Tahoe, land rover, Nissan armada, GMC Yukon, Terrian, CX7, BMW x5, GMC Terrian, B 2011, 2010, 2009, 2008, 2007, 2012, 2013, 2014, 2016, 2006, 2005, 2017, 2018, ', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n '] 
https://asheville.craigslist.org/ctd/d/2014-honda-accord-white-41k/6513312696.html  
35.039000   
-81.822000

我知道我已经接近了,但我正在努力确定如何编写我的 for 循环以删除 Description 值中的额外空白字符,同时仍然保持我已经拥有的输出结构?

line.strip()不会就地修改line - 它返回修改后的值,因此您使用它的方式不会以任何方式影响line

你可能的意思是:

split_lines = [line.strip() for line in lines]
>>> desc = ['\n        ',
...   '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.',
...   '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM',
...   '\n',
...   '\n    ']

前:

>>> desc
['\n        ', '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '\n', '\n    ']

应用 replace() 和 strip()

[x.replace('\n', '').strip() for x in desc ]

后:

['', '2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '', '']

如果我理解正确,您可以用空字符串替换换行符,然后删除周围的空格

 [x.replace('\n', '').strip() for x in desc ]

这给了我正确的输出:

for item in cl_used_items_raw[:2]:
    for k in wanted_keys:
        if k == 'Description':
            lines = str(''.join(item[k])).split()
            split_lines = [line.replace('\n', '').strip() for line in lines]
            split_lines = ' '.join(split_lines)
            print(split_lines)
        else:
            lines = str(item[k]).split()
            split_lines = [line.replace('\n', '').strip() for line in lines]
            print("{}".format(' '.join(split_lines) + '\t'))       
    print('\n')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM