簡體   English   中英

從字符串中刪除所有標點符號

[英]Remove all punctuation from string

我目前正在研究 pandas dataframe 並嘗試從由列表中的字符串組成的列中提取值,但我有點堅持如何只保留我想要的文本。

這是列表之一的樣子:

["{'BusinessAcceptsCreditCards': 'True'",
 "'RestaurantsPriceRange2': '2'",
 "'ByAppointmentOnly': 'False'",
 "'BikeParking': 'False'",
 '\'BusinessParking\': "{\'garage\': False',
 "'street': True",
 "'validated': False",
 "'lot': False",
 '\'valet\': False}"}']

冒號左邊是屬性,冒號右邊是對應的值。 有沒有辦法讓我在這個列表上 go 並擺脫每個字符串中的所有標點符號並僅獲取屬性和相應值的文本?

所以我的想法是首先通過使用以下代碼來打破冒號:

txt = df_business['attributes'][2]
y = txt.split(", ")
y
y1 = y[0].split(":")
y1
y1[1].strip()

但是使用上面的代碼,我只能得到以下結果:

Attribute = "{'BusinessAcceptsCreditCards'"
Value = "'True'"

我想要的結果是:

Attribute = "BusinessAcceptsCreditCards"
Value = "True"

dataframe 示例:

{'business_id': {0: '6iYb2HFDywm3zjuRg0shjw',
  1: 'tCbdrRPZA0oiIYSmHG3J0w',
  2: 'bvN78flM8NLprQ1a1y5dRg',
  3: 'oaepsyvc0J17qwi8cfrOWg',
  4: 'PE9uqAjdw0E4-8mjGl3wVA',
  5: 'D4JtQNTI4X3KcbzacDJsMw',
  6: 't35jsh9YnMtttm69UCp7gw',
  7: 'jFYIsSb7r1QeESVUnXPHBw',
  8: 'N3_Gs3DnX4k9SgpwJxdEfw'},
 'name': {0: 'Oskar Blues Taproom',
  1: 'Flying Elephants at PDX',
  2: 'The Reclaimory',
  3: 'Great Clips',
  4: 'Crossfit Terminus',
  5: 'Bob Likes Thai Food',
  6: 'Escott Orthodontics',
  7: 'Boxwood Biscuit',
  8: 'Lane Wells Jewelry Repair'},
 'address': {0: '921 Pearl St',
  1: '7000 NE Airport Way',
  2: '4720 Hawthorne Ave',
  3: '2566 Enterprise Rd',
  4: '1046 Memorial Dr SE',
  5: '3755 Main St',
  6: '2511 Edgewater Dr',
  7: '740 S High St',
  8: '7801 N Lamar Blvd, Ste A140'},
 'city': {0: 'Boulder',
  1: 'Portland',
  2: 'Portland',
  3: 'Orange City',
  4: 'Atlanta',
  5: 'Vancouver',
  6: 'Orlando',
  7: 'Columbus',
  8: 'Austin'},
 'state': {0: 'CO',
  1: 'OR',
  2: 'OR',
  3: 'FL',
  4: 'GA',
  5: 'BC',
  6: 'FL',
  7: 'OH',
  8: 'TX'},
 'postal_code': {0: '80302',
  1: '97218',
  2: '97214',
  3: '32763',
  4: '30316',
  5: 'V5V',
  6: '32804',
  7: '43206',
  8: '78752'},
 'latitude': {0: 40.0175444,
  1: 45.5889058992,
  2: 45.5119069956,
  3: 28.9144823,
  4: 33.7470274,
  5: 49.2513423,
  6: 28.573998,
  7: 39.947006523,
  8: 30.346169},
 'longitude': {0: -105.2833481,
  1: -122.5933307507,
  2: -122.6136928797,
  3: -81.2959787,
  4: -84.3534244,
  5: -123.101333,
  6: -81.3892841,
  7: -82.997471,
  8: -97.711458},
 'stars': {0: 4.0,
  1: 4.0,
  2: 4.5,
  3: 3.0,
  4: 4.0,
  5: 3.5,
  6: 4.5,
  7: 4.5,
  8: 5.0},
 'review_count': {0: 86,
  1: 126,
  2: 13,
  3: 8,
  4: 14,
  5: 169,
  6: 7,
  7: 11,
  8: 30},
 'is_open': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1},
 'attributes': {0: '{\'RestaurantsTableService\': \'True\', \'WiFi\': "u\'free\'", \'BikeParking\': \'True\', \'BusinessParking\': "{\'garage\': False, \'street\': True, \'validated\': False, \'lot\': False, \'valet\': False}", \'BusinessAcceptsCreditCards\': \'True\', \'RestaurantsReservations\': \'False\', \'WheelchairAccessible\': \'True\', \'Caters\': \'True\', \'OutdoorSeating\': \'True\', \'RestaurantsGoodForGroups\': \'True\', \'HappyHour\': \'True\', \'BusinessAcceptsBitcoin\': \'False\', \'RestaurantsPriceRange2\': \'2\', \'Ambience\': "{\'touristy\': False, \'hipster\': False, \'romantic\': False, \'divey\': False, \'intimate\': False, \'trendy\': False, \'upscale\': False, \'classy\': False, \'casual\': True}", \'HasTV\': \'True\', \'Alcohol\': "\'beer_and_wine\'", \'GoodForMeal\': "{\'dessert\': False, \'latenight\': False, \'lunch\': False, \'dinner\': False, \'brunch\': False, \'breakfast\': False}", \'DogsAllowed\': \'False\', \'RestaurantsTakeOut\': \'True\', \'NoiseLevel\': "u\'average\'", \'RestaurantsAttire\': "\'casual\'", \'RestaurantsDelivery\': \'None\'}',
  1: '{\'RestaurantsTakeOut\': \'True\', \'RestaurantsAttire\': "u\'casual\'", \'GoodForKids\': \'True\', \'BikeParking\': \'False\', \'OutdoorSeating\': \'False\', \'Ambience\': "{\'romantic\': False, \'intimate\': False, \'touristy\': False, \'hipster\': False, \'divey\': False, \'classy\': False, \'trendy\': False, \'upscale\': False, \'casual\': True}", \'Caters\': \'True\', \'RestaurantsReservations\': \'False\', \'RestaurantsDelivery\': \'False\', \'HasTV\': \'False\', \'RestaurantsGoodForGroups\': \'False\', \'BusinessAcceptsCreditCards\': \'True\', \'NoiseLevel\': "u\'average\'", \'ByAppointmentOnly\': \'False\', \'RestaurantsPriceRange2\': \'2\', \'WiFi\': "u\'free\'", \'BusinessParking\': "{\'garage\': True, \'street\': False, \'validated\': False, \'lot\': False, \'valet\': False}", \'Alcohol\': "u\'beer_and_wine\'", \'GoodForMeal\': "{\'dessert\': False, \'latenight\': False, \'lunch\': True, \'dinner\': False, \'brunch\': False, \'breakfast\': True}"}',
  2: '{\'BusinessAcceptsCreditCards\': \'True\', \'RestaurantsPriceRange2\': \'2\', \'ByAppointmentOnly\': \'False\', \'BikeParking\': \'False\', \'BusinessParking\': "{\'garage\': False, \'street\': True, \'validated\': False, \'lot\': False, \'valet\': False}"}',
  3: "{'RestaurantsPriceRange2': '1', 'BusinessAcceptsCreditCards': 'True', 'GoodForKids': 'True', 'ByAppointmentOnly': 'False'}",
  4: '{\'GoodForKids\': \'False\', \'BusinessParking\': "{\'garage\': False, \'street\': False, \'validated\': False, \'lot\': False, \'valet\': False}", \'BusinessAcceptsCreditCards\': \'True\'}',
  5: '{\'GoodForKids\': \'True\', \'Alcohol\': "u\'none\'", \'RestaurantsGoodForGroups\': \'True\', \'RestaurantsReservations\': \'True\', \'BusinessParking\': "{\'garage\': False, \'street\': True, \'validated\': False, \'lot\': False, \'valet\': False}", \'RestaurantsAttire\': "u\'casual\'", \'BikeParking\': \'True\', \'RestaurantsPriceRange2\': \'2\', \'HasTV\': \'False\', \'NoiseLevel\': "u\'average\'", \'WiFi\': "u\'no\'", \'RestaurantsTakeOut\': \'True\', \'Caters\': \'False\', \'OutdoorSeating\': \'False\', \'Ambience\': "{\'romantic\': False, \'intimate\': False, \'classy\': False, \'hipster\': False, \'divey\': False, \'touristy\': False, \'trendy\': False, \'upscale\': False, \'casual\': True}", \'GoodForMeal\': "{\'dessert\': False, \'latenight\': False, \'lunch\': True, \'dinner\': True, \'brunch\': False, \'breakfast\': False}", \'DogsAllowed\': \'False\', \'RestaurantsDelivery\': \'True\'}',
  6: "{'AcceptsInsurance': 'True', 'BusinessAcceptsCreditCards': 'True', 'ByAppointmentOnly': 'True'}",
  7: nan,
  8: '{\'RestaurantsPriceRange2\': \'1\', \'ByAppointmentOnly\': \'False\', \'BusinessParking\': "{\'garage\': False, \'street\': False, \'validated\': False, \'lot\': True, \'valet\': False}", \'BusinessAcceptsCreditCards\': \'True\', \'DogsAllowed\': \'True\', \'RestaurantsDelivery\': \'None\', \'BusinessAcceptsBitcoin\': \'False\', \'BikeParking\': \'True\', \'RestaurantsTakeOut\': \'None\', \'WheelchairAccessible\': \'True\'}'},
 'categories': {0: 'Gastropubs, Food, Beer Gardens, Restaurants, Bars, American (Traditional), Beer Bar, Nightlife, Breweries',
  1: 'Salad, Soup, Sandwiches, Delis, Restaurants, Cafes, Vegetarian',
  2: 'Antiques, Fashion, Used, Vintage & Consignment, Shopping, Furniture Stores, Home & Garden',
  3: 'Beauty & Spas, Hair Salons',
  4: 'Gyms, Active Life, Interval Training Gyms, Fitness & Instruction',
  5: 'Restaurants, Thai',
  6: 'Dentists, Health & Medical, Orthodontists',
  7: 'Breakfast & Brunch, Restaurants',
  8: 'Shopping, Jewelry Repair, Appraisal Services, Local Services, Jewelry, Engraving, Gold Buyers'},
 'hours': {0: "{'Monday': '11:0-23:0', 'Tuesday': '11:0-23:0', 'Wednesday': '11:0-23:0', 'Thursday': '11:0-23:0', 'Friday': '11:0-23:0', 'Saturday': '11:0-23:0', 'Sunday': '11:0-23:0'}",
  1: "{'Monday': '5:0-18:0', 'Tuesday': '5:0-17:0', 'Wednesday': '5:0-18:0', 'Thursday': '5:0-18:0', 'Friday': '5:0-18:0', 'Saturday': '5:0-18:0', 'Sunday': '5:0-18:0'}",
  2: "{'Thursday': '11:0-18:0', 'Friday': '11:0-18:0', 'Saturday': '11:0-18:0', 'Sunday': '11:0-18:0'}",
  3: nan,
  4: "{'Monday': '16:0-19:0', 'Tuesday': '16:0-19:0', 'Wednesday': '16:0-19:0', 'Thursday': '16:0-19:0', 'Friday': '16:0-19:0', 'Saturday': '9:0-11:0'}",
  5: "{'Monday': '17:0-21:0', 'Tuesday': '17:0-21:0', 'Wednesday': '17:0-21:0', 'Thursday': '17:0-21:0', 'Friday': '17:0-21:0', 'Saturday': '17:0-21:0', 'Sunday': '17:0-21:0'}",
  6: "{'Monday': '0:0-0:0', 'Tuesday': '8:0-17:30', 'Wednesday': '8:0-17:30', 'Thursday': '8:0-17:30', 'Friday': '8:0-17:30'}",
  7: "{'Saturday': '8:0-14:0', 'Sunday': '8:0-14:0'}",
  8: "{'Monday': '12:15-17:0', 'Tuesday': '12:15-17:0', 'Wednesday': '12:15-17:0', 'Thursday': '12:15-17:0', 'Friday': '12:15-17:0'}"}}

我想計算每個餐廳屬性中 True 和 False 出現的次數

您可以連接您列出的所有元素並搜索'\bTrue\b' / '\bFalse\b'模式( \b表示單詞邊界):

s = df['attributes'].fillna('').apply(''.join)
df['nb_True'] = s.str.count('\bTrue\b')
df['nb_False'] = s.str.count('\bFalse\b')

output:

>>> df[['nb_True', 'nb_False']]
   nb_True  nb_False
0       12        21
1        8        23
2        2         6
3        2         1
4        1         6
5       10        20
6        3         0
7        0         0
8        5         6

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM