簡體   English   中英

我想選擇一列每一行的前 4 個單詞,並根據該值使用 python 為另一個新創建的列分配一個新值

[英]I want to to pick the first 4 words of each row of a column and based on the value assign a new value to another newly created column using python

下面給出了我的數據集前 5 行的圖片。 我想要做的是,我想創建一個名為“停車類型”的新列,並根據另一個名為“Sign”的列將列的值分配為“Meter”、“Ticket”和“Other”。 “Sign”列是字符串,其中一些字符串值有 MTR,一些有 TKT,還有一些兩者都沒有。 所以我只想在“停車類型”列中放入值“Meter”,如果“Sign”列中有一行在此處輸入圖像描述 其中包含字符串“MTR”等等。 我正在做這樣的事情:

pSignInfringe['停車類型'] = pSignInfringe.Sign.apply(lambda x: "Meter" if x == "1P MTR M-SAT 7:30-19:30" or x == "1/2P MTR SAT 7: 30-1930" else "票")

但隨后它將需要太多的 or 語句。 有沒有更好的方法來做到這一點? 我是 python 的新手,如果這是一個初學者問題,我很抱歉。 dataframe代碼如下:

,Area Name,Street Name,Between Street 1,Between Street 2,Side Of Street,Street Marker,Arrival Time,Departure Time,Duration of Parking Event (in seconds),Sign,In Violation?,Street ID,Device ID,Month Number
8,City Square,FLINDERS STREET,SWANSTON STREET,RUSSELL STREET,3,1630N,2012-05-19 18:20:01,2012-05-19 19:19:58,3597,1/2P MTR SAT 7:30-1930,1,670,1123,5
10,Chinatown,RUSSELL STREET,Lt BOURKE STREET,BOURKE STREET,2,770E,2012-02-25 18:30:31,2012-02-25 21:02:36,9125,2P DIS M-SUN 0:00-23:59,1,1221,504,2
11,Princes Theatre,LONSDALE STREET,RUSSELL STREET,EXHIBITION STREET,1,C2858,2011-11-17 09:00:00,2011-11-17 10:41:06,6066,1P MTR M-SAT 7:30-19:30,1,894,1996,11
15,Southbank,COVENTRY STREET,DODDS STREET,WELLS STREET,4,9317S,2012-02-20 13:50:40,2012-02-20 16:33:33,9773,2P TKT A M-F 7:30-18:30,1,547,4054,2
28,Queensberry,VICTORIA STREET,KING STREET,HAWKE STREET,3,7642N,2012-02-15 11:32:34,2012-02-15 12:09:35,2221,1/4P M-SAT 7:30-18:30,1,1381,4001,2
30,Rialto,COLLINS STREET,KING STREET,WILLIAM STREET,3,2066N,2012-09-03 09:24:51,2012-09-03 10:45:41,4850,1/2P M-SAT 7:30-19:30,1,528,1290,9
45,Victoria Market,FRANKLIN STREET,QUEEN STREET,ELIZABETH STREET,1,C6628,2011-11-11 17:42:32,2011-11-11 19:50:44,7692,2P MTR M-SAT 7:30-20:30,1,681,2812,11
53,Hardware,LONSDALE STREET,QUEEN STREET,ELIZABETH STREET,1,C2942,2012-05-05 13:17:55,2012-05-05 14:59:35,6100,1P MTR M-SAT 7:30-19:30,1,894,2019,5
55,Hyatt,EXHIBITION STREET,Lt COLLINS STREET,COLLINS STREET,1,C364,2011-01-11 08:11:48,2011-01-11 16:48:39,31011,1P MTR M-SAT 7:30-19:30,1,647,243,1
56,Banks,QUEEN STREET,FLINDERS LANE,FLINDERS STREET,5,975W,2012-03-03 12:53:27,2012-03-03 14:06:27,4380,1P MTR M-SAT 7:30-19:30,1,1171,693,3

如果"ParkingType的所需值僅取決於“MTR”的存在,您可能會發現這更好。這將說明 MTR 在.Sign字段中的所有情況,而無需硬編碼所有可能的值。

pSignInfringe['Parking Type'] = pSignInfringe.Sign.apply(lambda x: "Meter" if 'MTR' in x else "Ticket")

試着做一個 for 循環,你可以做綜合列表,但我不推薦它,因為你從 python 開始。

我根據你的描述做了一些代碼。

看一看,讓我知道它是否有效

for item in Sign:
    if "MTR" in item:
        pSignInfringe['Parking Type'] = "MTR"
    else:
        pSignInfringe['Parking Type'] = "Ticket"

您可以使用.str.contains ,它將返回一個 boolean 系列,其索引與df相同,然后將其用作索引器

pSignInfringe.loc[
    pSignInfringe.Sign.srt.contains('MTR'),
    'Parking Type'] = 'Meter'

注意 pandas 的字符串訪問器默認使用正則表達式。

這樣您就可以避免通用apply調用,從而使您的代碼更快。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM