简体   繁体   中英

Create Boolean Columns in Pandas Dataframe using Dictionary

I am using a network trace dataset, and have loaded the initial data into a pandas dataframe, which looks like this:

初始数据框

I have created a python dict with common port numbers and applications names like

port_dict = {80: 'http', 20: 'ftp', 21: 'ftp'}

and I want to modify my dataframe by adding additional columns whose names will be the unique values of the ports_dict and if either of sport or dport contains the relevant key, the newly added column should have a value True , False otherwise, like this:

修改后的数据框

In the above picture, the column https should have True as the sport is 443 .

How would I go about accomplishing this?

Try this out. Series.map should be a faster way to look up values from the dictionary. pandas.get_dummies turns a single column of data into columns for each distinct value as 1s / 0s, which I'm converting into a bool, and compare with or ( | ) to get whether the service was on either port.

service = pd.get_dummies(df['sport'].map(port_dict)).astype(bool) | pd.get_dummies(df['sport'].map(port_dict)).astype(bool)

df[services.columns] = services

In [166]: df.head()
Out[166]: 
   dport  sport    ftp   http
0      1      1  False  False
1     80      2  False  False
2      2     80  False   True
3      3     20   True  False
4      1      1  False  False

If I may suggest that you will simply have a service column, then if the sport or dport are in the port_dict keys then the value will be written in the service column:

port_dict = {80: 'http', 20: 'ftp', 21: 'ftp'}

df = pd.DataFrame(data={'sport':[1, 2, 80, 20], 'dport':[1, 80, 2, 3]})

for i in df.index:
    found_service = port_dict.get(df.ix[i, 'sport'], False) or port_dict.get(df.ix[i, 'dport'], False)
    df.at[i, 'service'] = found_service

# a small example dataframe
>>       dport  sport service
      0      1      1    False
      1     80      2    http
      2      2     80    http
      3      3     20     ftp

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM