How to split pandas data frame by many criteria

Question

I have ~150,000 rows of data detailing email bounces by domain, email template, bounce type and the count of each by day. It is formatted like the below:

+--------+-------------+-----------------+-------+---------+-------+
|   t    | bounce_type |    source_ip    |  tid  |  emld   | count |
+--------+-------------+-----------------+-------+---------+-------+
| 1/1/15 | hard        | 199.122.255.142 | 10033 | aol.com |     4 |
+--------+-------------+-----------------+-------+---------+-------+

What is the easiest way to select only rows with an emld of "aol.com", bounce type of "hard", from all source ips and all tids? Is this something I would create a function for and pass the dataframe through, or is there a simpler operation to filter the data by these criteria?

Answer 1

An easy way is to perform a masked, supposed your DataFrame called df , it will be something like this:

masked = (df['emld'] == 'aol.com') & (df['bounce_type'] == 'hard')
# then the result will be
df[masked]

shorthanded version in one line:

df[(df['emld'] == 'aol.com') & (df['bounce_type'] == 'hard')]

To just return source_ip and tids columns:

df[masked][['source_ip', 'tids']]

Or,

df[(df['emld'] == 'aol.com') & (df['bounce_type'] == 'hard')][['source_ip', 'tids']]

Hope this helps.

How to split pandas data frame by many criteria

Question

1 answers

solution1
1 ACCPTED 2015-01-28 23:31:53

How to split pandas data frame by many criteria

Question

1 answers

solution1 1 ACCPTED 2015-01-28 23:31:53

solution1
1 ACCPTED 2015-01-28 23:31:53