I have a dataframe with a column of tuples (df.row_col) that I need to search using a list of tuples. If a tuple from the list is present in the dataframe column, I want to return that row and add a new column to the dataframe. I tried this list comprehension, but I'm not sure if I can loop through a list like this. Much appreciate the help!
data_tuples=
[(7, 45),
(13, 34),
(17, 51),
(17, 52),
(17, 53),
(17, 54),
(17, 55),
(18, 50)]
Dataframe to search:
index farm layer row column Qmax row_col
0 1 1 3 7 36 0.0 (7, 36)
1 2 1 3 7 37 0.0 (7, 37)
2 3 1 3 8 35 0.0 (8, 35)
3 4 1 3 8 36 0.0 (8, 36)
4 5 1 3 8 37 0.0 (8, 37)
for tup in data_tuples:
new_df = df[df["row_col"].apply(lambda x: True if tup in x else False)]
return new_df
You can use Series.map(...)
to accomplish what you're trying to do. First, you can create a boolean mask (a column of True/False) based on whether the tuple is present in data_tuples
or not:
tuple_present_in_list = df["row_col"].map(lambda x: x in data_tuples)
Then, you can filter your original DataFrame down to just those rows (if that's what you're trying to do):
new_df = df[tuple_present_in_list]
The key thing here is that .map()
applies your logic to a single column (which is a pandas Series) to check each "row_col" value to see if it's in your tuple list.
Here's another answer about the difference between apply and map: Difference between map, applymap and apply methods in Pandas
And here's the pandas documentation for .map()
: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.map.html
isin
lets you check if a value is in a list (or iterable) object
For example If you have the following:
data_tuples = [
(8, 36),
(7, 37)
]
df
+----+-----+---------+--------+---------+-------+----------+--------+-----------+
| | a | index | farm | layer | row | column | Qmax | row_col |
|----+-----+---------+--------+---------+-------+----------+--------+-----------|
| 0 | 0 | 1 | 1 | 3 | 7 | 36 | 0 | (7, 36) |
| 1 | 1 | 2 | 1 | 3 | 7 | 37 | 0 | (7, 37) |
| 2 | 2 | 3 | 1 | 3 | 8 | 35 | 0 | (8, 35) |
| 3 | 3 | 4 | 1 | 3 | 8 | 36 | 0 | (8, 36) |
| 4 | 4 | 5 | 1 | 3 | 8 | 37 | 0 | (8, 37) |
+----+-----+---------+--------+---------+-------+----------+--------+-----------+
Then we can use isin
function
df[df["row_col"].isin(data_tuples)]
+----+-----+---------+--------+---------+-------+----------+--------+-----------+
| | a | index | farm | layer | row | column | Qmax | row_col |
|----+-----+---------+--------+---------+-------+----------+--------+-----------|
| 1 | 1 | 2 | 1 | 3 | 7 | 37 | 0 | (7, 37) |
| 3 | 3 | 4 | 1 | 3 | 8 | 36 | 0 | (8, 36) |
+----+-----+---------+--------+---------+-------+----------+--------+-----------+
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.