I have a dataset:
id1 id2 val
first10 second5 10
second3 first19 14
first2 second7 8
first10 second10 1
second8 first22 9
I want to swap values in columns id1 and id2 to have in id1 only values containing "first" and in id2 only values containing "second". So desired result is:
id1 id2 val
first10 second5 10
first19 second3 14
first2 second7 8
first10 second10 1
first22 second8 9
How could I do that? i know about str.contains() but how should condition look like?
here is one way do it, using np.where by checking if the first id starts with second, then swap, else keep as is
df['id1'],df['id2']=np.where(df['id1'].str.startswith('second'),
[df['id2'],df['id1']],
[df['id1'],df['id2']]
)
df
id1 id2 val
0 first10 second5 10
1 first19 second3 14
2 first2 second7 8
3 first10 second10 1
4 first22 second8 9
Maybe not the most elegant solution, but you can achieve it, transforming DataFrame in a list of dict and then making it a DataFrame again. For example:
df_list = []
for i in range(df.shape[0]):
row = dict(df.iloc[i])
new_row = {}
for key in row:
val = row[key]
if type(val) == str and val.startswith('f'):
new_row['id1'] = val
elif type(val) == str and val.startswith('s'):
new_row['id2'] = val
else:
new_row['val'] = val
df_list.append(new_row)
new_df = pd.DataFrame(df_list)
My suggestion is to define a function that can take a row of the dataframe and swap the id1
and id2
values in the condition required, and then apply it to the dataframe with df.apply(func, axis=1)
.
def conditional_swap(row):
if row["id1"].startswith("second"):
row["id1"], row["id2"] = row["id2"], row["id1"]
return row
df.apply(conditional_swap, axis=1)
( axis=1
ensures that it is applied row-wise, not column-wise)
Hope this helps!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.