I have a csv file separated by tabs:
I need only to focus in the two first columns and find, for example, if the pair AB appears in the document again as BA and print AB if the BA appears. The same for the rest of pairs.
For the example proposed the output is: · AB & CD
dic ={}
import sys
import os
import pandas as pd
import numpy as np
import csv
colnames = ['col1', 'col2', 'col3', 'col4', 'col5']
data = pd.read_csv('koko.csv', names=colnames, delimiter='\t')
col1 = data.col1.tolist()
col2 = data.col2.tolist()
dataset = list(zip(col1,col2))
for a,b in dataset:
if (a,b) and (b,a) in dataset:
dic [a] = b
print (dic)
output = {'A': 'B', 'B': 'A', 'D': 'C', 'C':'D'}
How can I avoid duplicated (or swapped) results in the dictionary?
Does this work?:
import pandas as pd
import numpy as np
col_1 = ['A', 'B', 'C', 'B', 'D']
col_2 = ['B', 'C', 'D', 'A', 'C']
df = pd.DataFrame(np.column_stack([col_1,col_2]), columns = ['Col1', 'Col2'])
df['combined'] = list(zip(df['Col1'], df['Col2']))
final_set = set(tuple(sorted(t)) for t in df['combined'])
final_set looks like this:
{('C', 'D'), ('A', 'B'), ('B', 'C')}
The output contains more than AB and CD because of the second row that has BC
The below should work,
example df used:
df = pd.DataFrame({'Col1' : ['A','C','D','B','D','A'], 'Col2' : ['B','D','C','A','C','B']})
This is the function I used:
temp = df[['Col1','Col2']].apply(lambda row: sorted(row), axis = 1)
print(temp[['Col1','Col2']].drop_duplicates())
useful links:
checking if a string is in alphabetical order in python
Difference between map, applymap and apply methods in Pandas
Here is one way.
df = pd.DataFrame({'Col1' : ['A','C','D','B','D','A','E'],
'Col2' : ['B','D','C','A','C','B','F']})
df = df.drop_duplicates()\
.apply(sorted, axis=1)\
.loc[df.duplicated(subset=['Col1', 'Col2'], keep=False)]\
.drop_duplicates()
# Col1 Col2
# 0 A B
# 1 C D
Explanation
The steps are:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.