how to convert multiple rows into single row for same id using pandas

Question

I have text file, in below format and it has unique IDs and each unique IDs have four rows, now I need to convert into single row for particular ID. let say if have 8 rows and the output should give 2 rows. And it doesn't have header which I need do using pandas!

    xyz,name,,,12345
    2nd street,add,,,12345
    xyx@mail.com,email,,,12345
    575xxx5678,contact,,,12345

output

xyz,name,,,12345,2nd street,add,,,12345,xyx@mail.com,email,,,12345,575xxx5678,contact,,,12345

Consider unique ID as 12345, can help me to resolve this. It would be great. Thanks in Advance.

Answer 1

Suppose you have this file.csv :

www,contact,,,99999
xyz,name,,,12345
2nd street,add,,,12345
xyx@mail.com,email,,,12345
575xxx5678,contact,,,12345
qqq,contact,,,99999

To read it to pandas:

df = pd.read_csv("file.csv", names=["col1", "col2", "col3", "col4", "ID"])
print(df)

Prints:

           col1     col2  col3  col4     ID
0           www  contact   NaN   NaN  99999
1           xyz     name   NaN   NaN  12345
2    2nd street      add   NaN   NaN  12345
3  xyx@mail.com    email   NaN   NaN  12345
4    575xxx5678  contact   NaN   NaN  12345
5           qqq  contact   NaN   NaN  99999

Then to convert it to your desired output:

x = (
    df.assign(ID2=df["ID"])
    .groupby("ID")
    .agg(list)
    .apply(lambda x: [v for l in zip(*x) for v in l], axis=1)
)

pd.DataFrame(x.tolist()).to_csv("output.txt", sep=",", header=None, index=None)

This creates output.txt :

xyz,name,,,12345,2nd street,add,,,12345,xyx@mail.com,email,,,12345.0,575xxx5678,contact,,,12345.0
www,contact,,,99999,qqq,contact,,,99999,,,,,,,,,,

how to convert multiple rows into single row for same id using pandas

Question

1 answers

solution1
1 2021-04-24 16:09:55

how to convert multiple rows into single row for same id using pandas

Question

1 answers

solution1 1 2021-04-24 16:09:55

solution1
1 2021-04-24 16:09:55