Python pandas read_csv 每两列合并，读取为 dataframe

Question

Beginner in python and pandas and trying to figure out how to read from csv in a particular way. python 和 pandas 的初学者，并试图弄清楚如何以特定方式从 csv 中读取数据。

My datafile我的数据文件

01 AAA1234 AAA32452 AAA123123 0 -9 C C A A T G A G .......
01 AAA1334 AAA12452 AAA125123 1 -9 C A T G T G T G .......
...
...
...

So I have 100.000 columns in this file and I want to merge every two columns into one.所以我在这个文件中有 100.000 列，我想将每两列合并为一列。 But the merging needs to occur after the first 6 columns.但是合并需要在前 6 列之后进行。 I would prefer to do this while reading the file if possible instead of manipulating this huge datafile/如果可能的话，我宁愿在读取文件时这样做，而不是操作这个巨大的数据文件/

Desired outcome期望的结果

01 AAA1234 AAA32452 AAA123123 0 -9 CC AA TG AG .......
01 AAA1334 AAA12452 AAA125123 1 -9 CA TG TG TG .......
...
...
...

That will result in a dataframe with half the columns.这将导致 dataframe 有一半的列。 My datafile has no col names, the names reside in a different csv but that is another subject.我的数据文件没有列名，这些名称位于不同的 csv 中，但这是另一个主题。

I d appreciate a solution, thanks in advance!我很感激一个解决方案，在此先感谢！

Answer 1

Separate the data frame initially.最初分离数据框。 I created one for experimental purposes:我创建了一个用于实验目的：

Then I defined a function.然后我定义了一个 function。 Then passed in the dataframe which needed manipulation as an argument into the function然后将需要操作的 dataframe 作为参数传入 function

def columns_joiner(data):
    new_data = pd.DataFrame()
    for i in range(0,11,2): # You can change range to your wish
    # Here, I had only 10 columns to concatenate (Therefore the range ends at 11)
        ser = data[i] + data[i + 1]
        new_data = pd.concat([new_data, ser], axis = 1)

  return new_data

I don't think this is an efficient solution.我认为这不是一个有效的解决方案。 But it worked for me.但它对我有用。

Python pandas read_csv 每两列合并，读取为 dataframe

问题描述

1 个解决方案

解决方案1
0 2021-03-31 13:30:37

Python pandas read_csv 每两列合并，读取为 dataframe

问题描述

1 个解决方案

解决方案1 0 2021-03-31 13:30:37

解决方案1
0 2021-03-31 13:30:37