[英]Python pandas read_csv merge every two columns and read them as a dataframe
Beginner in python and pandas and trying to figure out how to read from csv in a particular way. python 和 pandas 的初学者,并试图弄清楚如何以特定方式从 csv 中读取数据。
My datafile我的数据文件
01 AAA1234 AAA32452 AAA123123 0 -9 C C A A T G A G .......
01 AAA1334 AAA12452 AAA125123 1 -9 C A T G T G T G .......
...
...
...
So I have 100.000 columns in this file and I want to merge every two columns into one.所以我在这个文件中有 100.000 列,我想将每两列合并为一列。 But the merging needs to occur after the first 6 columns.
但是合并需要在前 6 列之后进行。 I would prefer to do this while reading the file if possible instead of manipulating this huge datafile/
如果可能的话,我宁愿在读取文件时这样做,而不是操作这个巨大的数据文件/
Desired outcome期望的结果
01 AAA1234 AAA32452 AAA123123 0 -9 CC AA TG AG .......
01 AAA1334 AAA12452 AAA125123 1 -9 CA TG TG TG .......
...
...
...
That will result in a dataframe with half the columns.这将导致 dataframe 有一半的列。 My datafile has no col names, the names reside in a different csv but that is another subject.
我的数据文件没有列名,这些名称位于不同的 csv 中,但这是另一个主题。
I d appreciate a solution, thanks in advance!我很感激一个解决方案,在此先感谢!
Separate the data frame initially.最初分离数据框。 I created one for experimental purposes:
我创建了一个用于实验目的:
Then I defined a function.然后我定义了一个 function。 Then passed in the dataframe which needed manipulation as an argument into the function
然后将需要操作的 dataframe 作为参数传入 function
def columns_joiner(data):
new_data = pd.DataFrame()
for i in range(0,11,2): # You can change range to your wish
# Here, I had only 10 columns to concatenate (Therefore the range ends at 11)
ser = data[i] + data[i + 1]
new_data = pd.concat([new_data, ser], axis = 1)
return new_data
I don't think this is an efficient solution.我认为这不是一个有效的解决方案。 But it worked for me.
但它对我有用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.