[英]How to read multiple lines from csv into a single dataframe row with pandas
I have a file that has a comment on the first the line, followed by two lines with the names of the headers slippted across them and a third line with the name of the index.我有一个文件,第一行有注释,然后是两行,标题的名称滑过它们,第三行是索引的名称。 The file looks like this:
该文件如下所示:
# 3 5 <-- this is a comment indicating how many rows and column are matrix data
head1 head2 head3
head4 head5
idx1 idx2 idx3
1.1 1.2 1.3
1.4 1.5
2.1 2.2 2.3
2.4 2.5
3.1 3.2 3.3
3.4 3.5
How can I read the file with pandas in order to have a dataframe that looks like this?如何使用 Pandas 读取文件以获得如下所示的数据框?
head1 head2 head3 head4 head5
idx1 1.1 1.2 1.3 1.4 1.5
idx2 2.1 2.2 2.3 2.4 2.5
idx3 3.1 3.2 3.3 3.4 3.5
You can specify the skiprows
keyword of read_csv in order to create one data frame that contains all 3-value lines (by skipping the 2-valued ones) and then create another data frame which contains all the 2-value lines.可以指定
skiprows
关键字的read_csv以创建一个包含所有3值线中的一条数据帧(通过跳过2值的),然后创建包含所有的2值行另一个数据帧。 Note that you can specify the header row with the header
keyword.请注意,您可以使用
header
关键字指定标题行。
So you can parse the csv file into two different data frames which you can concatenate later on.因此,您可以将 csv 文件解析为两个不同的数据帧,您可以稍后将它们连接起来。
As an example (assuming 3-valued lines are even line numbers and 2-valued lines are odd line numbers):例如(假设 3 值行是偶数行号,2 值行是奇数行号):
df3 = pd.read_csv(..., skiprows=lambda x: x%2 == 1)
df2 = pd.read_csv(..., skiprows=lambda x: x%2 == 0)
Then you can use concat in order to concatenate the two data frames into a single one:然后您可以使用concat将两个数据帧连接成一个:
df = pd.concat((df3, df2))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.