简体   繁体   English

Pandas柱header

[英]Pandas column header

i have multiple excel files that contains similar information that i want to bring in as list into python with pandas.我有多个 excel 文件,其中包含我想作为列表引入 python 和 pandas 的类似信息。 Every file contains a table that has the info that i want but this table in some files starts at row 5 or in other in row 10 or 23 and go on (because before this table there are some titles that change from one file to others) so this cannot be a constant BUT the headers of the table are the same, can i tell pandas "take all data under the row with "specific name""?每个文件都包含一个表,其中包含我想要的信息,但某些文件中的此表从第 5 行或第 10 行或第 23 行的其他文件开始,并且 go 打开(因为在此表之前有一些标题从一个文件更改为其他文件)所以这不能是一个常数,但表的标题是相同的,我可以告诉 pandas “获取具有“特定名称”的行下的所有数据吗? or have i to tell every time the right index?还是让我每次都告诉正确的索引?

Thanks!谢谢! Have a good work!祝你工作愉快!

Edit: To make it more clear this is how pandas read my dataframe (based on excel file)编辑:为了更清楚,这是 pandas 如何读取我的 dataframe (基于 excel 文件)

数据框示例

So as you can see in row 2 (in this example) there is the raw with "Draw, Back#,Horse,Rider...) and this is my "keyrow" so i want that my df starts under this row so i can use all below datas to make my folders but as i said this row is the same with all excels but in every excels is in different row.因此,正如您在第 2 行(在本例中)中看到的那样,有带有“Draw,Back#,Horse,Rider ...”的原始数据,这是我的“keyrow”,所以我希望我的 df 从这一行开始,所以我可以使用以下所有数据来制作我的文件夹,但正如我所说,这一行与所有 excel 相同,但在每个 excel 中都在不同的行中。

Since the multiple Excel files have the same column headers, you could create a filter to "take all data under the row with "specific name"".由于多个 Excel 文件具有相同的列标题,因此您可以创建一个过滤器以“获取具有“特定名称”的行下的所有数据”。

This can be done using df['col_name']=='specific name' which will return an array of True / False values.这可以使用df['col_name']=='specific name'来完成,这将返回一个True / False值数组。

filter = df['col_name']=='specific name'

After that, apply the filter (array of True / False values) to the dataframe, which will keep those values that are True only之后,将过滤器( True / False值数组)应用于 dataframe,这将只保留那些为True

df.loc[filter]

For example例如

df = pd.DataFrame({'col_name': ['nothing', 'specific name', 'specific name', 'blah', 'blah', 'blah'], 
              'col_2': ['blah blah', 'blah', 'blah blah', 'blah', 'blah', 'blah'] })

        col_name      col_2
0        nothing  blah blah
1  specific name       blah
2  specific name  blah blah
3           blah       blah
4           blah       blah
5           blah       blah

filter = df['col_name']=='specific name'
print(df.loc[filter])

        col_name      col_2
1  specific name       blah
2  specific name  blah blah

With the index, you can use the below code.使用索引,您可以使用以下代码。 If you have excel values with row 4 header and from row 5 the rest data如果您有 excel 值,第 4 行 header 和第 5 行 rest 数据

col_name = excel_df.iloc[3:4,:].values.tolist() col_name = excel_df.iloc[3:4,:].values.tolist()

df = excel_df.iloc[ 4:,: ] df = excel_df.iloc[4:,:]

df.columns = sum(col_name, []) df.columns = sum(col_name, [])

df df

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM