简体   繁体   English

如何从没有列的杂乱无章的excel文件中提取具体数据)

[英]How to extract the specific data from the unorganized excel file without columns)

I reached my limit and My hair is getting thinner.我达到了我的极限,我的头发越来越稀疏。 I really need your help.我真的需要你的帮助。

1. Try 1.尝试

I'd like to extract the data line including the specific words " Super Banana " from *.xlsx in one folder.我想从一个文件夹中的*.xlsx中提取包含特定单词“ Super Banana ”的数据行。

Here is the file pic.这是文件图片。 [1]: https://i.stack.imgur.com/Cb3yD.png [1]: https://i.stack.imgur.com/Cb3yD.png

But

2. Problem 2.问题

  • No columns in this unorganized excel files.此杂乱无章的 excel 文件中没有列。 There are many files and I can`t add column manually to all files.有很多文件,我无法手动将列添加到所有文件。

    I was looking for the way to extract:我正在寻找提取方法:

  1. the row including the key words " Super Banana "包含关键词“超级香蕉”的行

    or或者

  2. the row before and after the row including the key words包含关键字的行之前和之后的行

https://i.stack.imgur.com/Cb3yD.png https://i.stack.imgur.com/Cb3yD.png

  • A1 title "Monday Shopping List 2020 " will be changed (Monday Shopping List 2020,Monday Shopping List 2021...) A1标题“Monday Shopping List 2020”将更改(Monday Shopping List 2020,Monday Shopping List 2021...)

3. My Code 3. 我的代码

import glob

files = glob.glob('*.xlsx')
print(files)

import pandas as pd
for file in files:
    df = pd.read_excel(file).fillna(value = 0)
    for row in df.values:
        data = df[df[''].str.contains('Super Banana',na=False)]
        data.to_excel('excel-data_find.xlsx', encoding='utf-8')
        print(data)
        print('Data was extracted')

use the proper file path.使用正确的文件路径。 Example: df = pd.read_excel('C:\\Users\\file.xlsx').fillna(value = 0)示例: df = pd.read_excel('C:\\Users\\file.xlsx').fillna(value = 0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM