[英]Pandas to parse the text data and aligned the columns based on condition
我有以下文本數據,我需要根據以下條件解析並拆分為列..
以=
開頭的任何內容都應在ENC_NAME
下
任何包含BladeSystem
的行,行尾的數字應位於OA_VERSION
列下
包含1 HP
的任何行都應位於VC_ACTIVE
列下
包含2 HP
的任何行都應位於VC_STDN
列下
========= enc1001 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1002 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1003 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1004 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1005 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1006 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1007 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1008 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.40
2 HP VC Flex-10/10D Module 4.40
========= enc1009 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2001 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2002 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2003 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2004 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2005 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2006 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2007 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2008 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2009 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2011 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2013 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc3020 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.41
2 HP VC Flex-10/10D Module 4.41
========= enc3021 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.41
2 HP VC Flex-10/10D Module 4.41
========= enc3022 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.41
2 HP VC Flex-10/10D Module 4.41
========= enc3026 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.45
2 HP VC Flex-10/10D Module 4.45
========= enc3027 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc3028 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc3029 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc3030 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc3031 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4021 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.41
2 HP VC Flex-10/10D Module 4.41
========= enc4023 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.41
2 HP VC Flex-10/10D Module 4.41
========= enc4024 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.41
2 HP VC Flex-10/10D Module 4.41
========= enc4025 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.41
2 HP VC Flex-10/10D Module 4.41
========= enc4026 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4027 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4028 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4029 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4030 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4031 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4032 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4033 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4034 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc6002 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.60
========= enc6011 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.60
========= enc6012 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.60
========= enc6013 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.60
========= enc6014 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.60
========= enc6015 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.60
========= enc6016 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.60
========= enc6017 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.60
========= enc7002 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
========= enc7003 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
========= enc7004 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
========= enc7009 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1010 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1011 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1012 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1013 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1014 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1015 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1016 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1017 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1018 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc1025 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc1026 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2010 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2012 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2014 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2015 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2016 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2018 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2019 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2020 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2021 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2022 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc2023 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc3033 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc3034 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc3036 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc4020 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.41
2 HP VC Flex-10/10D Module 4.41
========= enc4022 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.41
2 HP VC Flex-10/10D Module 4.41
========= enc4035 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc7005 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc7006 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC FlexFabric 10Gb/24-Port Module 4.50
2 HP VC FlexFabric 10Gb/24-Port Module 4.50
========= enc7007 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc7008 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc8001 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc8017 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc8018 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc8019 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc8021 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.50
2 HP VC Flex-10/10D Module 4.50
========= enc8022 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc8023 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc8024 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc8025 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc8026 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc8027 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc8028 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.62
2 HP VC Flex-10/10D Module 4.62
========= enc8033 =========
1 BladeSystem c7000 DDR2 Onboard Administrator with KVM 4.85
1 HP VC Flex-10/10D Module 4.40
2 HP VC Flex-10/10D Module 4.40
ENC_NAME OA_VERSION VC_ACTIVE VC_STDN
enc4031 4.85 4.50 4.50
enc4032 4.85 4.50 4.50
enc4033 4.85 4.50 4.50
enc4034 4.85 4.50 4.50
enc6002 4.60 NaN NaN
enc6011 4.60 NaN NaN
enc6012 4.60 NaN NaN
enc6013 4.60 NaN NaN
df = pd.read_csv("enc_list_sorted", names=["col1"])
df = df.col1.str.split(' ', expand = True)
df = df.drop(df.columns[[0, 2, 3, 4, 5, 6, 7, 8, 11]], axis=1)
df = df.rename(columns={ 1: 'ENC_NAME', 9: 'VC_VERSION', 10: 'OA_VERSION'})
print(df)
ENC_NAME VC_VERSION OA_VERSION
0 enc1001 None None
1 KVM 4.85
2 4.50 None
3 4.50 None
4 enc1002 None None
5 KVM 4.85
6 4.50 None
7 4.50 None
8 enc1003 None None
9 KVM 4.85
10 4.50 None
11 4.50 None
12 enc1004 None None
13 KVM 4.85
14 4.50 None
15 4.50 None
任何幫助或想法都會非常有幫助。
在我看來,請改用自己編寫的解析器。 您所擁有的可以看作是所謂的 DSL 的一種形式,一種領域特定的語言。 這里使用的語法相當寬容:
import re, pandas as pd
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
class ENCVisitor(NodeVisitor):
grammar = Grammar(r"""
content = (ws / block)*
block = header oa_line vc_active? vc_stdn?
header = delim ws word ws delim nl
oa_line = ~"^(?=.*BladeSystem).+"m nl?
vc_active = ~"^(?=.*1 HP).+"m nl?
vc_stdn = ~"^(?=.*2 HP).+"m nl?
word = ~"\w+"
delim = ~"=+"
ws = ~"\s+"
nl = ~"[\n\r]+"
""")
version_pattern = re.compile(r"\d+\.\d+$")
def get_version(self, key, line):
match = self.version_pattern.search(line)
value = match.group(0) if match else None
return {key: value}
def generic_visit(self, node, visited_children):
return visited_children or node
def visit_header(self, node, visited_children):
header = visited_children[2]
return {"ENC_NAME": header.text}
def visit_oa_line(self, node, visited_children):
line, _ = visited_children
return self.get_version("OA_VERSION", line.text)
def visit_vc_active(self, node, visited_children):
line, _ = visited_children
return self.get_version("VC_ACTIVE", line.text)
def visit_vc_stdn(self, node, visited_children):
line, _ = visited_children
return self.get_version("VC_STDN", line.text)
def visit_block(self, node, visited_children):
dct = {}
for child in visited_children:
if isinstance(child, dict):
dct.update(child)
elif isinstance(child, list):
dct.update(child[0])
return dct
def visit_content(self, node, visited_children):
return [child[0] for child in visited_children if isinstance(child[0], dict)]
enc = ENCVisitor()
result = enc.parse(data)
df = pd.DataFrame(result)
print(df)
對於您的數據,這會導致
ENC_NAME OA_VERSION VC_ACTIVE VC_STDN
0 enc1001 4.85 4.50 4.50
1 enc1002 4.85 4.50 4.50
2 enc1003 4.85 4.50 4.50
3 enc1004 4.85 4.50 4.50
4 enc1005 4.85 4.50 4.50
.. ... ... ... ...
94 enc8025 4.85 4.62 4.62
95 enc8026 4.85 4.62 4.62
96 enc8027 4.85 4.62 4.62
97 enc8028 4.85 4.62 4.62
98 enc8033 4.85 4.40 4.40
[99 rows x 4 columns]
解釋:您的輸入可以看作是一種自己的迷你語言,一種所謂的領域特定語言。 文件中的每個信息塊都包含一個 header 行、一個OA_VERSION
行和兩行可能存在或不存在的行( VC_ACTIVE
和VC_STDN
)。 您的 header 行始終以===
開頭和結尾。
所有這些磚塊形成一個語法,即文件/字符串中的空格或多個塊。 在內部,我們建立了一個抽象的語法樹( ast )並檢索信息,我們需要“訪問”每個節點。 在我選擇使用的解析器庫(優秀的parsimonious
)中,這是通過NodeVisitor
class 完成的,並且通過相應的 function 名稱訪問 ast 的每個葉子。 這意味着如果我們將一個部分稱為“標題”,則 function 應該命名為“visit_header”。
結果是通過“visit_block”獲取的,並且是該塊的所有檢索信息的字典。 最后,所有內容都輸入pandas
。
當然,這只是一個簡短的介紹,如果您想了解更多關於parsimonious
的內容,請查看Github 存儲庫。
正如評論中所建議的那樣,使用pandas
打開文件,解析並不理想。
假設您的數據保存在文本文件file.txt
import pandas as pd
with open("file.txt") as file:
lines = [l.rstrip("\n") for l in file]
row_temp = [None] * 4
row = None
out = []
for line in lines:
if line.startswith("="):
if row is not None:
out.append(row)
row = row_temp.copy()
row[0] = line.replace("=", "").rstrip().lstrip()
if 'BladeSystem' in line:
row[1] = line.split(" ")[-1]
if '1 HP' in line:
row[2] = line.split(" ")[-1]
if '2 HP' in line:
row[3] = line.split(" ")[-1]
col_names = ["ENC_NAME", "OA_VERSION", "VC_ACTIVE", "VC_STDN"]
df = pd.DataFrame(out,
columns=col_names)
返回您正在尋找的 output。
你可以試試這個:
import pandas as pd
import re
import numpy as np
with open(r'test1.txt','r') as file:
txto=file.read()
data=[]
pattern1 = re.compile('(^\=.+)\s.+$\n?', re.MULTILINE)
lstlines=txto.split('\n')
for ele1, ele2 in zip(re.findall(pattern1,txto),re.findall(pattern1,txto)[1:]):
row=lstlines[lstlines.index(ele1):lstlines.index(ele2)]
OA_VERSION=[i for i in row if 'BladeSystem' in i]
OA_VERSION=OA_VERSION[0].split()[-1] if len(OA_VERSION)>0 else np.nan
VC_ACTIVE=[i for i in row if '1 HP' in i]
VC_ACTIVE=VC_ACTIVE[0].split()[-1] if len(VC_ACTIVE)>0 else np.nan
VC_STDN=[i for i in row if '2 HP' in i]
VC_STDN=VC_STDN[0].split()[-1] if len(VC_STDN)>0 else np.nan
data.append([ele1.replace('=','').strip(),OA_VERSION, VC_ACTIVE,VC_STDN])
#last row
row=lstlines[lstlines.index(re.findall(pattern1,txto)[-1]):]
OA_VERSION=[i for i in row if 'BladeSystem' in i]
OA_VERSION=OA_VERSION[0].split()[-1] if len(OA_VERSION)>0 else np.nan
VC_ACTIVE=[i for i in row if '1 HP' in i]
VC_ACTIVE=VC_ACTIVE[0].split()[-1] if len(VC_ACTIVE)>0 else np.nan
VC_STDN=[i for i in row if '2 HP' in i]
VC_STDN=VC_STDN[0].split()[-1] if len(VC_STDN)>0 else np.nan
data.append([re.findall(pattern1,txto)[-1].replace('=','').strip(),OA_VERSION, VC_ACTIVE,VC_STDN])
#Create dataframe
df=pd.DataFrame(data, columns=['ENC_NAME ','OA_VERSION','VC_ACTIVE','VC_STDN'])
print(df)
Output:
df
ENC_NAME OA_VERSION VC_ACTIVE VC_STDN
0 enc1001 4.85 4.50 4.50
1 enc1002 4.85 4.50 4.50
2 enc1003 4.85 4.50 4.50
3 enc1004 4.85 4.50 4.50
4 enc1005 4.85 4.50 4.50
.. ... ... ... ...
94 enc8025 4.85 4.62 4.62
95 enc8026 4.85 4.62 4.62
96 enc8027 4.85 4.62 4.62
97 enc8028 4.85 4.62 4.62
98 enc8033 4.85 4.40 4.40
[99 rows x 4 columns]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.