简体   繁体   中英

Eliminate or Simplify repetitive Python code

I have the following development that I am working on with the ElementTree and Pandas module in Python:

import xml.etree.ElementTree as ET
import pandas as pd

file_xml = ET.parse('example1.xml')
rootXML = file_xml.getroot()

def transfor_data_atri(rootXML):
    file_xml = ET.parse(rootXML)
    data_XML = [
        {"Name": signal.attrib["Name"],
         # "Value": signal.attrib["Value"]
         "Value": int(signal.attrib["Value"].split(' ')[0])
         } for signal in file_xml.findall(".//Signal")
    ]
    
    signals_df = pd.DataFrame(data_XML)
    extract_name_value(signals_df)
    
def extract_name_value(signals_df):
    #print(signals_df)
    
    signal_ig_st = signals_df[signals_df.Name.isin(["Status"])]
    row_values_ig_st = signal_ig_st.T
    vector_ig_st = row_values_ig_st.iloc[[1]]
    
    signal_nav_DSP_rq = signals_df[signals_df.Name.isin(["SetDSP"])]
    row_values_nav_DSP_rq = signal_nav_DSP_rq.T
    vector_nav_DSP_rq = row_values_nav_DSP_rq.iloc[[1]]
    
    signal_HMI_st = signals_df[signals_df.Name.isin(["HMI"])]
    row_values_HMI_st = signal_HMI_st.T
    vector_HMI_st = row_values_HMI_st.iloc[[1]]
    
    signal_delay_ac = signals_df[signals_df.Name.isin(["Delay"])]
    row_values_delay_ac = signal_delay_ac.T
    vector_delay_ac = row_values_delay_ac.iloc[[1]]
    
    signal_DSP_st = signals_df[signals_df.Name.isin(["DSP"])]
    row_values_DSP = signal_DSP.T
    vector_DSP = row_values_DSP.iloc[[1]]
    
    print('1: ', vector_ig_st)
    print('2: ', vector_nav_DSP_rq)
    print('3: ', vector_HMI_st)
    print('4: ', vector_delay_ac)

The result of the above is the following, they are the first 4 print and it is fine, because it is what the client wants, but I have to simplify or eliminate the code repetitive, for loads any other xml file of the type: link to long xml , and it can be read, no just example1.xml:

在此处输入图像描述

I did the test with the following code, but in all cases, the print results get me separately:

    signals_name = signals_df["Name"]
    # print(signals_name)
    row_signals_name = signals_name.T
    # print(row_signals_name)
    for i in row_signals_name:
        print(i)

I have been checking that Pandas saves the DataFrame as an array of several dimensions, in my case it is only two-dimensional, but I can't find how to check that array to make me get the names in a variable of the type:

    names_list = ['Status', 'SetDSP', 'HMI', 'Delay', 'DSP']

But instead of fetching the hardcoded names, bring them functionalized or simplified in a variable within the Python code (or remove the repetitiveness of this function), so that when the client wants to use another XML file with the same structure, but with other names that are not in the code hardcoded, the extract_name_value (signals_df) function can read them without problem. Beforehand thank you very much.

you can extract the repetitive part into its own function that take as arguments the parts that change, like for example:

def get_vector(data,name):
    step1  = data[data.Name.isin([name])]
    step2  = step1.T
    result = step2.iloc[[1]]
    return result

thus reducing your extract_name_value to several call to that function

def extract_name_value(signals_df):
    print('1: ', get_vector(signals_df, "Status"))
    print('2: ', get_vector(signals_df, "SetDSP"))
    ...

and you can then generalize that even more by also taking as argument a list of the field you're interest in, like for example:

def extract_name_value(signals_df, fields):
    for n, name in enumerate(fields, 1):
        print(f'{n}: ', get_vector(signals_df, name))

and call that function as extract_name_value(signals_df,['Status', 'SetDSP', 'HMI', 'Delay', etc])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM