繁体   English   中英

从句子列表中删除单词

[英]Removing words from the list of sentences

我有一个频道名称列表,我想从这些名称中删除单词。 我在这个( 从 python 列表中删除单词)讨论中尝试了方法,但对我不起作用。 我有这些:

'Housekeeping.XTX_heater-0_Switch_Status'
 'Housekeeping.PDM_1__SW11_Status'
 'Housekeeping.Slim6_Imager-1_Switch_Status'
 'Power.BCM1_Battery_Cell_Temperature_degC'
 'Power.BCM2_Battery_Cell_Temperature_degC'
 'Power.BCR1__Battery_Discharge_Current_A'
 'Power.BCR0__Array_Temperature_degC'
 'Power.BCM0_Battery_Interface_Plate_Temp_degC'
 'Power.PDM_2__PDM_Current_A' 'Power.PDM_1__PDM_Temperature_degC'
 'Power.PDM_1__PDM_Current_A' 'Power.PDM_0__PDM_Temperature_degC'
 'Power.PDM_0__PDM_Current_A' 'Power.BCR2__BCR_Temperature_degC'
 'Power.BCR2__Battery_Discharge_Current_A'
 'Power.BCR2__Battery_Charge_Current_mA' 'Power.BCR2__Array_Voltage_V'
 'Power.BCR2__Array_Temperature_degC' 'Power.BCR2__Array_Current_mA'
 'Power.BCR1__BCR_Temperature_degC'
 'Power.BCR1__Battery_Charge_Current_mA' 'Power.BCR1__Array_Voltage_V'
 'Power.BCR1__Array_Temperature_degC' 'Power.BCR1__Array_Current_mA'
 'Power.BCR0__Overvoltage_Clamp_Current_A'
 'Power.BCR0__BCR_Temperature_degC' 'Power.BCR0__Battery_Voltage_V'
 'Power.BCR0__Battery_Charge_Current_mA' 'Power.BCR0__Array_Voltage_V'
 'Power.BCR0__Array_Current_mA' 'Thermal.WHL1_Measured_Current_mA'
 'Thermal.WHL0_Measured_Current_mA' 'Thermal.WHL1_IF_Temp_degC'
 'Thermal.WHL2_IF_Temp_degC'
 'Thermal.Prop_controller_-Y_panel__temperature_degC'
 'Thermal.WHL3_IF_Temp_degC' 'Thermal.WHL0_IF_Temp_degC'
 'Thermal.WHL3_Measured_Current_mA' 'Thermal.WHL2_Measured_Current_mA'
 'Thermal.SS1_Temperature_degC'
 'Thermal.Imager_flat_plate_EFF__temperature_degC'
 'Thermal.OBC_Temp_PPC750FL_degC' 'Thermal.OBC_Temp_PCB_degC'
 'Thermal.MTM-0_Temperature_degC' 'Thermal.AIM_Module_Temperature_degC'
 'Thermal.Sep_system_panel_-Z_+X__temperature_degC'
 'Thermal.OBDH_cardframe_-X_panel__temperature_degC'
 'Thermal.SS0_Temperature_degC' 'LIN.LIN_Failed_Nodes_Count'
 'LIN.LIN_BCM_Fail' 'LIN.LIN_Bus_Fail' 'LIN.LIN_Passive'
 'LIN.LIN_Master_1_State_Of_Health' 'LIN.LIN_Master_Up_Time'
 'LIN.LR_PA_Temperature_degC' 'LIN.My_IP_Packets' 'LIN.Switch_Error'
 'LIN.PA_Current_mA' 'LIN.S-Band_Power_Amplifier_ONOFF_State'
 'LIN.STRx0_Uplink_Reset_Count' 'LIN.STRx1_Uplink_Reset_Count'
 'LIN.Switch_Transaction_Fail_Count' 'LIN.Switch_Transaction_OK_Count'
 'LIN.TTC_0_Current_mA' 'LIN.TTC_1_Current_mA' 'LIN.TTC_Reset_Cause'
 'LIN.RSSI_dBm' 'LIN.TTC0_Temperature_degC' 'LIN.LIN_SPARE_STATUS'
 'LIN.LIN_Master_Reset' 'LIN.COUNT_FPGA_RX_STRx0' 'LIN.Lifetime_Cold_Boot'
 'LIN.Lifetime_Warm_Boot' 'LIN.LIN_Comms_Error_Count'
 'LIN.LIN_Node_Resets_Count' 'LIN.LIN_Bus_Reset'
 'LIN.LIN_Failed_Switches_Count' 'LIN.LIN_Master_0_State_Of_Health'
 'LIN.TTC1_Temperature_degC' 'LIN.UDP_Error_STRx0'
 'LIN.UDP_IPS_size_errors_STRx0' 'LIN.UDP_IPS_STRx0' 'LIN.UDP_Total_STRx0'
 'LIN.UDP_Valid_STRx0' 'LIN.UPD_IPS_errors_STRx0' 'LIN.Warm_Resets'
 'LIN.Cold_Resets' 'LIN.CAN_Reset_Count']

并想删除句子的这些部分:

['Housekeeping.(including period)', 'Power.', 'Thermal.', 'LIN.']

预计 output 是:

'XTX_heater-0_Switch_Status'
 'PDM_1__SW11_Status'
 'Slim6_Imager-1_Switch_Status'
 'BCM1_Battery_Cell_Temperature_degC'
 'BCM2_Battery_Cell_Temperature_degC'
 'BCR1__Battery_Discharge_Current_A'

等等。

让我们这样说:

import re
abc=['Housekeeping.XTX_heater-0_Switch_Status',
 'Housekeeping.PDM_1__SW11_Status',
 'Housekeeping.Slim6_Imager-1_Switch_Status',
 'Power.BCM1_Battery_Cell_Temperature_degC']
stop=['Housekeeping.', 'Power.', 'Thermal.', 'LIN.\s+']
print([(lambda x: re.sub(r'|'.join(stop), '', x))(x) for x in abc])

这是来自您提供的链接,我对其进行了测试,并且可以正常工作。 试试看

它也可以在没有正则表达式的情况下解决:

new_list= [ w.partition('.')[2] for w in old_list ]

像下面这样的东西可能会起作用:

import copy
def remove_bad_words(in_stryngs, bad_words):
    bad_words = iter(bad_words)
    try:
        bad_word = next(bad_words)
    except StopIteration:
        return in_stryngs
    in_stryngs = iter(in_stryngs)
    out_strings = list()
    for stryng in in_stryngs:
        split_string = stryng.split(bad_word)
        blah = remove_bad_words(split_string, copy.copy(bad_words))
        out_strings.append("".join(blah))
    return out_strings

在这里它正在使用中:

bad_words = ["hello", "world"]

channel_names = [
    "Nationahellol Broadcahellosting Company (NBC)",
    "worldCworldBworldS (formerly world known asworld the Columbia world Broadcasting System)",
    "the Americaworldworldn Broadcashelloting Company (ABC)",
    "the Fox Broadchelloasting Coworldworldmpany (Fox)",
    "the ChelloW Televiworldsion Network.",
    "public broadcworldasting serhellovice (PBS)"
]

clean_chanel_names = remove_bad_words(channel_names, bad_words)

print("\n".join(clean_chanel_names))

output 是:

National Broadcasting Company (NBC)
CBS (formerly  known as the Columbia  Broadcasting System)
the American Broadcasting Company (ABC)
the Fox Broadcasting Company (Fox)
the CW Television Network.
public broadcasting service (PBS)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM