[英]Removing words from the list of sentences
我有一个频道名称列表,我想从这些名称中删除单词。 我在这个( 从 python 列表中删除单词)讨论中尝试了方法,但对我不起作用。 我有这些:
'Housekeeping.XTX_heater-0_Switch_Status'
'Housekeeping.PDM_1__SW11_Status'
'Housekeeping.Slim6_Imager-1_Switch_Status'
'Power.BCM1_Battery_Cell_Temperature_degC'
'Power.BCM2_Battery_Cell_Temperature_degC'
'Power.BCR1__Battery_Discharge_Current_A'
'Power.BCR0__Array_Temperature_degC'
'Power.BCM0_Battery_Interface_Plate_Temp_degC'
'Power.PDM_2__PDM_Current_A' 'Power.PDM_1__PDM_Temperature_degC'
'Power.PDM_1__PDM_Current_A' 'Power.PDM_0__PDM_Temperature_degC'
'Power.PDM_0__PDM_Current_A' 'Power.BCR2__BCR_Temperature_degC'
'Power.BCR2__Battery_Discharge_Current_A'
'Power.BCR2__Battery_Charge_Current_mA' 'Power.BCR2__Array_Voltage_V'
'Power.BCR2__Array_Temperature_degC' 'Power.BCR2__Array_Current_mA'
'Power.BCR1__BCR_Temperature_degC'
'Power.BCR1__Battery_Charge_Current_mA' 'Power.BCR1__Array_Voltage_V'
'Power.BCR1__Array_Temperature_degC' 'Power.BCR1__Array_Current_mA'
'Power.BCR0__Overvoltage_Clamp_Current_A'
'Power.BCR0__BCR_Temperature_degC' 'Power.BCR0__Battery_Voltage_V'
'Power.BCR0__Battery_Charge_Current_mA' 'Power.BCR0__Array_Voltage_V'
'Power.BCR0__Array_Current_mA' 'Thermal.WHL1_Measured_Current_mA'
'Thermal.WHL0_Measured_Current_mA' 'Thermal.WHL1_IF_Temp_degC'
'Thermal.WHL2_IF_Temp_degC'
'Thermal.Prop_controller_-Y_panel__temperature_degC'
'Thermal.WHL3_IF_Temp_degC' 'Thermal.WHL0_IF_Temp_degC'
'Thermal.WHL3_Measured_Current_mA' 'Thermal.WHL2_Measured_Current_mA'
'Thermal.SS1_Temperature_degC'
'Thermal.Imager_flat_plate_EFF__temperature_degC'
'Thermal.OBC_Temp_PPC750FL_degC' 'Thermal.OBC_Temp_PCB_degC'
'Thermal.MTM-0_Temperature_degC' 'Thermal.AIM_Module_Temperature_degC'
'Thermal.Sep_system_panel_-Z_+X__temperature_degC'
'Thermal.OBDH_cardframe_-X_panel__temperature_degC'
'Thermal.SS0_Temperature_degC' 'LIN.LIN_Failed_Nodes_Count'
'LIN.LIN_BCM_Fail' 'LIN.LIN_Bus_Fail' 'LIN.LIN_Passive'
'LIN.LIN_Master_1_State_Of_Health' 'LIN.LIN_Master_Up_Time'
'LIN.LR_PA_Temperature_degC' 'LIN.My_IP_Packets' 'LIN.Switch_Error'
'LIN.PA_Current_mA' 'LIN.S-Band_Power_Amplifier_ONOFF_State'
'LIN.STRx0_Uplink_Reset_Count' 'LIN.STRx1_Uplink_Reset_Count'
'LIN.Switch_Transaction_Fail_Count' 'LIN.Switch_Transaction_OK_Count'
'LIN.TTC_0_Current_mA' 'LIN.TTC_1_Current_mA' 'LIN.TTC_Reset_Cause'
'LIN.RSSI_dBm' 'LIN.TTC0_Temperature_degC' 'LIN.LIN_SPARE_STATUS'
'LIN.LIN_Master_Reset' 'LIN.COUNT_FPGA_RX_STRx0' 'LIN.Lifetime_Cold_Boot'
'LIN.Lifetime_Warm_Boot' 'LIN.LIN_Comms_Error_Count'
'LIN.LIN_Node_Resets_Count' 'LIN.LIN_Bus_Reset'
'LIN.LIN_Failed_Switches_Count' 'LIN.LIN_Master_0_State_Of_Health'
'LIN.TTC1_Temperature_degC' 'LIN.UDP_Error_STRx0'
'LIN.UDP_IPS_size_errors_STRx0' 'LIN.UDP_IPS_STRx0' 'LIN.UDP_Total_STRx0'
'LIN.UDP_Valid_STRx0' 'LIN.UPD_IPS_errors_STRx0' 'LIN.Warm_Resets'
'LIN.Cold_Resets' 'LIN.CAN_Reset_Count']
并想删除句子的这些部分:
['Housekeeping.(including period)', 'Power.', 'Thermal.', 'LIN.']
预计 output 是:
'XTX_heater-0_Switch_Status'
'PDM_1__SW11_Status'
'Slim6_Imager-1_Switch_Status'
'BCM1_Battery_Cell_Temperature_degC'
'BCM2_Battery_Cell_Temperature_degC'
'BCR1__Battery_Discharge_Current_A'
等等。
让我们这样说:
import re
abc=['Housekeeping.XTX_heater-0_Switch_Status',
'Housekeeping.PDM_1__SW11_Status',
'Housekeeping.Slim6_Imager-1_Switch_Status',
'Power.BCM1_Battery_Cell_Temperature_degC']
stop=['Housekeeping.', 'Power.', 'Thermal.', 'LIN.\s+']
print([(lambda x: re.sub(r'|'.join(stop), '', x))(x) for x in abc])
这是来自您提供的链接,我对其进行了测试,并且可以正常工作。 试试看
它也可以在没有正则表达式的情况下解决:
new_list= [ w.partition('.')[2] for w in old_list ]
像下面这样的东西可能会起作用:
import copy
def remove_bad_words(in_stryngs, bad_words):
bad_words = iter(bad_words)
try:
bad_word = next(bad_words)
except StopIteration:
return in_stryngs
in_stryngs = iter(in_stryngs)
out_strings = list()
for stryng in in_stryngs:
split_string = stryng.split(bad_word)
blah = remove_bad_words(split_string, copy.copy(bad_words))
out_strings.append("".join(blah))
return out_strings
在这里它正在使用中:
bad_words = ["hello", "world"]
channel_names = [
"Nationahellol Broadcahellosting Company (NBC)",
"worldCworldBworldS (formerly world known asworld the Columbia world Broadcasting System)",
"the Americaworldworldn Broadcashelloting Company (ABC)",
"the Fox Broadchelloasting Coworldworldmpany (Fox)",
"the ChelloW Televiworldsion Network.",
"public broadcworldasting serhellovice (PBS)"
]
clean_chanel_names = remove_bad_words(channel_names, bad_words)
print("\n".join(clean_chanel_names))
output 是:
National Broadcasting Company (NBC)
CBS (formerly known as the Columbia Broadcasting System)
the American Broadcasting Company (ABC)
the Fox Broadcasting Company (Fox)
the CW Television Network.
public broadcasting service (PBS)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.