如何在 python 中使用正则表达式在字符串中查找模式“numberxnumber”

Question

我有一列包含以下字符串：

** DSP_campaign_region_market_MO_0_Device_Display_Open Web_0_0_0_PROS_DSP 自定义 HH Ext_160x600_0_DYN_FLTKG_010121-123121_SP_PID=111112220202043

DSP_campaign_region_market_0_Device_video_Open Web_0_0_0_PROS_DSP Custom HH Ext_160x600_0__PID=11172045203353_DYN_FLTKG_010121-123121_MP **

我需要从上面显示的字符串中提取 160x600、1x1 等广告素材尺寸

我基本上用“_”拆分列中的所有字符串，并且 append 空列表将它们添加为列，

 campaign=[]
 dsp = []
 market=[]
 region =[]
 device_type=[]
 channel=[]
 creative = []
 for i in mapper['string_column']:
     i = str(i)
     i = i.split("_")
     dsp.append(i[0].replace("  ",''))
     campaign.append(i[1])
     region.append(i[2])
     market.append(i[3])
     device_type.append(i[5])
     channel.append(i[6])
     **creative.append(i[13])**

然而，由于字符串命名之间缺乏对称性，一些（当被“_”分割时）将 i[13] 设置为160x600 ，而另一些则使用DSP Custom HH

那么，有没有办法使用正则表达式来识别字符串的创意大小部分，例如 160X600、1X1、720X90 等，而不是拆分字符串？

Answer 1

这可以使用正则表达式解决，而无需拆分初始字符串。 像这样的东西：

import re

texts = ["DSP_campaign_region_market_ MO_0_Device_Display_Open Web_0_0_0_PROS_DSP Custom HH Ext_160x600_0_DYN_FLTKG_010121-123121_SP_PID=111112220202043", "DSP_campaign_region_market_0_Device_video_Open Web_0_0_0_PROS_DSP Custom HH Ext_160x600_0__PID=11172045203353_DYN_FLTKG_010121-123121_MP"]

pattern = "\d+x\d+"

for text in texts:
    occurences = re.findall(pattern, text)
    for item in occurences:
        print(item)

#> 160x600
#> 160x600

如何在 python 中使用正则表达式在字符串中查找模式“numberxnumber”

问题描述

1 个解决方案

解决方案1
0 2021-02-08 10:20:30

如何在 python 中使用正则表达式在字符串中查找模式“numberxnumber”

问题描述

1 个解决方案

解决方案1 0 2021-02-08 10:20:30

解决方案1
0 2021-02-08 10:20:30