[英]getting NAN values from for loop [python pandas]
I have a python dataframe with a column CREATIVE_NAME
and I want to create a new column CREATIVE_SIZE
by searching specific substrings and put them in the new column.我有一个带有
CREATIVE_NAME
列的 python dataframe ,我想通过搜索特定子字符串来创建一个新列CREATIVE_SIZE
并将它们放在新列中。
creative_size = []
for i in df['CREATIVE_NAME']:
if search('320x480', i):
creative_size.append('320x480')
elif search('728x1024', i):
creative_size.append('728x1024')
elif search('320x50', i):
creative_size.append('320x50')
elif search('728x90', i):
creative_size.append('728x90')
elif search('300x250', i):
creative_size.append('300x250')
elif search('80x80', i):
creative_size.append('80x80')
elif search('1200x627', i):
creative_size.append('1200x627')
elif search('768x1024', i):
creative_size.append('768x1024')
elif search('320x420', i):
creative_size.append('320x420')
elif search('768x820', i):
creative_size.append('768x820')
else:
creative_size.append('no creative size')
sizes = pd.Series(creative_size)
df.insert(column='creative_size', value=sizes, loc = 0)
df['creative_size'].isna().sum()
output: 1579
I don't understand why I'm getting NAN values from for loop because it should have captured all the conditions and nothing should be left out.我不明白为什么要从 for 循环中获取 NAN 值,因为它应该已经捕获了所有条件,并且不应该遗漏任何内容。
import pandas as pd
#### FOR TESTING ####
test_data_dict = {
'CREATIVE_NAME':['320x480', '728x1024', '1000x1000']
}
df = pd.DataFrame(data=test_data_dict)
#### Define a set of all creative sizes you want to check against
creative_sizes =('320x480','728x1024','320x50','728x90','300x250','80x80','1200x627','768x1024','320x420','768x820') #list of valid creative sizes
###### Define a function which will check if `C_name` is a substring of available creative_sizes
def get_creative_size(c_name):
#c_name is the value of creative_name in row
result = [size for size in creative_sizes if c_name in size]
if len(result) > 0:
return result[0]
else:
return 'no creative size'
df['CREATIVE_SIZE'] = df['CREATIVE_NAME'].apply(lambda x: get_creative_size(x))
print(df.head())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.