简体   繁体   English

检查python中的每一行是否存在条件并赋值给新变量

[英]Check if there is a condition in each row in python and assing to a new variable

I have a dataframe unir in Python and I want to check if there is a certain text pattern in an URL. 我在Python中有一个数据unir ,我想检查URL中是否有某种文本模式。 If the pattern is present, I want to assign a value to a new variable, if it doesn't, I want to leave the variable blank. 如果存在模式,则我想为新变量分配一个值,如果不存在,我想将该变量保留为空白。

A sample of my data is the following: 我的数据示例如下:

sample =[
"https://www.unir.net/revista/especiales/ley-de-factura-electronica.html",
"https://www.unir.net/revista/especiales/autoempleo/",
"https://www.unir.net/revista/",
"https://www.unir.net/revista/especiales/examen-acceso-abogacia.html",
"https://www.unir.net/revista/especiales/informe-pisa/",
"https://www.unir.net/revista/",
"https://www.unir.net/revista/especiales/dificultades-de-aprendizaje.html",
"https://www.unir.net/revista/especiales/informe-pisa/profesores-salarios.html",
"https://www.unir.net/revista/especiales/autoempleo/",
"https://www.unir.net/revista/evento/ii-festival-de-teatro-unir/",
"https://en.unir.net/revista/noticias/page/64/",
"https://www.unir.net/revista/especiales/autoempleo/",
"https://www.unir.net/revista/especiales/informe-pisa/profesores-salarios.html",
"https://www.unir.net/revista/"]

unir = pd.DataFrame(sample, columns=["url"])

And I'm searchig for the pattern " https://www.unir.net/revista/especiales " doing the following: 而且我正在搜索模式“ https://www.unir.net/revista/especiales ”,以执行以下操作:

for x in unir["url"]:
    if (unir["url"].str.contains("https://www.unir.net/revista/especiales")) is True:
        unir["arees"] = "Especiales"
    else:
        unir["arees"] = ""

But it only returns blanks. 但是它只返回空白。

I don't know what seems to be the problem... 我不知道这是什么问题...

Thanks in advance, 提前致谢,

In pandas is best avoid loops, because slow, better is use vectorized solution with numpy.where : 在熊猫中最好避免循环,因为比较慢,最好使用numpy.where矢量化解决方案:

mask = unir["url"].str.contains("https://www.unir.net/revista/especiales")
unir["arees"] = np.where(mask, "Especiales", '')
print (unir)
                                                  url       arees
0   https://www.unir.net/revista/especiales/ley-de...  Especiales
1   https://www.unir.net/revista/especiales/autoem...  Especiales
2                       https://www.unir.net/revista/            
3   https://www.unir.net/revista/especiales/examen...  Especiales
4   https://www.unir.net/revista/especiales/inform...  Especiales
5                       https://www.unir.net/revista/            
6   https://www.unir.net/revista/especiales/dificu...  Especiales
7   https://www.unir.net/revista/especiales/inform...  Especiales
8   https://www.unir.net/revista/especiales/autoem...  Especiales
9   https://www.unir.net/revista/evento/ii-festiva...            
10      https://en.unir.net/revista/noticias/page/64/            
11  https://www.unir.net/revista/especiales/autoem...  Especiales
12  https://www.unir.net/revista/especiales/inform...  Especiales
13                      https://www.unir.net/revista/            

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM