[英]How to check if the first four characters of a column are 'http' or not?
I have a dataframe like:我有一个 dataframe 像:
df['Web']
I just want to check the first four characters of df['Web']
is 'http'
or not.我只想检查
df['Web']
的前四个字符是否为'http'
。
I don't want to check if df['Web']
is in url format or not.我不想检查
df['Web']
是否为 url 格式。
And how to use if condition like:以及如何使用 if 条件,例如:
if (firstfour=='http'):
print("starts with http")
else:
print("doesn't starts with http")
You can use string.startswith()
.您可以使用
string.startswith()
。 However you should not that it would also match https
as well.但是,您不应该认为它也会匹配
https
。
You could use regex
to match http and not https.您可以使用
regex
匹配 http 而不是 https。
df = pd.DataFrame({'Web': ['htt', 'http', 'https', 'www']})
df['match'] = df.Web.apply(lambda x: x.startswith('http'))
Web match
0 htt False
1 http True
2 https True
3 www False
Regex正则表达式
df['match'] = df['Web'].str.match(r'^http(?!s)')
Web match
0 htt False
1 http True
2 https False
3 www False
Use Series.str.startswith
:使用
Series.str.startswith
:
df['match'] = df.Web.str.startswith('http')
Or use Series.str.contains
with ^
for start of string:或者使用
Series.str.contains
和^
作为字符串的开头:
df['match'] = df.Web.str.contains('^http')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.