[英]regular expression to fetch the specific pattern
I need to fetch the specific rule from the filter list that how many rules are there in with this category. 我需要从该类别的过滤器列表中获取特定的规则。
I have tried to fetch this type of rule from the filter list. 我试图从过滤器列表中提取这种类型的规则。 The rule pattern is as follows
规则模式如下
" /example.com $script,domain=example.com "
“ /example.com $ script,domain = example.com”
and the second exception rule is 第二个例外规则是
"@@/example.com $script,domain=example.com "
“ @@ / example.com $ script,domain = example.com”
third rule with domain anchor is 域锚的第三个规则是
"||example.com
“|| example.com
whereas fourth rule with anchor and domain tag is 而带有锚点和域标记的第四条规则是
"||jizz.best^$popup,domain=vivo.sx
“|| jizz.best ^ $弹出,域名= vivo.sx
and the fifth one is 第五个是
"@@||pagead2.googlesyndication.com/pagead/js/adsbygoogle.js$script,domain=quebeccoupongratuit.com
“@@ || pagead2.googlesyndication.com/pagead/js/adsbygoogle.js$script,domain=quebeccoupongratuit.com
6th one with domain restriction is as follows 第六个有域限制的如下
"example.com###examplebanner
“example.com ### examplebanner
7th without domain restriction is 没有域限制的第7个是
"###examplebanner
“### examplebanner
8th is exeptional with element hidding 第八名元素隐藏
example.com#@##examplebanner
example.com#@##examplebanner
These are different categories of rules I have to fetch separately 这些是我必须分别获取的不同类别的规则
a=open('1-19-16anti-adblock-killer-filters.txt','r')
text=a.read()
line_starts_with_2pipes_no_domain = 0
line_starts_with_2pipes_with_domain = 0
line_starts_with_2ats_with_domain = 0
line_with_domain = 0
for line in text.split("\n"):
if line.startswith("||"):
if ",domain" in line:
line_starts_with_2pipes_with_domain += 1
else:
line_starts_with_2pipes_no_domain += 1
elif line.startswith("@@") and ",domain" in line:
line_starts_with_2ats_with_domain += 1
elif ",domain" in line:
line_with_domain += 1
elif line.strip():
print(f"No idea what to do with :{line}")
print("2pipes_no_group", line_starts_with_2pipes_no_domain )
print("2pipes_with_group", line_starts_with_2pipes_with_domain )
print("2@_with_group", line_starts_with_2ats_with_domain )
print("line_with_domain", line_with_domain)
i am trying now to fetch 5th , 6th ,7th and 8th rule . 我现在正在尝试获取第5、6、7和8条规则。 Your response will be appreciated, thanks.
感谢您的答复。
Your regex does not fit the ,
before domain: 您正则表达式不适合
,
域之前:
"\/[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+domain="
# ^^^^^^^^^^^^ no , allowed
You can also simplify this a lot: 您还可以简化很多:
with open("easylist.txt") as f:
print('There are total Rule With Domain tag are =', f.read().count(",domain="))
should give you your answer of how often ',domain='
occures. 应该给您答案
',domain='
频率。 If your file is big, you can also count linewise: 如果文件很大,也可以按行计数:
domain_rule_count = 0
with open("easylist.txt") as f:
for line in f:
domain_rule_count += 1 if ",domain=" in line else 0
Edit after question in comment: You simply test for what you want: 在评论中提问后编辑:您只需测试所需内容:
text = """ some text
/example.com $script,domain=example.com
@@/example.com $script,domain=example.com
||example.com
||jizz.best^$popup,domain=vivo.sx
"""
line_starts_with_2pipes_no_domain = 0
line_starts_with_2pipes_with_domain = 0
line_starts_with_2ats_with_domain = 0
line_with_domain = 0
for line in text.split("\n"):
if line.startswith("||"):
if ",domain" in line:
line_starts_with_2pipes_with_domain += 1
else:
line_starts_with_2pipes_no_domain += 1
elif line.startswith("@@") and ",domain" in line:
line_starts_with_2ats_with_domain += 1
elif ",domain" in line:
line_with_domain += 1
elif line.strip():
print(f"No idea what to do with '{line}'")
print("2pipes_no_group", line_starts_with_2pipes_no_domain )
print("2pipes_with_group", line_starts_with_2pipes_with_domain )
print("2@_with_group", line_starts_with_2ats_with_domain )
print("line_with_domain", line_with_domain)
Output: 输出:
No idea what to do with ' some text'
2pipes_no_group 1
2pipes_with_group 1
2@_with_group 1
line_with_domain 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.