[英]How to get match result by given range using regular expression?
I'm stucking with my code to get all return match by given range.我坚持使用我的代码来获得给定范围内的所有返回匹配。 My data sample is:
我的数据样本是:
comment
0 [intj74, you're, whipping, people, is, a, grea...
1 [home, near, kcil2, meniaga, who, intj47, a, l...
2 [thematic, budget, kasi, smooth, sweep]
3 [budget, 2, intj69, most, people, think, of, e...
I want to get the result as: (where the given range is intj1 to intj75)我想得到结果:(其中给定的范围是 intj1 到 intj75)
comment
0 [intj74]
1 [intj47]
2 [nan]
3 [intj69]
My code is:我的代码是:
df.comment = df.comment.apply(lambda x: [t for t in x if t=='intj74'])
df.ix[df.comment.apply(len) == 0, 'comment'] = [[np.nan]]
I'm not sure how to use regular expression to find the range for t=='range'.我不确定如何使用正则表达式来查找 t=='range' 的范围。 Or any other idea to do this?
或者任何其他想法来做到这一点?
Thanks in advance,提前致谢,
Pandas Python Newbie Pandas Python 新手
you could replace [t for t in x if t=='intj74']
with, eg,您可以将
[t for t in x if t=='intj74']
替换为例如,
[t for t in x if re.match('intj[0-9]+$', t)]
or even甚至
[t for t in x if re.match('intj[0-9]+$', t)] or [np.nan]
which would also handle the case if there are no matches (so that one wouldn't need to check for that explicitly using df.ix[df.comment.apply(len) == 0, 'comment'] = [[np.nan]]
) The "trick" here is that an empty list evaluates to False
so that the or
in that case returns its right operand.如果没有匹配项,它也会处理这种情况(这样就不需要使用
df.ix[df.comment.apply(len) == 0, 'comment'] = [[np.nan]]
) 这里的“技巧”是空列表的计算结果为False
以便or
在这种情况下返回其正确的操作数。
I am new to pandas
as well.我也是
pandas
新手。 You might have initialized your DataFrame differently.您可能以不同的方式初始化了 DataFrame。 Anyway, this is what I have:
无论如何,这就是我所拥有的:
import pandas as pd
data = {
'comment': [
"intj74, you're, whipping, people, is, a",
"home, near, kcil2, meniaga, who, intj47, a",
"thematic, budget, kasi, smooth, sweep",
"budget, 2, intj69, most, people, think, of"
]
}
print(df.comment.str.extract(r'(intj\d+)'))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.