24 Apr 2010

Python Tips - 正则表达式 (Regular Expression)

1 非贪婪flag
>>> re.findall(r"a(\d+?)", "a23b")
['2']
>>> re.findall(r"a(\d+)", "a23b")
['23']


注意比较这种情况:
>>> re.findall(r"a(\d+)b", "a23b")
['23']
>>> re.findall(r"a(\d+?)b", "a23b")
['23']

2 如果你要多行匹配,那么加上re.S和re.M标志
re.S:.将会匹配换行符,默认.不会匹配换行符
>>> re.findall(r"a(\d+)b.+a(\d+)b", "a23b\na34b")
[]
>>> re.findall(r"a(\d+)b.+a(\d+)b", "a23b\na34b", re.S)
[('23', '34')]
>>>

re.M:^$标志将会匹配每一行,默认^和$只会匹配第一行
>>> re.findall(r"^a(\d+)b", "a23b\na34b")
['23']
>>> re.findall(r"^a(\d+)b", "a23b\na34b", re.M)
['23', '34']

但是,如果没有^标志,
>>> re.findall(r"a(\d+)b", "a23b\na23b")
['23', '23']

可见,是无需re.M
来自 http://www.juyimeng.com/python-multi-line-non-greedy-regular-expression-sample.html

No comments :

Post a Comment