When trying to use groups for regular expression searches with findall in python, python wont work as in re.search:
Example with re.search
>>> vresearch = re.search(r"(<tag101>titel</tag101>)(\n)(<dd>)(.*)(</dd>)", str(i))
>>> print("vresearch.group(4) = " + str(vresearch.group(4)))
whatever is in .* will be returned
Example with re.findall
Without group:
>>> vresearch = re.findall(r"(<tag101>Titel</tag101>)(\n)(<dd>)(.*)(</dd>)", str(i))
>>> print("vresearch[0] = " + str(vresearch[0]))
vresearch[0] = ('<tag101>Titel</tag101>', '\n', '<dd>', ".*", '</dd>')
Again with group:
>>> vresearch = re.findall(r"(<tag101>Titel</tag101>)(\n)(<dd>)(.*)(</dd>)", str(i))
>>> print("vresearch[0].group() = " + str(vresearch[0].group()))
AttributeError: 'tuple' object has no attribute 'group'
Example2 with re.findall
>>> re.findall('ab(cde)fg(0123)', 'abcdefg0123 and again abcdefg0123')
[('cde', '0123'), ('cde', '0123')]
👉 Findall just returns the captured groups.
Python documentation
re.findall
(pattern, string, flags=0)Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.
Source: https://docs.python.org/3/library/re.html
No comments:
Post a Comment