Python regex findall groups

When trying to use groups for regular expression searches with findall in python, python wont work as in re.search:

Example with re.search

>>> vresearch = re.search(r"(<tag101>titel</tag101>)(\n)(<dd>)(.*)(</dd>)", str(i))
>>> print("vresearch.group(4) = " + str(vresearch.group(4)))
whatever is in .* will be returned

Example with re.findall

Without group:

>>> vresearch = re.findall(r"(<tag101>Titel</tag101>)(\n)(<dd>)(.*)(</dd>)", str(i))
>>> print("vresearch[0] = " + str(vresearch[0]))
vresearch[0] = ('<
tag101>Titel</tag101>', '\n', '<dd>', ".*", '</dd>')

Again with group:

>>> vresearch = re.findall(r"(<tag101>Titel</tag101>)(\n)(<dd>)(.*)(</dd>)", str(i))
>>> print("vresearch[0].group() = " + str(vresearch[0].group()))
AttributeError: 'tuple' object has no attribute 'group'

 

Example2 with re.findall

>>> re.findall('ab(cde)fg(0123)', 'abcdefg0123 and again abcdefg0123')
[('cde', '0123'), ('cde', '0123')]

👉 Findall just returns the captured groups.

Python documentation

re.findall(pattern, string, flags=0)

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.

Source: https://docs.python.org/3/library/re.html



No comments:

Post a Comment

Cribl - Change values to lowerCase

Some logs (e.g. Microsoft Azure) sometimes are not fully normalized to all lowercase characters. You can use Cribl to adjust those values by...