<!-- Please fill out as much of the below template and delete unnecessary text. Sample Bug Report - https://github.com/nexB/scancode-toolkit/issues/1778 Markdown Styling - https://commonmark.org/help/ --> ### Description The current e-mail regular expression cannot catch the latest (and greatest?) TLDs. There are many these days: https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains but scancode limits it to something either 2, 3 or 4 characters: ``` def emails_regex(): return re.compile('\\b[A-Z0-9._%-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}\\b', re.IGNORECASE) ```
Description
The current e-mail regular expression cannot catch the latest (and greatest?) TLDs. There are many these days:
https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains
but scancode limits it to something either 2, 3 or 4 characters: