regex-email | Regular expression for email | Regex library
kandi X-RAY | regex-email Summary
kandi X-RAY | regex-email Summary
Regular expression for email
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of regex-email
regex-email Key Features
regex-email Examples and Code Snippets
Community Discussions
Trending Discussions on regex-email
QUESTION
I'm doing the RegexOne regex tutorial and it has a question about writing a regular expression to remove unnecessary whitespaces.
The solution provided in the tutorial is
We can just skip all the starting and ending whitespace by not capturing it in a line. For example, the expression
^\s*(.*)\s*$
will catch only the content.
The setup for the question does indicate the use of the hat at the beginning and the dollar sign at the end, so it makes sense that this is the expression that they want:
We have previously seen how to match a full line of text using the hat ^ and the dollar sign $ respectively. When used in conjunction with the whitespace \s, you can easily skip all preceding and trailing spaces.
That said, using \S
instead, I was able to come up with what seems like a simpler solution - (\S.*\S)
.
I've found this SO solution that match the one in the tutorial - Regex Email - Ignore leading and trailing spaces? and I've seen other guides that recommend the same format but I'm struggling to find an explanation for why the \S
is bad.
Additionally, this validates as correct in their tool... so, are there cases where this would not work as well as the provided solution? Or is the recommended version just a standard format?
...ANSWER
Answered 2020-Jul-25 at 04:32The tutorial's solution of ^\s*(.*)\s*$
is wrong. The capture group .*
is greedy, so it will expand as much as it can, all the way to the end of the line - it will capture trailing spaces too. The .*
will never backtrack, so the \s*
that follows will never consume any characters.
https://regex101.com/r/584uVG/1
Your solution is much better at actually matching only the non-whitespace content in the line, but there are a couple odd cases in which it won't match the non-space characters in the middle. (\S.*\S)
will only capture at least two characters, whereas the tutorial's technique of (.*)
may not capture any characters if the input is composed of all whitespace. (.*)
may also capture only a single character.
But, given the problem description at your link:
Occasionally, you'll find yourself with a log file that has ill-formatted whitespace where lines are indented too much or not enough. One way to fix this is to use an editor's search a replace and a regular expression to extract the content of the lines without the extra whitespace.
From this, matching only the non-whitespace content (like you're doing) probably wouldn't remove the undesirable leading and trailing spaces. The tutorial is probably thinking to guide you towards a technique that can be used to match a whole line with a particular pattern, and then replace that line with only the captured group, like:
Match ^\s*(.*\S)\s*$
, replace with $1
: https://regex101.com/r/584uVG/2/
Your technique would work given the problem if you had a way to make a new text file containing only the captured groups (or all the full matches), eg:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install regex-email
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page