I love regular expressions. Okay, I love the challenge of crafting regular expressions. I do not enjoy reading regular expressions that I have not created or, really, even the ones I do create. But give me a problem and tell me to make a regular expression to match things and I am all over it.
A co-worker wanted a regular expression to turn unlinked URLs in text into HTML links and to correct linked URLs that lacked a protocol into valid URLs. In other words, if "www.google.com" appeared in some text, it needed to be replaced with
<a href="http://www.google.com/">www.google.com</a> and
<a href="www.google.com">some link text<a> needed to turn into
<a href="http://www.google.com">some link text<a>
My first pass was a monster regular expression that handled both situations but I couldn't get the replacement string to account for the fact that there was already link text in the invalid URL example. And I couldn't adequately cover the situation where there were attributes before the
href attribute. So scrap that one.
This is what I came up with after separating it into two replacement passes. I share it with you both as a testament to my regular expression abilities (good or bad, you decide) and because this situation seems like one that might come up pretty frequently.
|Regular expression||Replacement string|