That's Debatable

The GreaterDebater Blog

So you want to autolink urls...

by gabe on Feb 08, 2010

Finding a url inside a chunk of text is no easy task. Coding Horror and Daring Fireball have covered regular expressions for matching urls in a variety of circumstances. I went with the Coding Horror regular expression, mainly because the Daring Fireball version wasn't around when I was working on this. Using Python's regular expression library, the code to get all the urls in a block of text looks like this:

urlre = re.compile("(\(?https?://[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|])(\">|</a>)?")
urls = urlre.findall(html)

That pulls out all of the urls in a text and puts them in ...

more...

Introducing That's Debatable

by gabe on Feb 01, 2010

Welcome to the first installment of That's Debatable, the official GreaterDebater blog. I guess this is the part where I talk about the sorts of things I intend to be posting here. For the time being, I'm planning on using this as a catchall blog for everything I want to write about. I suppose time will tell if this is the best idea or not. On the GreaterDebater front, I'd like to blog about:

  • New and in-development features for GreaterDebater. Feedback is always appreciated.
  • Some of how GreaterDebater works and things I've learned along the way ...
more...
Subscribe to RSS