Tuesday, May 8, 2007

Stripping HTML tag

I had to write a module to 'word count' a web page. This required to strip the HTML tags out of the url read. I quickly googled, but I felt that this kind of easy problem doesn't even need googling. So, I came up with simple solution.

This is very rough test code written using BeautifulSoup.

http://lucky.umd.edu/code/strip-htmltag.py

No comments: