Strip HTML Tags Using Regular Expressions
Written by coregps on Tuesday, November 30th, 2004 in General.
If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!
In my current project, I want a function to strip all of the HTML tags from a string. After a long time on Google, I found some good solutions and implemented myself using JavaBean. In addition, I’m more aware of that Regular Expressions is an incredibly powerful tool. I decide to set aside some time to learn Regular Expressions.
package com.esurfer.common;
public class CommonUtil {
/**
* Function used to strip all HTML tags from strings using Regular Expressions
* (.|n) - > matches any character or a new line
* *? -> 0 or more occurences, and make a non-greedy search
*
* @param strHTML A string to be cleared of HTML TAGS
* @return A string that has been filtered by the function
*/
public String StripHTMLTags(String strHTML) {
String sResult = “”;
if ( strHTML.equals(“”) || (strHTML == null) ) {
sResult = “”;
} else {
sResult = strHTML.replaceAll(“<(.|n)*?>”, “”);
}
return sResult;
}
}
The following is in Javascript:
<script type=”text/javascript”>
function StripHTMLTags(s) {
var strHTML = s;
var regex = /<(.|n)*?>/;
var result = strHTML.replace(regex, “”);
return result;
}
</script>










