To sanitize user content, use an HTML parser
It is especially important, if you allow any HTML at all in user-submitted content, to sanitize that content by actually parsing the HTML and filtering it for any tags or attributes you wish to exclude. If you fail to do so, your site may be vulnerable to XSS (cross-site scripting) attacks.
Q: “But isn’t it overkill to parse the HTML? Can’t I use other techniques, such as regular expressions or simple string replacement, to filter out dangerous tags and attributes?” A: No, and I’ll explain why. Read more