This library sounds interesting. I am wondering if it can be configured to do something like the following:
I'd like to find a way to filte the input from web forms to allow a small subset of explicitly defined HTML and Attributes in an Allow / White List, but excise any of the non-allowed ones.
Basically, I would like to be able to specify an "allow" list that might contain B, I, U, TABLE, TD, TR. and a large number of attributes, excluding, of course, onmouse*, on* in general.
I believe I could do this with Html Agility Pack: http://www.codeplex.com/htmlagilitypack. However, I also noticed that the examples for the library showed how even src= is a dangerous attribute, so that really stinks. Perhaps stripping explicitly dangerous
tags first, and then _also_ running that output through the library would be the solution.
Even still, I wondered if anyone had written or come across something that is forward only and does not parse the content into a object tree the way that Agility Pack does, since I'm not really concerned with well-formedness, just that absolutely no potentially
descructive script or object tags or attributes get through.
Someone gave me a link to a sample on 4GuysFromRolla, but the solution was not rigorous enough.
I know how to filter manually, however, it's a pain in the azz. Especially when I do not know EVERY dangerous tag and attribute. Variable width encoding complicates things as well.
If you require a white-list function for checking HTML, look at the IsValidHtmlFragment, part of the CommonData library at
http://www.CodePlex.Com/CommonData
Click "Mark as Answer" on the post that helped you.
This earns you a point and marks your thread as Resolved so we will all know you have been helped.
FAQ on the correct forum http://forums.asp.net/p/1337412/2699239.aspx#2699239
xzg3
Member
744 Points
178 Posts
Strip and clean HTML, dangerous att and script tags, but allow certain tags?
Jan 19, 2007 04:05 PM|LINK
Hello,
This library sounds interesting. I am wondering if it can be configured to do something like the following:
I'd like to find a way to filte the input from web forms to allow a small subset of explicitly defined HTML and Attributes in an Allow / White List, but excise any of the non-allowed ones.
Basically, I would like to be able to specify an "allow" list that might contain B, I, U, TABLE, TD, TR. and a large number of attributes, excluding, of course, onmouse*, on* in general.
I believe I could do this with Html Agility Pack: http://www.codeplex.com/htmlagilitypack. However, I also noticed that the examples for the library showed how even src= is a dangerous attribute, so that really stinks. Perhaps stripping explicitly dangerous tags first, and then _also_ running that output through the library would be the solution.
Even still, I wondered if anyone had written or come across something that is forward only and does not parse the content into a object tree the way that Agility Pack does, since I'm not really concerned with well-formedness, just that absolutely no potentially descructive script or object tags or attributes get through.
Someone gave me a link to a sample on 4GuysFromRolla, but the solution was not rigorous enough.
JoshThank you,
nightzeus
Member
111 Points
71 Posts
Re: Strip and clean HTML, dangerous att and script tags, but allow certain tags?
Jun 28, 2007 05:35 PM|LINK
If you find something, let me know. :)
I know how to filter manually, however, it's a pain in the azz. Especially when I do not know EVERY dangerous tag and attribute. Variable width encoding complicates things as well.
AjaxNinja
Member
28 Points
12 Posts
Re: Strip and clean HTML, dangerous att and script tags, but allow certain tags?
Jul 28, 2007 12:36 AM|LINK
I was wondering the same thing actually... Does the XSS library or the tool mentioned by the OP allow for HTML whitelisting?
XSS
TATWORTH
All-Star
72415 Points
14017 Posts
MVP
Re: Strip and clean HTML, dangerous att and script tags, but allow certain tags?
Oct 27, 2008 04:46 AM|LINK
If you require a white-list function for checking HTML, look at the IsValidHtmlFragment, part of the CommonData library at http://www.CodePlex.Com/CommonData
This earns you a point and marks your thread as Resolved so we will all know you have been helped.
FAQ on the correct forum http://forums.asp.net/p/1337412/2699239.aspx#2699239
Nordes
Member
12 Points
6 Posts
Re: Strip and clean HTML, dangerous att and script tags, but allow certain tags?
Nov 03, 2008 04:53 PM|LINK
I've found that link to the Anti-XSS project. You should get a look at this article. It does what you may want to do.
http://blogs.msdn.com/cisg/archive/2008/08/27/what-does-anti-xss-offer-for-html-sanitization.aspx
anti-xss
Blog: http://nordz.sauleil.com/