Strip and clean HTML, dangerous att and script tags, but allow certain tags?

Last post 07-27-2007 8:36 PM by AjaxNinja. 2 replies.

Sort Posts:

  • Strip and clean HTML, dangerous att and script tags, but allow certain tags?

    01-19-2007, 12:05 PM
    • Loading...
    • xzg3
    • Joined on 02-28-2005, 11:33 AM
    • Posts 176

    Hello,

    This library sounds interesting. I am wondering if it can be configured to do something like the following:

     I'd like to find a way to filte the input from web forms to allow a small subset of explicitly defined HTML and Attributes in an Allow / White List, but excise any of the non-allowed ones.

    Basically, I would like to be able to specify an "allow" list that might contain B, I, U, TABLE, TD, TR. and a large number of attributes, excluding, of course, onmouse*, on* in general.

    I believe I could do this with Html Agility Pack: http://www.codeplex.com/htmlagilitypack. However, I also noticed that the examples for the library showed how even src= is a dangerous attribute, so that really stinks. Perhaps stripping explicitly dangerous tags first, and then _also_ running that output through the library would be the solution.
     
    Even still, I wondered if anyone had written or come across something that is forward only and does not parse the content into a object tree the way that Agility Pack does, since I'm not really concerned with well-formedness, just that absolutely no potentially descructive script or object tags or attributes get through.

    Someone gave me a link to a sample on 4GuysFromRolla, but the solution was not rigorous enough.
     
    Thank you,

    Josh
    ASP.NET/C# Developer
  • Re: Strip and clean HTML, dangerous att and script tags, but allow certain tags?

    06-28-2007, 1:35 PM
    • Loading...
    • nightzeus
    • Joined on 06-07-2004, 11:26 AM
    • Posts 13

    If you find something, let me know. :)

     I know how to filter manually, however, it's a pain in the azz. Especially when I do not know EVERY dangerous tag and attribute. Variable width encoding complicates things as well.

  • Re: Strip and clean HTML, dangerous att and script tags, but allow certain tags?

    07-27-2007, 8:36 PM
    • Loading...
    • AjaxNinja
    • Joined on 07-25-2007, 8:43 PM
    • Nashville, TN
    • Posts 12

    I was wondering the same thing actually... Does the XSS library or the tool mentioned by the OP allow for HTML whitelisting?

    Filed under:
Page 1 of 1 (3 items)
Microsoft Communities
Page view counter