I'm talking to Phil about an operator overload that could do just this - it's a pretty sweeping change but I think it makes sense. So one of the two could happen, we could overload ! to encode everything, leaving = as unencoded, or the opposite.
Rails uses "h" as they prefer not to get in the way (it seems).
The error-message scenario you mention is a good one to consider, but please recognise this is just a special case. The protection you're proposing isn't going to help if, rather than using UpdateFrom(), the coder has user myObject.Value = Request.Form["something"],
nor will it help if they are echoing back something from the querystring, from a cookie, a user-agent string or whatever. These are more common cases.
In other words, this is a bigger problem and UpdateFrom()-with-encoding doesn't help, it just muddies the waters and prevents anyone from learning the rule that output should be encoded. I hate the idea of that confusion (not to mention the basic architectural
wrongness of it) - pre-encoding sounds like a disaster in the making. There are so many benefits to the encode-on-display method that I hope you don't let this odd special case get in the way. I appreciate your earlier points that you did propose to enable
an option not to encode, but what many of us have learned over the years is that the
only thing that matters is defaults - this isn't the open-source world where devs make intelligent choices on their own, remember!
Coming back to the idea of changing the meaning of <%= ... %>, which would actually solve the problem in a much bigger way, I've written up
some ideas in a blog post, including a demo implementation of the behaviour and mechanism for retaining the same MVC Toolkit syntax. Not sure what you meant about <pre> and <code> blocks - could you elaborate?
Is a change to <%= ... %> too radical a step, or does the MVC framework give you an opportunity to truly improve matters?
For all to know - I've been on an internal discussion for the last day on this, and this thread is being followed :). This discussion means something for all who feel you're not being heard :).
SteveSanderson
The protection you're proposing isn't going to help if, rather than using UpdateFrom()...
Yes - agreed.
SteveSanderson
pre-encoding sounds like a disaster in the making
Again - I understand this - and I'm suggesting we force anything on anyone. All I want at the end of the day is the most secure solution (primarily) that is as close to acceptable and reasonable as we can get. Security in some ways leads to more work.
SteveSanderson
only thing that matters is defaults - this isn't the open-source world where devs make intelligent choices on their own, remember!
Which is my point entirely. If you default to the secure side, you're safer. I understand (once again) the ramifications of that thought and please know it's part of the ongoing discussion. One solution offered is to not have a default and go with a 50/50
option: UpdateFromEncoded() and UpdateFromRaw() - naming TBD :).
SteveSanderson
Is a change to <%= ... %> too radical a step, or does the MVC framework give you an opportunity to truly improve matters?
Initially I was thinking this might be doable but it actually can be problematic. I've asked Phil to weigh in on it so I'll let him respond.
As people have said encoding just in UpdateFrom leaves many other scenarios and avenues open for attack so we need a broader solution.
When I argued for <%= %> encoding at MS last month I was told that this would likely be out of scope because it would require work on the ASPX engine.
I'm not sure what magic Steve has come up with on this SafeEncodingHelper but it looks like something we should be investigating if it avoids such changes.
The ideal scenario I put forth was that:
<%= %> should call a method such as Encode(string output) on ViewPage and that a new HtmlViewPage class becomes the default which overrides this to call HttpUtility.HtmlEncode.
The advantage is that when writing views to create non-HTML content you could use the same syntax with an alternate encoding mechanism, i.e
CsvViewPage that would implement an Encode method to escape double-quotes and commas etc.
I personally agree that incoming data should not be encoded automatically in general. I've worked on multiple systems in which the same data needed to be formatted for multiple outputs, not just HTML. From
my experience it's actually quite common.
Aside: One crazy idea is to have an encoded column and a raw column for each displayable field. I don't like it personally, but just tossing that out there.
So let's focus on the output situation.
The question that was asked is if we can override <%= %> to Html Encode output.
Right now, the ASP.NET page parser converts <%= "Foo" %> to a call to Response.Write("Foo"); I don't think we want to change Response.Write() to default to HTML encoding because that would cause the entire page template to be an unencoded mess since literals
within an aspx page are clumped together into Response.Write calls.
There are two approaches I can think of that might not require changes to the underlying ASP.NET forms engine (which would be a huge breaking change).
1. Use the approach that Steve (last name?)
does here. He effectively implements his own CSharpCodeCompiler to intercept the code generation phase I mentioned. Very neat approach.
2. Another theoretical approach (as in I think it would work, but I haven't tried it) is to hook in your own
PageParserFilter. Ideally, this approach might allow for introducing an alternative syntax for unencoded HTML such as the <%!= Html.TextBox(...) %>
The next question is whether the ASP.NET team should make this the default behavior for ASP.NET MVC?
From this thread, I get the vague sense that people here are saying yes! [;)]
One question for you:
Would it make more sense to not change the semantics of <%= and introduce <%!= as the encoded version? Or perhaps <%%= (since % is an encoding character and might make more sense semantically?) The reason I bring this up
is that <%= has a looooong history in multiple template engines, not just ASP.NET. For example, RoR does not automatically encode <%=.
What would cause the least surprise for developers?
As for whether we include it in ASP.NET MVC, I'm sure there are issues I haven't even thought about. I personally think it's worth considering if its possible without huge underlying changes. It does seem to adhere to the general security principle of exposing
the least surface area and requiring opt-in when exposing something unsafe.
ASP.NET MVC
Phil Haack (http://haacked.com/)
Senior Program Manager, Microsoft
I'm for changing <%= default behaviour to cause HTML encoding by default because I believe Microsoft should be secure by default which I thought was their stance since IIS6 - even if it does mean breaking or knowing things (i.e. switching on ASPX ability)
If they are worried about breaking existing solutions then switch it on in the web.config as default for new projects via the MVC VS templates.
How easy is it to make it contextually aware? I deally if <%= %> is filling in an attribute of a HTML tag it should end up calling HttpUtility.HtmlAttributeEncode rather than ending up at HttpUtility.HtmlEncode.
I covered
various ASP.NET HTML injection pitfalls at my blog. The proposed solution covers #1 and potentially could cover #2. Microsoft needs to get a patch out for #3 (a bug) and consider a patch for #4/5 (probably by design).
What would cause the least surprise for developers?
I understand the argument for making it the default (<%= %>), but I think the least amount of surprise will be introducing <%!= %> as encoding, since none of the other frameworks do encoding by default. would be a surprise for someone not to expect that.
It would be of both least surprise and of least use letting developers continue unaware and unhindered in developing web applications open to vulnerability.
If security is considered nothing more than an inconvenience by all means keep it quiet and out the way.
Personally security rates more highly than a 5 minute learning curve.
Good discussion guys - I think we're coming around to the gravity of the whole thing. I wrote a post about it, diving into a bit more of my position on this:
robconery
Participant
852 Points
195 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 06:14 PM|LINK
I'm talking to Phil about an operator overload that could do just this - it's a pretty sweeping change but I think it makes sense. So one of the two could happen, we could overload ! to encode everything, leaving = as unencoded, or the opposite.
Rails uses "h" as they prefer not to get in the way (it seems).
Does anyone have a feeling on this?
SteveSanders...
Member
432 Points
119 Posts
Microsoft
Re: UpdateFrom and Encoding
Dec 19, 2007 06:19 PM|LINK
Whew, glad we got past all the name-calling!
The error-message scenario you mention is a good one to consider, but please recognise this is just a special case. The protection you're proposing isn't going to help if, rather than using UpdateFrom(), the coder has user myObject.Value = Request.Form["something"], nor will it help if they are echoing back something from the querystring, from a cookie, a user-agent string or whatever. These are more common cases.
In other words, this is a bigger problem and UpdateFrom()-with-encoding doesn't help, it just muddies the waters and prevents anyone from learning the rule that output should be encoded. I hate the idea of that confusion (not to mention the basic architectural wrongness of it) - pre-encoding sounds like a disaster in the making. There are so many benefits to the encode-on-display method that I hope you don't let this odd special case get in the way. I appreciate your earlier points that you did propose to enable an option not to encode, but what many of us have learned over the years is that the only thing that matters is defaults - this isn't the open-source world where devs make intelligent choices on their own, remember!
Coming back to the idea of changing the meaning of <%= ... %>, which would actually solve the problem in a much bigger way, I've written up some ideas in a blog post, including a demo implementation of the behaviour and mechanism for retaining the same MVC Toolkit syntax. Not sure what you meant about <pre> and <code> blocks - could you elaborate?
Is a change to <%= ... %> too radical a step, or does the MVC framework give you an opportunity to truly improve matters?
http://blog.codeville.net/
JeremyS
Member
506 Points
99 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 06:26 PM|LINK
My personal preference would be:
robconery
Participant
852 Points
195 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 06:51 PM|LINK
For all to know - I've been on an internal discussion for the last day on this, and this thread is being followed :). This discussion means something for all who feel you're not being heard :).
Yes - agreed.
Again - I understand this - and I'm suggesting we force anything on anyone. All I want at the end of the day is the most secure solution (primarily) that is as close to acceptable and reasonable as we can get. Security in some ways leads to more work.
Which is my point entirely. If you default to the secure side, you're safer. I understand (once again) the ramifications of that thought and please know it's part of the ongoing discussion. One solution offered is to not have a default and go with a 50/50 option: UpdateFromEncoded() and UpdateFromRaw() - naming TBD :).
Initially I was thinking this might be doable but it actually can be problematic. I've asked Phil to weigh in on it so I'll let him respond.
damieng
Member
38 Points
16 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 06:52 PM|LINK
As people have said encoding just in UpdateFrom leaves many other scenarios and avenues open for attack so we need a broader solution.
When I argued for <%= %> encoding at MS last month I was told that this would likely be out of scope because it would require work on the ASPX engine.
I'm not sure what magic Steve has come up with on this SafeEncodingHelper but it looks like something we should be investigating if it avoids such changes.
The ideal scenario I put forth was that:
<%= %> should call a method such as Encode(string output) on ViewPage and that a new HtmlViewPage class becomes the default which overrides this to call HttpUtility.HtmlEncode.
The advantage is that when writing views to create non-HTML content you could use the same syntax with an alternate encoding mechanism, i.e
CsvViewPage that would implement an Encode method to escape double-quotes and commas etc.
Haacked
Contributor
6901 Points
412 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 07:23 PM|LINK
I personally agree that incoming data should not be encoded automatically in general. I've worked on multiple systems in which the same data needed to be formatted for multiple outputs, not just HTML. From my experience it's actually quite common.
So let's focus on the output situation.
The question that was asked is if we can override <%= %> to Html Encode output.
Right now, the ASP.NET page parser converts <%= "Foo" %> to a call to Response.Write("Foo"); I don't think we want to change Response.Write() to default to HTML encoding because that would cause the entire page template to be an unencoded mess since literals within an aspx page are clumped together into Response.Write calls.
There are two approaches I can think of that might not require changes to the underlying ASP.NET forms engine (which would be a huge breaking change).
1. Use the approach that Steve (last name?) does here. He effectively implements his own CSharpCodeCompiler to intercept the code generation phase I mentioned. Very neat approach.
2. Another theoretical approach (as in I think it would work, but I haven't tried it) is to hook in your own PageParserFilter. Ideally, this approach might allow for introducing an alternative syntax for unencoded HTML such as the <%!= Html.TextBox(...) %>
The next question is whether the ASP.NET team should make this the default behavior for ASP.NET MVC?
From this thread, I get the vague sense that people here are saying yes! [;)]
One question for you:
Would it make more sense to not change the semantics of <%= and introduce <%!= as the encoded version? Or perhaps <%%= (since % is an encoding character and might make more sense semantically?) The reason I bring this up is that <%= has a looooong history in multiple template engines, not just ASP.NET. For example, RoR does not automatically encode <%=.
What would cause the least surprise for developers?
As for whether we include it in ASP.NET MVC, I'm sure there are issues I haven't even thought about. I personally think it's worth considering if its possible without huge underlying changes. It does seem to adhere to the general security principle of exposing the least surface area and requiring opt-in when exposing something unsafe.
ASP.NET MVC
Senior Program Manager, Microsoft
What wouldn’t you do for a Klondike bar?
damieng
Member
38 Points
16 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 07:35 PM|LINK
I'm for changing <%= default behaviour to cause HTML encoding by default because I believe Microsoft should be secure by default which I thought was their stance since IIS6 - even if it does mean breaking or knowing things (i.e. switching on ASPX ability)
If they are worried about breaking existing solutions then switch it on in the web.config as default for new projects via the MVC VS templates.
How easy is it to make it contextually aware? I deally if <%= %> is filling in an attribute of a HTML tag it should end up calling HttpUtility.HtmlAttributeEncode rather than ending up at HttpUtility.HtmlEncode.
I covered various ASP.NET HTML injection pitfalls at my blog. The proposed solution covers #1 and potentially could cover #2. Microsoft needs to get a patch out for #3 (a bug) and consider a patch for #4/5 (probably by design).
[)amien
shinakuma
Member
378 Points
92 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 08:54 PM|LINK
I understand the argument for making it the default (<%= %>), but I think the least amount of surprise will be introducing <%!= %> as encoding, since none of the other frameworks do encoding by default. would be a surprise for someone not to expect that.
damieng
Member
38 Points
16 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 09:00 PM|LINK
It would be of both least surprise and of least use letting developers continue unaware and unhindered in developing web applications open to vulnerability.
If security is considered nothing more than an inconvenience by all means keep it quiet and out the way.
Personally security rates more highly than a 5 minute learning curve.
[)amien
robconery
Participant
852 Points
195 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 09:47 PM|LINK
Good discussion guys - I think we're coming around to the gravity of the whole thing. I wrote a post about it, diving into a bit more of my position on this:
http://blog.wekeroad.com/2007/12/19/in-which-we-discuss-html-encoding/
I don't want anyone here to think that
The thing that we (and our larger developer audience) need to consider is the default stance - and then how to execute it. Let's keep this rolling...