I think <%=o.Subject%> is not related in any way to
content of the o.Subject. It is only presentation of
o.Subject. Nothing more.
If we force UI (HTML) related things to o.Subject (via automatic Encoding), the separation of concerns in MVC seems to be broken a bit.
Developer should manually control XSS problems and manually encode <%=o.Subject%>. No one framework can handle all XSS problems. Any developer anyway will be able to emit unsafe HTML.
I propose having some kind of extension:
<%= o.Subject.ToHtmlText()%>
Advantages:
Easy to use.
Controllable (sometimes you just should not to encode).
I've been thinking about this for the a while now.
When I looked at Ruby on Rails and security, I was surprised to see that it didn't automatically encode the output. It was a prefect opportunity to start making websites secure by default and sadly even with h() being very simply to implement, in books
its still only discussed around chapter 6-7.
I'm for secure by default, however it does come at a cost of maybe developers not being aware of the issues involved. However, those developers would also be the developers releasing insecure applications anyway. By having secure by default, we are protecting
against those people, or should I say we are protecting users from those people.
I also don't think we should be against changing the way <%= works due to the way it worked in the past / in other frameworks. If its wrong, we should change it.....
For me, I would like to see:
<%= Encoded %> <%!= Not Encoded %>
ASP.net 2.0 auto encodes for GridView etc and no one seems to complain....
Oh, on a sidenote. HtmlUtility.HtmlEncode is not the preferred way of securing your application as it uses blacklisting instead of whitelisting. No reported problems in 2.0 (there was in 1.1) but Microsoft have released the AntiXSS Library as the preferred
approach... :)
The developer does have to take some culpability in security as well. I don't think it's a fair assessment to assume not wanting to change the behavior of <%= is due to a lack of awareness or attention to security. It's not such a black and white decision.
We've had some interesting discussions internally sparked by this thread. One point I should bring up is that <%= %> is neither inherently secure nor insecure. It's a shorthand for <% Response.Write() %>. Should we automatically encode Response.Writes now?
Certainly not.
Here's another thing to chew on. Suppose we change <%= to html encode just for MVC. What happens when you have projects with both WebForms and MVC templates in the same project?
Inconsistency is just as bad (if not worse) for security as requiring opt-out. Now, when you're on some pages, <%= automatically encodes while other pages <%= doesn't encode. That's not a good situation.
One might conclude that <%= should be a universal change across ASP.NET. While I don't take security lightly, I also don't take a massive breaking change across nearly every site that uses ASP.NET lightly either.
Here are some ideas for improving the situation:
<div mce_keep="true">You as a developer can use the approach that David wrote-up to change <%= yourself.</div>
<div mce_keep="true">An extension method ToEncode() or ToHtmlEncode() is pretty nice as its discoverable and only two more key presses (with VS Intellisense).</div>
<div mce_keep="true">Better code samples and more scrutiny of code samples. Samples should not include XSS flaws as well as disclaimers about not using in prod.</div>
<div mce_keep="true">More education.</div>
<div mce_keep="true">Studies. How often are XSS exploits discovered and exploited on ASP.NET? This is important data for this discussion.</div>
<div mce_keep="true">Declarative controls for MVC that do the right thing in regards to encoding.</div>
<div mce_keep="true">Good XSS analysis tools. After all, Encode() won't catch everything.</div>
After all, even if we did do <%= we haven't really prevented very common XSS attack avenues. For example, sites that allow users to submit HTML and display HTML. Encoding doesn't help in those cases. Only HTML scrubbing, white listed html tags, etc...
For example, we might consider a ToSafeHtml() extension method that doesn't encode HTML, but strips out common exploit tags. Or .ToSafeHtml(string[] allowedTags);
In any case, this is a very interesting discussion and thinking about helping to prevent XSS is certainly on our radar. Ultimately though, I think providing consistency and not too much magic under the hood is required.
We could do all the magic in the world, and a bad developer will always be able to write a page with an exploit. By at least remaining consistent, good developers know what to expect and can apply tried and true practices for writing secure web pages without
worrying about the framework circumventing their efforts.
Speculating, but I imagine that's why RoR and other frameworks chose to not encode <%=.
Phil Haack (http://haacked.com/)
Senior Program Manager, Microsoft
The more I read the thread and think about it, the more it makes sense not to change the default behavior.
Haacked
The developer does have to take some culpability in security as well
+1
My feeling is, keep as is and don't encode any input. If Microsoft wanted to include an extension for encoding with <%!= thats cool, otherwise I am sure MVC Contrib could pick up something like that too. This is something that should be consistent with
other frameworks, not innovative or creative... changing this will just makes things confusing.
Another good tool would be an httpmodule that could filter output and throw warnings for potential xss attacks when in a test environment. Not perfect but certainly better than nothing.
P.S. If you are thinking about working some magic with the Asp.Net compiler
think about this too.
Phil, You make some very good points. No solution is fully secure (ValidateRequest has a number of holes) and I have to say I didn't think about WebForms + MVC in the same project which does put a different view on the idea of changing <%.
The solution needs to be simply and easy for the newbie to find. If its an external method (like HtmlEncode) then they are unlikely to come across it unless they go looking at security - I wonder how many developers bypass that topic. I have saw a number
of rails applications which have not used the h() simply because the developers wasn't aware of XSS, so even making it simple doesn't mean it will be implemented. This then comes down to your point about making sure all the samples are secure. Too many books
publish security flaws in their samples.
I tried to ask the rails team their decision not to be secure by default, I forget the extract quote but I think it was something like its upto the developers to make their applications secure and follow best practices, we don't want to enforce what they
should and shouldn't do. (from memory)
Absolutely the developer should be taking some responsibility but one oversight is all it takes. I've been encoding my HTML since '99 and have still missed it a few times in ASP and now in ASP.NET because of the src/img/href issue in HtmlControls.
There are so many changes between the MVC stack and WebForms such as loosing server-side controls, viewstate, postback, events that adding encoding by default to <%= %> is going to be a footnote in comparison.
You're right we shouldn't encoded Response.Write but I think we do need a strategy for building HTML from inside code for controls/helpers. I think the HtmlControls are a good start although better constructors and that bug fix would be needed.
I think a solution such as Steves whereby you can specify an option to enable it in the web.config for both MVC and ASP.NET applications switched on for MVC because it's new and the right thing to do. Whether it is switched on by default for new ASP.NET
projects is something for Microsoft teams to think about but being able to switch it on for those is a good start.
I'm not sure ToEncode or ToHtmlEncode add anything other than people attempting to write out HTML from code using strings rather than a recommended set of controls designed for building HTML. It's still easy to get things wrong such as standards compliance
and the whole thing would fall under the 'Primitive Obsession' smell because we are thinking too much about the serialisation format (string) rather than a logical abstraction (dom / htmlcontrols).
Robust supplied .NET functionality for white-listing HTML would be fantastic, +1 on that.
I don't think we can draw from the RoR philosophy and do nothing. As RoR matures as a platform I think they are going to come back and regret that decision - I bet there are exploitable RoR systems out there already.
Thanks again to everyone putting effort into discussing and thinking this through.
I'd love it if you achieved "minimum keystrokes == maximum safety", so I have a strong preference that <%= ... %> should encode its output.
Yes, it is a breaking change, but ASP.NET has had breaking changes in the past and the world didn't end. When request validation was shipped as an automatic update in .NET 1.1, the first my last company knew about it was when the production web app started
breaking. That was unpleasant, but they changed the web.config and got on with business. What you're proposing here is far less dramatic - this is a whole new platform which nobody even has in production yet. And you could enable changing <%= ... %> back to
its old behaviour in web.config.
As Damien points out, this is nothing compared to the breaking of nearly all existing web controls, which has no workaround (which I totally understand and support). If this isn't your perfect opportunity to make a big improvement to security, I fear there
will never be one.
What do you think about the idea of not actually forcing the developer to choose between encoding and not encoding - and letting the framework make the choice by default? It could be determined by the type of the evaluation result. If it's a plain old string,
encode it, if it's a special variant on string (let's call it RawHtml), then don't encode it. The HTML helper methods would return RawHtml of course.
Sounds weird, but the key point is that it makes it very difficult for the developer to do the wrong thing. They just write <%= ... %> and the right thing happens, except in the rare case where they want to print HTML verbatim *without* using a HTML helper
method, in which case they write <%= (RawHtml)value %>, or <%= value.toRawHtml() %> - a deliberately non-concise syntax to discourage unnecessary use. Even then it chose the safe default until they added the typecast. I know it's less obvious at first but
it's very simple when you know it. This is what I implemented in my
demo by the way. In my view this is more enlightened than NVelocity's style of forcing the developer to manually choose between $ and $! all the time (they're just one keystroke different, you can pick the wrong one without thinking about it).
But even if you go with the manual choice approach (with <%!= ... %>), this could be one of the best features of ASP.NET MVC.
I'm not too fond of encoding the input by default for all the reasons already mentioned. I like the idea of adding to <%=%>, but not changing it. As others have said, that would cause some confusion. If something like <%% = %%> did HTML encoding by default
it would not break the old model, and it would get used since it's easy. Also, it would be an opportunity to educate developers on XSS. Every one would say to always use <%% = %%> and when devs asked why they could learn about XSS. Education on the matter
is probably more important than the framework handling everything.
I also like the idea of having extension methods to handle the different encodings for HTML, XML, and JavaScript. They are discoverable and easy to use.
Ghotiman, I certainly agree that education is important. I'd be sad if we force people to type a big trainwreck of punctuation (<%% ... %%>) just to get the behaviour they should "always use". If other developers are as lazy as me, they're just going to press
the fewest number of keys needed to say their project is finished, which means <%= ... %> ("seems to work fine").
SteveSanderson1, I don't care to much what the syntax should be, just that it be different than the current <%= %>. If they change the behavoir of <%= %> then most of the developers who need to be educated on this will never know anything about it. Also,
there will be the inconsistant behavior between forms and mvc, or old code will break. If there is a new syntax, it can serve as the talking point for XSS. The new syntax should be easy, I think everyone can agree on that. I just think it should be different.
nagir
Member
162 Points
184 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 10:07 PM|LINK
I think <%=o.Subject%> is not related in any way to content of the o.Subject. It is only presentation of o.Subject. Nothing more.
If we force UI (HTML) related things to o.Subject (via automatic Encoding), the separation of concerns in MVC seems to be broken a bit.
Developer should manually control XSS problems and manually encode <%=o.Subject%>. No one framework can handle all XSS problems. Any developer anyway will be able to emit unsafe HTML.
I propose having some kind of extension:
<%= o.Subject.ToHtmlText()%>
Advantages:
Disadvantages:
Regards,
Dmitriy.
ben2005uk
Member
95 Points
23 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 11:25 PM|LINK
I've been thinking about this for the a while now.
When I looked at Ruby on Rails and security, I was surprised to see that it didn't automatically encode the output. It was a prefect opportunity to start making websites secure by default and sadly even with h() being very simply to implement, in books its still only discussed around chapter 6-7.
I'm for secure by default, however it does come at a cost of maybe developers not being aware of the issues involved. However, those developers would also be the developers releasing insecure applications anyway. By having secure by default, we are protecting against those people, or should I say we are protecting users from those people.
I also don't think we should be against changing the way <%= works due to the way it worked in the past / in other frameworks. If its wrong, we should change it.....
For me, I would like to see:
<%= Encoded %> <%!= Not Encoded %>
ASP.net 2.0 auto encodes for GridView etc and no one seems to complain....
Oh, on a sidenote. HtmlUtility.HtmlEncode is not the preferred way of securing your application as it uses blacklisting instead of whitelisting. No reported problems in 2.0 (there was in 1.1) but Microsoft have released the AntiXSS Library as the preferred approach... :)
http://Blog.BenHall.me.uk
Haacked
Contributor
6901 Points
412 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 11:27 PM|LINK
@Damien
The developer does have to take some culpability in security as well. I don't think it's a fair assessment to assume not wanting to change the behavior of <%= is due to a lack of awareness or attention to security. It's not such a black and white decision.
We've had some interesting discussions internally sparked by this thread. One point I should bring up is that <%= %> is neither inherently secure nor insecure. It's a shorthand for <% Response.Write() %>. Should we automatically encode Response.Writes now? Certainly not.
Here's another thing to chew on. Suppose we change <%= to html encode just for MVC. What happens when you have projects with both WebForms and MVC templates in the same project?
Inconsistency is just as bad (if not worse) for security as requiring opt-out. Now, when you're on some pages, <%= automatically encodes while other pages <%= doesn't encode. That's not a good situation.
One might conclude that <%= should be a universal change across ASP.NET. While I don't take security lightly, I also don't take a massive breaking change across nearly every site that uses ASP.NET lightly either.
Here are some ideas for improving the situation:
After all, even if we did do <%= we haven't really prevented very common XSS attack avenues. For example, sites that allow users to submit HTML and display HTML. Encoding doesn't help in those cases. Only HTML scrubbing, white listed html tags, etc...
For example, we might consider a ToSafeHtml() extension method that doesn't encode HTML, but strips out common exploit tags. Or .ToSafeHtml(string[] allowedTags);
In any case, this is a very interesting discussion and thinking about helping to prevent XSS is certainly on our radar. Ultimately though, I think providing consistency and not too much magic under the hood is required.
We could do all the magic in the world, and a bad developer will always be able to write a page with an exploit. By at least remaining consistent, good developers know what to expect and can apply tried and true practices for writing secure web pages without worrying about the framework circumventing their efforts.
Speculating, but I imagine that's why RoR and other frameworks chose to not encode <%=.
Senior Program Manager, Microsoft
What wouldn’t you do for a Klondike bar?
abombss
Member
575 Points
164 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 11:40 PM|LINK
The more I read the thread and think about it, the more it makes sense not to change the default behavior.
+1
My feeling is, keep as is and don't encode any input. If Microsoft wanted to include an extension for encoding with <%!= thats cool, otherwise I am sure MVC Contrib could pick up something like that too. This is something that should be consistent with other frameworks, not innovative or creative... changing this will just makes things confusing.
Another good tool would be an httpmodule that could filter output and throw warnings for potential xss attacks when in a test environment. Not perfect but certainly better than nothing.
P.S. If you are thinking about working some magic with the Asp.Net compiler think about this too.
ben2005uk
Member
95 Points
23 Posts
Re: UpdateFrom and Encoding
Dec 20, 2007 12:04 AM|LINK
Phil, You make some very good points. No solution is fully secure (ValidateRequest has a number of holes) and I have to say I didn't think about WebForms + MVC in the same project which does put a different view on the idea of changing <%.
The solution needs to be simply and easy for the newbie to find. If its an external method (like HtmlEncode) then they are unlikely to come across it unless they go looking at security - I wonder how many developers bypass that topic. I have saw a number of rails applications which have not used the h() simply because the developers wasn't aware of XSS, so even making it simple doesn't mean it will be implemented. This then comes down to your point about making sure all the samples are secure. Too many books publish security flaws in their samples.
I tried to ask the rails team their decision not to be secure by default, I forget the extract quote but I think it was something like its upto the developers to make their applications secure and follow best practices, we don't want to enforce what they should and shouldn't do. (from memory)
http://Blog.BenHall.me.uk
damieng
Member
38 Points
16 Posts
Re: UpdateFrom and Encoding
Dec 20, 2007 12:54 AM|LINK
@Phil:
Absolutely the developer should be taking some responsibility but one oversight is all it takes. I've been encoding my HTML since '99 and have still missed it a few times in ASP and now in ASP.NET because of the src/img/href issue in HtmlControls.
There are so many changes between the MVC stack and WebForms such as loosing server-side controls, viewstate, postback, events that adding encoding by default to <%= %> is going to be a footnote in comparison.
You're right we shouldn't encoded Response.Write but I think we do need a strategy for building HTML from inside code for controls/helpers. I think the HtmlControls are a good start although better constructors and that bug fix would be needed.
I think a solution such as Steves whereby you can specify an option to enable it in the web.config for both MVC and ASP.NET applications switched on for MVC because it's new and the right thing to do. Whether it is switched on by default for new ASP.NET projects is something for Microsoft teams to think about but being able to switch it on for those is a good start.
I'm not sure ToEncode or ToHtmlEncode add anything other than people attempting to write out HTML from code using strings rather than a recommended set of controls designed for building HTML. It's still easy to get things wrong such as standards compliance and the whole thing would fall under the 'Primitive Obsession' smell because we are thinking too much about the serialisation format (string) rather than a logical abstraction (dom / htmlcontrols).
Robust supplied .NET functionality for white-listing HTML would be fantastic, +1 on that.
I don't think we can draw from the RoR philosophy and do nothing. As RoR matures as a platform I think they are going to come back and regret that decision - I bet there are exploitable RoR systems out there already.
[)amien
SteveSanders...
Member
432 Points
119 Posts
Microsoft
Re: UpdateFrom and Encoding
Dec 20, 2007 04:03 PM|LINK
Thanks again to everyone putting effort into discussing and thinking this through.
I'd love it if you achieved "minimum keystrokes == maximum safety", so I have a strong preference that <%= ... %> should encode its output.
Yes, it is a breaking change, but ASP.NET has had breaking changes in the past and the world didn't end. When request validation was shipped as an automatic update in .NET 1.1, the first my last company knew about it was when the production web app started breaking. That was unpleasant, but they changed the web.config and got on with business. What you're proposing here is far less dramatic - this is a whole new platform which nobody even has in production yet. And you could enable changing <%= ... %> back to its old behaviour in web.config.
As Damien points out, this is nothing compared to the breaking of nearly all existing web controls, which has no workaround (which I totally understand and support). If this isn't your perfect opportunity to make a big improvement to security, I fear there will never be one.
What do you think about the idea of not actually forcing the developer to choose between encoding and not encoding - and letting the framework make the choice by default? It could be determined by the type of the evaluation result. If it's a plain old string, encode it, if it's a special variant on string (let's call it RawHtml), then don't encode it. The HTML helper methods would return RawHtml of course.
Sounds weird, but the key point is that it makes it very difficult for the developer to do the wrong thing. They just write <%= ... %> and the right thing happens, except in the rare case where they want to print HTML verbatim *without* using a HTML helper method, in which case they write <%= (RawHtml)value %>, or <%= value.toRawHtml() %> - a deliberately non-concise syntax to discourage unnecessary use. Even then it chose the safe default until they added the typecast. I know it's less obvious at first but it's very simple when you know it. This is what I implemented in my demo by the way. In my view this is more enlightened than NVelocity's style of forcing the developer to manually choose between $ and $! all the time (they're just one keystroke different, you can pick the wrong one without thinking about it).
But even if you go with the manual choice approach (with <%!= ... %>), this could be one of the best features of ASP.NET MVC.
http://blog.codeville.net/
ghotiman
Member
205 Points
57 Posts
Re: UpdateFrom and Encoding
Dec 20, 2007 04:57 PM|LINK
I'm not too fond of encoding the input by default for all the reasons already mentioned. I like the idea of adding to <%=%>, but not changing it. As others have said, that would cause some confusion. If something like <%% = %%> did HTML encoding by default it would not break the old model, and it would get used since it's easy. Also, it would be an opportunity to educate developers on XSS. Every one would say to always use <%% = %%> and when devs asked why they could learn about XSS. Education on the matter is probably more important than the framework handling everything.
I also like the idea of having extension methods to handle the different encodings for HTML, XML, and JavaScript. They are discoverable and easy to use.
SteveSanders...
Member
6 Points
3 Posts
Re: UpdateFrom and Encoding
Dec 20, 2007 05:15 PM|LINK
ghotiman
Member
205 Points
57 Posts
Re: UpdateFrom and Encoding
Dec 20, 2007 07:22 PM|LINK
SteveSanderson1, I don't care to much what the syntax should be, just that it be different than the current <%= %>. If they change the behavoir of <%= %> then most of the developers who need to be educated on this will never know anything about it. Also, there will be the inconsistant behavior between forms and mvc, or old code will break. If there is a new syntax, it can serve as the talking point for XSS. The new syntax should be easy, I think everyone can agree on that. I just think it should be different.