Think about situation when Products are displayed on the web and in the desktop application.
If you will encode INPUT it will display really "incorrect" data in the desktop application.
For example, user edits product online and sets its name to "Visual Studio 2008 <XMAS EDITION>".
And the desktop application will display it as "Visual Studio 2008 <XMAS EDITION>".
So to aviod it, the DESTOP APP SHOULD ALSO DECODE the data which is nonsense unless its not a CMS system.
Gents I think you're misunderstanding me :). First - Damien thanks for weighing in here - much appreciated. Let's break this down a bit:
1) This isn't webforms anymore, and there are no server controls to auto-encode your raw text
2) I fully realize the implication of encoded text in the DB
3) I also fully realize the hole you create when unencoded text is sent to the browser. ScottGu just did it on his blog (yes, I know it was a demo, but how often do we work under the "this is just a quicky" thing)
4) I'm not *forcing* anyone to do anything :)
OK - that said, let me address your issues Steve...
SteveSanderson
As you're in the priviledged position of designing a Microsoft-endorsed framework
I have two jobs really - the first is this Toolkit which isn't necessarily "Microsoft-endorsed" - it's just plain Micrsoft. As such, security is top of mind.
SteveSanderson
If you proceed with the idea "we'll encode the data inbound", you are not just perpetuating ignorance about encoding it outbound, you're actually *enforcing inaction*. It means that developers learn like monkeys they must *not* encode the data they send, because
some black magic somewhere in the system does that and they have to trust it to avoid double-escaping
Monkey's and BlackMagic aside (are we in Oz?) - I hear what you're saying about double-escaping. It's a problem to be sure, and I don't think that giving developers a choice with a default to the secure side of things is *perpetuating ignorance*. If anything,
it's helping the "monkeys" keep from opening their site up to XSS.
SteveSanderson
sharing databases, legacy data storage, and interacting with external systems aren't 1% issues for us
Of course not - and I understand that. But 99% of the developers I need to support don't do this sort of thing. They make the sites that get targetted...
SteveSanderson
Have you done any benchmarking of this
String manipulation comes at a cost and even if it's minor; but this isn't my point - we do encode everything coming through the HtmlHelper bits.
I understand the anxiety, but I don't think you're grokking that I'm offering alternatives here so you can do what you like - specifically turning *off* encoding as-needed. I think *my* default stance here comes from the countless times I've built an application
that I've forgotten to encode - and I've paid for it. You've done it too - so has Damien. The issue here is what does the platform do by default - and I don't agree that I'm *perpetuating ignorance* as you put it.
So - more specifically, you'll be able to do this:
MyObject.UpdateFrom(Request.Form) which will encode everything or
MyObject.UpdateFromRaw(Request.Form)
The difference here is that you're selecting the less-secure method - the choice is yours, as it should be.
Finally - setting <%=%> up to auto-encode doesn't work either because of <pre> and <code> blocks.
If you HTML encode data into the database you force web developers into writing unsafe code because if they do HTML encode now it will be double-encoded and will display as a mess of " etc. all over the screen/output/XML/RSS.
Agreed and I'm not suggesting that this is done *always* - only by *choice* - your choice :).
I totally disagree with HTML-encoding data going *into* the database even as an option or choice.
If by your own measure they aren't going to reuse the database elsewhere then what is wrong with the MVC toolkit helpers just encoding on output to the page? The problem is entirely solved there and then.
By encoding values into the database you cause problems for yourself in the MVC toolkit development because your helpers won't know whether to encode or not on output and nor will anyone else.
You end up protecting a very singular scenario - poorly written sites that deal only with your encoded data. Any other type of data - generated, request objects, sessions, cookies etc. are not protected and well written websites crumble under the double-encoding
problem.
Developers ends up never knowing what to encode on output and what not to once down this path.
The problem is purely one of output and should be treated as such. Any hack to encode data going into the database is going to cause more grief and chaos than it solves.
1) This isn't webforms anymore, and there are no server controls to auto-encode your raw text
So why not extend the asp.net compiler to auto encode <%= %>, if its not webforms no backwards compatibility problems there? If people could make pre-compilers that add runat="server" for <asp: tags I am sure you could come up with something to wrap all
calls <%= %> in Html.Encode(). Or just switch to Boo, it can do it no problem.
robconery
2) I fully realize the implication of encoded text in the DB
No you don't or the community would not have responded the way it did.
robconery
3) I also fully realize the hole you create when unencoded text is sent to the browser. ScottGu just did it on his blog (yes, I know it was a demo, but how often do we work under the "this is just a quicky" thing)
Bad developers, bad testing, bad framework design, maybe just a bug? Don't make the same mistakes you did in webforms fix the real problem. Html Encoding
output not input.
robconery
4) I'm not *forcing* anyone to do anything :)
Yes you are, you are forcing EVERYONE on this thread to call UpdateFromRaw. You are forcing me to write fxcop rules and do code analysis ensuring that I always call UpdateFromRaw and never call UpdateFrom. You are forcing me to educate my developers
on using non standard method calls. Why would a dev know to use UpdateFromRaw that just sounds weird, whats raw? You are forcing me to write some precompiler process to replace all UpdateFrom to UpdateFromRaw because I have no good way of testing that my
database does not contain encoded data. You are forcing microsoft to upset a lot of people by not listening to the community from the begining. Why ask for feedback if you don't listen?
And you may be forcing me to stick with Monorail...
Wow... I need to stop sounding like Bellware [:)]... I guess I would need a little RoR speak to really sound like him.
This is a harsh response please excuse the tone. I appreciate the work the whole team is doing. I want this project to succeed, I just think this is the wrong solution and it is not solving the real problem.
I totally disagree with HTML-encoding data going *into* the database even as an option or choice.
Agreed - 100%. What I'm looking at is the issue where someone does a "myObject.UpdateFrom(Request.Form)" and then spins it back to the browser on error, validation, or...? You have to agree that there is a hole here that I need to consider.
damieng
You end up protecting a very singular scenario - poorly written sites that deal only with your encoded data.
Unfortunately that singular scenario dominates :(.
damieng
The problem is purely one of output and should be treated as such.
I just read your posts on this issue and I have to say you're violating your own rule:
damieng
Vulnerable code often looks like this:
myLabel.Text = Request.Form["Something"];
In this example you show how a developer will take a Request value and spin it back to the page. This is the hole I speak of.
So - I'd like to ask you guys - what is the problem with having two methods? UpdateFrom() and UpdateFromRaw()? From your responses I think you guys are positioning me as dictating "all things
must be encoded" when that's not the case. Would you rather it be UpdateFromEncoded()?
I'd like to focus on this rather than my ignorance if that's OK with you guys ;).
I'd like to focus on this rather than my ignorance if that's OK with you guys ;).
Sorry Rob wasn't trying to beat you up personally, I just feel strongly about this one.
My vote is drop UpdateFrom completely and make two methods, UpdateFromEncoded and UpdateFromDecoded. Keeping UpdateFrom with an unexpected side effect just makes the application more difficult to read, understand, and debug.
No you don't or the community would not have responded the way it did.
I'd like to see if we can avoid inflammatory remarks. You might be frustrated the things I'm mentioning here, but comments like these don't help resolve anything.
abombss
Wow... I need to stop sounding like Bellware
Yep.
abombss
This is a harsh response please excuse the tone. I appreciate the work the whole team is doing. I want this project to succeed, I just think this is the wrong solution and it is not solving the real problem.
Good to hear. Now if you can consider my point of view on this, and that I used to be YOU 2 months ago (before I go this job), we can come to a solution. In reading everything there, can I sum up your post in "I want UpdateFrom() to load unencoded text,
and UpdateFromEncoded() to encode the text" - is that right?
Again I'm going to ask you Good Fellas if you can help me with this :). If, by default, we do an UpdateFrom() that's not encoded, how can we avoid the problem where the user does this:
MyController(){
MyObject o=new MyObject;
o.UpdateFrom(Request.Form);//someone submitted some nasty script
try{
o.Save();//the DB barfed it back because o.Subject was too long
//it contains a <script XSS attack
//bombing our char limit and forcing the response back the browser - very common
}catch(Exception x){
//let the user know
string message="Can't save the data!";
//repopulate
RenderView("Edit",o); //
}
}
This scenario happens a lot, and hopefull the dev just sends the response back to the control using the toolkit:
<%=Html.TextBox("subject",o.Subject)%>
But, they may also output an error message:
<%=o.Subject%> is invalid!
And this would be a problem. I'd like your help with this - this is my conundrum.
This is why it would great to make switch in mvc that <%= wraps the output in HtmlEncode() . It would avoid this problem and force developers to think about when then don't their data encoded versus when it needs to be encoded.
nagir
Member
162 Points
184 Posts
Re: UpdateFrom and Encoding
Dec 18, 2007 11:14 PM|LINK
Rob, I do agree with Steve.
Think about situation when Products are displayed on the web and in the desktop application.
If you will encode INPUT it will display really "incorrect" data in the desktop application.
For example, user edits product online and sets its name to "Visual Studio 2008 <XMAS EDITION>".
And the desktop application will display it as "Visual Studio 2008 <XMAS EDITION>".
So to aviod it, the DESTOP APP SHOULD ALSO DECODE the data which is nonsense unless its not a CMS system.
Regards,
Dmitriy.
robconery
Participant
852 Points
195 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 12:04 AM|LINK
Gents I think you're misunderstanding me :). First - Damien thanks for weighing in here - much appreciated. Let's break this down a bit:
1) This isn't webforms anymore, and there are no server controls to auto-encode your raw text
2) I fully realize the implication of encoded text in the DB
3) I also fully realize the hole you create when unencoded text is sent to the browser. ScottGu just did it on his blog (yes, I know it was a demo, but how often do we work under the "this is just a quicky" thing)
4) I'm not *forcing* anyone to do anything :)
OK - that said, let me address your issues Steve...
I have two jobs really - the first is this Toolkit which isn't necessarily "Microsoft-endorsed" - it's just plain Micrsoft. As such, security is top of mind.
Monkey's and BlackMagic aside (are we in Oz?) - I hear what you're saying about double-escaping. It's a problem to be sure, and I don't think that giving developers a choice with a default to the secure side of things is *perpetuating ignorance*. If anything, it's helping the "monkeys" keep from opening their site up to XSS.
Of course not - and I understand that. But 99% of the developers I need to support don't do this sort of thing. They make the sites that get targetted...
String manipulation comes at a cost and even if it's minor; but this isn't my point - we do encode everything coming through the HtmlHelper bits.
I understand the anxiety, but I don't think you're grokking that I'm offering alternatives here so you can do what you like - specifically turning *off* encoding as-needed. I think *my* default stance here comes from the countless times I've built an application that I've forgotten to encode - and I've paid for it. You've done it too - so has Damien. The issue here is what does the platform do by default - and I don't agree that I'm *perpetuating ignorance* as you put it.
So - more specifically, you'll be able to do this:
MyObject.UpdateFrom(Request.Form) which will encode everything or
MyObject.UpdateFromRaw(Request.Form)
The difference here is that you're selecting the less-secure method - the choice is yours, as it should be.
Finally - setting <%=%> up to auto-encode doesn't work either because of <pre> and <code> blocks.
robconery
Participant
852 Points
195 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 12:06 AM|LINK
Agreed and I'm not suggesting that this is done *always* - only by *choice* - your choice :).
damieng
Member
38 Points
16 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 12:47 AM|LINK
I totally disagree with HTML-encoding data going *into* the database even as an option or choice.
If by your own measure they aren't going to reuse the database elsewhere then what is wrong with the MVC toolkit helpers just encoding on output to the page? The problem is entirely solved there and then.
By encoding values into the database you cause problems for yourself in the MVC toolkit development because your helpers won't know whether to encode or not on output and nor will anyone else.
You end up protecting a very singular scenario - poorly written sites that deal only with your encoded data. Any other type of data - generated, request objects, sessions, cookies etc. are not protected and well written websites crumble under the double-encoding problem.
Developers ends up never knowing what to encode on output and what not to once down this path.The problem is purely one of output and should be treated as such. Any hack to encode data going into the database is going to cause more grief and chaos than it solves.
abombss
Member
575 Points
164 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 02:15 AM|LINK
No, you are not listening to the community.
So why not extend the asp.net compiler to auto encode <%= %>, if its not webforms no backwards compatibility problems there? If people could make pre-compilers that add runat="server" for <asp: tags I am sure you could come up with something to wrap all calls <%= %> in Html.Encode(). Or just switch to Boo, it can do it no problem.
No you don't or the community would not have responded the way it did.
Bad developers, bad testing, bad framework design, maybe just a bug? Don't make the same mistakes you did in webforms fix the real problem. Html Encoding output not input.
Yes you are, you are forcing EVERYONE on this thread to call UpdateFromRaw. You are forcing me to write fxcop rules and do code analysis ensuring that I always call UpdateFromRaw and never call UpdateFrom. You are forcing me to educate my developers on using non standard method calls. Why would a dev know to use UpdateFromRaw that just sounds weird, whats raw? You are forcing me to write some precompiler process to replace all UpdateFrom to UpdateFromRaw because I have no good way of testing that my database does not contain encoded data. You are forcing microsoft to upset a lot of people by not listening to the community from the begining. Why ask for feedback if you don't listen?
And you may be forcing me to stick with Monorail...
Wow... I need to stop sounding like Bellware [:)]... I guess I would need a little RoR speak to really sound like him.
This is a harsh response please excuse the tone. I appreciate the work the whole team is doing. I want this project to succeed, I just think this is the wrong solution and it is not solving the real problem.
robconery
Participant
852 Points
195 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 05:36 PM|LINK
Agreed - 100%. What I'm looking at is the issue where someone does a "myObject.UpdateFrom(Request.Form)" and then spins it back to the browser on error, validation, or...? You have to agree that there is a hole here that I need to consider.
Unfortunately that singular scenario dominates :(.
I just read your posts on this issue and I have to say you're violating your own rule:
abombss
Member
575 Points
164 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 05:47 PM|LINK
Sorry Rob wasn't trying to beat you up personally, I just feel strongly about this one.
My vote is drop UpdateFrom completely and make two methods, UpdateFromEncoded and UpdateFromDecoded. Keeping UpdateFrom with an unexpected side effect just makes the application more difficult to read, understand, and debug.
robconery
Participant
852 Points
195 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 05:51 PM|LINK
That's out of line.
I'd like to see if we can avoid inflammatory remarks. You might be frustrated the things I'm mentioning here, but comments like these don't help resolve anything.
Yep.
Good to hear. Now if you can consider my point of view on this, and that I used to be YOU 2 months ago (before I go this job), we can come to a solution. In reading everything there, can I sum up your post in "I want UpdateFrom() to load unencoded text, and UpdateFromEncoded() to encode the text" - is that right?
Again I'm going to ask you Good Fellas if you can help me with this :). If, by default, we do an UpdateFrom() that's not encoded, how can we avoid the problem where the user does this:
MyController(){ MyObject o=new MyObject; o.UpdateFrom(Request.Form);//someone submitted some nasty script try{ o.Save();//the DB barfed it back because o.Subject was too long //it contains a <script XSS attack //bombing our char limit and forcing the response back the browser - very common }catch(Exception x){ //let the user know string message="Can't save the data!"; //repopulate RenderView("Edit",o); // } }<%=Html.TextBox("subject",o.Subject)%>robconery
Participant
852 Points
195 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 05:54 PM|LINK
Passion's a good thing, and I'm learning quickly to toughen up now that I'm a Blue Badge. It's still weird when people tell call me Microsoft :).
\This is a great idea :).
abombss
Member
575 Points
164 Posts
Re: UpdateFrom and Encoding
Dec 19, 2007 06:04 PM|LINK
This is why it would great to make switch in mvc that <%= wraps the output in HtmlEncode() . It would avoid this problem and force developers to think about when then don't their data encoded versus when it needs to be encoded.