I have an enum with values composed of words (i.e. "BackToBack") and know that I can use the ToString() method to convert the enum to a string representation. However, when displaying this to the user, I want it appear as separate words (i.e. "Back To Back").
Basically, I want to do a Split, but I don't have any delimiter embedded (other than the fact that I want to split based on case).
This seems like this might be a good case for Regular Expressions, but I"m not sure (and not fluent in these).
I definitely don't want to walk the string checking each character's case and building that way (yuck).
Has anyone done anything like this?
Ideas?
Thanks for your help!
Jon
string split "Pascal notation"
Hope this helps!
Jon
If I was able to help, please mark this post as Answer.
I definitely don't want to walk the string checking each character's case and building that way (yuck).
Why yuck? Someone has to do it... A regex can certain do the job, but as you say - you're not fluent in these, and few others are so it'll be a piece of magic sitting in your code that will be hard to maintain and debug. Remember that regardless of the level
of abstraction, some code will have to walk the string and check each characters case...
The link provided by the other member seems to be a good compromise. Easy to read and maintain code, and you don't have to write it yourself.
Svante
AxCrypt - Free Open Source File Encryption & Online Password Manager - http://www.axantum.com [Disclaimer: Code snippets usually uncompiled, beware typos.]
______
Don't forget to click "Mark as Answer" on the post(s) that helped you.
Why not just copy-paste the code in the article I provided? You're gonna have a lot of trouble getting this right with a regexp. Even I who tend to always look for a regexp solution as the first choice, would avoid to do so in this case.
If this post was useful to you, please mark it as answer. Thank you!
I've just try to show an alternative way to the asker.
For me a 3 line regexp solution is simpler and cleaner than the article (there is also a more capable regexp solution in the article's comments: http://secretgeek.net/progr_purga.asp)
\p{Ll} matches a lower-case letter, based on Unicode data, so works beyond the basic 26 ASCII letters. \p{Lu} similarly matches upper case. This can be seen in the last test case.
Actually, it's a one line solution, and with no extra spaces either! Let's not waste these spaces and new lines, they are scarce I've heard ;-)
Reducing the number of lines is seldom equivalent to improving the code quality...
Regular expressions are far too complicated and terse in their syntax to make for good code in most cases. They are also limited in their cultural awareness, and are not really suitable for handing human-generated input. Just see how hard it is to get a
correct regex for this simple problem. The original code posted as a link, is certainly longer - but at least 10 times as many developers are capable of maintaining that code for various upcoming requirements, perhaps:
If the string is in camel-case, convert it as it it was Pascal case.
If the string contains known 2-letters uppercase acronyms, such as ID or IP, handle these.
Handle the letter classification according to the current system or UI-culture.
I'm not saying you can't do this with regular expressions - I'm saying that fewer developers can, and they will require more time to do it, and even fewer will understand the result.
Write code firstly for other programmers to read, understand, test and maintain. Secondly, when proven by measurements, code specific parts for performance. Never code for elegance or terseness at the expense of clarity.
Svante
AxCrypt - Free Open Source File Encryption & Online Password Manager - http://www.axantum.com [Disclaimer: Code snippets usually uncompiled, beware typos.]
______
Don't forget to click "Mark as Answer" on the post(s) that helped you.
JByrd2007
Member
447 Points
93 Posts
Splitting strings with Pascal notation.
May 08, 2008 09:28 PM|LINK
I have an enum with values composed of words (i.e. "BackToBack") and know that I can use the ToString() method to convert the enum to a string representation. However, when displaying this to the user, I want it appear as separate words (i.e. "Back To Back"). Basically, I want to do a Split, but I don't have any delimiter embedded (other than the fact that I want to split based on case).
This seems like this might be a good case for Regular Expressions, but I"m not sure (and not fluent in these).
I definitely don't want to walk the string checking each character's case and building that way (yuck).
Has anyone done anything like this?
Ideas?
Thanks for your help!
Jon
string split "Pascal notation"
Jon
If I was able to help, please mark this post as Answer.
johram
All-Star
28531 Points
3567 Posts
Re: Splitting strings with Pascal notation.
May 08, 2008 09:48 PM|LINK
google is your friend.
http://haacked.com/archive/2005/09/24/10334.aspx
Svante
All-Star
18347 Points
2300 Posts
Re: Splitting strings with Pascal notation.
May 09, 2008 07:19 AM|LINK
Why yuck? Someone has to do it... A regex can certain do the job, but as you say - you're not fluent in these, and few others are so it'll be a piece of magic sitting in your code that will be hard to maintain and debug. Remember that regardless of the level of abstraction, some code will have to walk the string and check each characters case...
The link provided by the other member seems to be a good compromise. Easy to read and maintain code, and you don't have to write it yourself.
AxCrypt - Free Open Source File Encryption & Online Password Manager - http://www.axantum.com
[Disclaimer: Code snippets usually uncompiled, beware typos.]
______
Don't forget to click "Mark as Answer" on the post(s) that helped you.
stmarti
Contributor
4963 Points
1036 Posts
Re: Splitting strings with Pascal notation.
May 09, 2008 08:40 AM|LINK
I would try with regular expressions (if you don't know ii, I highly recommend to learn it, there are excellent tutorials on the web)
string test = "PascalCaseSomethingMatchAlsoASingleCharacter"; Response.Write( test + "<br />"); System.Text.RegularExpressions.Regex pattern = new System.Text.RegularExpressions.Regex( "[A-Z][a-z]*" ); System.Text.RegularExpressions.Match result = pattern.Match( test ); while( result.Success ) { Response.Write( result.Value + "<br />" ); result = result.NextMatch( ); }Anyway there could be problematic identifiers:
What about this: "thisstartwithcamelcaseWhatIsNow", you can use this pattern for this for example: "([a-z]+|[A-Z][a-z]*)"
And what about his: "ThisContainsSomeIDHowToTreatThis", you can split this several ways: Some, ID, How or Some, IDH, ow [:D]
johram
All-Star
28531 Points
3567 Posts
Re: Splitting strings with Pascal notation.
May 09, 2008 09:01 AM|LINK
Why not just copy-paste the code in the article I provided? You're gonna have a lot of trouble getting this right with a regexp. Even I who tend to always look for a regexp solution as the first choice, would avoid to do so in this case.
stmarti
Contributor
4963 Points
1036 Posts
Re: Splitting strings with Pascal notation.
May 09, 2008 10:20 AM|LINK
I've just try to show an alternative way to the asker.
For me a 3 line regexp solution is simpler and cleaner than the article (there is also a more capable regexp solution in the article's comments: http://secretgeek.net/progr_purga.asp)
rjcox
Contributor
7064 Points
1444 Posts
Re: Splitting strings with Pascal notation.
May 09, 2008 11:02 AM|LINK
That'll only handle ASCII.. and too much work handling the results:
private static Regex LowThenUpRegex = new Regex(@"(\p{Ll})(\p{Lu})", RegexOptions.None); static string Convert(string input) { return LowThenUpRegex.Replace(input, "$1 $2"); } static void Main(string[] args) { string[] test = new[] { "one", "Two", "OneTwo", "oneTwo", "onetwo", "OneTWO", "OneTwoThreeFourFive", "ÁbcÈdfghÏjklmnØp", }; foreach (string t in test) { Console.WriteLine("\"{0}\" => \"{1}\"", t, Convert(t)); } }\p{Ll} matches a lower-case letter, based on Unicode data, so works beyond the basic 26 ASCII letters. \p{Lu} similarly matches upper case. This can be seen in the last test case.johram
All-Star
28531 Points
3567 Posts
Re: Splitting strings with Pascal notation.
May 09, 2008 11:08 AM|LINK
I totally agree with you. But the problem remains. You've gotta get the regexp right ;-)
stmarti
Contributor
4963 Points
1036 Posts
Re: Splitting strings with Pascal notation.
May 09, 2008 11:34 AM|LINK
That is it! A 2 line regexp solution [:D]
Svante
All-Star
18347 Points
2300 Posts
Re: Splitting strings with Pascal notation.
May 09, 2008 11:52 AM|LINK
Actually, it's a one line solution, and with no extra spaces either! Let's not waste these spaces and new lines, they are scarce I've heard ;-)
Reducing the number of lines is seldom equivalent to improving the code quality...
Regular expressions are far too complicated and terse in their syntax to make for good code in most cases. They are also limited in their cultural awareness, and are not really suitable for handing human-generated input. Just see how hard it is to get a correct regex for this simple problem. The original code posted as a link, is certainly longer - but at least 10 times as many developers are capable of maintaining that code for various upcoming requirements, perhaps:
I'm not saying you can't do this with regular expressions - I'm saying that fewer developers can, and they will require more time to do it, and even fewer will understand the result.
Write code firstly for other programmers to read, understand, test and maintain. Secondly, when proven by measurements, code specific parts for performance. Never code for elegance or terseness at the expense of clarity.
AxCrypt - Free Open Source File Encryption & Online Password Manager - http://www.axantum.com
[Disclaimer: Code snippets usually uncompiled, beware typos.]
______
Don't forget to click "Mark as Answer" on the post(s) that helped you.