Last post Apr 16, 2013 09:09 AM by JeffreyABecker
Apr 16, 2013 05:22 AM|ksureshh_pk|LINK
I have a html editor control in my webform. so users copying the content from word document and pasting into the editor. sometimes a empty tags are appearing in the html editor content. because of this am getting of lot of spaces between the content. For
example <p> </p> is adding into the content.
Is there anyway to parse the content and remove the tags those dont have any valid content inside it?
Apr 16, 2013 05:54 AM|akhleshchauhan|LINK
Apr 16, 2013 05:55 AM|binson143|LINK
Actually you need to use regular expresssion to remove unwanted text. in the scense of HTML your sample tag is valid ( <p> </p>)
Apr 16, 2013 06:03 AM|Nasser Malik|LINK
See same question
and see this discussion too
Apr 16, 2013 06:29 AM|smirnov|LINK
If you have a text similar to
text1<p> </p><p> </p>
then you can use regular expressions to replace required text to a new text
Regex.Replace(input, pattern, replacement)
string pattern = "(<p> </p>(\r\n)?)+";string newtext = Regex.Replace(input, pattern, "<br/>");
Hope this helps.
Apr 16, 2013 09:09 AM|jeffreyabecker|LINK
First parsing HTML with regular expressions is evil see: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
Second there is a wonderful library out there for screen-scraping and manipulating html called
Html Agility Pack which you should use instead.