string html = "<Style font-family: \"arial\"; Size:10; font-family:Verdana; font-family: Times New Roman> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";
please help me with the appropriate regular expression.
Actually the html string is rendered as you wrote above at runtime. If you put a break point and check the string in Data Visualizer at runtime you will see it is rendered exactly as you want.
As you may see in the above picture.
If you simply want to replace the string in the html. You may use the following code :
string html = "<Style font-family: \"arial\"; Size:10; font-family:Verdana; font-family: Times New Roman> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";
string newhtml = html.Replace("font-family:Verdana; font-family: Times New Roman", "font-family:Arial Unicode MS; font-family: Arial Unicode MS");
Actually the Font is not fixed. it can be different everytime. also the sequence can be different.
e.g.
string html1 ="<Style font-family: \"arial\"; Size:10; font-family:Verdana; font-family: Times New Roman> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";
string html2 ="<Style font-family: \"Comic Sans MS\"; Size:10; font-family:Verdana> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";
string html3 ="<Style Size:10; font-family:Verdana> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";
These are just 3 examples. so the final requirement is: Find the font (e.g. Font-Family, Font-Face) and replace it to 'Arial Unicode MS'
Hint: anything between
Font-Family: and '>' or ';' is going to be the font e.g. if the string is <style Font-Family:Verdana size:20> or <style Font-Family:Verdana, Comic Sans MS size:20> or <style size:20 Font-Family:"Verdana">
Hard Code strings find and replace would not work
I have tried the following code:
Dim reg1 As Regex = New Regex("font-family:(.)*?(;|>)", RegexOptions.IgnoreCase)
Dim matchFont As MatchCollection = reg1.Matches(htmlText, 0)
Dim reg2 As Regex = New Regex("face=(.)*?(;|>)", RegexOptions.IgnoreCase)
Dim matchFace As MatchCollection = reg2.Matches(htmlText, 0)
For Each v As Match In matchFont
htmlText = htmlText.Replace(v.ToString().Substring(0, v.ToString().Length - 1), "font-family:'Arial Unicode MS'")
Next
For Each v As Match In matchFace
htmlText = htmlText.Replace(v.ToString().Substring(0, v.ToString().Length - 1), "face='Arial Unicode MS'")
Next
This code works fine. Now i want to refactor the above code. I want to do it using 1 regular expression, so that I can avoid using foreach loop.
None
0 Points
2 Posts
Regular Expression to change the font in html string
Jun 19, 2014 01:42 AM|bikka|LINK
Hi,
I need to change the font in html string
e.g. i have below mentioned kind of string.
string html = "<Style font-family: \"arial\"; Size:10; font-family:Verdana; font-family: Times New Roman> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";
(This is not the actual HTML string.)
I want to get the following result
html = "<Style font-family: "Arial Unicode MS"; Size:10; font-family:Arial Unicode MS; font-family: Arial Unicode MS> 2222222 <FONT size=3 face="Arial Unicode MS"> </FONT> <FONT Size=3 face=Arial Unicode MS color=RED></FONT>";
please help me with the appropriate regular expression.
Thanks
Bikka
Star
10596 Points
1379 Posts
Re: Regular Expression to change the font in html string
Jun 20, 2014 03:04 AM|Sam - MSFT|LINK
Hi,
Actually the html string is rendered as you wrote above at runtime. If you put a break point and check the string in Data Visualizer at runtime you will see it is rendered exactly as you want.
As you may see in the above picture.
If you simply want to replace the string in the html. You may use the following code :
For more reference:
http://msdn.microsoft.com/en-us/library/fk49wtc1(v=vs.110).aspx
You may also use Regex:
For more reference:
http://stackoverflow.com/questions/8143811/find-and-replace-content-within-string-c
Hope it helps!
Best Regards!
None
0 Points
2 Posts
Re: Regular Expression to change the font in html string
Jun 25, 2014 02:48 AM|bikka|LINK
Hi Sam,
Thanks for spending time for this issue.
Actually the Font is not fixed. it can be different everytime. also the sequence can be different.
e.g.
These are just 3 examples.
so the final requirement is:
Find the font (e.g. Font-Family, Font-Face) and replace it to 'Arial Unicode MS'
Hint: anything between
Dim reg1 As Regex = New Regex("font-family:(.)*?(;|>)", RegexOptions.IgnoreCase)
Dim matchFont As MatchCollection = reg1.Matches(htmlText, 0)
Dim reg2 As Regex = New Regex("face=(.)*?(;|>)", RegexOptions.IgnoreCase)
Dim matchFace As MatchCollection = reg2.Matches(htmlText, 0)
For Each v As Match In matchFont
htmlText = htmlText.Replace(v.ToString().Substring(0, v.ToString().Length - 1), "font-family:'Arial Unicode MS'")
Next
For Each v As Match In matchFace
htmlText = htmlText.Replace(v.ToString().Substring(0, v.ToString().Length - 1), "face='Arial Unicode MS'")
Next
This code works fine. Now i want to refactor the above code. I want to do it using 1 regular expression, so that I can avoid using foreach loop.
Thanks
Bikka