I am taking in two html files and creating one out of them. To do this I am opening the first html file and not writing out the closing </body> and </html> tag and opening the second file and not writing out the corresponding opening tags, as well as the <style></style>
section. I start a streamwriter, and write the lines out to it, and then close the streamwriter. My problem is that the output file is filled with strange characters. I've tried opening the streamwriter with different character sets as the third parm, but
all this does is change the characters to different strange characters. It says charset=windows-1252
at the tops of the input files (and the output files for that matter - since I'm just reading stuff in and writing it out - with the exceptions mentioned above).
Questions;
First, do you think I am properly approaching appending two .htm files together?
Second, how can I eliminate these strange characters.
Thanks!
here is my (very rookie-ish) code. suggestions welcome! (name calling... not so much ;)
Dim myLine As String
Dim f As Integer = 0
Dim sr As New StreamReader(sourceFiles(f))
Dim sw As New StreamWriter(destinationFile, False, System.Text.Encoding.GetEncoding("Windows-1252"))
Do Until sr.Peek = -1
myLine = sr.ReadLine()
myLine = myLine.Replace("</html>", "")
myLine = myLine.Replace("</body>", "")
sw.WriteLine(myLine)
Loop
f = f + 1
Dim doWriteLine As Boolean = True
If (f < sourceFiles.Length) Then
Dim myLine2 As String
Dim sr2 As New StreamReader(sourceFiles(f))
Do Until sr2.Peek = -1
myLine2 = sr2.ReadLine()
doWriteLine = True
myLine2 = myLine2.Replace("<html>", "")
myLine2 = myLine2.Replace("<head>", "")
myLine2 = myLine2.Replace("<title>", "")
myLine2 = myLine2.Replace("</title>", "")
If myLine2.Contains("<meta") Then
doWriteLine = False
End If
If myLine2.Contains("<style>") Then 'we need to go until we clean out the style part
Dim stopStyle As Boolean = False
Do Until stopStyle 'just keep reading through this until we get through the style part.
Hopefully
Member
73 Points
39 Posts
streamwriter and strange charactors
Apr 21, 2010 03:36 PM|LINK
I am taking in two html files and creating one out of them. To do this I am opening the first html file and not writing out the closing </body> and </html> tag and opening the second file and not writing out the corresponding opening tags, as well as the <style></style> section. I start a streamwriter, and write the lines out to it, and then close the streamwriter. My problem is that the output file is filled with strange characters. I've tried opening the streamwriter with different character sets as the third parm, but all this does is change the characters to different strange characters. It says charset=windows-1252 at the tops of the input files (and the output files for that matter - since I'm just reading stuff in and writing it out - with the exceptions mentioned above).
Questions;
First, do you think I am properly approaching appending two .htm files together?
Second, how can I eliminate these strange characters.
Thanks!
here is my (very rookie-ish) code. suggestions welcome! (name calling... not so much ;)
Dim myLine As String
Dim f As Integer = 0
Dim sr As New StreamReader(sourceFiles(f))
Dim sw As New StreamWriter(destinationFile, False, System.Text.Encoding.GetEncoding("Windows-1252"))
Do Until sr.Peek = -1
myLine = sr.ReadLine()
myLine = myLine.Replace("</html>", "")
myLine = myLine.Replace("</body>", "")
sw.WriteLine(myLine)
Loop
f = f + 1
Dim doWriteLine As Boolean = True
If (f < sourceFiles.Length) Then
Dim myLine2 As String
Dim sr2 As New StreamReader(sourceFiles(f))
Do Until sr2.Peek = -1
myLine2 = sr2.ReadLine()
doWriteLine = True
myLine2 = myLine2.Replace("<html>", "")
myLine2 = myLine2.Replace("<head>", "")
myLine2 = myLine2.Replace("<title>", "")
myLine2 = myLine2.Replace("</title>", "")
If myLine2.Contains("<meta") Then
doWriteLine = False
End If
If myLine2.Contains("<style>") Then 'we need to go until we clean out the style part
Dim stopStyle As Boolean = False
Do Until stopStyle 'just keep reading through this until we get through the style part.
myLine2 = sr2.ReadLine()
If myLine2.Contains("</style>") Then
stopStyle = True
End If
Loop
End If
myLine2 = myLine2.Replace("</style>", "")
If doWriteLine Then
sw.WriteLine(myLine2)
End If
Loop
sr2.Close()
End If
sr.Close()
sw.Close()
SatyaV
Participant
875 Points
145 Posts
Re: streamwriter and strange charactors
Apr 21, 2010 05:45 PM|LINK
Can you try setting an Encoding for the StreamReader also ?
Hopefully
Member
73 Points
39 Posts
Re: streamwriter and strange charactors
Apr 21, 2010 06:14 PM|LINK
Setting the reader as the same encoding did the trick! Thanks!