Last post Jan 11, 2018 07:56 AM by Eric Du
Jan 10, 2018 06:05 PM|MikeT89|LINK
When I open and read the pdf file everything looks fine, but whenever I try to read and parse that same pdf file all of a sudden there are a bunch of extra characters or tags. And so whenever my code is looking for a specific string, it's not finding it.
When I open the pdf file I see this:
Membership ID: 1111111
But when I open and parse each line I get this:
MembershipMembership ID:ID: <<MemberId>>1111111
Can someone explain to me why those extra characters or tags are there? And how can I get rid of them or account for them in my code when I'm reading and parsing pdf files.
I'am currently using aspose.pdf library
Jan 11, 2018 07:56 AM|Eric Du|LINK
According to your description and needs, please check the following tutorials about use itextsharp or other dll to extra data, the tutorials have example code to test, please check:
Read and Extract PDF Text in C# and VB.NET:
How to read PDF file using iTextSharp in ASP.NET: