Last post Mar 18, 2021 07:24 AM by Mikesdotnetting
Mar 17, 2021 09:04 PM|valenciano8|LINK
Hi everyone, I'd like to know if there is a built-in way to do webscraping (like fetching basic data from a website and use that data on my ASP.NET Core website) with ASP.NET Core
? Because from what I've understood webscraping basically works like this :
Fetch the HTML page
Analyse the HTML
Get the data that matches classnames, divs or whatever you've specified
Mar 17, 2021 09:30 PM|mgebhard|LINK
I'd like to know if there is a built-in way to do webscraping (like fetching basic data from a website and use that data on my ASP.NET Core website) with ASP.NET Core
This question is too vague to answer. .NET 5 (Core) has many APIs that can accomplish this task. HttpCleint for fetching HTML via HTTP. XML libraries for querying the data. Plus you can search NuGet for 3rd party libraries.
The HTML Agility Pack is a common web scaping API that many forum members recommend.
Mar 18, 2021 07:20 AM|YihuiSun|LINK
I found a tutorial
to automated web scraping and data extraction using HTTP requests and web browsers, you can refer to it.
This tutorial provides two ways to fetch and crawl data in the following ways:
The tutorial also provides example, you can click the link above to view.
Mar 18, 2021 07:24 AM|Mikesdotnetting|LINK
I use AngleSharp, which enables you to query the HTML using standard CSS selectors. For example, to get the h1 content for this page, you would do this:
var config = AngleSharp.Configuration.Default.WithDefaultLoader();
var address = "https://forums.asp.net/t/2175143.aspx?How+to+do+WebScraping+with+ASP+NET+Core+";
var document = await BrowsingContext.New(config).OpenAsync(address);
var heading = document.QuerySelector("h1#threadstatus");