What is the best way to parse html in C#?


I'm looking for a library/method to parse an html file with more html specific features than generic xml parsing libraries.

1/3/2010 8:29:36 AM

Html Agility Pack

This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).

9/19/2008 8:05:44 AM

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow