Skip to content Skip to sidebar Skip to footer

Can't Get Xpath Working With Html Agility Pack

I'm trying to scrape the 'Today's featured article' on Wikipedia by getting the XPATH value using firebug. And then pasting it into my code: string result = wc.DownloadString('h

Solution 1:

Because what Firebug shows the XPath like Firefox made the Html, that may or may not be what the Html from the server is. Also, the Path from Firebug is absolute, and every little change can break it.

And easier way is to just look at the Html, the p-Tag you are looking for is in a div with the id mp-tfa, so it's easier to make the XPath look for the div and the just get the first p inside.

Like this:

var wc = new WebClient();
var doc = new HtmlDocument();
doc.Load(wc.OpenRead("http://en.wikipedia.org/wiki/Main_Page"));
var featuredArticle = doc.DocumentNode.SelectSingleNode("//div[@id='mp-tfa']/p");
Console.WriteLine(featuredArticle.InnerText);

The best place to learn how to use XPath is w3schools.com.

Or you could use Linq, though i feel XPath is a bit more clear.

var featuredArticle=   doc.DocumentNode.Descendants("div")
 .First(n => n.Id == "mp-tfa")
 .Descendants("p").FirstOrDefault();

Post a Comment for "Can't Get Xpath Working With Html Agility Pack"