Using Htmlagilitypack Extract Text, Which Is Not Between Tags And Comes After Specific Node

October 26, 2023 Post a Comment

HTML code: CAR
Car is something you can drive.

C# code: HtmlAgilityPa

Solution 1:

XPATH is case-sensitive (see here for more on this: Is it possible to ignore case using xpath and c#? ) plus the second phrase that contains 'Car' is not a child a B element. You could have it work like this:

HtmlDocument doc = new HtmlWeb().Load("http://website.com/x.html");
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//text()[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'car')]"))
{
    Console.WriteLine(node.InnerText);
}

In a console application, it will output this:

CARCaris something you can drive.

Html5 Manual

Using Htmlagilitypack Extract Text, Which Is Not Between Tags And Comes After Specific Node

Solution 1:

Post a Comment for "Using Htmlagilitypack Extract Text, Which Is Not Between Tags And Comes After Specific Node"