HTML Agility Pack - using XPath to get a single node - Object Reference not set to an instance of an object -
this first attempt element value using hap. i'm getting null object error when try use innertext.
the url scraping :- http://www.mypivots.com/dailynotes/symbol/659/-1/e-mini-sp500-june-2013 trying value current high day change summary table.
my code @ bottom. firstly, know if going right way? if so, xpath value incorrect?
the xpath value obtained using utility found called htmlagility helper. firebug version of xpath below, gives same error :- /html/body/div[3]/div/table/tbody/tr[3]/td/table/tbody/tr[5]/td[3]
my code :-
webclient mypivotswc = new webclient(); string nodevalue; string htmlcode = mypivotswc.downloadstring("http://www.mypivots.com/dailynotes/symbol/659/-1/e-mini-sp500-june-2013"); htmlagilitypack.htmldocument doc = new htmlagilitypack.htmldocument(); doc.loadhtml(htmlcode); htmlnode node = doc.documentnode.selectsinglenode("/html[1]/body[1]/div[3]/div[1]/table[1]/tbody[1]/tr[3]/td[1]/table[1]/tbody[1]/tr[5]/td[3]"); nodevalue=(node.innertext);
thanks, will.
you can't rely on developper tools such firebug or chrome, etc... determine xpath nodes you're after, xpath given such tools correspond in memory html dom while html agility pack knows raw html sent server.
what need visually @ what's sent (or view source). you'll see there no tbody element example. want find discriminant, , use xpath axes example. also, xpath, if worked, not resistant changes in document, need find more "stable" scraping more future-proof.
here code seems work:
htmlnode node = doc.documentnode.selectsinglenode("//td[@class='dntablecell']//a[text()='high']/../../td[3]");
this does:
- find td element class attribute set 'dntablecell'. // token means search recursive in xml hierarchy.
- find element contains text (inner text) equals 'high'.
- navigate 2 parents (we'll closest tr element)
- select 3rd td element there
Comments
Post a Comment