c# - Walking a WebBrowser control DOM - elements with both children and text -


i'm trying walk dom of webbrowser control using c# , performing processing each htmlelement. (i'm doing transformations on dom @ same time, discussion assume trying flatten dom walking each node recursively )

when encounter like:

<p>text <a href="http://www.example.com/">link</a> in middle of </p> 

i find htmlelement p tag (which contains expected innertext) , child htmlelement node corresponding tag a. htmlelement tag contains expected inner text.

but cannot find structures or attributes related text before , after tag.

is there way find text before , after text of tag other dreadful hack of comparing innerhtml property of p tag outerhtml property of tag?

or there way walk ie dom?

to text nodes in dom, qi (a type cast in c#) parent element (htmlelement.domelement in windows forms) mshtml.ihtmldomnode.

then can direct child nodes via ihtmldomnode.childnodes. enumerate ihtmldomnode.childnodes collection, node type 3 (text). if want text nodes in child elements well, repeat type 1 child nodes.


Comments

Popular posts from this blog

ios - iPhone/iPad different view orientations in different views , and apple approval process -

java Extracting Zip file -

C# WinForm - loading screen -