regex - Remove HTML Tags in C -


in program, have downloaded webpage wget , want extract on it, text string.

what should (if right) clear html tag file have text on webpage?

i've never used regex in c , don't know if right way trouble. can advise me other alternatives, or librarys, can use? or if should use regex can me doing replace tag in c?

sed -e 's/<[^>]\+>/ /g' file.html 

thanks

regular expressions aren't suited parsing html. long have xhtml, that's guaranteed valid xml, can use xml parser library parsing it.


Comments

Popular posts from this blog

ios - iPhone/iPad different view orientations in different views , and apple approval process -

monitor web browser programmatically in Android? -

c# - Using multiple datasets in RDLC -