Skip to content Skip to sidebar Skip to footer

Is There A Library For Extracting Data From An Html Page?

I would like to extract information from a web page. Unfortunately, the website (4chan) doesn't have a public API, for as far as I know. What is a good library to extract specific

Solution 1:

What you are looking for is an HTML Dom Parse.

This link of a previous question should help you out. Also check out this question


Solution 2:

It is correct, there are lots of libraries for parsing html data. For example, if you use Perl, you can use HTML::Parse.

If you just want a fast result and you agree to use a system command you can use:

lynx -dump http://4chan.org

or

links -dump http://4chan.org

Post a Comment for "Is There A Library For Extracting Data From An Html Page?"