Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

cloudfire/jsoup

Open more actions menu
 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

461 Commits
461 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jsoup: Java HTML parser that makes sense of real-world HTML soup.

jsoup is a Java library for working with real-world HTML. It provides a very convenient API
for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.

jsoup implements the WHATWG HTML specification (http://whatwg.org/html), and parses HTML to the same DOM
as modern browsers do.

* parse HTML from a URL, file, or string
* find and extract data, using DOM traversal or CSS selectors
* manipulate the HTML elements, attributes, and text
* clean user-submitted content against a safe white-list, to prevent XSS
* output tidy HTML

jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating,
to invalid tag-soup; jsoup will create a sensible parse tree.

See http://jsoup.org/ for downloads and documentation.

About

jsoup: Java HTML Parser, with best of DOM, CSS, and jquery

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Morty Proxy This is a proxified and sanitized view of the page, visit original site.