jsoup - Tutorial #1 setup, basic commands

in #utopian-io6 years ago (edited)


What Will You Learn?
In this tutorial you will learn about jsoup. Its basic elements and development.

What is jsoup
jsoup is a Java based library to work with HTML based content. It provides a very convenient API to extract and manipulate data, using the best of DOM, CSS, and jquery-like methods. It implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

Requirements

  • Basic Java Programming
  • Good OOP Concept is a plus point for you
  • Difficulty

  • Intermediate
  • jsoup - Overview
    jsoup is a Java based library to work with HTML based content. It provides a very convenient API to extract and manipulate data, using the best of DOM, CSS, and jquery-like methods. It implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

  • jsoup libary implements the WHATWG HTML5 specification, and parses an HTML content to the same DOM as per the modern browsers.
  • jsonp library provides following functionalities.
  • Multiple Read Support - It reads and parses HTML using URL, file, or string.
  • CSS Selectors It can find and extract data, using DOM traversal or CSS selectors.
  • DOM Manipulation It can manipulate the HTML elements, attributes, and text.
  • Prevent XSS attacksIt can clean user-submitted content against a given safe white-list, to prevent XSS attacks.

    TidyIt outputs tidy HTML.

  • Handles invalid data - jsoup can handle unclosed tags, implicit tags and can reliably create the document structure.
  • Local Environment Setup
    JUnit is a framework for Java, so the very first requirement is to have JDK installed in your machine.

    System Requirement

    Step 1: Verify Java Installation in Your Machine

    Step 2: Set JAVA Environment

    Step 3: Download jsoup Archive

    Step 4: Set jsoup Environment

    Step 5: Set CLASSPATH Variable
    5.png

    Source: Tutorialspoint.com

    Sort:  
    Loading...

    Coin Marketplace

    STEEM 0.33
    TRX 0.11
    JST 0.034
    BTC 66530.34
    ETH 3251.57
    USDT 1.00
    SBD 4.36