kandi background
Explore Kits

How to parse XML with Jsoup

by vigneshchennai74 Updated: Jan 1, 2023

java has significant advantages over other languages and environments that make it suitable for just about any programming task java is easy to learn. Java has many useful library .In that Jsoup is the one of the famous libray. Jsoup Library  is designed for beginners and professionals providing basic and advanced concepts of html parsing through jsoup. Jsoup is a java html parser. It is a java library that is used to parse HTML document.


Preview of the output that you will get on running this code.

Code

In this solution we have used JSOUP Library.

JavaLines of Code : 81License : CC BY-SA 4.0

Dependent Libraries :
<!-- https://mvnrepository.com/artifact/org.jsoup/jsoup -->
<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.13.1</version>
</dependency>

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.parser.Parser;
import org.jsoup.select.Elements;

public class Example {


    public static void main(String[] args) {
        String xml = "<categories>\n"
                + "    <category>abc\n"
                + "        <category>cde\n"
                + "            <item>someid_1</item>\n"
                + "            <item>someid_2</item>\n"
                + "            <item>someid_3</item>\n"
                + "            <item>someid_4</item>\n"
                + "        </category>\n"
                + "    </category>\n"
                + "    <category>xyz\n"
                + "       <category>zwd\n"
                + "          <category>hgw\n"
                + "             <item>someid_5</item>\n"
                + "          </category>\n"
                + "       </category>\n"
                + "    </category>\n"
                + " </categories>";

        Document doc = Jsoup.parse(xml, "", Parser.xmlParser());

        //if you are interested in Items only
        Elements items = doc.select("category > item");
        items.forEach(i -> {
            System.out.println("Parent text: " +i.parent().ownText());
            System.out.println("Item text: "+ i.text());
            System.out.println();
        });


        //if you are interested in categories having at least one direct item element
        Elements categories = doc.select("category:has(> item)");
        categories.forEach(c -> {
            System.out.println(c.ownText());
            Elements children = c.children();
            children.forEach(ch -> {
                System.out.println(ch.text());
            });
            System.out.println();
        });
    }

Parent text: cde
Item text: someid_1

Parent text: cde
Item text: someid_2

Parent text: cde
Item text: someid_3

Parent text: cde
Item text: someid_4

Parent text: hgw
Item text: someid_5

cde
someid_1
someid_2
someid_3
someid_4

hgw
someid_5
  1. copy the code using the "Copy" button above, and paste it in a your Java IDE.
  2. Add jsoup Library in your code.
  3. Run the file to get the Output


I hope you found this useful. I have added the link to dependent libraries, version information in the following sections.


I found this code snippet by searching for "How to parse xml tag with same tag Name" in kandi. You can try any such use case!

Dependent Library

jsoupby jhy

Java star image 9951 Version:1.15.1

License: Permissive (MIT)

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.

Support
Quality
Security
License
Reuse

jsoupby jhy

Java star image 9951 Version:1.15.1 License: Permissive (MIT)

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
Support
Quality
Security
License
Reuse

If you do not have Jsoup that is required to run this code , You can just install it by clicking on the above link and copying the pip install command from the Jsoup page in Kand. You can search for any dependent library on kandi like Jsoup.

Environment Test

I tested this solution in the following version. Be mindful of changes when working with other versions.


  1. The solution is created and executed in java java version "1.8.0_251"
  2. The solution is tested on Joup Library version "1.13.1"


In this solution we are going to parse Nested XML with same tag name using Jsoup in java with simple steps. This process also facilities an easy to use, hassle free method to create a hands-on working version of code which would help us parse Nested XML with same tag name using Jsoup.

Support

  1. For any support on kandi solution kits, please use the chat
  2. For further learning resources, visit the Open Weaver Community learning page.

See similar Kits and Libraries

Android
Mobile