How to parse Nested XML with same tag name using Jsoup

by vigneshchennai74 Updated: Feb 20, 2023

Solution Kit

This Java code using the Jsoup library helps to parse and process an XML hierarchy of categories and items by allowing you to extract specific information from the XML document. It enables you to select and extract only the category and item elements relevant to your task or analysis instead of manually parsing the entire document.

The code can be used to simplify the parsing and processing of hierarchical data structures in XML documents. It demonstrates the use of two different techniques for selecting elements based on their attributes and structure. These techniques can also be applied to other XML documents, making the code a useful starting point for working with XML data in Java.

The classes jsoup.Jsoup, jsoup.nodes.Document, jsoup.parser.Parser, and jsoup.select.Elements are part of the Jsoup library, a Java library for working with HTML and XML documents.

jsoup - Jsoup class provides static methods for parsing HTML and XML documents. It takes the document's source as input, such as a URL or a string, and returns a jsoup.node. Document object that represents the parsed document.
jsoup.nodes - Document class represents an in-memory representation of an HTML or XML document. It provides methods for querying and manipulating the document, such as selecting elements based on a tag name, attribute value or CSS selector.
jsoup.parser - Parser class is an enumeration that provides different parsers that can be used to parse an HTML or XML document. The default parser is the HTML parser, but other parsers, such as the XML parser, can be specified for documents that require different parsing rules.
jsoup.select - Elements class represents a collection of HTML or XML elements selected based on a CSS selector. It provides methods for iterating over the selected elements and performing operations on them, such as getting the text content, the attributes, or the HTML representation of the element.

The Jsoup library to parse and process XML hierarchies of categories and items can be helpful in various applications that require processing XML data.

Preview of the output that you will get on running this code.

Code

In this solution we have used JSOUP Library.

How to parse nested xml tags with the same tag name

JavaLines of Code : 81License : Strong Copyleft (CC BY-SA 4.0)

Dependent Libraries :

<!-- https://mvnrepository.com/artifact/org.jsoup/jsoup -->
<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.13.1</version>
</dependency>

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.parser.Parser;
import org.jsoup.select.Elements;

public class Example {


    public static void main(String[] args) {
        String xml = "<categories>\n"
                + "    <category>abc\n"
                + "        <category>cde\n"
                + "            <item>someid_1</item>\n"
                + "            <item>someid_2</item>\n"
                + "            <item>someid_3</item>\n"
                + "            <item>someid_4</item>\n"
                + "        </category>\n"
                + "    </category>\n"
                + "    <category>xyz\n"
                + "       <category>zwd\n"
                + "          <category>hgw\n"
                + "             <item>someid_5</item>\n"
                + "          </category>\n"
                + "       </category>\n"
                + "    </category>\n"
                + " </categories>";

        Document doc = Jsoup.parse(xml, "", Parser.xmlParser());

        //if you are interested in Items only
        Elements items = doc.select("category > item");
        items.forEach(i -> {
            System.out.println("Parent text: " +i.parent().ownText());
            System.out.println("Item text: "+ i.text());
            System.out.println();
        });


        //if you are interested in categories having at least one direct item element
        Elements categories = doc.select("category:has(> item)");
        categories.forEach(c -> {
            System.out.println(c.ownText());
            Elements children = c.children();
            children.forEach(ch -> {
                System.out.println(ch.text());
            });
            System.out.println();
        });
    }

Parent text: cde
Item text: someid_1

Parent text: cde
Item text: someid_2

Parent text: cde
Item text: someid_3

Parent text: cde
Item text: someid_4

Parent text: hgw
Item text: someid_5

cde
someid_1
someid_2
someid_3
someid_4

hgw
someid_5

copy the code using the "Copy" button above, and paste it in a your Java IDE.
Add jsoup Library in your code.
Run the file to get the Output

I hope you found this useful. I have added the link to dependent libraries, version information in the following sections.

I found this code snippet by searching for "How to parse xml tag with same tag Name" in kandi. You can try any such use case!

Environment Tested

I tested this solution in the following version. Be mindful of changes when working with other versions.

The solution is created and executed in java java version "1.8.0_251"
The solution is tested on Joup Library version "1.13.1"

In this solution we are going to parse Nested XML with same tag name using Jsoup in java with simple steps. This process also facilities an easy to use, hassle free method to create a hands-on working version of code which would help us parse Nested XML with same tag name using Jsoup.

Dependent Library

jsoupby jhy

Java

10188

Version:jsoup-1.16.1

License: Permissive (MIT)

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.

Support

Quality

Security

License

Reuse

jsoupby jhy

Java 10188 Version:jsoup-1.16.1 License: Permissive (MIT)

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.

Support

Quality

Security

License

Reuse

If you do not have Jsoup that is required to run this code , You can just install it by clicking on the above link and copying the pip install command from the Jsoup page in Kand. You can search for any dependent library on kandi like Jsoup.

Support

For any support on kandi solution kits, please use the chat
For further learning resources, visit the Open Weaver Community learning page.

See similar Kits and Libraries

Open Weaver – Develop Applications Faster with Open Source

Terms
Privacy policy

How to parse Nested XML with same tag name using Jsoup

Code

Environment Tested

Dependent Library

Support

Open Weaver – Develop Applications Faster with Open Source

kandi

Community and Support

Company

Follow