XML Basics for the Java Programmer. part 2 of 3

Introduction

Hello dear readers of my article. This is the second article in a series about XML, and this article will talk about XML Namespace and XML Schema.

Literally recently, I myself didn’t know anything about this, but I mastered a lot of material and I will try to explain these two important topics in simple words. I want to say right away that schemas are a very advanced mechanism for validating XML documents and much more functional than DTD, so there will be no full study of it from and to here. Let's get started :)

XML Namespace

Namespace means "namespace", but in this article I will often replace the Russian expression with just namespace, because it is shorter and more comfortable to understand. XML Namespace is a technology whose main purpose is to make sure that all elements are unique in an XML file and there is no confusion. And since these are Java courses, the same technology is also in Java - packages. If we could put two classes with the same name side by side and use them, then how would we determine which class we need? This problem is solved by packages - we can simply place classes in different packages and import them from there, specifying the exact name of the desired package and the path to it, or simply specifying the full path to the desired class. XML Basics for the Java Programmer. Part 2 of 3 - 1

XML Basics for the Java Programmer. Part 2 of 3 - 1

Now, we can do this:

public class ExampleInvocation {
    public static void main(String[] args) {
        // Creation экземпляра класса из первого пакета.
        example_package_1.Example example1 = new example_package_1.Example();

        // Creation экземпляра класса из второго пакета.
        example_package_2.Example example2 = new example_package_2.Example();

        // Creation экземпляра класса из третьего пакета.
        example_package_3.Example example3 = new example_package_3.Example();
    }
}

In XML Namespace, everything is about the same, just a little different. The essence is the same: if the elements are the same (like classes), then we just have to use them in different namespaces (specify packages), then even if the names of the elements (classes) start to match, we will still refer to a specific element from the space ( package). For example: we have two elements in XML - prediction (oracle) and Oracle database.

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <oracle>
        <connection value="jdbc:oracle:thin:@10.220.140.48:1521:test1" />
        <user value="root" />
        <password value="111" />
    </oracle>

    <oracle>
        Сегодня вы будете заняты весь день.
    </oracle>
</root>

And when we process this XML file, we will be seriously confused if instead of a database we get a prediction, and vice versa too. In order to resolve the collision of elements, we can give each of them its own space to distinguish between them. There is a special attribute for this - xmlns:prefix= "unique value for namespace". After that, we can use a prefix in front of the elements to indicate that it is part of this namespace (essentially, we must create the path to the package - namespace, and then prefix each element with which package it belongs to).

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <database:oracle xmlns:database="Unique ID #1">
        <connection value="jdbc:oracle:thin:@10.220.140.48:1521:test1" />
        <user value="root" />
        <password value="111" />
    </database:oracle>

    <oracle:oracle xmlns:oracle="Unique ID #2">
        Сегодня вы будете заняты весь день.
    </oracle:oracle>
</root>

In this example, we have declared two namespaces: database and oracle. Namespace prefixes can now be used before elements. No need to be scared if something is unclear now. Actually, it's very simple. At first, I wanted to write this part of the article more quickly, but after Wednesday I decided that I need to pay more attention to this topic, as it is easy to get confused or not understand something. Now a lot of attention will be paid to the xmlns attribute. So, another example:

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="https://www.standart-namespace.com/" xmlns:gun="https://www.gun-shop.com/" xmlns:fish="https://www.fish-shop.com/">
    <gun:shop>
        <gun:guns>
            <gun:gun name="Revolver" price="1250$" max_ammo="7" />
            <gun:gun name="M4A1" price="3250$" max_ammo="30" />
            <gun:gun name="9mm Pistol" price="450$" max_ammo="12" />
        </gun:guns>
    </gun:shop>

    <fish:shop>
        <fish:fishes>
            <fish:fish name="Shark" price="1000$" />
            <fish:fish name="Tuna" price="5$" />
            <fish:fish name="Capelin" price="1$" />
        </fish:fishes>
    </fish:shop>
</root>

You can see the regular XML, which uses the spaces gun for gun store unique items and fish for fish store unique items. You can see that when we created the spaces, we used the same shop element for two different things at once - a weapon shop and a fish shop, and we know exactly what kind of shop it is due to the fact that the spaces were declared. The most interesting thing will begin in the schemes, when we can still validate different structures with the same elements in this way. xmlns is an attribute for declaring a namespace, it can be specified in any element. Namespace declaration example:

xmlns:shop= «https://barber-shop.com/»

The colon is followed by a prefix, which is a reference to a space, which can then be used before elements to indicate that they come from that space. The xmlns value must be a UNIQUE STRING. This is extremely important to understand: it is very common to use site links or URIs to declare a namespace. This rule is a standard because the URI or URL of the link is unique, BUT this is where it gets really confusing. Just remember: the value can be ANY string you want, but for precise uniqueness and standard, you need to use the URL or URI of the address. That any string can be used is shown in the example in oracle:

xmlns:oracle="Unique ID #2"
xmlns:database="Unique ID #1"

When you declare a namespace, you can use it on the element itself and on all elements within it, so namespaces declared on the root element can be used on all elements. This can be seen in the last example, and here is a more specific example:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <el1:element1 xmlns:el1="Element#1 Unique String">
        <el1:innerElement>

        </el1:innerElement>
    </el1:element1>


    <el2:element2 xmlns:el2="Element#2 Unique String">
        <el2:innerElement>

        </el2:innerElement>
    </el2:element2>


    <el3:element3 xmlns:el3="Element#3 Unique String">
        <el3:innerElement>
            <el1:innerInnerElement> <!-- Так нельзя, потому что пространство el1 объявлено только в первом элементе, потому может использовать только внутри первого element и его внутренних элементов. -->

            </el1:innerInnerElement>
        </el3:innerElement>
    </el3:element3>
</root>

Here is an important detail: there is also a standard namespace in the root element. If you declare other namespaces, you override the default one and cannot use it. Then the root element must be preceded by some kind of space prefix, whatever you declared earlier. However, this can also be tricked: you can declare the standard space explicitly. It is enough just not to use a prefix after xmlns, but immediately write down some value, and all your elements without a prefix will belong to this namespace'u. In the last example this was used:

<root xmlns="https://www.standart-namespace.com/" xmlns:gun="https://www.gun-shop.com/" xmlns:fish="https://www.fish-shop.com/">

We declared the default space explicitly to avoid the need to use either gun or fish, since the root element is neither a fish shop nor a gun shop entity, so using any space would be logically wrong. Further: if you created xmlns:a and xmlns:b, but they have the same value, then this is the same space and they are not unique. Therefore, you should always use unique values, because violation of this rule can create a large number of errors. For example, if we had spaces declared like this:

xmlns="https://www.standart-namespace.com/" xmlns:gun="https://www.gun-shop.com/" xmlns:fish="https://www.gun-shop.com/"

Then our fishing shop would become a gun shop, and the prefix would still be a fish shop. These are all the highlights of the spaces. I spent quite a lot of time collecting them all and cutting them down, and then expressing them clearly, since the information on spaces on the Internet is very huge and often one water, because most of everything that is there - I learned it myself by trial and error . If you still have questions, you can try to read the materials at the links at the end of the article.

XML Schema

I want to say right away that in this article there will be only the tip of the iceberg, since the topic is very extensive. If you want to get acquainted with the schemes in more detail and learn how to write them yourself of any complexity, then at the end of the article there will be a link where there will be everything about different types, restrictions, extensions, and so on. I want to start with theory. Schemes have the .xsd (xml scheme definition) format and are a more advanced and popular alternative to DTDs: they can also create elements, describe them, and so on. However, a lot of bonuses have been added: type checking, namespace support and more functionality. Remember when we talked about DTD, there was a minus that it doesn't support spaces? Now that we have learned this, I explain: if it were possible to import two or more schemas with a DTD, where there would be the same elements, we would have collisions (coincidences) and it would be impossible to use them at all, because it is not clear which element we need. XSD solves this problem because you can import schemas into one specific space and use it. Essentially, each XSD schema has a target space, which indicates which space the schema should be written to in the XML file. Thus, in the XML file itself, we just need to create these spaces predefined in the schemas and assign prefixes to them, and then connect the necessary schemas to each of them, after which we can safely use elements from the schema, substituting prefixes from the space where we imported schemas. And so, we have an example: because you can import schemas into one specific space and use it. Essentially, each XSD schema has a target space, which indicates which space the schema should be written to in the XML file. Thus, in the XML file itself, we just need to create these spaces predefined in the schemas and assign prefixes to them, and then connect the necessary schemas to each of them, after which we can safely use elements from the schema, substituting prefixes from the space where we imported schemas. And so, we have an example: because you can import schemas into one specific space and use it. Essentially, each XSD schema has a target space, which indicates which space the schema should be written to in the XML file. Thus, in the XML file itself, we just need to create these spaces predefined in the schemas and assign prefixes to them, and then connect the necessary schemas to each of them, after which we can safely use elements from the schema, substituting prefixes from the space where we imported schemas. And so, we have an example: in the XML file itself, we just need to create these spaces predefined in the schemas and assign prefixes to them, and then connect the necessary schemas to each of them, after which we can safely use elements from the schema, substituting prefixes from the space where we imported the schemas . And so, we have an example: in the XML file itself, we just need to create these spaces predefined in the schemas and assign prefixes to them, and then connect the necessary schemas to each of them, after which we can safely use elements from the schema, substituting prefixes from the space where we imported the schemas . And so, we have an example:

<?xml version="1.0" encoding="UTF-8"?>
<house>
    <address>ул. Есенина, дом №5</address>
    <owner name="Ivan">
        <telephone>+38-094-521-77-35</telephone>
    </owner>
</house>

We want to validate it with a schema. First, we need a schema:

<?xml version="1.0"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="https://www.nedvigimost.com/">
    <element name="house">
        <complexType>
            <sequence>
                <element name="address" type="string" maxOccurs="unbounded" minOccurs="0" />
                <element name="owner" maxOccurs="unbounded" minOccurs="0" >
                    <complexType>
                        <sequence>
                            <element name="telephone" type="string" />
                        </sequence>
                        <attribute name="name" type="string" use="required"/>
                    </complexType>
                </element>
            </sequence>
        </complexType>
    </element>
</schema>

As you can see, schemas are also XML files. You write what you need directly in XML language. This schema is able to validate the XML file from the example above. For example: if the owner does not have a name, then the scheme will see it. Also, thanks to the sequence element, the address must always come first, and then the owner of the house. There are ordinary and complex elements. Ordinary elements are elements that store only some type of data. Example:

<element name="telephone" type="string" />

So we declare an element that stores a string. There should be no other elements inside this element. There are also complex elements. Complex elements are able to store inside themselves other elements, attributes. Then the type does not need to be specified, but it is enough to start writing a complex type inside the element.

<complexType>
    <sequence>
        <element name="address" type="string" maxOccurs="unbounded" minOccurs="0" />
        <element name="owner" maxOccurs="unbounded" minOccurs="0" >
            <complexType>
                <sequence>
                    <element name="telephone" type="string" />
                </sequence>
                <attribute name="name" type="string" use="required"/>
            </complexType>
        </element>
    </sequence>
</complexType>

You could also do it differently: you could create a complex type separately, and then substitute it in type. Only at the time of writing this example, for some reason it was necessary to declare the space under some prefix, and not use the standard one. In general, it turned out like this:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="https://www.nedvigimost.com/">
    <xs:element name="house" type="content" />

    <xs:complexType name="content">
        <xs:sequence>
            <xs:element name="address" type="xs:string" maxOccurs="unbounded" minOccurs="0" />
            <xs:element name="owner" maxOccurs="unbounded" minOccurs="0" >
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="telephone" type="xs:string" />
                    </xs:sequence>
                    <xs:attribute name="name" type="xs:string" use="required"/>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:schema>

Thus, we can create our own types separately, and then substitute them somewhere in the type attribute. This is very convenient, as it allows you to use one type in different places. I would like to talk more about connecting circuits and finish on this. There are two ways to connect a circuit: to a specific space and just connect.

The first way to connect the circuit

The first way assumes that the schema has a specific target space. It is specified using the targetNamespace attribute on the scheme element. Then it is enough to create THIS SAME space in the XML file, and then “load” the schema there:

<?xml version="1.0" encoding="UTF-8"?>
<nedvig:house xmlns:nedvig="https://www.nedvigimost.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://www.nedvigimost.com/ example_schema1.xsd">
    <address>ул. Есенина, дом №5</address>
    <owner name="Ivan">
        <telephone>+38-094-521-77-35</telephone>
    </owner>
</nedvig:house>

It is important to understand two lines:

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemeLocation="https://www.nedvigimost.com/ example_schema1.xsd"

The first line - just remember it. Think of it as an object that helps load schemas where they need to be. The second line is a specific download. schemaLocation accepts a space-separated list of value-value values. The first argument is a namespace, which must match the target namespace in the schema (the targetNamespace value). The second argument is a relative or absolute path to the schema. And since this is a LIST value, you can put a space after the scheme in the example, and again enter the target space and the name of another scheme, and so on as you like. Important:in order for the schema to validate something later, you need to declare this space and use it with a prefix. Look closely at the last example:

<?xml version="1.0" encoding="UTF-8"?>
<nedvig:house xmlns:nedvig="https://www.nedvigimost.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://www.nedvigimost.com/ example_schema1.xsd">
    <address>ул. Есенина, дом №5</address>
    <owner name="Ivan">
        <telephone>+38-094-521-77-35</telephone>
    </owner>
</nedvig:house>

We created this target space with the nedvig prefix and then used it. So our elements started validating as we started using the space referenced by the target schema space.

The second way to connect the circuit

The second way to connect a schema implies that the schema does not have a specific target space. Then you can just connect it to an XML file and it will validate it. It is done in almost the same way, only you can not declare spaces at all in the XML file, but simply include a schema.

<?xml version="1.0" encoding="UTF-8"?>
<house xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="example_schema1.xsd">
    <address>ул. Есенина, дом №5</address>
    <owner name="Ivan">
        <telephone>+38-094-521-77-35</telephone>
    </owner>
</house>

As you can see, this is done using noNamespaceSchemaLocation and specifying the path to the schema. Even if the schema does not have a target space, the document will be validated. And the final touch: we can import other schemas into schemas, and then use elements from one schema in another. Thus, we can use elements in some schemes that are already in others. Example:

The schema where the owner type is declared:

<?xml version="1.0" encoding="UTF-8" ?>
<schema targetNamespace="bonus" xmlns="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
        <complexType name="owner">
            <all>
                <element name="telephone" type="string" />
            </all>
            <attribute name="name" type="string" />
        </complexType>
</schema>

The second schema, where the owner type from the first schema is used:

<?xml version="1.0" encoding="UTF-8"?>
<schema targetNamespace="main" xmlns="http://www.w3.org/2001/XMLSchema" xmlns:bonus="bonus" elementFormDefault="qualified">
    <import namespace="bonus" schemaLocation="xsd2.xsd" />
    <element name="house">
        <complexType>
            <all>
              <element name="address" type="string" />
                <element name="owner" type="bonus:owner" />
            </all>
        </complexType>
    </element>
</schema>

In the second scheme, the construction is used:

<import namespace="bonus" schemaLocation="xsd2.xsd" />

With it, we imported types and elements from one schema to another into the bonus space. Thus, we have access to the bonus:owner type. And in the next line we used it:

<element name="owner" type="bonus:owner" />

Just a little attention to the next line:

elementFormDefault="qualified"

This attribute is declared in schema and means that in XML files each element must be declared with an explicit prefix in front of it. If it is not there, then it is enough for us to declare an external element with a prefix, and so we need to set prefixes in all elements inside, clearly indicating that we are using exactly the elements of this scheme. And here, in fact, is an example of an XML file validated by a schema that imported another schema:

<?xml version="1.0" encoding="UTF-8"?>
<nedvig:house xmlns:nedvig="main" xmlns:bonus="bonus" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="main xsd.xsd">
    <nedvig:address>ул. Есенина, дом №5</nedvig:address>
    <nedvig:owner name="Ivan">
        <bonus:telephone>+38-094-521-77-35</bonus:telephone>
    </nedvig:owner>
</nedvig:house>

In line:

<bonus:telephone>+38-094-521-77-35</bonus:telephone>

We need to explicitly declare the bonus namespace, pointing to the target space of the first scheme, since we have elementFormDefault in qualified (check), so all elements must explicitly indicate their space.

End of article

The next article will be the last one in the series and there will already be about processing XML files using Java. We will learn how to get information in different ways and so on. I hope that this article was useful and, even if there are errors somewhere, it will teach you something useful and new, or maybe just give you the opportunity to better understand XML files. For those who would like to explore this in more detail, I decided to put together a small set of links:

XSD Simple Elements - starting from this article, start reading and go ahead, all the information on the schemes is collected there and it is explained more or less clearly, only in English. You can use a translator.
video on namespaces, it's always good to hear a different point of view on something if the first one isn't clear.
Namespace XML is a good example of the use of namespaces and is pretty complete information.
XML Basics - Namespaces - Another small article on namespaces.
The Basics of Using XML Schema to Define Elements is also an extremely useful reference on schemas, but you need to read slowly and carefully, delving into the material.

That's all for sure, I hope that if you want to learn something deeper from this, then the links will help you. I myself roamed through all these sources, studying all the material, and, in general, they were the most useful of all the sources that I looked at, since each of them either improved the understanding of what I had already read somewhere else, or gave to learn something new, but a lot was done just during the practice. So, for those who really want to understand all this pretty well, my advice is to learn namespaces, then how easy it is to include schemas in XML files, and then how to write document structure in schemas. And most importantly, practice. Thank you all for your attention and success in programming :) Previous article: [Contest] XML Basics for a Java Programmer - Part 1 of 3 Next Article: [Contest] XML Fundamentals for Java Programmers - Part 3.1 of 3 - SAX