TDE Document Fragment

Using the document fragment in your views

When writing TDE templates there should be a consideration of what data to add to a view. Any field added to a view will be indexed and therefore impact the size of your indexes as well as the insertion time of documents. Information that is only required during runtime can also be gathered using the document fragment that is related to every row in the view.

All code samples in this article are based on the llamaverse dataset. Click here for information on how to load and configure the llamaverse.

Fragment id column

Both data access function, op:from-view and op:from-lexicons, have a systemColumns parameter. This parameter can be used to add a column to the view that returns the fragment id of the row. The fragment id can be used in joins. However, selecting the framgment id column back will not add anything useful to your results on its own:

xquery version "1.0-ml";
import module namespace op = "http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

xdmp:invoke-function(
    function() {
        op:from-view("llamaverse", "llamas", (), op:fragment-id-col("docId"))
            => op:select(("docId"))
            => op:result()
    },
    <options xmlns="xdmp:eval">
        <user-id>{xdmp:user("mother")}</user-id>
    </options>
)
'use strict';
const op = require('/MarkLogic/optic');

xdmp.invokeFunction(
    () => {
        return op.fromView('llamaverse', 'llamas', '', op.fragmentIdCol('docId'))
            .select(['docId'])
            .result();      
    },
    {
        "userId": xdmp.user("mother")
    }
);

Doing so wil result each document's fragment id being returned:

http://marklogic.com/fragment/00000031D38CAF3F
http://marklogic.com/fragment/00000051D38CAF3F
http://marklogic.com/fragment/00000041D38CAF3F

Document Fragment

What the fragment id column can be used for is creating a document join, which joins the document fragment to the row. Selecting back the underlying document can be quite useful in some scenarios, but depending on the size of the dataset and that of the document it can also be quite expensive. The document fragment can be selected back in the following way:

xquery version "1.0-ml";
import module namespace op = "http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

xdmp:invoke-function(
    function() {
        op:from-view("llamaverse", "llamas", (), op:fragment-id-col("docId"))
            => op:join-doc("doc", op:fragment-id-col("docId"))
            => op:select(("doc"))
            => op:result()
    },
    <options xmlns="xdmp:eval">
        <user-id>{xdmp:user("mother")}</user-id>
    </options>
)
'use strict';
const op = require('/MarkLogic/optic');

xdmp.invokeFunction(
    () => {
        return op.fromView('llamaverse', 'llamas', '', op.fragmentIdCol('docId'))
            .joinDoc('doc', op.fragmentIdCol('docId'))
            .select(['doc'])
            .result();      
    },
    {
        "userId": xdmp.user("mother")
    }
);

XPath

As mentioned in the opening paragraph, the document fragment can be used to gather information that is only required during runtime. Using the joined document additional columns can be added to the results of the view using XPath. The following example adds the birthday column to the results:

xquery version "1.0-ml";
import module namespace op = "http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

xdmp:invoke-function(
    function() {
        op:from-view("llamaverse", "llamas", (), op:fragment-id-col("docId"))
            => op:join-doc("doc", op:fragment-id-col("docId"))
            => op:select((
                "name",
                op:as("birthDate", op:xpath("doc", "/envelope/instance/birthDate"))
            ))
            => op:result()
    },
    <options xmlns="xdmp:eval">
        <user-id>{xdmp:user("mother")}</user-id>
    </options>
)
'use strict';
const op = require('/MarkLogic/optic');

xdmp.invokeFunction(
    () => {
        return op.fromView('llamaverse', 'llamas', '', op.fragmentIdCol('docId'))
            .joinDoc('doc', op.fragmentIdCol('docId'))
            .select([
                'name',
                op.as('birthDate', op.xpath('doc', 'envelope/instance/birthDate'))
            ])
            .result();      
    },
    {
        "userId": xdmp.user("mother")
    }
);

Conclusion

Using the fragment id column can be quite useful in some scenarios. It can be used to join the document to the row and gather additional information that is not part of an existing view. However, it should be used with caution as it can be quite expensive depending on the size of the dataset and the document. Therefore it is recommended to only use the document fragment after where clauses and pagination have been applied to the results.

Need Some Help?


Looking for more information on this subject or any other topic related to MarkLogic? Contact Us (info@cleverllamas.com) to find out how we can assist you with consulting or training!