• Channels

  • Contact

  • Main Site

  • More

    To see this working, head to your live site.
    1. Elastacloud Channels
    2. Modern Data
    3. Managing Data Lakes
    Search
    Andy Cross
    Aug 17, 2017

    Managing Data Lakes

    We've been building a system for higher level management of storage assets in a cloud context for some time. The Elastacloud Data Catalogue (EDC) manages the state, history and the utilisation of Cloud Storage. Unlike other offering the EDC takes a data oriented view of storage, rather than a directory oriented view. The fundamental difference it that we are managing how data is used and the effects that has on Storage, and not treating the storage as a directory of unknown content. EDC deals with data content, origin, lineage, provenance, quality, existence, utilisation and metrics.


    Core to this is the building of a graph database that stores the assets under management on the low level storage medium as edges associated with vertices that relate them together. This allows for the traversal of the database, discovering related files, measuring distances and establishing provenance of data assets held in BI tooling; the natural visualisation facet of advanced analytics.


    We achieve this with low friction meta-management, as shown in this diagram:




    0 comments
    0
    • Twitter Social Icon
    • LinkedIn Social Icon
    • Facebook Social Icon

    Visit the Elastacloud website