Data

Data Concepts

Data objects in Cavaliba are best understood with the following concepts in mind.

Schema : top-level asset description : Devices, Applications, Laptops, Projects, Customers, Books, Facilities, Cooking Receipes, Invoices, etc.

Fields attributes for each Schema. Fields and Schema can be modified at any time during the life of the system: add or remove fields, change constraints.

Field Types - fields can be of basic nature such as string, int, float, boolean, date, IP addresses, text, … They can have a more complex structure : other objects, enumerate (static) lists, users and groups, external and computed fields, etc. Fields can be single or multi-valued. Fields can be mandatory or optional. Fields can enforce constraints (eg. an integer field must be greater than 4, a text field must be valid JSON or HTML, etc.)

Instance / asset / object : these terms refer to individual items managed by the Cavaliba Data solution. Each Instance belong to one Schema, and has values for some or all fields related to their Schema.

Schemas (and their Instances) have relationships : a geographical sites is located in a region or country. A virtual machine is hosted on a physical server, etc.

YAML/JSON - Schema, Fields, Instances can be provided as regular YAML (or JSON) files. They can be uploaded to Cavaliba from UI, API or CLI command, with appropriate permissions. You can also import CSV file.

Enumerate is a special FieldType which provides predefined list of values. Think of enumerates as static lists. Enumerates can provide more than a single value per choice (eg. A country Enumerate could provide both the ISO code as well as as human-friendly name for the country). Enumerate can be seen as mimnal/little changing Schema.

Pipelines are compoents used to transform data at (bulk) import or export time. When uploading a CSV, rather than asking the sender to adapt to your exact schema, use a pipeline to define mapping and transformation. A large set of operators is available. In itself, a pipeline is a regular Cavaliba object, with a Schema and specific attributes. You can manage pipelines with the Web UI, CLI, YAML import files, etc.

Dataviews are object used to select a subset of Schema fields to be presented to your users in the Web UI for easy access. It helps to provide several dataviews for complex Schema. Dataviews are also regular objects with a Schema and specific attributes and can be managed as such.

Once Schema are created or loaded from description files, Cavaliba Data Management automatically provides :

  • a Responsive Web UI for humans : create, view, edit, import, export, delete, enable/disable assets.
  • a Role/Permission model to handle authorization at Schema/Instance level
  • a REST API for machines: same operations, single instance/field level, or massive bulk transfer
  • a CLI command to import/export in batch from/to external systems
  • a relationship framework, including inheritance / propagation of fields and objects between Instances
  • a Data storage at scale on standard relationnal databases (PostgreSQL, MariaDB/MySQL, …) with an Entity-Attribute-Value (EAV) dynamic datamodel.