Our standard setup for clients is the Semantic Nexus — a design where the data architecture is based on our best practices and years of client feedback. Company leadership, finance teams, and team leads will use the Internal Analytics to track performance. Finance and planning managers rely on the semantic model to run ad-hoc calculations, which are all based on a single source of truth. After a succesful deployment of the Nexus, the monthly Company KPIs can then be directly shared with shareholders and stakeholders through the metrics exchange.
## Deliverables
### Internal Analytics
Each design has a central [[Internal Analytics]] environment, where all core elements of the data model are presented to the entire company. This environment is governed by our company to make sure it remains clean and consistent. All other self-service reports and dashboard that are created by team members themselves, are not part of this Internal Analytics setup, but are outside in their own environment. This approach is chosen to make sure that all the building blocks and operational entities are not obscurred and polluted.
### Connected spreadsheets
The design enables the finance and planning teams to connect their [[Microsoft Excel]] and [[Google Sheets]] to the semantic model, which enables them to use the same data as used in the Internal Analytics setup. This results in less manual work for copy-and-pasting data from reports, less mistakes due to an automated data-sync and no out-dated data in the calculations. This allows these team to create more business opportunities and optimisation than before.
### Metrics exchange
All systems we deploy can be connected to the [[Metricsrouter]], which is an open core metrics exchange between legal entities. It allows shareholders and stakeholders to draw in the metrics of the companies, to perform their own analysis and compare them with the performance of other investments.
## Frameworks
### Standardised Data Architecture
The SDA is one of three [[Frameworks]] that are put into place when designing a data arechiture. It provides us with the best practices to build a data pipeline and data model that is robust, flexible and low cost. Years of experience and mistakes have slowly provided us with this list that we believe creates a solid foundation.
### Integrated Performance Framework
All the components of an effective performance tracking design require their place and relationship towards each other. In the IPF we have a systematised way of presenting the data model and make sure it strikes the right balance between simplicity and depth.
### Modular Security Protocol
Our design preference for client-side and open core systems, makes the security environment a bit more complex than in a closed source monolithic architecture. Therefore we have a set of checks to make sure the security requirements are met.
## Communication
### Workload backlog
The communication regarding all the outstandig tasks, their size and priority are tracked in the task management tool [[Trello]], but can also be tracked in other systems like [Monday](https://monday.com/) or [Jira](https://www.atlassian.com/software/jira). The main goal in the workload management is to create a continuous-flow process with a backlog that get's prioritised monthly. We structure the tasks through boards with distinct phases:
- Backlog
- Discussion (>4 hrs)
- Discussion (<4 hrs)
- Can be done
- Doing
- Review
- Done
In the two discussion phases we distinguish between minor and major tasks determined by a 4 hour treshold. All development tasks require green lighting by the Data Owner on the side of the client. Any minor ad-hoc issues resolve tasks and minor maintentance tasks are executed without the need for any formal decision making.
To keep the process clean and the work-in-progress (WIP) at a manageble level, the number of tasks that need to be done, should not grow to big. First complete what you have already started vs. starting something new.
### Shared Chat Channel
For all clients we have a shared chat channel, usually [[Slack]], but this can also be Teams. In this channel we have all relevant people present. The major updates, issues and bug-fixes are shared here, as well as requests from individual people go into this channel for all to see.
The centralised communication approach makes sure there is no confusion or maojr back channel conversations. All topics data-related should be discussed there.
### Documentation
The documentions or 'Docs', are filled with all key components of the Data & Analytics setup. Which are:
- Architecture
- Data pipeline (flowchart)
- Data pipeline assets (link to asset overview)
- Extraction
- Computation
- Transformation
- Semantics
- Analytics
- Calculations
- Data Model
- Roadmap
In each of the above compontents we write down all the relevant systems, assumptions, logic, owners etc. This allows us to have one central place that describes how the data architecture is organised.
### Code repositories
We have a strong preference for version-controlled dedicated git-repositories. When it comes to custom forks of the [[Airbyte - Exact Online]] instance, the transfromation tool [[dbt]] or the semantic models in [[Cube]], all of them are stored in the client's [[GitHub]] or Gitlab account.
We prefer this approach over having managed repositories at the vendor or repositories owned by our engineering company. We believe that each client should have full sovereignty over their architecture and should be able easily switch between vendor, as described in our [[Design Philosophy]].