Onboarding Guide#
The following section gives an overview of onboarding-related information, including service capabilities, limitations.
Getting Started#
Initial Contact
Together with the Bosch project that a data sharing is planned with, we will sit together to talk through the onboarding details. The following steps give you an overview of information that is required to ensure a smooth and quick connection setup.
Defining Infrastructure
DIANA Data Integration provides a verity of connectors that enable us to connect to multiple cloud infrastructure components. Please go to Connector Catalog and find your suitable solution.
Technical Setup
Depending on the desired setup for the onboarding key properties are required. All source and sink specific information can be derived from Connector Catalog from the details below.
Test runs
We recommend setting up an end-to-end data flow first for some test data. This helps ensuring that all components are connected properly and allows for data automation and quality assessment. Once all parties confirm either a productive environment is created or the provided setup lifted.
Best Practices
Below you can find the Code of Conduct. It helps understanding the boundaries of the DIANA Data Integration service.
Access Right#
To enable connectivity with our IngestAPI and ingest topics, we as the infrastructure owner will provide the necessary credentials. We utilize encrypted files to ensure the secure and confidential transmission of these secrets.
For data retrieval from external infrastructure, we request the infrastructure owner to securely and confidentially provide the required credentials.
Code of Conduct#
The Data Integration service allows data providers to push data via a REST API endpoint. Abuse of this endpoint (e.g., penetration testing) is strictly prohibited.
Load or performance tests that exceed regular system use must be discussed and planned in advance.
Capabilities#
The Data Integration Service supports data streaming use cases.
Default max message size: 15 MB
For larger messages, contact the product team.
Messages are processed sequentially and stored in a customer-specific topic/queue. If ingestion exceeds processing speed, messages accumulate and delivery may be delayed.
Reference throughput table:
Message Size |
Message Frequency |
|---|---|
15 MB |
2 msgs/s |
5 MB |
10 msgs/s |
< 1 MB |
50 msgs/s |
Retention Time and Retry Mechanism#
To ensure no data loss in the over all setup, it is required to configure proper retention periods and retry mechanisms in the connected components.
Retention Time#
In order to ensure no data loss in the recommended retention time on infrastructure where the Data Integration Service pulls the data from is 7 days. This allows us that we can cope with unexpected downtimes of either system and our retry mechanisms ensure that the data is pulled to after connection has been reestablished.
Retry Mechanism#
The Data Integration Service ensures an availability of the ingest components (API and importer topics) of 99.5%. Accordingly, the endpoints should be available for an ingest. Nevertheless, downtimes can happen or responsiveness can be impacted by load.
It is recommended to build a retry mechanism on Data Provider side to cope with 500 errors. A retry over a span of min. 3 days is recommended to allow for no data to get lost.
Project Obligations#
DIANA Data Integration acts as enabling service between Bosch projects and the external Data Provider or Data Consumer. The data sharing agreements need to be in negotiated between these parties before a transfer can be facilitated via DIANA. Moreover, a data assessment needs to be ensured to allow compliance with regional regulations.