TALEND WEBINAR : March 27th, 2018 | Step-by-Step to Enterprise Data Integration

Building a Data Sharehouse – Agile Data Management and Industrial Data Space (IDS)

Building a Data Sharehouse – Agile Data Management and Industrial Data Space (IDS)


In the first two parts of this series we emphasized that manufacturing companies face enormous challenges in managing their data. Obviously, it takes a lot of skill to integrate the IT systems already in place – CAD, CAPP, MES, PPS and ERP, among many others. Technologies for IoT, cyber-physical systems, and Industry 4.0 make things even more complex. The amount of sensor data that is available is already huge and is expanding day by day. Imagine: just one aircraft engine on a flight from Munich to Frankfurt/Main produces more than 1TB of data. And if a standard vibration sensor generates approx. 1.3 TB of raw data per year, we can figure out the challenge of dealing with the already huge and growing number of sensors on the shop-floor. However, drowning in data is not an option. Two challenges are paramount:

  1. A strong need to specify the data requirements with respect to timeliness, accuracy or relevance.
  2. Powerful tools to capture, store and prepare the data to enable value creation. 

Digging Up Golden Data

In today’s world of digital transformation, data is an asset. It can be turned into information and information can be turned into knowledge. Metaphorically speaking, data is like gold dust - process it, and it can become highly valuable gold jewelry. Widely recognized innovators like Google, Facebook and Amazon demonstrate the importance of data for disruptive success. The fact is that there is no difference in business to business (B2B) business models either. Companies striving for vertical and horizontal integration of their processes depend on seamless data exchange. This is why they have to address data management from a strategic perspective, i.e. to open the gates for exploratory analysis, self-learning processes and intelligent decision-making.

In industry 4.0, this is where it gets complicated: It makes an enormous difference whether the sensor data from an elevator or a welding machine is owned by the buyer or the supplier of the machine. The value of the data for the supplier is obvious: they can compare the data with similar data from installations world-wide. This information can increase their competitiveness, help to reduce costs by predictive maintenance or provide general information for further optimizations to the equipment. A similar advantage is obvious for suppliers of mechanical parts. Obviously, the machine owner benefits from better service or, possibly, from the next generation of the machine. But in this scenario, it is the owner who has made the investment, who might upgrade the machine with sensors and who has the burden of integrating sensor data into their MES, PPS, ERP or logistic systems.

This leads us to the basic question: How can companies leverage the value of their data in a technically volatile and interconnected world? What is necessary to guarantee interoperability, to claim ownership, to guarantee a certain level of data quality? How can we organize a collaborative vocabulary? How can we subscribe to a data source and report on the use of data?

Industrial Space Initiative (IDS)

These questions are being addressed by the Industrial Data Space initiative (IDS). This initiative was launched in Germany at the end of 2014 by representatives from business, politics, and research. The overall goal was to provide a reference architecture for the safe, secure and transparent exchange of data between the producers and (possible) consumers of industrial data. The research results are currently being transferred to an association of the same name, which now has more than 100 members, many of them from the top league of German industry.

The main focuses of IDS are

  • Data sovereignty - the data owner must be able to specify the terms and conditions of use for their data
  • Easy linkage of data - a linked-data concept and common vocabularies will facilitate the integration of data between participants
  • Trust - all participants, data sources, and data services of the IDS will be certified according to defined rules
  • Secure data supply chain - data exchange will be secure across the entire data supply chain, i.e. from data creation and data capture to data usage
  • Data governance - participants will jointly decide on data management processes as well as on applicable rights and duties.

This prominent approach requires very solid architectural foundations, which are based on four dimensions:

  • Business architecture that addresses questions regarding the economic value of data, the quality of data, applicable rights and duties (data governance), and data management processes
  • Data and service architecture specifies - in an application- and technology-independent format - the functionality of the IDS, especially the functionality of the data services, on the basis of existing standards (vocabularies, semantic standards etc.)
  • Security architecture addresses questions concerning secure execution of application software, secure transfer of data, and prevention of data misuse and
  • Software architecture specifies the software components required for pilot tests by IDS.

Confining my viewpoint to the data and service architecture, the diagram gives us a rough insight. The concept differentiates between three components: the connector for the exchange of data (request handling, data transformation, data preprocessing etc.), the broker (support and version control of the sources, search of sources, exchange agreements, monitoring etc.) and an app store (services for data transformation, quality support etc.). With the help of an app store, third parties can offer software code which could be injected into the connector to enrich the data with additional value (for instance from meta-data or analytics).

The IDS is an ambitious and unique approach. Currently, proof-of-concept projects for “collaborative supply chain risk management”, “intelligent inventory information” and “dynamic time slot management and tracking in cross-enterprise supply chains”, among others, are being carried out to demonstrate the technical viability of IDS.

The Future of Innovation with Third Party Data

From the perspective of data management, IDS can help companies to open their Enterprise Data Warehouses, Data Lakes or Hadoop-based storage for third parties. Digitally transformed companies must be able to guarantee the quality of their data. No one would accept defective machines, so why should someone accept invalid or erroneous data? The data provider (or their agent) has to prepare, merge, cleanse their data – fast, flexibly and without programming. And this is where Talend’s talent comes in. Talend and IDS perfectly complement each other. Talend’s Data Fabric can support the IDS stakeholders – Data Producer, Data Broker and Data User – to prepare, cleanse, transform, enrich, and integrate data assets. The Data Fabric’s ability to assure the quality of data and to create native code for integration into the Hadoop ecosystem is particularly valuable. While IDS is still in its infancy, Data Fabric is a sophisticated market-leading product. It remains to be seen if IDS will be able to gain traction in Germany and beyond to become a de facto standard for the exchange of data.

At QuinScape, an ambitious partner of Talend and a member of the IDS community, we are working on integrating Talend’s agile management capabilities with the opportunities of the IDS - for the benefit of our customers.

Links to basic IDS information:



About the Author – Dr. Norbert Jesse

QuinScape GmbH

Jesse is co-founder and managing partner of QuinScape GmbH. QuinScape is a leading IT service provider for Talend, Jaspersoft/Spotfire, Kony and Intrexx. With today 120 employees QuinScape is partner of large corporations and internationally operating SMEs.

Jesse studied Social Sciences with emphasis on economics and statistics at Ruhr-Universität Bochum. He received his Ph.D. with a work on analytics for multi-dimensional spatial data.

Jesse has been organizer and co-organizer of numerous international conferences (Fuzzy Days, FIRA World Congresses, CIRAS, Enterprise 2.0 etc.). He is lecturer at TU Vienna and Visiting Professor at University of Business and Technology, Pristina. Furthermore, Jesse is author or co-author of more than 55 conference papers and co-editor of 6 books.

Most Downloaded Resources

Browse our most popular resources - You can never just have one.

Join The Conversation


Leave a Reply

Your email address will not be published. Required fields are marked *