The FDZ Data Set – Introduction:
Start your journey into the world of ObjectAnalytics
A word of friendly “warning” before you start:
Xplain Data ObjectAnalytics® represents a completely new paradigm. It will take you some time to get used to working analytically with entire objects rather than with a flat table. Allow yourself time to adapt to this change and embrace the new universe of data science possibilities.
Better be prepared: You may find that, by the end of this journey, you never want to go back to working with flat tables – because the world is not flat.
Note: Access Free Online-Demos on the Demo Server
When you click one of the Data Model Links below (leading to the Demo Server with the Free Online-Demos) for the first time, you’ll be redirected to the registration page.
- If you’re not yet registered, you’ll need to sign up.
- If you already have an account, just log in. You’ll then see links to the Demo Server and can access the demos directly.
Subsequent clicks on a Data Model link will take you straight to the Demo Server.
Despite Paula & Robo’s explanations, it still might need some time to get used to thinking in terms of objects, instead of flat tables.
To get comfortable, use the Xplain Data ObjectExplorer, a web-based interface that lets you explore an ObjectAnalytics model.
Three pre-prepared ObjectAnalytics models are available on a server hosted by Xplain Data. You only need a browser (Google Chrome is recommended) to get started.
Example 1: FDZ Health Data (FDZ – Forschungsdatenzentrum Gesundheit/German Health Data Lab)
This dataset stems from a public use file, published by the FDZ /Health Data Lab at the German Federal Institute for Drugs & Medical Devices (BfArM). Details (German only) can be found here or here. It contains data from over 700k patients, comprising millions of diagnoses and prescriptions.
Due to privacy issues, the original dataset has been decorrelated. However, we also used this complex real-world dataset to simulate certain dependencies into it. This semi-synthetic dataset can now be used to test the strengths of ObjectAnalytics and our Causal Discovery algorithms in a realistic scenario.
Click to launch the FDZ Health Data Model on the Demo Server.
Before you begin experimenting, we recommend watching the following videos. They provide examples of how to use the Xplain Data ObjectExplorer (XOE):
In this video, you’ll learn about the FDZ dataset’s structure (root object and sub-objects) and discover how to analyze event sequences using the Relative Time Axis.
The FDZ Dataset – Alzheimer’s Disease Example:
In this Causal Discovery example, we took the de-correlated dataset and simulated specific relationships – including Alzheimer’s disease factors. We test whether our algorithms can successfully identify these causal relationships in complex data.
The FDZ Dataset – Create a Causal Discovery Model:
Find out how to build your own Causal Discovery model configuration for any desired target.
Example 2: Patient Data | Electronic Medical Records (EMR)
Real-world objects can be highly complex, particularly in healthcare. In this domain, the primary object is usually the “Patient“.
For example, an electronic medical record (EMR) is not a flat structure—its schema may contain over a hundred interconnected tables, including diagnoses, prescriptions, diagnostic procedures, surgeries, and lab results.
This illustrates that a patient is a complex entity with numerous sub-objects. In public health fund implementations, the patient serves as the root object, supported by approximately 50 related objects and recursive sub-objects. This structure enables a comprehensive 360-degree view of the patient.
This data cannot be published, but here is a smaller object model and a video illustrating the ObjectAnalytics paradigm in health data.
The key aspect of ObjectAnalytics becomes evident when applying sophisticated algorithms, such as Causal Discovery, which use its object-oriented data structures to efficiently analyze cause-and-effect relationships.
Since you can only uncover what exists in the data, the comprehensive view provided by ObjectAnalytics is ideal for identifying causal factors & confounders.
Watch this video to learn about the importance of ObjectAnalytics for patient data and to see the Causal Discovery algorithms in action:
Data Discovery on Patient Data (Breastcancer Example):
Example 3: Business Process Intelligence (BPI)
The data used in this example comes from the BPI Challenge 2019. This dataset represents a business process, specifically the sequence of events involved in processing a purchase order with multiple items & actions.
Click to launch the BPI ObjectAnalytics model on the Demo Server.
Before you begin experimenting, we recommend watching the following videos. They provide examples of how to use the Xplain Data ObjectExplorer (XOE):
BPI Analysis Example 1:
This video presents a simple introductory example of an object model within the ObjectAnalytics paradigm. It explains the root object, its sub-objects & the rationale behind their selection. Additionally, we will demonstrate some typical operations on an object tree, specifically aggregations along the tree’s edges.
BPI Analysis Example 2:
In this example, we introduce the Relative Time Axis (RTA), a key tool in ObjectAnalytics for analyzing the relationships between different sub-objects.
BPI Analysis Example 3:
In this example, we’ll cover how to analyze event sequences.
After watching the videos, you’ll be ready to take your first steps independently, as the ObjectExplorer’s basic features are user-friendly. For advanced features, refer to the detailed documentation videos in our documentation area.
Example 4: National Basketball Association (NBA) Data
This example uses data from the NBA, covering teams, players & matches from the 1946 to 2023 seasons. The raw data is sourced from the R package nbastatR.
Click to launch the NBA ObjectAnalytics model on the Demo Server.
Some example videos:
The NBA Dataset – Analysis Example 1:
This is a short introduction to the NBA Dataset. What are the important objects and how are they organized in the object tree? How can we aggregate data with respect to different objects?
The NBA Dataset – Analysis Example 2:
A Causal Discvorery example – what are causal factors for players winning awards such as Most Valuable Player (MVP) or Rookie of the Year (ROY)?
Example 5: It’s all Code: Python & other Interfaces
You can code ObjectAnalytics with Python directly in the integrated JupyterLite in XOE.
Click to try it out with the BPI dataset on the Demo Server.
Make sure that the Developer mode is activated in XOE:
Start JupyterLite from XOE:
For technical details & API references for Python Services in ObjectAnalytics: https://docs.xplain-data.de/xplaindoc/interfaces/xplainpy.html