00:00: This video is a quick walkthrough of object, analytics and causal Discovery using
00:04: an example. It is intended as an introduction and familiarization
00:08: with the terms and Concepts. It should also serve as motivation, as
00:12: you will see what unique things can be done.
00:16: You can't identify causal factors that aren't present in your data.
00:19: So comprehensive data is essential for causal discovery.
00:23: Take Rich patient data as an example.
00:25: The root object is the patient.
00:27: And in typical projects that are up to 50 related objects and recursive
00:32: subjects that provide a full 360-degree view of the patient.
00:37: In this demo, we will use a simplified model.
00:39: The root is the patient or person.
00:44: Let's look at the gender attribute, you'll see that there are about 70,000
00:48: male and female patients. The
00:53: age distribution shows that there are 1200 patients over 100 years
00:57: old. If you select these patients you see
01:01: that most of them are female.
01:10: But that's just a warm-up. Let's take a look at the sub objects.
01:14: Direct sub objects or Master data diagnoses prescriptions
01:19: and Hospital cases. A hospital case is itself.
01:22: A complex object with subjects such as procedures charge
01:26: items and hospital diagnosis.
01:29: We want to understand what causes breast cancer.
01:31: So we dive into the diagnosis sub-object.
01:38: Breast cancer can be found in the ICD catalog in the neoplasm section.
01:50: We select those patients. The
02:06: flag at the root shows us. How many patients we have with the target disease? In
02:13: total, there are 3300 patients with breast cancer.
02:18: Among women. The percentage is 4.5%.
02:31: Can painkillers cause breast cancer. Let's find out.
02:35: We open the sub-object prescription.
02:37: The painkillers can be found in the ATC catalog in the nervous system
02:41: section. When we select those, we see that among patients who
02:45: use painkillers, we find an increased percentage of 7.6%
02:50: who also had breast cancer. Let's get a little more detail
02:54: than count the number of painkillers for each patient.
02:57: We can initiate such an aggregation along the hierarchies of the object Tree
03:01: by dragging, the prescriptions up and dropping them on the patient object.
03:06: As a result, we find a new dimension that characterizes each patient.
03:10: The amount of painkillers used.
03:12: We open this on the canvas, 57,000 of the female
03:16: patients. Never use painkillers. But there are a few, which had even more than 100
03:21: painkiller prescriptions to see how the prevalence of breast cancer changes.
03:25: With the number of painkillers we drag and drop the attribute into the window.
03:30: Switch to graph View and convert the data to percentages.
03:33: Now, we can see that the more painkillers are used the higher, the proportion
03:37: of patients with breast cancer.
03:40: Does this mean that painkillers cause breast cancer? Absolutely not.
03:44: There are several reasons why we can't infer causation from this Association.
03:49: First, we didn't control for Time.
03:51: Painkillers may have been prescribed after the breast cancer diagnosis.
03:56: Second. We're comparing apples to oranges patients.
03:59: With lots of painkillers on the right side of the graph.
04:02: Tend to be older while those with few or none are younger and breast cancer.
04:06: Also becomes more common with age H is an important confounding Factor.
04:11: We can see this by selecting a specific age group, This
04:15: ensures that all patients in the chart are of a comparable age.
04:19: When we look at the graph, it becomes flat showing that within a given
04:23: age group, there is no correlation between painkillers and breast cancer.
04:31: So what causes breast cancer? There could be millions of potential
04:35: factors according to all the different disease codes available drugs,
04:39: existing procedures, Etc. All need to be evaluated.
04:43: Also as potential, confounding factors, let's start the search.
04:47: We again select our Target breast cancer.
04:50: Now, click on the light bulb icon, to start the causal Discovery algorithms
05:01: During the process you'll see that the algorithm is evaluating 2.3
05:06: million different factors and potential confounders simultaneously.
05:09: Not just painkillers and age.
05:12: No algorithm can prove cause and effect from observational data alone,
05:16: but the deeper and broader the search, the more relevant the hypotheses
05:20: generated. That's where our object analytic space deep search
05:24: algorithms Excel.
05:26: Here are the results. At the top is malignant neoplasms
05:30: of the ovary, which is plausible. Further factors
05:34: are benign neoplasms of the breast.
05:37: And personal history of malignant neoplasms.
05:40: Also expected, more surprising, is the factor, g03
05:44: F, progestogens and estrogens in combination related to hormones,
05:49: used for menopausal symptoms, These have recently been linked to carcinogenic
05:54: effects. The initial list contains only direct factors where
05:58: also interested in indirect factors essentially the entire cause and
06:02: effect graph. This advanced search may take longer depending on your configuration
06:07: in search steps.
06:09: Here are some sample results. The
06:14: Redbubble on the right represents the target breast cancer.
06:19: The time axis runs from left to right with factors plotted according
06:23: to their duration of influence Arrows indicate cause and
06:27: effect moving from left to right. For example, hypercholesterolemia
06:32: is on the far left indicating that it contributes indirectly to breast cancer
06:36: over a long period of time through other factors.
06:41: This is only a demonstration on a small data set.
06:44: These results need to be validated on larger data sets.
06:47: Ideally, we would like to apply these algorithms to a set of 10
06:51: million patients, covering all diagnoses prescriptions and
06:55: procedures over 10 years for a total of about 10 billion events.
07:01: By now, it should be clear that causal, Discovery requires, Rich information that
07:05: cannot be represented in a flat table.
07:08: We use holistic objects and are algorithms x-ray, these objects
07:12: from any perspective to find likely causal factors for a Target.
07:16: Causal Discovery is the Supreme part but much simpler things are also
07:20: becoming feasible based on object analytics which are hard to achieve with
07:24: other Technologies in healthcare.
07:26: The term patient Journey. Describes the need of understanding the flow of
07:30: events, across the diverse event streams Associated to a patient.
07:35: For example, we want to find out whether in which blood thinners are used after
07:39: the initial diagnosis of atrial fibrillation A common type
07:43: of heart disease and how patients switch between different products as the disease,
07:47: progresses, with just a few clicks, you can build this analysis.
07:52: Usually therapy starts at the time of diagnosis.
07:55: Most often with vitamin K antagonists, and there is rarely a
07:59: switch. In
08:04: some cases, it also starts with Clopidogrel.
08:12: And from there, we often see a switch to another product.
08:20: We don't want to go into medical details, but give you an idea of what is possible
08:24: based on object analytics, analyzing a complex object,
08:28: across subjects, in the above case therapies, relative to diagnosis
08:33: and previous Therapies. Try to do this with a relational database
08:37: and SQL. If you get this done at all, it will likely be slow.
08:42: The object Explorer implements a usability concept to do that kind
08:46: of analysis in an interactive way and the object analytics database
08:50: provides the speed for that sort of analyzes.
08:53: And there is a python interface where you can do all of this on a programmatic level.
08:58: You will learn all of this in the next steps.