This guide demonstrates how to leverage SQL on FHIR to extract events data into a patient timeline from bulk FHIR (Fast Healthcare Interoperability Resources) exports. By combining the power of SQL with FHIR’s standardized healthcare data, you can efficiently analyze and derive insights to support clinical and operational decision-making.
## Example: Using Anonymized Data
This section provides an example of how to work with anonymized data to generate value points.
Step 1: Creating View Definitions for FHIR Resources
The first step involves creating view definitions for various FHIR resources. These view definitions are essential for transforming the hierarchical structure of FHIR resources into flattened relational tables, enabling easier querying and analysis using SQL. Below is an example of a view definition for the Patient resource:
In this step, we use Pathling, an open-source solution built on Apache Spark, to extract tabular results from bulk FHIR exports using the view definitions created in Step 1.
The output of Pathling is a Spark DataFrame, which can be easily loaded into a database for further analysis. For this demonstration, we will use DuckDB to store and query the extracted data.
in this example we will be assuming the bulk exports are in ndjson format and we will be using the patient resource type.
Code
import builtinsfhir_data = pathling_context.read.ndjson('bulk_fhir')# use the view definition to flatten the datawith builtins.open('views/Patient.ViewDefinition.json') as f: patient_view = fhir_data.view(resource='Patient', json=f.read()) patient_view.show(5)
+----------+--------------------+---------------+--------------------+
|birth_date| patient_id|identifier_code|identifier_system_id|
+----------+--------------------+---------------+--------------------+
|1989-09-30|4F12AC21954DF9A3B...| ANON|4F12AC21954DF9A3B...|
|1989-09-21|4F12AC21954DF9A3B...| ANON|4F12AC21954DF9A3B...|
|1989-09-20|4F12AC21954DF9A3B...| ANON|4F12AC21954DF9A3B...|
|1991-12-31|2276EE71DE1500D8F...| ANON|2276EE71DE1500D8F...|
|1995-12-02|07BF8F9537F9D0EC8...| ANON|07BF8F9537F9D0EC8...|
+----------+--------------------+---------------+--------------------+
only showing top 5 rows
Step 3: Defining Resources and Creating Patient Timeline
In this step, we define all the required FHIR resources, flatten them using the view definitions, and transform the data into a patient timeline. This timeline provides a comprehensive view of patient events, enabling detailed analysis and insights.
Code
from helper_functions import helpersfrom pyspark.sql import functions as Ffrom pyspark.sql.types import DateType, LongType, StringType# define required resourcesrequired_resources = ["Condition","Encounter","MedicationDispense","Observation","Patient","Procedure", ]# read bulk datafhir_data = pathling_context.read.ndjson('bulk_fhir')reference_columns = ["visit_provider_id","patient_id","encounter_id","account_id", ]views = {}for resource in required_resources:with builtins.open(f'views/{resource}.ViewDefinition.json') as f: resource_view = fhir_data.view(resource=resource, json=f.read())# this removes backward references i.e paitent/1 -> 1# this can also be avoided by using correct view definition cleaned_view = helpers.clean_resource_references( resource_view, reference_columns ) views[resource] = cleaned_view# create a base view to join other tablesencounters = views["Encounter"]encounters = encounters.withColumn("visit_type_code", encounters.visit_type_code.cast(LongType()))encounters = encounters.withColumn("visit_start_date", encounters.visit_start_date.cast(DateType()))encounters = encounters.withColumn("visit_end_date", encounters.visit_end_date.cast(DateType()))base = encounters.where(F.col("visit_start_date") >"1900-01-01")base.show(5)
Step 4: Extracting Visit Information for Value Points Calculation
Once the patient timeline is generated, the next step involves extracting detailed visit information. By aggregating and transforming this data, one can calculate value points, which serve as key indicators for clinical and operational decision-making.