TORCH Integration
TORCH is a FHIR server for clinical research data. Aether can directly extract data from TORCH using CRTDL (Clinical Resource Transfer Definition Language) query files.
Overview
What is TORCH?
TORCH is a FHIR-based data extraction service that allows researchers to define patient cohorts using CRTDL query files and retrieve matching FHIR data automatically.
TORCH Integration allows you to:
- Query cohorts from TORCH FHIR servers using standardized CRTDL queries
- Automatically extract patient data based on clinical criteria with data minimization
- Import extracted FHIR resources into Aether for processing
- Combine with DIMP for pseudonymization of sensitive data (optional)
- Resume and reprocess previously extracted data
Prerequisites
- Access to a TORCH server with valid credentials
- A CRTDL query file defining your cohort criteria
- TORCH server URL, username, and password
Configuration
1. Add TORCH Credentials
Add your TORCH credentials to aether.yaml:
services:
torch:
base_url: "https://torch.hospital.org"
username: "researcher-username"
password: "secure-password"Security Note: Consider using environment variables for sensitive credentials:
export TORCH_USERNAME="researcher"
export TORCH_PASSWORD="secure-password"Then reference in aether.yaml (if supported):
services:
torch:
base_url: "https://torch.hospital.org"
username: "${TORCH_USERNAME}"
password: "${TORCH_PASSWORD}"2. Enable TORCH in Pipeline
pipeline:
enabled_steps:
- torch # Extract from TORCH
- import # Import extracted data
- dimp # Optional: pseudonymizeCreating CRTDL Query Files
A CRTDL query file defines the cohort selection criteria. Example:
{
"description": "Patients with diabetes diagnosis",
"criteria": {
"resourceType": "Condition",
"code": {
"coding": [{
"system": "http://snomed.info/sct",
"code": "73211009"
}]
}
}
}Input Methods
Aether supports multiple ways to work with TORCH data:
1. CRTDL File Extraction (recommended for new queries)
Submit a CRTDL file to TORCH and automatically download results:
aether pipeline start cohort-query.crtdlAether will:
- Connect to the configured TORCH server
- Submit the CRTDL query
- Poll extraction status until complete
- Download resulting FHIR NDJSON files
- Continue with pipeline processing
2. TORCH Result URL (for reusing existing extractions)
Skip extraction and download files directly from a TORCH result URL:
aether pipeline start http://localhost:8080/fhir/result/abc123This is useful for:
- Resuming or reprocessing previously extracted data
- Sharing extraction results with other researchers
- Testing pipeline changes on existing data
3. Backward Compatibility
Existing workflows using local directories or HTTP URLs continue to work:
# Still supported
aether pipeline start ./test-data/
aether pipeline start https://example.com/fhir/exportUsing TORCH Integration
Basic TORCH Query
Run a simple query to extract data from TORCH:
aether pipeline start my_cohort.crtdlWith Pseudonymization
Extract data and pseudonymize it in one command:
# aether.yaml
services:
torch:
base_url: "https://torch.hospital.org"
username: "researcher"
password: "secret"
dimp_url: "http://localhost:8083/fhir"
pipeline:
enabled_steps:
- torch
- import
- dimp # Apply pseudonymization
jobs:
jobs_dir: "./jobs"Run:
aether pipeline start my_cohort.crtdlMonitoring Progress
Monitor your TORCH extraction:
# Check job status
aether job list
# Get details for specific job
aether pipeline status <job-id>
# View logs
aether job logs <job-id>Common CRTDL Patterns
Patient Demographics
Extract specific patient age range:
{
"description": "Patients aged 18-65",
"criteria": {
"resourceType": "Patient",
"birthDate": {
"$gte": "1960-01-01",
"$lte": "2006-01-01"
}
}
}Diagnosis-Based Cohorts
Extract patients with specific diagnoses:
{
"description": "Hypertension patients",
"criteria": {
"resourceType": "Condition",
"code": {
"coding": [{
"system": "http://snomed.info/sct",
"code": "59621000"
}]
},
"clinicalStatus": "active"
}
}Medication-Based Cohorts
Extract patients on specific medications:
{
"description": "Patients on Metformin",
"criteria": {
"resourceType": "MedicationStatement",
"medicationReference": {
"code": {
"coding": [{
"system": "http://www.nlm.nih.gov/research/umls/rxnorm",
"code": "6809"
}]
}
}
}
}Advanced Configuration
Extraction Timeout
Configure how long Aether waits for TORCH extractions to complete:
services:
torch:
base_url: "https://torch.hospital.org"
username: "researcher"
password: "secret"
extraction_timeout_minutes: 30 # Default: 30
polling_interval_seconds: 5 # Default: 5
max_polling_interval_seconds: 30 # Default: 30extraction_timeout_minutes: Maximum wait time for extraction (adjust for large cohorts)polling_interval_seconds: Initial poll interval (increases exponentially up to max)max_polling_interval_seconds: Maximum poll interval between checks
Error Handling
Aether implements robust error handling for TORCH operations:
- Server unreachable: Clear error within 5 seconds
- Authentication failure: Fails early with credential error
- Extraction timeout: Configurable timeout (default 30 minutes)
- Empty results: Gracefully handles zero-patient cohorts
- Malformed CRTDL: Validates syntax before submission
- Connection lost during download: Automatic retry with exponential backoff
Troubleshooting
"TORCH server unreachable"
- Verify
services.torch.base_urlis correct and accessible - Check network connectivity from your machine to TORCH server
- Test with curl:
curl -u username:password https://torch.hospital.org/fhir/
"Authentication failed"
- Verify username and password in configuration
- Ensure credentials have appropriate permissions on TORCH server
- Check for password expiration or account lockout
- Verify you have access to the TORCH system
"CRTDL query syntax error"
- Validate CRTDL query file syntax (should be valid JSON)
- Check field names and data types match TORCH schema
- Review TORCH documentation for query syntax
- Test query manually on TORCH interface
"No patients matched"
- Review cohort inclusion/exclusion criteria
- Verify TORCH server contains matching data
- Test query manually on TORCH interface
- Ensure query criteria are not too restrictive
"Extraction takes longer than expected"
- Large cohorts may need higher
extraction_timeout_minutes - Check TORCH server logs for processing status
- Verify network connectivity remains stable
- Consider splitting into smaller cohorts
"How long do TORCH extractions take?"
- Depends on cohort size and server load
- Aether polls every 5 seconds (configurable) with default 30-minute timeout
- Large cohorts may need a higher timeout in configuration
- Monitor progress with
aether job listandaether pipeline status <job-id>
Next Steps
- Configuration Guide - Set up Aether fully
- DIMP Pseudonymization - Protect sensitive data
- Pipeline Steps - Understand the full pipeline
- CLI Commands Reference - All available commands