my_functions.py — Centralised Python Library
BCHPR · underpins 23+ projects · 2023 – present
The 21,086-line shared Python library that every BCHPR data project depends on — APIManager, PathsManager, REDCap wrappers, study-ID generation, SharePoint I/O, and dozens of cross-project utilities.
Highlights
- 50+ API key groups managed through a single APIManager (GHIT, Wave11, S4A, RapidTB, Viral Load, Manager.io, M365 Graph).
- Cross-platform PathsManager auto-detecting Windows / WSL / Linux with SharePoint-sync awareness.
- ThreadPoolExecutor-based parallel REDCap exports / imports with exponential-backoff retry logic.
- Chunked record imports (default 2,000 per call) and rate-limited API request queuing.
- Cut new-project setup time from days to hours by standardising patterns across all 23+ projects.
Related projects
Architect
data_quality_manager.py — Enterprise DQA Framework
11,007-line data quality platform with fluent QueryBuilder, persistent query lifecycle tracking, duplicate analysis, and double-data-entry verification across 28+ instruments — with SQLite persistence and Polars acceleration.
Engineer
study_id_patterns.py — Study-ID Regex Registry
2,611-line centralised registry of 8 study-ID patterns and 14 site-code patterns across Cameroon, Nigeria, and Vietnam projects — with vectorised extraction, validation, classification, and cleaning.
Engineer
date_utils.py — Date Parser & Power BI Calendar Generator
3,746-line date engine handling 60+ formats, Excel serials, timezone conversion, and Power BI dimension tables with 99+ attributes (fiscal periods, holidays, relative categories, sort orders).