Research Software Engineer · AI/ML · Bioinformatics · Signal & Image Processing
I build reproducible software and data workflows for biomedical research, with a focus on AI model development, NMR metabolomics, sequencing data, biomedical signals and images, statistical modelling, and high-performance Python.
My work sits at the intersection of life science and software engineering. I turn research methods into tested, documented, and reusable tools that can move from exploratory analysis to larger cohort workflows.
- Scientific software: Python packages, command-line tools, interactive visualisation, and reproducible pipelines
- Biomedical data: NMR metabolomics, microbiome and sequencing data, medical imaging, and biomarker analysis
- AI and machine learning: model development, feature engineering, evaluation, explainability, and reproducible training workflows
- Deep learning: neural networks for classification, representation learning, signal analysis, and image-based modelling
- Local AI and LLMs: private model inference, prompt and model evaluation, local APIs, and hardware-aware deployment
- Signal and image processing: preprocessing, denoising, feature extraction, segmentation, and quantitative analysis
- Statistical learning: PCA, OPLS-DA, regression, resampling, validation, and interpretable analytics
- Performance engineering: parallel processing, native C/OpenMP extensions, GPU backends, and memory-bounded algorithms
| Project | What it does | Focus |
|---|---|---|
| metbit | End-to-end Python toolkit for reproducible 1H NMR metabolomics, from raw FID preprocessing to chemometrics and interactive visualisation | Python, NMR, PCA, OPLS-DA, STOCSY, C/OpenMP, GPU |
| FrameX | Arrow-backed dataframe and array library for parallel analytics on a single machine, with optional Ray, Dask, and accelerator backends | Python, Apache Arrow, parallel computing, HPC |
| lingress | NMR analysis pipeline for peak-wise OLS regression, multiple-testing correction, resampling, and visual interpretation | Python, regression, statistics, metabolomics |
| barcodactyl | Command-line package that separates Oxford Nanopore reads into per-barcode FASTQ, SAM, or BAM files | Python, ONT, FASTQ, SAM/BAM |
| seq-miner | Lightweight read extraction and filtering by ID, quality score, and length, with multithreaded FASTQ processing | Python, sequencing, CLI, data processing |
Explore all 53 public repositories, including biomedical analyses, teaching material, visualisation projects, and research prototypes.
I contribute computational analysis to biomedical studies spanning cancer, cardiovascular surgery, metabolomics, microbiome research, and experimental biology.
| Year | Publication | Journal |
|---|---|---|
| 2025 | Distinct Gut Microbiota Profiles Associated with Advanced Hepatocellular Carcinoma in a Thai Cohort | Cancers, 17(17), 2915 |
| 2025 | Oxford Nanopore Sequencing Unveils Structural Variations and Their Functional Impacts in Cholangiocarcinoma Cell Lines with Varying Degrees of Differentiation | Journal of Medical Bioscience, 7(Suppl 1), S1-S7 |
| 2025 | Impact of Non-Thermal Plasma Seed Priming and Early Development Stages of Two Local Thai Cruciferous Plants Mustard Green and Rat-Tailed Radish on Glucosinolates, Isothiocyanates, Minerals, Antioxidant and Anticancer Activities | Notulae Botanicae Horti Agrobotanici Cluj-Napoca, 53(1), 14149 |
| 2024 | Comparative Analysis of Metabolomic Responses in On-Pump and Off-Pump Coronary Artery Bypass Grafting | Annals of Thoracic and Cardiovascular Surgery, 30(1) |
| 2022 | Antioxidant Activity of Mustard Green and Thai Rat-tailed Radish Grown from Cold Plasma Treated Seeds | Notulae Botanicae Horti Agrobotanici Cluj-Napoca, 50(2) |
Full publication records: ResearchGate · ORCID
- Primary language: Python
- Additional languages: R, JavaScript, C, MATLAB, and Shell
- Web applications: Next.js, React, Dash, Plotly, HTML, and CSS
- Scientific computing: NumPy, pandas, SciPy, scikit-learn, statsmodels, and PyArrow
- AI and deep learning: PyTorch, TensorFlow, model evaluation, explainable AI, and experiment workflows
- Local AI and LLM tooling: Ollama, Hermes agents, Hugging Face, and Transformers.
- Signal and image processing:
scipy.signal, OpenCV, scikit-image, medical imaging, and spectral analysis - Software development: Python libraries, command-line applications, setuptools, PyPI, and automated releases
- Quality and delivery: pytest, Hypothesis, GitHub Actions, benchmarking, and technical documentation
- Data and compute: Apache Arrow, Parquet, multiprocessing, OpenMP, CuPy, Dask, and Ray
- Scaling metabolomics workflows for larger cohorts
- Developing and evaluating machine learning and deep learning models for biomedical applications
- Building local AI workflows with Ollama, Hermes, Hugging Face, and Transformer Lab
- Running, comparing, and integrating LLMs through private local inference and API services
- Processing physiological signals, spectra, and medical images for quantitative analysis
- Building interpretable AI workflows for classification, prediction, segmentation, and biomarker discovery
- Reliable and interpretable statistical software for biomedical research
- Efficient dataframe, array, and scientific-computing systems
- Practical sequencing utilities and reproducible research infrastructure
I am open to research collaboration and scientific software projects involving bioinformatics, metabolomics, biomedical data, or performance-focused Python.
LinkedIn · GitHub · ORCID · ResearchGate



