This is a repository for my research, paper reading summaries/reviews, and relevant blog-like posts in markdown.
This post will attempt to describe and explore a “declarative programming interface” for bioinformatics. The specific purpose of this declarative interface is to determine the necessary subset of relational algebra, relational calculus, or other language that a storage system would need to provide to support various bioinformatics use cases. I will start with gene expression and single cell RNA sequencing (scRNA-seq) use cases, but I will explore how this interface generalizes to other bioinformatics use cases.
Bioinformatics is a broad field that tries to understand molecular and cellular biology through the analysis of biological data. Taking a top-down approach to determining a declarative programming interface, I start with some key biological entities and concepts that exist in gene expression analysis and single cell RNA sequencing.
Molecular and cellular biology terminology:
Sequencing terminology:
Analysis terminology:
Ontology terminology: