The data used in this manuscript were generated by the External Quality Assurance Program Oversight Laboratory (EQAPOL) Flow Cytometry Program. We anticipate and hope that the Python bioinformatics community will build advanced tools for quality control (QC), analysis, visualization, and even graphical user interfaces (GUI) on top of the foundational features provided by FlowKit. While R has a rich set of libraries for cytometric data analysis, we believe that FlowKit provides a complementary alternative with access to state-of-the-art data science and machine learning frameworks for Python developers and will be welcomed by the cytometry bioinformatics community. We chose Python as the implementation language for pragmatic reasons - Python is the dominant language for data science, machine learning and computer vision the two main deep learning frameworks, TensorFlow and Torch, use Python as their de facto language Python is now often taught as a first programming language in many quantitative disciplines and Python has a robust ecosystem for scalable workflows and system integration. To allow integrative manual and automated analysis, we worked with FlowJo developers to ensure that FlowKit could read and write FCS and FlowJo workspace files, allowing the round-tripping of data and analytic results to and from FlowJo. To interface with data science, machine learning and computer vision algorithms, we developed FlowKit so that analytic results and event data could be exported as a generic pandas DataFrame, a standard unit for analysis in Python scientific workflows and interoperable with the R, Spark, and SQL frameworks. To ensure that foundational cytometry operations are supported, we checked for full compliance with Gating-ML 2.0 ( 1), hence ensuring that compensation, transformation, and gating operations were all implemented correctly. Specifically, we wanted to develop a robust basis for foundational cytometry operations, provide a straightforward interface to SCDS algorithms, and facilitate the integration of manual and automated analysis. We developed FlowKit to bridge the gap between manual and automated workflows. However, there are also severe limitations to a purely manual workflow for data analysis, especially the poor scalability to high-volume workflows and limitations of visual discovery for high-dimensional data sets. For example, domain experts are typically better at removing debris, dead cells, and cell aggregates by gating than automated approaches. There are good reasons for this - traditional software such as FlowJo excels at the visual manipulation and analysis of data, and human analysis is inherently more adaptable than any fully automated workflow. We present examples of the use of FlowKit for constructing reporting and analysis workflows, including round-tripping results to and from FlowJo for joint analysis by both domain and quantitative experts.ĭespite the phenomenal advances in Single Cell Data Science (SCDS) methodology and an ever-growing collection of algorithms and open-source packages, it is an open secret that the day-to-day analysis of cytometric data in flow laboratories and core facilities is still predominantly performed using traditional software, especially FlowJo. To address this challenge, we developed FlowKit, a Gating-ML 2.0-compliant Python package that can read and write FCS files and FlowJo workspaces. To a large extent, this cuts domain experts off from the rapidly growing library of Single Cell Data Science algorithms available, curtailing the potential contributions of these experts to the validation and interpretation of results. Domain experts in cytometry laboratories and core facilities increasingly recognize the need for automated workflows in the face of increasing data complexity, but by and large, still conduct all analysis using traditional applications, predominantly FlowJo. 7Duke Human Vaccine Institute, Durham, NC, United StatesĪn important challenge for primary or secondary analysis of cytometry data is how to facilitate productive collaboration between domain and quantitative experts.6Department of Surgery, Duke University Medical Center, Durham, NC, United States.5Duke Immune Profiling Core, Duke University School of Medicine, Durham, NC, United States.4BD Life Sciences - FlowJo, Ashland, OR, United States.3Center for Human Systems Immunology, Duke University Medical Center, Durham, NC, United States. 2Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, NC, United States.1Duke Center for AIDS Research, Duke University, Durham, NC, United States.Weinhold 1,5,6, Guido Ferrari 1,3,6 and Cliburn Chan 1,2,3 Scott White 1,2,3*, John Quinn 4, Jennifer Enzor 5,6, Janet Staats 5,6, Sarah M.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |