Review of OPTIMAL: An OPTimized Imaging Mass cytometry AnaLysis framework for benchmarking segmentation and data exploration

Content of review 1, reviewed on May 03, 2023

This work provides a basic analysis pipeline for imaging mass cytometry (IMC) data that selects an optimized method from pre-existing tools to analyze their dataset. They segment cells, transform and perform batch correction as well as perform a single cell analysis in FCS Express. Then the data is put in matlab and an analysis script based on histocat neighborhood analysis is run there for interactions and avoidances. Authors state that this pipeline can be thought of as an improvement on the histocat pipeline because the user can edit settings for a more tailored analysis and because part of it is in FCS Express with a clickable interface, it is easier to perform. While it is well written and useful to demonstrate an analysis pipeline for IMC data, this pipeline is extremely complicated with many steps and diverse programs (Ilastik, CellProfiler (CP), matlab, FCS express, Fiji ImageJ) involved (Figure 1); thus requiring a lot of background knowledge from the user. Simpler pipelines exist with supporting information and full single cell and spatial analysis (https://www.biorxiv.org/content/10.1101/2021.11.12.468357v1). While the instructions are very thorough for the FCS Express portion, the CP and matlab portion has little information and this will be difficult for users to reproduce without instructions + CP template and thorough chunk descriptions in matlab script.
The cell segmentation section tests one method with 4 versions of that method (pixel classification method with Ilastik and CP). It would have been interesting to see a comparison of an artificial intelligence (AI) cell segmentation method like DeepCell with the mesmer model, or Cellpose with the cyto model, and even cell mask vs nuclear mask (this was done a bit in this work, but the nuclear mask was such poor quality, it was not much of a comparison). Authors call the pixel classification method the current “gold standard,” but do not demonstrate this by comparing to other methods. They showed the model using cells in Ilastik training that were not the cells from the sample and the mask did not perform as well. They showed the model with skipping Ilastik and making a nuclear mask in CP only and that performed terribly. It was not surprising that it helps to train in ilastik first but I am left wondering if the nuclear mask could have been made better in CP? Without more experience with CP, I don’t know. The authors compare the best condition with the worst and I have to say this comparison is not very interesting because the nuclear mask is so poor it’s obvious none of the other plots with data from that mask will look good. It would be more interesting to compare to the multi-signal mask data which is much more similar, or a good nuclear mask. I am also wondering why they chose to make a nuclear instead of a cell mask in Ilastik? What was the reasoning behind this approach as the standard published pipeline from the bodenmiller lab is to make a cell mask using cell membrane targets in Ilastik and then go into CP just to extract single cell data (https://bodenmillergroup.github.io/ImcSegmentationPipeline/ilastik.html)? The last version of the Ilastik/CP approach tested was to see if one membrane target was better than multiple targets put together in Fiji Image J and they show that the masks are basically the same (is the number of cells extracted the same? In figure 2B the cell number is above both the single and multi-masks so it looks like it must apply to both images, but looking at the mask image, the multi signal mask looks to have fewer cells) so there is no difference. It would be nice to see a clear image with the EpCAM only staining compared to the multi target image so we can get an idea if the reason for this is because the EpCAM staining nicely stains the membrane of all cells in this sample, so there is no need for the multi target mask.
The resolution of the tissue images in figure 2A is not good enough to clearly see the cells and targets.
Can the authors comment on why they used CD79a for B cells instead of a more common marker like CD19?
The comparison of the mask quality is checked and there is a large amount of double positive B and T cells for every single mask version tested and the OPTIMAL method has slightly more double positive than the multi signal method. How can any of these masks pass quality control with all these double positive cells? It is known that there is some spill between cells that are neighboring, but 20% of all T and B cells being double positive seems like a lot... Why didn’t the authors try to improve on this by trying other methods? What about trying just a nuclear mask and no membrane markers to reduce the amount of spill between neighboring cells?
Different Arcsinh cofactors were tested and they nicely show that 1 displays the data from IMC best. 1 is already the currently recommended cofactor for this data (https://bodenmillergroup.github.io/IMCDataAnalysis/index.html), but it is nice to see it demonstrated why. Please make the graph axes labels legible in Fig. 3.
Authors added a Z-score normalization to reduce batch effects after they tested many other batch effect methods and finding that this one performed the best. There is no data showing how the performance of the other methods compares to the Z score normalization. It would be nice to see this comparison and an explanation why this is the best.
Figure 4- please remember to label figures on the figures for both print and online figures. Please write conditions above plots in Figure 4.
Figure 5A- please make legible axes labels.
Figure 5C- please use better resolution and better color contrasts so we can see the targets better in the tissue.
Data availability- Data is shared with this link:
https://www.ebi.ac.uk/biostudies/studies/S-BSST1047
But when I download all files as a zip and then extract, I get a message that the folder is empty. Please make a data availability statement in the text that goes over exactly what scripts are shared and what data files are present. Was the CP pipeline shared? The probability maps? All matlab scripts + explanation of what each chunk doing?

Supp figure legends Table S1 says 14 batches instead of 12.

Supp figure 1 – empty channels don’t need to be shown and the resolution is too poor to assess what worked and what didn’t.

Supp fig. S3 legend- please state what the colors are.

Supp fig 2 tissue image tiles need better resolution.

Can the authors say why 8um sections were cut for this? Most IMC papers use 3-3.5 um tissue thickness- is there a reason this thickness was chosen?

A discussion about the limitations of this analysis pipeline would be helpful. Mentioning other cell segmentation methods and troubleshooting the reasoning for all the double positive T and B cells and how that could be improved upon would be worthwhile. The spatial analysis is limited and there is a lot more one can do with spatial data, so some information about what other options are out there is worthwhile. Also something about the kind of expertise one needs to have before attempting this- is the matlab script given with step-by-step instructions like the FCS express section is? Some sections of this analysis are quite simple (Ilastik, FCS Express) but others are not simple at all (CP, matlab, Fiji ImageJ). It is quite cumbersome that the user must do part of the analysis with one file format and then save the data in another format and continue with a new analysis in FCS express in another format. Also that you have to go in and out of matlab multiple times- there is a lot of time lost due to reformatting and saving data and loading it over and over again. I can’t help but wonder why the whole pipeline is not in matlab as you can perform the single cell analysis there instead of going in and out of FCS express and then back to matlab?

How scalable is this analysis to a large dataset of 150 ROI?

In the Discussion it is written “The analysis of IMC data has historically been challenging with limited attempts made to develop accurate, scalable, and accessible solutions. Moreover, approaches tend not to be very accessible and require expert knowledge of programming languages such as R, Python or MATLAB (12,41).” Authors show understand the issue very well with IMC analysis pipelines but their pipeline has this same issue.

They say that their framework provides recommendations for optimizing any IMC analysis- such as using the actual cells that are from the data set when creating a cell mask and using Ilastik and CP for the mask creation (using a single target for cell boundaries in CP is panel specific). The conclusions from the optimizations and validations in this paper could provide some aid to users but this pipeline seems overly complicated and a bit outdated compared to newer pipelines that use more machine learning solutions like steinbock and deepcell and ImcRtools (https://doi.org/10.1093/bioadv/vbad046).

Source

Content of review 2, reviewed on September 05, 2023

Review for “OPTIMAL: An OPTimised Imaging Mass cytometry AnaLysis
framework for benchmarking segmentation and data exploration“
Revision 2
230905
This work can be thought of as a series of tools and ideas for quality controlling your IMC data cell segmentation and analysis more than an actual pipeline for IMC data analysis (title changed now to clarify this). As such, it can provide readers that are new to IMC with solutions for how to check if your segmentation is good quality and how to perform a batch effect correction (in FCS express), how to check the best method for dimensionality reduction (using FCS express) and they showed why the arcsinh cofactor parameter transformation is best at 1. Lastly, they show the neighborhood analysis was improved by using a “disk” pixel expansion instead of “bounding box” approach. These are all things that IMC users can check themselves on their data and the framework has been improved by the addition of a visual guide in the supplemental notes. The reviewer has the following minor changes to add:

• Please provide references for this statement that it is often an issue to get a good cell segmentation in your abstract. If this is not possible, better to say that it can be difficult to assess whether your segmentation is of good enough quality or not.

• Figure S10A- why does the nuclear only II have no mask overlay? Why does the Tonsil CellPose image have strange horizontal stripes in it? We regularly use cellpose and do not get these stripes. Can you please make the mask overlay for Tonsil CellPose look like the other mask overlays (white lines around cells instead of varying shades of grey) for proper visual comparison?

• While there are fewer cells with Cellpose, the populations look better in the heatmap- there is no population that is T and B cells double positive for cellpose- looks better to me so how can you conclude no benefit? Please address this in the text as readers might wonder why this is not considered an improvement. On the same note, it looks like the Tonsil-EPCAM phenograph clusters as well as the Cellpose clusters don't have a cluster that is double positive but there are no dotplot showing this for these conditions. Please comment on this or show these conditions to see if they are improved in this regard.

• Please add your reasoning for using CD79 for B cells in the text as this could be of high interest for user struggling with B cell discrimination in IMC.

• Please add figure titles on supplementary figs.

• It is not possible to read the targets on the Y axes of the heatmaps or the smaller cell subsets on the X axes Figure 4 and X axes Fig 5.

• Please explain why you used this tissue thickness to readers because the IMC projects we follow all use max 4um thickness and this is from Bodenmiller publications and then experience on our end that it works well and does not have signal loss. Please provide proof that 8um provides optimal signal to 3.5 or 4um if you state that this is the reason.

• You state “The analysis of IMC data has historically been challenging with limited attempts made to develop accurate, scalable, and accessible solutions – please provide examples and proof if you make this kind broad statement. Your tools are not more accessible than anything else that has been published as far as I can tell. FCS express even has to be purchased while at least the other tools are free. Please explain what you mean here about how your framework is more accessible to users compared to what? You also require knowledge of a programming language to do this.

• Moreover, approaches tend not to be very accessible and require expert knowledge of programming languages…), Please add a comment to clarify readers that your pipeline will not solve these challenges with IMC pipelines- you provide quality control methods to check the data in various ways to make sure one is properly representing the data and do it using many different programs plus a programming language.

Pre-publication Review of

OPTIMAL: An OPTimized Imaging Mass cytometry AnaLysis framework for benchmarking segmentation and data exploration

Reviewed On May 03, 2023 , and September 05, 2023

Submitted to

Reviewed

Actions

Content of review 1, reviewed on May 03, 2023

Source

Content of review 2, reviewed on September 05, 2023

Source