Content of review 1, reviewed on October 04, 2025

This prospective, within subject observational pilot study compared the accuracy of four CGM systems—Abbott Libre 3, Dexcom G7, Medtronic Simplera, and Sinocare iCan i3—during real world mid haul commercial flights (Vienna↔Reykjavik) and matched ground periods in 20 adults with type 1 diabetes.

Sensors were inserted one day prior; capillary glucose (Contour Next) was measured every 30 minutes and matched to CGM values within ±2.5 minutes. Primary accuracy metrics included MARD, Clarke and Parkes (Consensus) error grids, and post protocol %10/10 and %20/20 agreement rates.

Overall, G7 and Libre 3 showed lower MARD (≈9–10%) and a larger share of values in clinically acceptable error grid zones than Simplera and i3; this ranking was broadly consistent across flight phases and prandial states (Abstract; Tables 1–5; Figures 1–8).

Major comments

  1. Internal consistency and clarity of key numbers.
    o The Abstract reports “data from 297 sensors with 3,473 matched pairs,” which appears inconsistent with the study design (20 participants × 4 devices = 80 sensors at baseline, with reinsertion allowed).

Please clarify whether “297” reflects matched samples/records, sensor sessions, or a typographical error. Consider reconciling this with device level pair counts shown in Table 1 (e.g., overall N for G7 876; Libre 3 926; Simplera 744; i3 927; total 3473). (Abstract, p 2–3; Table 1, p 12–13.)

o The Abstract appears to invert 10/10 and 20/20 agreement rates for G7 and Libre 3. Table 5 (p 15) shows overall 10/10 = 70.2% (G7), 66.8% (Libre 3) and 20/20 = 90.6% (G7), 90.8% (Libre 3), whereas the Abstract states the reverse. Please correct. (Abstract; Table 5.)

  1. Statistical approach for within subject comparisons.The analysis is primarily descriptive. Because each participant wore all four devices, the design enables paired, within subject comparisons across devices and conditions.

Please add inferential analyses that account for repeated measures (e.g., linear mixed effects models or GEE on absolute relative difference, with participant as a random effect; models stratified by flight vs ground and by phase).

This would quantify whether observed MARD differences between devices (and across phases) are statistically meaningful beyond chance. (Methods, p 4–6; Results, Tables 1–2.)

  1. Handling of time varying flight factors and pressure data.Cabin pressure was continuously recorded (Methods, p 5), but results do not relate accuracy to pressure/altitude profiles.

Please present accuracy vs cabin pressure (and rate of pressure change), ideally with plots and mixed model terms for pressure, ascent/cruise/descent, and prandial state. This would strengthen mechanistic interpretation of the phase specific patterns reported in Table 1 and Figures 1–8.

  1. Comparator measurements and physiologic lag.Capillary SMBG was obtained every 30 minutes and matched within ±2.5 minutes (Methods, p 5).

Because interstitial CGM lags blood glucose, particularly around meals and rapid excursions, please discuss the potential mismatch bias and consider sensitivity analyses using (a) a wider matching window with interpolation, and/or (b) rate of change adjustment to examine whether error increases are driven by physiologic lag rather than device inaccuracy (notably during ascent/descent and non fasting periods; Table 1–2).

  1. Sensor life cycle and reinsertion.Reinsertion was permitted after failure/detachment (Methods, p 5).

Because CGM accuracy varies early after insertion, please report (a) how many reinsertions occurred by device, (b) time since insertion for each observation included in analyses, and (c) sensitivity analyses excluding early post insertion hours to ensure that group differences are not partly explained by sensor age.

  1. Hypoglycemia sample size and precision.Subgroup estimates for <70 mg/dL have very small n and N (e.g., in flight <70 mg/dL, G7 n=9/N=16; Table 2, p 12–14), yielding wide CIs.

Please (a) mark these as exploratory, (b) provide exact binomial CIs for 10/10 and 20/20 in low glucose strata (Table 5, p 15), and (c) avoid over interpreting device ranking in hypoglycemia.

  1. Error grid presentation and clinical interpretation.Clarke and Parkes grids are presented for multiple conditions (Tables 3–4; Figures 1–8, p 16–24).

Consider adding Bland–Altman plots (bias and limits of agreement) overall and by phase, plus median absolute relative difference (a robust accuracy metric) to complement MARD and visualize systematic bias (over /under reading) across glycemic ranges.

  1. Sample size justification and multiplicity.The pilot nature is acknowledged and no formal power calculation was performed (Methods, p 5). Please add a brief a priori precision justification (e.g., half width of CI for MARD) and clarify whether any multiplicity correction was applied (given numerous stratifications).

Minor comments and reporting suggestions

• Terminology consistency. Correct minor typographical issues: “Abott” → “Abbott” in figure captions (Figures 7–8, p 22–23); “Simpler a” → “Simplera” in Tables 4–5; ensure uniform device naming (“Sinocare iCan i3”).

• Abstract wording. Replace “post protocol” with post hoc in agreement rate analysis; correct the 10/10 vs 20/20 inversion noted above. (Abstract, p 2–3; Table 5, p 15.)

• Flight details. Provide actual cabin pressure/altitude ranges and timelines for ascent, cruise, and descent on both legs to contextualize Table 1 phase specific data. (Methods, p 5; Table 1.)

• Data completeness. Quantify missing data and sensor failures by device and condition, and describe how missingness was handled (e.g., listwise deletion vs imputation). (Results, p 9–10.)

• COI transparency. Several coauthors report relationships with manufacturers of evaluated devices; funding sources include EASA and OeDG, with Insulet travel support for Omnipod users. Please add a sentence clarifying that CGM manufacturers had no role in study design, data access, analysis, or manuscript review. (Funding/COI, p 10–11.)

• Standards cross walk. Consider a short appendix mapping your metrics to consensus CGM performance standards (e.g., %20/20, %15/15 when applicable), to aid readers’ interpretation alongside error grids. (Tables 3–5.)

• Figure economy. Consolidate error grid figures (e.g., side by side panels) and ensure vector quality. (Figures 1–8, p 16–24.)

Source

    © 2025 the Reviewer.

Content of review 2, reviewed on December 08, 2025

Authors responded all comments. I have no further comment.

Source

    © 2025 the Reviewer.

References

    Renald, M., Silvia, B., M., B. P., Monika, C., Omaima, E. H., A., H. D., Fariba, S., Siu, F. K., Dietrich, T., R., P. T., David, R., Chantal, M., Gerd, K., K., M. J. Performance of Continuous Glucose Monitoring Systems (CGMs) During Commercial Flights in Type 1 Diabetes: A Within-Subject Comparative Pilot Study. Diabetes, Obesity and Metabolism.