Content of review 1, reviewed on November 13, 2019

Overall statement

The paper proposed two architectures that handle both compression and semantic analysis of images, respectively at the encoder and decoder side. This enables client-side applications to obtain the semantic information of an image while compressing it, without conducting additional computing.

The paper shows an original contribution since the concept of deep semantic image compression, that as its name suggests aims to incorporate semantics in image compression, has not yet been largely explored.

The paper is well-written apart from some grammatical errors and poorly formulated sentences (e.g. “The length of compressed code directly affect the size of compressed code and is far from limitless”).

In what follows, I present my comments on each part of the paper, namely abstract, title and references, introduction and background, methodology, data and results and finally discussion and conclusions.

Comments on abstract, title, references

  1. The abstract clearly raises the research question and introduces the paper context. It includes a brief and comprehensible summary of the method and key results.
  2. The title is relevant and conveys the main idea.
  3. References are wholly correctly cited, relevant and current. Main state-of-the-art methods are cited, e.g. Ballé et al. and Toderici et al.

Comments on introduction/background

  1. The context of the paper is well presented in the first paragraph of the introduction.
  2. A clear description of the paper purpose is provided. Contributions are also distinctly enumerated without any disconnect with the drawn conclusions.
  3. You reviewed work related to deep learning-based image compression. Why haven’t you also present the semantic analysis methods state-of-the-art since the paper jointly points out the image compression and semantic analysis ?

Comments on methodology

  1. The proposed method is clearly presented. I appreciate the fact that you firstly provide a brief presentation of the proposed method steps. Then, you detail each of them. However, not enough details are provided so that the evaluation cannot be easily replicated. Here, some questions can be raised : • The choice of the distortion loss between the original and reconstructed images as well as the error rate of the semantic analysis has to be justified ? Are they the best metrics that can be used for deep learning compression and classification based methods ? • How did you choose λ1 and λ2 in equation (4) page 5 ? Do you favor the compression quality or the classification accuracy ? • What are the number of epochs used for the CNN ?

Comments on data and results

  1. Tables and figures are clearly presented, mentioned in the text of the results section and referred to by their numbers.
  2. Results are obtained on 2 well-known datasets. This allows to draw unbiased conclusions that can be generalized. You should mention a reference for the ILSVRC 2012 dataset (as you have done for the Kodak PhotoCD one).
  3. Which of the proposed architectures (pre- and post- semantic DeepSIC) is best ? and why ?
  4. Although the method is executed on a NVIDIA GPU, you should evaluate the complexity and/or the execution time of the proposed method. This is important when we are working on training and neural networks.
  5. Compared to conventional codecs, such as JPEG, JPEG 2000 and BPG, deep learning-based methods have proved better rate-distortion performances. In this paper, the semantics incorporation has highly affect the compression performances so that the obtained results are under the JPEG 2000 ones.

Comments on discussion and conclusions

  1. Results of some figures are not commented/discussed in the text (e.g. figures 5, 6 and 7).
  2. Pointed and clear perspectives are suggested for future research. References are even provided.

Source

    © 2019 the Reviewer.

References

    Sihui, L., Yezhou, Y., Yanling, Y., Chengchao, S., Ya, Z., Mingli, S. 2018. DeepSIC: Deep Semantic Image Compression. Lecture Notes in Computer Science.