Content of review 1, reviewed on March 05, 2020

Overall statement or summary of the article and its findings in your own words

A Study of Deep Learning for Network Traffic Data Forecasting aimed to improve the performance of the routing and load balancing process which is the bit rate forecasting based on machine learning algorithm DNNs and feature selection t-SNE. In addition, authors attempted to address big data analysis issues that are data acquisition, regression problem, class imbalance, concept drift and big data management. The previous studies demonstrated that classical machine learning methods performed well in network forecasting so authors wanted to apply more advanced methods to the same issue.

There are 3 main stages of this research that is data collection, data preparation and data processing. Data for processing is from the combination of two sources which are flow entries and flow records. The measurements of the research are precision, recall/sensitivity, specificity and accuracy. The results were clear that they obtained 5 features for prediction after t-SNE feature selection. They also compared the results of all features against 5 features. The aggregated methodology of t-SNE and DNNs performed better in some measurement and some class of the bit rate.

Overall strengths of the article and what impact it might have in your field

The application for DNNs applying to network traffic forecasting which has a ton of records is feasible and more specific than the previous that was from 2 classes of the bit rate to 3 classes. Researchers in the field can continue in this direction to make it work in practice.

Specific comments on the weaknesses of the article and what could be done to improve it

Major points in the article which needs clarification, refinement, reanalysis, rewrites and/or additional information and suggestions for what could be done to improve the article.

  1. I suggest gaining more experiment to deal with class imbalance with cross-validation since we know that precision of class 2 and particularly class 1 which had 64.5 and 38.8 respectively.

  2. Iā€™m interested in the runtime of all feature against 5-tuple analysis. I assumed that fewer features can improve time-consuming and would like to see in the article.

Minor points like figures/tables not being mentioned in the text, a missing reference, typos, and other inconsistencies.

  1. In my opinion, you can add the average of experiment performance in table 5 because it is same datasets and usually, we calculate all class by average e.g. F = all Accuracy ā€“ 84.43, Precision ā€“ 68.1.

Source

    © 2020 the Reviewer.