Zarei, Roozbeh and Monemi, Alireza and Marsono, Muhammad Nadzir (2015) Automated dataset generation for training peer-to-peer machine learning classifiers. Journal of Network and Systems Management, 23 (1). pp. 89-110. ISSN 1064-7570
Full text not available from this repository.
Official URL: http://dx.doi.org/10.1007/s10922-013-9279-z
Abstract
Peer-to-peer (P2P) classifications based on flow statistics have been proven accurate in detecting P2P traffic. A machine learning classification is affected by the quality and recency of the training dataset used. Hence, to classify P2P traffic on-line requires the removal of these limitations. In this paper, an automated training dataset generation for an on-line P2P traffic classification is proposed to allow frequent classifier retraining. A two-stage training dataset generator (TSTDG) is proposed by combining a 3-class heuristic and a 3-class statistical classification to automatically generate a training dataset. In the heuristic stage, traffic is classified as P2P, non-P2P, or unknown. In the statistical stage, a dual Decision Tree is built based on a dataset generated in the heuristic stage to reduce the amount of classified unknown traffic. The final training dataset is generated based on all flows that are classified in these two stages. The proposed system has been evaluated on traces captured from a campus network. The overall results show that the TSTDG can generate an accurate training dataset by classifying around 94 % of total flows with high accuracy (98.59 %) and a low false positive rate (1.27 %).
Item Type: | Article |
---|---|
Uncontrolled Keywords: | peer-to-peer traffic, traffic classification, training dataset, two-stage classifier |
Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering |
Divisions: | Electrical Engineering |
ID Code: | 57928 |
Deposited By: | Haliza Zainal |
Deposited On: | 04 Dec 2016 04:08 |
Last Modified: | 19 Dec 2021 06:19 |
Repository Staff Only: item control page