223 Pages
English

Supervised machine learning assisted real-time flow classification system [Elektronische Ressource] : a real-time approach to flow classification / Isara Anantavrasilp

Gain access to the library to view online
Learn more

Informations

Published by
Published 01 January 2010
Reads 20
Language English
Document size 2 MB

INSTITUT FÜR INFORMATIK
DER TECHNISCHEN UNIVERSITÄT MÜNCHEN
Forschungs- und Lehreinheit I
Angewandte Softwaretechnik
Supervised Machine Learning Assisted
Real-Time Flow Classification System
A Approach to Flow Classification
Isara Anantavrasilp
VollständigerAbdruckdervonderFakultätfürInformatikderTechnischenUniversität
München zur Erlangung des akademischen Grades eines
Doktors der Naturwissenschaften (Dr. rer. nat.)
genehmigten Dissertation.
Vorsitzender: Univ.-Prof. Dr. Georg Carle
Prüfer der Dissertation: 1. Univ.-Prof. Bernd Brügge, PhD.
2. Univ.-Prof. Dr. Dr. h. c. Alexander Schill
Technische Universität Dresden
Die Dissertation wurde am 17.05.2010 bei der Technischen Universität München
eingereicht und durch die Fakultät für Informatik am 24.09.2010 angenommen.To Mankind, Science ...and You.i
Abstract
A Flow Classification System (FCS) is a process and mechanism that assigns a
class to a network connection (flow). In QoS-aware networks, QoS-aware applications
can identify and assign service classes to their flows. The flows are then treated
by the networks according to their classes. However, most of the existing network
applications are QoS-unaware applications, prompting a need for an enhanced FCS
that can automatically identify the service classes of the flows.
ThisdissertationdescribesanewFCS,calledSupervisedMachinelearningAssisted
Real-Time (SMART) Flow Classification System, designed to classify QoS-unaware
flows in real-time. It uses a novel concept of flow prefix, which refers to a certain
number of flow packets. We empirically show that the characteristics of a flow can
be estimated by observing only up to a specific prefix. Evaluations on benchmark
datasets have shown that observing only 11 packets is sufficient to achieve more than
90% classification accuracy.
SMART uses a machine learning algorithm to automatically identify relationships
between the characteristics and the classes of the flows from QoS-aware applications.
The learned relationships are then used to identify the QoS-unaware flows. We have
evaluated our SMART FCS over a variety of real-world data, including flow samples
collected from individual users and a large dataset collected from an edge router of an
organizational network. The results show that our approach achieves average correct-
ness of 98.82% and 99.66% in individual-users and large-network benchmark datasets,
respectively.iiiii
Acknowledgement
I owe my deepest gratitude to my Doktorvater, Prof. Bernd Brügge, professor for
Applied Software Engineering at Technische Universität München, who has supervised
and guided me during my research. Not only providing me valuable scientific and
technical advices, he also motivated and encouraged me with perpetual energy and
enthusiasm. It is an honor for me to work with him.
I am heartily thankful to Prof. Alexander Schill, professor for Computer Networks
andProf.SteffenHölldobler, professorforKnowledgeRepresentationandReasoningat
Dresden University of Technology. This thesis would not have been possible without
their kindly help, suggestions as well as detailed and constructive comments. I am
deeply grateful to my supervisor at BenQ Mobile GmbH & Co. OHG, Dr. Thorsten
Schöler, for his important support and throughout this work. I also warmly thank
Dr. Kenjiro Cho, Deputy Research Director at Internet Initiative Japan, Inc., and the
WIDE Project, Japan, for their valuable data.
IamindebtedtomyfriendsandcolleaguesatTU-Dresden, TU-MünchenandBenQ
Mobile for their scientific suggestions, including Ari Saptawijaya, Tobias Pietzsch,
BoontaweeSuntisrivaraporn,SebastianBader,BertramFronhöfer,SurapaThiemjarus,
Kiattisak Roonprasang, Tansir Ahmed, Petr Osipov, Arsalan Minhas, Dennis Pagano,
Damir Ismailović, Florian Schneider, Helmut Naughton, Nitesh Narayan, and Yang
Li. I am also pleased to thank Araya Raiwa, Surapa Thiemjarus, Teerapat Anan-
tavarasilpa, Sujitra Thongjab and Phee for their tireless efforts in data collection. My
special thanks go Suvaporn Photjananuwat for the wonderful cover of this thesis.
My most sincere and warmest gratitude go to my beloved families, Baumeister and
Anantavrasilp, especially Mama and Papa, mom and dad, and Yai. Without their
advices, support and encouragement, this thesis would not be possible — Thank you
all for believing in me. In addition, I would also like to thank Bee, my brother, who
has proof-read every single word of this thesis.
I would like to show my gratitude to my friends in Thailand, Germany and the
United Kingdom for trusting and believing in me.ivv
Overview
Chapter 1
Introduction to the research
Motivation and need of real-time adaptive flow classification,
Problem description and challenges
Overview of Supervised Machine learning Assisted Real-Time (SMART) flow
classification system
Research contributions
Chapter 2
Overview of computer networks and quality-of-service (QoS) support
Discussion of flow classification system components
Survey and comparisons of previous systems
Chapter 3
Rigorous and unified mathematical framework for flow classification
System decomposition of SMART using Unified Modeling Language (UML)
Reviews of previous FCSs using the proposed framework
Chapter 4
In-depth review and analysis of current machine learning algorithms
Definitions of their performance measurements
Chapter 5
Description of the SMART and its components
Extensiveevaluationsofthesystemonindividual-userandlarge-networkdatasets
Chapter 6
Exploring the possibilities of real-time flow classification
Extension of SMART to support real-time and empirical evalua-
tions of SMART
Chapter 7
Conclusion of research
Directions for future works
Appendix A
Signatures of the communication protocols used in experiments
Appendix B
Additional evaluation resultsvi