TY - GEN
T1 - Software monitoring with bounded overhead
AU - Callanan, Sean
AU - Dean, Daniel J.
AU - Gorbovitski, Michael
AU - Grosu, Radu
AU - Seyster, Justin
AU - Smolka, Scott A.
AU - Stoller, Scott D.
AU - Zadok, Erez
PY - 2008
Y1 - 2008
N2 - In this paper, we introduce the new technique of High-Confidence Software Monitoring (HCSM), which allows one to perform software monitoring with bounded overhead and concomitantly achieve high confidence in the observed error rates. HCSM is formally grounded in the theory of supervisory control of finite-state automata: overhead is controlled, while maximizing confidence, by disabling interrupts generated by the events being monitored - and hence avoiding the overhead associated with processing these interrupts - for as short a time as possible under the constraint of a user-supplied target overhead O target. HCSM is a general technique for software monitoring in that HCSM-based instrumentation can be attached at any system interface or API. A generic controller implements the optimal control strategy described above. As a proof of concept, and as a practical framework for software monitoring, we have implemented HCSM-based monitoring for both bounds checking and memory leak detection. We have further conducted an extensive evaluation of HCSM's performance on several real-world applications, including the Lighttpd Web server, and a number of special-purpose micro-benchmarks. Our results demonstrate how confidence grows in a monotonically increasing fashion with the target overhead, and that tight confidence intervals can be obtained for each target-overhead level.
AB - In this paper, we introduce the new technique of High-Confidence Software Monitoring (HCSM), which allows one to perform software monitoring with bounded overhead and concomitantly achieve high confidence in the observed error rates. HCSM is formally grounded in the theory of supervisory control of finite-state automata: overhead is controlled, while maximizing confidence, by disabling interrupts generated by the events being monitored - and hence avoiding the overhead associated with processing these interrupts - for as short a time as possible under the constraint of a user-supplied target overhead O target. HCSM is a general technique for software monitoring in that HCSM-based instrumentation can be attached at any system interface or API. A generic controller implements the optimal control strategy described above. As a proof of concept, and as a practical framework for software monitoring, we have implemented HCSM-based monitoring for both bounds checking and memory leak detection. We have further conducted an extensive evaluation of HCSM's performance on several real-world applications, including the Lighttpd Web server, and a number of special-purpose micro-benchmarks. Our results demonstrate how confidence grows in a monotonically increasing fashion with the target overhead, and that tight confidence intervals can be obtained for each target-overhead level.
UR - https://www.scopus.com/pages/publications/51049105062
U2 - 10.1109/IPDPS.2008.4536433
DO - 10.1109/IPDPS.2008.4536433
M3 - Conference contribution
AN - SCOPUS:51049105062
SN - 9781424416943
T3 - IPDPS Miami 2008 - Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium, Program and CD-ROM
BT - IPDPS Miami 2008 - Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium, Program and CD-ROM
T2 - IPDPS 2008 - 22nd IEEE International Parallel and Distributed Processing Symposium
Y2 - 14 April 2008 through 18 April 2008
ER -