Home » Projects » Retired Projects

Category Archives: Retired Projects

Record/Replay Bug Reproduction for Java

There will inevitably continue to be bugs that are not detected by any testing approach, but eventually impact users who then file bug reports. Reproducing field failures in the development environment can be difficult, however, especially in the case of software that behaves non-deterministically, relies on remote resources, or has complex reproduction steps (the users may not even know what led up to triggering the flaw, particularly in the case of software interacting with external devices, databases, etc. in addition to human users). So a record/replay approach is used to capture the state of the system just before a bug is encountered, so the steps leading up to this state can be replayed later in the lab. The naive approach of constant logging in anticipation of a defect tends to produce unacceptably high overheads (reaching 2,000+ %) in the deployed application. Novel solutions that lower this overhead typically limit the depth of information recorded (e.g., to use only a stack trace, rather than a complete state history) or the breadth of information recorded (e.g., to only log information during execution of a particular subsystem that a developer identifies as potentially buggy). But limiting the depth of information gathered may fail to reproduce an error if the defect does not present itself immediately and limiting logging to a specific subcomponent of an application makes it only possible to reproduce the bug if it occurred within that subcomponent.

Our new technique, called “Chronicler”, instead captures program execution in a manner that allows for deterministic replay in the lab with very low overhead. The key insight is to log sources of non-determinism only at the library level – allowing for a lightweight recording process while still supporting a complete replay for debugging purposes (programs with no sources of non-determinism, e.g., no user interactions, are trivial to replay – just provide the same inputs). When a failure occurs, Chronicler automatically generates a test case that consists of the inputs (e.g., file or network I/O, user inputs, random numbers, etc.) that caused the system to fail. This general approach can be applied to any “managed” language that runs in a language virtual machine (for instance, JVM or Microsoft’s .NET CLR), requiring no modifications to the interpreter or environment, and thus addresses a different class of programs than related work for non-managed languages like C and C++.

We expect to extend and use this tool as part of the Mutable Replay project, and are seeking new project students in tandem with that effort.

Contact Professor Gail Kaiser (kaiser@cs.columbia.edu)



Jonathan Bell, Nikhil Sarda and Gail Kaiser. Chronicler: Lightweight Recording to Reproduce Field Failures. 35th International Conference on Software Engineering, May 2013, pp. 362-371. See teaser video at https://www.youtube.com/watch?v=4IYGfdDnAJg.


Download ChroniclerJ.

Dynamic Information Flow Analysis

We are investigating an approach to runtime information flow analysis for managed languages
that tracks metadata about data values through the execution of a program. We first considered
metadata that propagates labels representing the originating source of each data value, e.g.,
sensitive data from the address book or GPS of a mobile device that should only be accessed on a
need-to-know basis, or potentially suspect data input by end-users or external systems that
should be sanitized before including in database queries, collectively termed “taint tracking”.
We developed and made available open-source the first general purpose implementation of taint
tracking that operates with minimal performance overhead on commodity Java Virtual Machine
implementations (e.g., from Oracle and OpenJDK), by storing the derived metadata “next to” the
corresponding data values in memory, achieved via bytecode rewriting that does not require
access to source code or any changes to the underlying platform. Previous approaches required
changes to the source code, the language interpreter, the language runtime, the operating system
and/or the hardware, or added unacceptable overhead by storing the metadata separately in a
hashmap. Our system has also been applied to Android, where it required changes in 13 lines of
code, contrasted to the state of the art TaintDroid which added 32,000 lines of code. We are
currently investigating tracking the path conditions constructed during dynamic symbolic
execution of programs, which record the constraints on data values that have reached a given
point in execution (e.g., taking the true or false branch of a series of conditionals). We plan to
use the more sophisticated but slower symbolic execution version as part of several prospective

We expect to extend and use this tool as part of the Mutable Replay project, and are seeking new project students in tandem with that effort.

Contact Professor Gail Kaiser (kaiser@cs.columbia.edu)

Team Members

Gail Kaiser

Former Graduate Students
Jonathan Bell



Jonathan Bell and Gail Kaiser. Phosphor: Illuminating Dynamic Data Flow in the JVM. Object-oriented Programming, Systems, Languages, and Applications (OOPSLA), October 2014,pp. 83-101. Artifact accepted as meeting reviewer expectations.

Jonathan Bell and Gail Kaiser. Dynamic Taint Tracking for Java with PhosphorInternational Symposium on Software Testing and Analysis (ISSTA), July 2015, pp. 409-413.


Download Phosphor.

Download Knarr.

Sound Build Acceleration

Sound Build Acceleration: Our empirical studies found that the bulk of the clock time during the builds of the ~2000 largest and most popular Java open source software applications is spent running test cases, so we seek to speed up large builds by reducing testing time. This is an important problem because real-world industry builds often take many hours, so developers cannot be informed of any errors introduced by their changes while still in context – as needed for continuous integration (best practice). The consequent lack of attention to failed tests is one of the major reasons that software is deployed with so many security vulnerabilities and other severe bugs. Prior work reduces testing time by running only subsets of the test suite, chosen using various selection criteria. But this model inherently reduces failure detection, and may be unsound because remaining test cases may have dependencies on removed test cases (causing false positives and false negatives). We thought out of the box to substantially reduce measured testing time without removing any test cases at all, thus no reduction in failure detection. For example, we developed tools that use static and dynamic analyses to determine exactly which portion of the state written by previous test cases will be read by the next test case, and instrument the bytecode to just-in-time reinitialize only that dependent portion of the state, rather than restarting the JVM between separate test cases, a common industry practice. Some dependencies are unintentional, so our tools also inform developers so they can re-engineer the code to remove those dependencies. Other dependencies are necessary, because series of tests are needed to build up and check each step of complex usage scenarios; for these our tools bundle dependent test cases and distinguish independent sets of test cases to enable sound parallelization of large test suites.

We expect to use components of this tool as part of the Mutable Replay project, and are seeking new project students in tandem with that effort.

Contact Professor Gail Kaiser (kaiser@cs.columbia.edu)

Team Members

Gail Kaiser

Former Graduate Students
Jonathan Bell



Jonathan Bell, Gail Kaiser, Eric Melski and Mohan Dattatreya. Efficient Dependency Detection for Safe Java Test Acceleration. 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), Aug-Sep 2015, pp. 770-781.

Jonathan Bell, Eric Melski, Gail Kaiser and Mohan Dattatreya. Accelerating Maven by Delaying Dependencies.3rd International Workshop on Release Engineering (RelEng), May 2015, p. 28.

Jonathan Bell, Eric Melski, Mohan Dattatreya and Gail Kaiser. Vroom: Faster Build Processes for Java.IEEE Software, 32(2):97-104, Mar/Apr 2015.

Jonathan Bell and Gail Kaiser. Unit Test Virtualization with VMVM. 36th International Conference on Software Engineering (ICSE), June 2014, pp. 550-561. (ACM SIGSOFT Distinguished Paper Award)

Jonathan Bell and Gail Kaiser. Unit Test Virtualization: Optimizing Testing Time. 2nd International Workshop on Release Engineering (RelEng), April 2014.

Jonathan Bell and Gail Kaiser. VMVM: Unit Test Virtualization for Java. ICSE 2014 Formal Demonstrations Track, Companion Proceedings of 36th International Conference on Software Engineering (ICSE), June 2014, pp. 576-579. Video at https://www.youtube.com/watch?v=sRpqF3rJERI.


Download VmVm.

CS/SE Education

About CS/SE Education

We are exploring new techniques and approaches to improve the teaching of computer science and software engineering. Our recent projects and papers are listed below.

Contact: Swapneel Sheth (swapneel@cs.columbia.edu)

Team Members


Prof. Gail Kaiser, kaiser [at] cs.columbia.edu

Phd Students

Swapneel Sheth, swapneel [at] cs.columbia.edu

Former PhD students

Jonathan Bell, jbell [at] cs.columbia.edu
Chris Murphy
, cmurphy [at] cs.columbia.edu

See the Software Project Management project listed on our project student advertisements page.



HALO (Highly Addictive, sociaLly Optimized) Software Engineering




Swapneel Sheth, Jonathan Bell, Gail Kaiser. A Competitive-Collaborative Approach for Introducing Software Engineering in a CS2 Class. 26th Conference on Software Engineering Education and Training (CSEE&T), San Francisco CA, pages 41-50, May 2013

Jonathan Bell, Swapneel Sheth, Gail Kaiser. Secret Ninja Testing with HALO Software Engineering. 4th International Workshop on Social Software Engineering Workshop (SSE), Szeged, Hungary, pages 43-47, September 2011

Christian Murphy, Gail Kaiser, Kristin Loveland, Sahar Hasan. Retina: Helping Students and Instructors Based on Observed Programming Activities. 40th ACM SIGCSE Technical Symposium on Computer Science Education, Chattanooga TN, pages 178-182, March 2009

Christian Murphy, Dan Phung, and Gail Kaiser. A Distance Learning Approach to Teaching eXtreme Programming. 13th Annual ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE), Madrid, Spain, pages 199-203, June 2008

C. Murphy, E. Kim, G. Kaiser, A. Cannon. Backstop: A Tool for Debugging Runtime Errors. 39th ACM SIGCSE Technical Symposium on Computer Science Education, Portland OR, pages 173-177, March 2008

Tech Reports

Kunal Swaroop Mishra, Gail Kaiser. Effectiveness of Teaching Metamorphic Testing.Technical Report CUCS-020-12, Dept. of Computer Science, Columbia University, November 2012

Fine-Grained Data Management Abstractions

We participated in developing novel technology that leverages the storage abstractions of
modern operating systems (e.g., the relational databases and object-relational mappings of
Android) to automatically detect fragments strewn across memory, files and databases that is part
of the same logical application object, such as an email and its attachments, without requiring
source code or any cooperation on the part of application developers. This substrate enabled the development of our prototype tools to check that application-level deletions in fact actually delete all the data fragments related to, say, a document or a photo; to hide (and later unhide) sensitive data, e.g., to protect business data at international border crossings; and to detect when an application collects more data than required by its functionality. In our case study, our system worked correctly on 42 out of 50 real-world applications, and lead to publication of “best practices” rules of thumb required for the approach to work on future applications — e.g., fully declare database schemas, use the database to index file storage, use standard storage libraries, which are admittedly obvious to anyone with the software engineering training that some “app” developers sadly lack.

Contact Professor Roxana Geambasu (roxana@cs.columbia.edu) for further information.

Team Members

Roxana Geambasu
Gail Kaiser

Graduate Students
Riley Spahn

Former Graduate Students
Jonathan Bell



Riley Spahn,  Jonathan Bell, Michael Z. Lee, Sravan Bhamidipati, Roxana Geambasu and Gail Kaiser. Pebbles: Fine-Grained Data Management Abstractions for Modern Operating Systems. 11th USENIX Symposium on Operating Systems Design and Implementation, October 2014, pp. 113-129.

An Open Software Framework for the Emulation and Verification of Drosophila Brain Models on Multiple GPUs

We are working with Prof. Aurel Lazar’s Bionet Lab (http://www.bionet.ee.columbia.edu/) to design, implement and experimentally evaluate an open software framework called the Neurokernel that will enable the isolated and integrated emulation of fly brain model neural ciits and their connectivity patterns (e.g., sensory and locomotion systems)  and other parts of the fly’s nervous system on clusters of GPUs, and support the in vivo functional identification of neural circuits.  (Note this is NOT the same meaning of “in vivo” as PSL’s In Vivo Testing project.)

The Neurokernel will:

  1. Enable computational/systems neuroscientists to exploit new connectome data by directing emulation efforts at interoperable local processing units (LPUs), functional subdivisions of the brain that serve as its computational substrate;
  2. Capitalize on the representation of stimuli in the time domain to enable the development of novel asynchronous algorithms for processing spikes with neural circuits;
  3. Serve as an extended machine that will provide abstractions and interfaces for scalably leveraging a powerful commodity parallel computing hardware platform to study a tractable neural system;
  4. Serve as a resource allocator that will enable researchers to transparently take advantage of future improvements in this hardware platform;
  5. Enable testing of models, both by easing the detection and localization of programming errors and by operationally verifying the models’ designs against time-encoded signals to/from live fly brains in real-time;
  6. Accelerate the research community’s progress in developing new brain circuit model by facilitating the sharing and refinement of novel and/or improved models of LPUs and their constituent circuits by different  groups.

To ease its use by the neuroscience community and enable synergy with existing computational tools and packages, we are developing our software framework in Python, a high-level language that has enjoyed great popularity amongst computational neuroscientists.

As we enhance Neurokernel to model new regions of the fly brain, there may be a negative effect on previous models for other regions.  As the fly brain model(s) will be developed in iterative software development cycles, it will be imperative to ensure that each iteration re-verifies the platform and its individual LPUs against the actual fly brain neuropils.  We would like these tests on the Python code to be conducted automatically, without requiring the use of our fly interface equipment — which is manually intensive to operate.  We are constructing a tool to simulate the fly brain interface for software testing purposes that will capture the stimuli provided to the fly along with its responses.  From these sets of inputs and outputs, the tool will automatically generate test cases that recreate the same experiment without the need for repeated interfacing with the fly. This tool will also be used to automatically generate regression tests for the Neurokernel software that depend on other external factors.

Additional information is available on the Bionet website.

Contact Professor Aurel Lazar (aurel@ee.columbia.edu) for further information.

Team Members

Aurel Lazar
Gail Kaiser

Former PSL Graduate Students
Nikhil Sarda

Metamorphic Testing

Metamorphic testing was originally developed, by others, as an approach to deriving new test
cases from an existing test suite, seeking to find additional bugs not found by the original tests.
Given a known execution function(input) produces output, the metamorphic properties of a
function (or of an entire application) enable automatic derivation of a new input’ from input such
that the expected output’ can be predicted from output. If the actual output’’ is different from
output’, then there is a flaw in the code or its documentation. We expanded metamorphic testing
in several ways, initially to apply to “non-testable programs”, where there is no test oracle; that
is, metamorphic testing can detect bugs even when we do not know whether output is correct for
input (so conventional testing techniques may not be useful). This problem arises for the
machine learning, data mining, search, simulation and optimization applications prevalent in “big
data” analysis. For example, if a machine learning program generates clusters from a set of
examples, one would expect it to produce the same clusters when the order of the input examples
is permuted; however, we have found anomalies in several widely used machine learning
libraries (e.g., Weka) where the result is different from expected when the set of input examples
is modified in some simple way. We are investigating how to extend the notion of metamorphic
properties to before and after state, beyond just input/output parameters, to find bugs that affect
the internal state but are not evident from input/output. Most recently we developed a tool for
automatically discovering candidate metamorphic properties from execution profiling that
performs better than student subjects; the state of the art is for a human domain expert to
manually define the properties, a tedious, error-prone and expensive process.

Team Members

Gail Kaiser

Graduate Students
Fang-Hsiang (“Mike”) Su

Former Graduate Students
Chris Murphy
Jonathan Bell



Fang-Hsiang Su, Jonathan Bell, Christian Murphy and Gail Kaiser. Dynamic Inference of Likely Metamorphic Properties to Support Differential Testing. 10th IEEE/ACM International Workshop on Automation of Software Test (AST), May 2015, pp. 55-59.

Jonathan Bell, Christian Murphy and Gail Kaiser. Metamorphic Runtime Checking of Applications Without Test Oracles. Crosstalk the Journal of Defense Software Engineering, 28(2):9-13, Mar/Apr 2015.

Christian Murphy, M. S. Raunak, Andrew King, Sanjian Chen, Christopher Imbriano, Gail Kaiser, Insup Lee, Oleg Sokolsky, Lori Clarke, Leon Osterweil. On Effective Testing of Health Care Simulation Software. 3rd International Workshop on Software Engineering in Health Care (SEHC), May 2011, pp. 40-47.

Xiaoyuan Xie, Joshua W. K. Ho, Christian Murphy, Gail Kaiser, Baowen Xu and Tsong Yueh Chen.  Testing and Validating Machine Learning Classifiers by Metamorphic Testing.  Journal of Systems and Software (JSS), Elsevier, 84(4):544-558, April 2011.

Christian Murphy, Kuang Shen and Gail Kaiser. Automatic System Testing of Programs without Test Oracles. International Symposium on Software Testing and Analysis (ISSTA), July 2009, pp. 189-200.

Christian Murphy, Kuang Shen and Gail Kaiser. Using JML Runtime Assertion Checking to Perform Metamorphic Testing in Applications without Test Oracles. 2nd IEEE International Conference on Software Testing, Verification and Validation (ICST), April 2009, pp. 436-445.

Christian Murphy, Gail Kaiser, Lifeng Hu and Leon Wu. Properties of Machine Learning Applications for Use in Metamorphic Testing. 20th International Conference on Software Engineering and Knowledge Engineering (SEKE), July 2008, pp. 867-872.

Christian Murphy, Gail Kaiser and Marta Arias. Parameterizing Random Test Data According to Equivalence Classes. 2nd ACM International Workshop on Random Testing (RT), November 2007, pp.38-41.

Christian Murphy, Gail Kaiser and Marta Arias. An Approach to Software Testing of Machine Learning Applications. 19th International Conference on Software Engineering and Knowledge Engineering (SEKE), July 2007, pp. 167-172.


Download <a href="http://” target=”_blank”>Kabu.

In Vivo Testing

Software products released into the field typically contain residual defects that either were not detected or could not have been detected during pre-deployment testing. For many large, complex software systems, it is infeasible in terms of time and cost to reliably test all configuration options before release using unit test virtualization, test suite minimization, or any other known approach. For example, Microsoft Internet Explorer has over 19 trillion possible combinations of configuration settings. Even given infinite time and resources to test an application and all its configurations, other software on which a software product depends or with which it interacts (e.g., sensor networks, libraries, virtual machines, etc.) are often updated after the product’s release; it is impossible to test with these dependencies prior to the application’s release, because they did not exist yet. Further, as multi-processor and multi-core systems become more prevalent, multi-threaded applications that had only been tested on single- or dual-processor/core machines are more likely to reveal concurrency bugs.

We are investigating a testing methodology that we call “in vivo” testing, in which tests are continuously executed in the deployment environment. This requires a new type of test case, called in vivo tests, which are designed to run from within the executing application in the states achieved during normal end-user operation rather than in a re-initialized or artificial pre-test state. These tests focus on aspects of the program that should hold true regardless of what state the system is in, but differ from conventional assertion checking, since assertions are prohibited from introducing side-effects: in vivo tests may indeed and typically do have side-effects on the application’s in-memory state, external files, I/O, etc. but these are all “hidden” from users by cloning the executing application to run the test cases in the same kind of sandbox often aimed to address security concerns. The in vivo approach can be used for detecting concurrency, security or robustness issues, as well as conventional flaws that may not have appeared in a testing lab (the “in vitro” environment). Our most recent research concerns how to reduce the overhead of such deployment-time testing as well as automatic generation of some of the in vivo test cases from traditional pre-existing unit tests.

In Fall 2007, we developed a prototype framework called Invite, which is described in our tech report and was presented as a poster at ISSTA 2008 (a variant of this paper was presented at ICST 2009, and is available here). This implementation uses an AspectJ component to instrument selected classes in a Java application, such that each method call in those classes has some chance (configurable on a per-method basis) of executing the method’s corresponding unit test. When a test is run, Invite forks off a new process in which to run the test, and the results are logged.

We also developed a distributed version of Invite, which seeks to amortize the testing load across a community of applications; a paper was published in the student track of ICST 2008. This version currently uses only one global value for the probability of running a test, instead of one per method, however. That value is set by a central server, depending on the size of the “application community”.

In Spring 2008, we looked at various mechanisms for reducing the performance impact of Invite, e.g. by assigning tests to different cores/processors on multi-core/multi-processor machines, or by limiting the number of concurrent tests that may be run. We also looked at ways of balancing testing load across members of a community so that instances under light load pick up more of the testing. Lastly, we created a modified JDK that allows Invite to create copies of files so that in vivo tests do not alter the “real” file system.

In Fall 2008, we ported the Invite framework to C and evaluated more efficient mechanisms for injecting the instrumentation and executing the tests. We also investigated fault localization techniques, which collect data from failed program executions and attempt to discover what caused the failure.

Recently we have investigated ways to make the technique more efficient by only running tests in application states it hasn’t seen before. This cuts down on the number of redundant states that are tested, thus reducing the performance overhead. This work has potential application to domains like model checking and dynamic analysis and was presented in a workshop paper at AST 2010.

Currently we are looking at ways to apply the In Vivo approach to the domain of security testing. Specifically, we devised an approach called Configuration Fuzzing in which the In Vivo tests make slight changes to the application configuration and then check “security invariants” to see if there are any vulnerabilities that are configuration-related. This work was presented at the 2010 Workshop on Secure Software Engineering.

In 2012-2013, we are investigating techniques to efficiently isolate the state of the tests, so as to avoid the effect of the tests on external systems.

Open research questions include:

  • Can the overhead be reduced by offloading test processes to other machines? This is especially important when the application is running on a single-core machine.
  • What sorts of defects are most likely to be detected with such an approach? How can we objectively measure the approach’s effectiveness at detecting defects?
  • How can the tests be “sandboxed” so that they do not affect external entities like databases? We currently assure that there are no changes to the in-process memory or to the file system, but what about external systems?

This is an older project, where we recently revived the main technique for our more recent work on dynamic code similarity.

Contact Mike Su (mikefhsu@su.columbia.edu) for further information about the recent effort.

Team Members

Gail Kaiser

Graduate Students
Fang-Hsiang (“Mike”) Su

Former Graduate Students
Chris Murphy
Jonathan Bell
Matt Chu
Waseem Ilahi
Moses Vaughan

Former Undergraduate Students
Ian Vo


Fang-Hsiang Su, Jonathan Bell, Gail Kaiser and Simha Sethumadhavan. Identifying Functionally Similar Code in Complex Codebases. 24th IEEE International Conference on Program Comprehension (ICPC), May 2016, pp. 1-10.

Christian Murphy, Moses Vaughan, Waseem Ilahi and Gail Kaiser. Automatic Detection of Previously-Unseen Application States for Deployment Environment Testing and Analysis. 5th International Workshop on the Automation of Software Test, May 2010, pp. 16-23.
Christian Murphy, Gail Kaiser, Ian Vo and Matt Chu. Quality Assurance of Software Applications Using the In Vivo Testing Approach. 2nd IEEE International Conference on Software Testing, Verification and Validation (ICST), April 2009, pp. 111-120.
Matt Chu, Christian Murphy and Gail Kaiser. Distributed In Vivo Testing of Software Applications. 1st IEEE International Conference on Software Testing, Verification, and Validation, April 2008, pp. 509-512.


Societal Computing

Societal Computing research is concerned with the impact of computational tradeoffs on societal issues and focuses on aspects of computer science that address significant issues and concerns facing the society as a whole such as Privacy, Climate Change, Green Computing, Sustainability, and Cultural Differences. In particular, Societal Computing research will focus on the research challenges that arise due to the tradeoffs among these areas.

As Social Computing has increasingly captivated the general public, it has become a popular research area for computer scientists. Social Computing research focuses on online social behavior and using artifacts derived from it for providing recommendations and other useful community knowledge. Unfortunately, some of that behavior and knowledge incur societal costs, particularly with regards to Privacy, which is viewed quite differently by different populations as well as regulated differently in different locales. But clever technical solutions to those challenges may impose additional societal costs, e.g., by consuming substantial resources at odds with Green Computing,
another major area of societal concern.

Societal Computing focuses on the technical tradeoffs among computational models and application domains that raise significant societal issues. We feel that these topics, and Societal Computing in general, need to gain prominence as they will provide useful avenues of research leading to increasing benefits for society as a whole.

We studied how software developers vs. end-users perceive data privacy requirements (e.g.,
Facebook), and which concrete measures would mitigate privacy concerns. We conducted a
survey with closed and open questions and collected over 400 valid responses. We found that
end-users often imagine that imposing privacy laws and policies is sufficient, whereas
developers clearly prefer technical measures; it is not terribly surprising that developers familiar
with how software works do not think merely passing a law will be effective. We also found that
both users and developers from Europe and Asia/Pacific are much more concerned about the
possibility of privacy breaches that those from North America.

Team Members

Gail Kaiser

Former Graduate Students
Swapneel Sheth



Swapneel Sheth, Gail Kaiser and Walid Maalej. Us and Them — A Study of Privacy Requirements Across North America, Asia, and Europe. 36th International Conference on Software Engineering (ICSE), pp. 859-870, June 2014.

Swapneel Sheth and Gail Kaiser. The Tradeoffs of Societal Computing. Onward!: ACM Symposium on New Ideas in Programming and Reflections on Software, October 2011, pp. 149-156.



System reliability is a fundamental requirement of Cyber-Physical System (CPS), i.e., a system featuring a tight combination of, and coordination between, the systems computational and physical elements. Cyber-physical system includes systems ranging from the critical infrastructure such as power grid and transportation system to the health and biomedical devices. An unreliable system often leads to disruption of service, financial cost and even loss of human life. In this work, we aim to improve system reliability for cyber-physical systems that meet following criteria: processing large amount of data; employing software as a system component; running online continuously; having operator-in-the-loop because of human judgment and accountability requirement for safety critical systems. The reason that we limit the system scope to this type of cyber-physical system is that this type of cyber-physical systems are important and becoming more prevalent.

To improve system reliability for this type of cyber-physical systems, we employ a novel system evaluation approach named automated online evaluation. It works in parallel with the cyber- physical system to conduct automated evaluation at the multiple stages along the workflow of the system continuously and provide operator-in-the-loop feedback on reliability improvement. It is an approach whereby data from cyber-physical system is evaluated. For example, abnormal input and output data can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop. The operator can then take actions and make changes to the system based on the alerts in order to achieve minimal system downtime and higher system reliability. To implement the approach, we design a system architecture named ARIS (Autonomic Reliability Improvement System).

One technique used by the approach is data quality analysis using computational intelligence that applies computational intelligence in evaluating data quality in some automated and efficient way to ensure data quality and make sure the running system to perform as expected reliably. The computational intelligence is enabled by machine learning, data mining, statistical and probabilistic analysis, and other intelligent techniques. In a cyber-physical system, the data collected from the system, e.g., software bug reports, system status logs and error reports, are stored in some databases. In our approach, these data are analyzed via data mining and other intelligent techniques so that useful information on system reliability including erroneous data and abnormal system state can be concluded. These reliability related information are directed to operators so that proper actions can be taken, sometimes proactively based on the predictive results, to ensure the proper and reliable execution of the system.

Another technique used by the approach is self-tuning that automatically self-manages and self-configures the evaluation system to ensure it adapts itself based on the changes in the system and feedback from the operator. The self-tuning adapts the evaluation system to ensure its proper functioning, which leads to a more robust evaluation system and improved system reliability.


Project Members

Faculty: Prof. Gail Kaiser

PhD Candidate: Leon Wu



Leon Wu and Gail Kaiser. FARE: A Framework for Benchmarking Reliability of Cyber-Physical Systems. In Proceedings of the 9th Annual IEEE Long Island Systems, Applications and Technology Conference (LISAT), May 2013.

Leon Wu and Gail Kaiser. An Autonomic Reliability Improvement System for Cyber-Physical Systems. In Proceedings of the IEEE 14th International Symposium on High-Assurance Systems Engineering (HASE), October 2012.

Leon Wu, Gail Kaiser, David Solomon, Rebecca Winter, Albert Boulanger, and Roger Anderson. Improving Efficiency and Reliability of Building Systems Using Machine Learning and Automated Online Evaluation. In the 8th Annual IEEE Long Island Systems, Applications and Technology Conference (LISAT), May 2012.

Rebecca Winter, David Solomon, Albert Boulanger, Leon Wu, and Roger Anderson. Using Support Vector Machine to Forecast Energy Usage of a Manhattan Skyscraper. In New York Academy of Science Sixth Annual Machine Learning Symposium, New York, NY, USA, October 2011.

Leon Wu, Gail Kaiser, Cynthia Rudin, and Roger Anderson. Data Quality Assurance and Performance Measurement of Data Mining for Preventive Maintenance of Power Grid. In Proceedings of the ACM SIGKDD 2011 Workshop on Data Mining for Service and Maintenance, August 2011.

Leon Wu and Gail Kaiser. Constructing Subtle Concurrency Bugs Using Synchronization-Centric Second-Order Mutation Operators. In Proceedings of the 23th International Conference on Software Engineering and Knowledge Engineering (SEKE), July 2011.

Leon Wu, Boyi Xie, Gail Kaiser, and Rebecca Passonneau. BugMiner: Software Reliability Analysis Via Data Mining of Bug Reports. In Proceedings of the 23th International Conference on Software Engineering and Knowledge Engineering (SEKE), July 2011.

Leon Wu, Gail Kaiser, Cynthia Rudin, David Waltz, Roger Anderson, Albert Boulanger, Ansaf Salleb-Aouissi, Haimonti Dutta, and Manoj Pooleery. Evaluating Machine Learning for Improving Power Grid Reliability. In ICML 2011 Workshop on Machine Learning for Global Challenges, July 2011.

Leon Wu, Timothy Teräväinen, Gail Kaiser, Roger Anderson, Albert Boulanger, and Cynthia Rudin. Estimation of System Reliability Using a Semiparametric Model. In Proceedings of the IEEE EnergyTech 2011 (EnergyTech), May 2011.

Cynthia Rudin, David Waltz, Roger Anderson, Albert Boulanger, Ansaf Salleb-Aouissi, Maggie Chow, Haimonti Dutta, Phil Gross, Bert Huang, Steve Ierome, Delfina Isaac, Artie Kressner, Rebecca Passonneau, Axinia Radeva, and Leon Wu. Machine Learning for the New York City Power Grid. IEEE Transactions on Pattern Analysis and Machine Intelligence, May 2011.