Home » Projects (Page 2)

Category Archives: Projects

Improving rr record/replay support for Java

rr (see http://rr-project.org/) is a widely-used open-source C/C++ debugging tool for Linux that enhances gdb with record/replay capabilities.  PSL is seeking several project students (to work in a team) for spring 2018 to adapt rr to record/replay Java applications on top of the JVM.  rr already works with Java/JVM recording/replaying all system calls.  The goal is to modify rr to record/replay only those system calls specific to the Java application recorded/replayed, not the system calls due to the JVM’s internal mechanisms, to improve performance of Java application recording and replaying. (Eventually we plan to modify rr further, to support mutable replay, but that will be a later semester.) Prospective project students should have strong Java/JVM, C/C++ and Linux skills, and preferably have completed 4115 and 4118.

Contact Prof. Kaiser at kaiser@cs.columbia.edu.

Record/Replay Bug Reproduction for Java

There will inevitably continue to be bugs that are not detected by any testing approach, but eventually impact users who then file bug reports. Reproducing field failures in the development environment can be difficult, however, especially in the case of software that behaves non-deterministically, relies on remote resources, or has complex reproduction steps (the users may not even know what led up to triggering the flaw, particularly in the case of software interacting with external devices, databases, etc. in addition to human users). So a record/replay approach is used to capture the state of the system just before a bug is encountered, so the steps leading up to this state can be replayed later in the lab. The naive approach of constant logging in anticipation of a defect tends to produce unacceptably high overheads (reaching 2,000+ %) in the deployed application. Novel solutions that lower this overhead typically limit the depth of information recorded (e.g., to use only a stack trace, rather than a complete state history) or the breadth of information recorded (e.g., to only log information during execution of a particular subsystem that a developer identifies as potentially buggy). But limiting the depth of information gathered may fail to reproduce an error if the defect does not present itself immediately and limiting logging to a specific subcomponent of an application makes it only possible to reproduce the bug if it occurred within that subcomponent.

Our new technique, called “Chronicler”, instead captures program execution in a manner that allows for deterministic replay in the lab with very low overhead. The key insight is to log sources of non-determinism only at the library level – allowing for a lightweight recording process while still supporting a complete replay for debugging purposes (programs with no sources of non-determinism, e.g., no user interactions, are trivial to replay – just provide the same inputs). When a failure occurs, Chronicler automatically generates a test case that consists of the inputs (e.g., file or network I/O, user inputs, random numbers, etc.) that caused the system to fail. This general approach can be applied to any “managed” language that runs in a language virtual machine (for instance, JVM or Microsoft’s .NET CLR), requiring no modifications to the interpreter or environment, and thus addresses a different class of programs than related work for non-managed languages like C and C++.

We expect to extend and use this tool as part of the Mutable Replay project, and are seeking new project students in tandem with that effort.

Contact Professor Gail Kaiser (kaiser@cs.columbia.edu)



Jonathan Bell, Nikhil Sarda and Gail Kaiser. Chronicler: Lightweight Recording to Reproduce Field Failures. 35th International Conference on Software Engineering, May 2013, pp. 362-371. See teaser video at https://www.youtube.com/watch?v=4IYGfdDnAJg.


Download <a href=”http://ChroniclerJ.

Mutable Replay

Society is increasingly reliant on software, but deployed software contains security vulnerabilities and other bugs that can threaten privacy, property and even human lives. When a security vulnerability or critical error is discovered, a software patch is issued to attempt to fix the problem, but patches themselves can be incorrect, inadequate, and break necessarily functionality. This project investigates the full workflow for the developer to rapidly diagnose the root cause of the vulnerability or error, for the developer to test that a prospective patch indeed completely removes the defect, and for users to check the issued patch on their own configurations and workloads before adopting the patch.

This project explores the use of mutable replay to help reproduce, diagnose, and fix software bugs. A low-overhead recorder records the execution of software in case a failure or exploit occurs, allowing the developer to replay the recorded log to reproduce the problem. Mutable replay allows logs recorded with the buggy version to be replayed after the modest code changes typical of critical patches to show that patches work correctly to resolve detected problems. This project leverages semantic information readily available to the developer to conduct well-understood static and dynamic analyses to correctly transform the recorded log to enable mutable replay. The results of this research will benefit society and individuals by simplifying and hastening both generation and validation of patches, ultimately making software more reliable and secure.

Contact Gail Kaiser (kaiser@cs.columbia.edu)

Team Members

Gail Kaiser

Graduate Students
Anthony Saeiva Narin

Former Graduate Students
Jonathan Bell
Kenny Harvey   


Dynamic Information Flow Analysis

We are investigating an approach to runtime information flow analysis for managed languages
that tracks metadata about data values through the execution of a program. We first considered
metadata that propagates labels representing the originating source of each data value, e.g.,
sensitive data from the address book or GPS of a mobile device that should only be accessed on a
need-to-know basis, or potentially suspect data input by end-users or external systems that
should be sanitized before including in database queries, collectively termed “taint tracking”.
We developed and made available open-source the first general purpose implementation of taint
tracking that operates with minimal performance overhead on commodity Java Virtual Machine
implementations (e.g., from Oracle and OpenJDK), by storing the derived metadata “next to” the
corresponding data values in memory, achieved via bytecode rewriting that does not require
access to source code or any changes to the underlying platform. Previous approaches required
changes to the source code, the language interpreter, the language runtime, the operating system
and/or the hardware, or added unacceptable overhead by storing the metadata separately in a
hashmap. Our system has also been applied to Android, where it required changes in 13 lines of
code, contrasted to the state of the art TaintDroid which added 32,000 lines of code. We are
currently investigating tracking the path conditions constructed during dynamic symbolic
execution of programs, which record the constraints on data values that have reached a given
point in execution (e.g., taking the true or false branch of a series of conditionals). We plan to
use the more sophisticated but slower symbolic execution version as part of several prospective

We expect to extend and use this tool as part of the Mutable Replay project, and are seeking new project students in tandem with that effort.

Contact Professor Gail Kaiser (kaiser@cs.columbia.edu)

Team Members

Gail Kaiser

Former Graduate Students
Jonathan Bell



Jonathan Bell and Gail Kaiser. Phosphor: Illuminating Dynamic Data Flow in the JVM. Object-oriented Programming, Systems, Languages, and Applications (OOPSLA), October 2014,pp. 83-101. Artifact accepted as meeting reviewer expectations.

Jonathan Bell and Gail Kaiser. Dynamic Taint Tracking for Java with PhosphorInternational Symposium on Software Testing and Analysis (ISSTA), July 2015, pp. 409-413.


Download Phosphor.

Download Knarr.


genSpace Web Services

The core functionalities of genSpace web services can be categorized into 7 facades in the following:

  1. UserFacade: Retrieve corresponding information about the current user in genSpace. The selected web methods are listed as following:
    • getProfile: Retrieve the profile of current user
    • updateUser: Update the information, such as password, for current user
    • getMyNetworks: Retrieve all networks that the current user is in
  2. UsageInformation: Retrieve information about tools and workflows of geWorkbench. The selected web methods are listed as following:
    • getTool: Retrieve the corresponding tool by too ID
    • getWorkflow: Retrieve the corresponding workflow by workflow ID
    • getMyNotes: Retrieve the note from user for a specific analysis event. An analysis event includes an invocation of geWorkbench tool
    • saveNote: Save the public note from user for a specific analysis event. This note can be seen by user and user’s friends
    • savePriv_Note: Save the private note from user for a specific analysis event. This note can be seen by user only
  3. ToolUsageInformation: Retrieve statistic data about tools and workflows of geWorkbench. The selected web methods are listed as following:
    • getToolsByPopularity: Retrieve the data of using times for each tool in geWorkbench
    • getWorkflowsByPopularity: Retrieve the data of using times for each work flow in geWorkbench
    • getToolSuggestion: Retrieve the suggestion for next tool based on current tool
    • sendUsageEvent: Store the logged analysis event from geWorkbench in genSpace
  4. FriendFacade: Retrieve information about friends of current user in genSpace. The selected web methods are listed as following:
    • getFriendRequest: Inform user that a new friend in genSpace who sends a friend-invitation
    • addFriend: Accept friend-invitation from a new friend
    • getFriends: Retrieve all users who are friends of user in genSpace
    • getMyFriendsEvents: Retrieve the analysis events from friends of user in genSpace
  5. PublicFacade: Responsible to register new users. The selected web methods are listed as following:
    • register: Register new users for genSpace. This method is now synchronized with the registration function in geWorkbench. When a new user registers in geWorkbench, she/he will also be registered in genSpace immediately
    • userExists: A guarding method to check if the registered username has been used by a former user
  6. NetworkFacade: Retrieve information about corresponding networks of user in genSpace. the selected web methods are listed as following:
    • createNetwork: Allow user to create a new network in genSpace
    • joinNetwork: Allow user to request joining an existing network in genSpace
    • leaveNetwork: Allow user to leave a network
  7. WorkflowRepository: A managing facade for users to store workflows created by them or their friends in geWorkbench
    • addWorkflow: Add a new workflow in the workflow repository
    • addComment: Add comment to a specific workflow in the workflow repository
    • sendWorkflow: Send an existing workflow in the workflow repository to friends

Sound Build Acceleration

Sound Build Acceleration: Our empirical studies found that the bulk of the clock time during the builds of the ~2000 largest and most popular Java open source software applications is spent running test cases, so we seek to speed up large builds by reducing testing time. This is an important problem because real-world industry builds often take many hours, so developers cannot be informed of any errors introduced by their changes while still in context – as needed for continuous integration (best practice). The consequent lack of attention to failed tests is one of the major reasons that software is deployed with so many security vulnerabilities and other severe bugs. Prior work reduces testing time by running only subsets of the test suite, chosen using various selection criteria. But this model inherently reduces failure detection, and may be unsound because remaining test cases may have dependencies on removed test cases (causing false positives and false negatives). We thought out of the box to substantially reduce measured testing time without removing any test cases at all, thus no reduction in failure detection. For example, we developed tools that use static and dynamic analyses to determine exactly which portion of the state written by previous test cases will be read by the next test case, and instrument the bytecode to just-in-time reinitialize only that dependent portion of the state, rather than restarting the JVM between separate test cases, a common industry practice. Some dependencies are unintentional, so our tools also inform developers so they can re-engineer the code to remove those dependencies. Other dependencies are necessary, because series of tests are needed to build up and check each step of complex usage scenarios; for these our tools bundle dependent test cases and distinguish independent sets of test cases to enable sound parallelization of large test suites.

We expect to use components of this tool as part of the Mutable Replay project, and are seeking new project students in tandem with that effort.

Contact Professor Gail Kaiser (kaiser@cs.columbia.edu)

Team Members

Gail Kaiser

Former Graduate Students
Jonathan Bell



Jonathan Bell, Gail Kaiser, Eric Melski and Mohan Dattatreya. Efficient Dependency Detection for Safe Java Test Acceleration. 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), Aug-Sep 2015, pp. 770-781.

Jonathan Bell, Eric Melski, Gail Kaiser and Mohan Dattatreya. Accelerating Maven by Delaying Dependencies.3rd International Workshop on Release Engineering (RelEng), May 2015, p. 28.

Jonathan Bell, Eric Melski, Mohan Dattatreya and Gail Kaiser. Vroom: Faster Build Processes for Java.IEEE Software, 32(2):97-104, Mar/Apr 2015.

Jonathan Bell and Gail Kaiser. Unit Test Virtualization with VMVM. 36th International Conference on Software Engineering (ICSE), June 2014, pp. 550-561. (ACM SIGSOFT Distinguished Paper Award)

Jonathan Bell and Gail Kaiser. Unit Test Virtualization: Optimizing Testing Time. 2nd International Workshop on Release Engineering (RelEng), April 2014.

Jonathan Bell and Gail Kaiser. VMVM: Unit Test Virtualization for Java. ICSE 2014 Formal Demonstrations Track, Companion Proceedings of 36th International Conference on Software Engineering (ICSE), June 2014, pp. 576-579. Video at https://www.youtube.com/watch?v=sRpqF3rJERI.


Download VmVm.

CS/SE Education

About CS/SE Education

We are exploring new techniques and approaches to improve the teaching of computer science and software engineering. Our recent projects and papers are listed below.

Contact: Swapneel Sheth (swapneel@cs.columbia.edu)

Team Members


Prof. Gail Kaiser, kaiser [at] cs.columbia.edu

Phd Students

Swapneel Sheth, swapneel [at] cs.columbia.edu

Former PhD students

Jonathan Bell, jbell [at] cs.columbia.edu
Chris Murphy
, cmurphy [at] cs.columbia.edu

See the Software Project Management project listed on our project student advertisements page.



HALO (Highly Addictive, sociaLly Optimized) Software Engineering




Swapneel Sheth, Jonathan Bell, Gail Kaiser. A Competitive-Collaborative Approach for Introducing Software Engineering in a CS2 Class. 26th Conference on Software Engineering Education and Training (CSEE&T), San Francisco CA, pages 41-50, May 2013

Jonathan Bell, Swapneel Sheth, Gail Kaiser. Secret Ninja Testing with HALO Software Engineering. 4th International Workshop on Social Software Engineering Workshop (SSE), Szeged, Hungary, pages 43-47, September 2011

Christian Murphy, Gail Kaiser, Kristin Loveland, Sahar Hasan. Retina: Helping Students and Instructors Based on Observed Programming Activities. 40th ACM SIGCSE Technical Symposium on Computer Science Education, Chattanooga TN, pages 178-182, March 2009

Christian Murphy, Dan Phung, and Gail Kaiser. A Distance Learning Approach to Teaching eXtreme Programming. 13th Annual ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE), Madrid, Spain, pages 199-203, June 2008

C. Murphy, E. Kim, G. Kaiser, A. Cannon. Backstop: A Tool for Debugging Runtime Errors. 39th ACM SIGCSE Technical Symposium on Computer Science Education, Portland OR, pages 173-177, March 2008

Tech Reports

Kunal Swaroop Mishra, Gail Kaiser. Effectiveness of Teaching Metamorphic Testing.Technical Report CUCS-020-12, Dept. of Computer Science, Columbia University, November 2012

Fine-Grained Data Management Abstractions

We participated in developing novel technology that leverages the storage abstractions of
modern operating systems (e.g., the relational databases and object-relational mappings of
Android) to automatically detect fragments strewn across memory, files and databases that is part
of the same logical application object, such as an email and its attachments, without requiring
source code or any cooperation on the part of application developers. This substrate enabled the development of our prototype tools to check that application-level deletions in fact actually delete all the data fragments related to, say, a document or a photo; to hide (and later unhide) sensitive data, e.g., to protect business data at international border crossings; and to detect when an application collects more data than required by its functionality. In our case study, our system worked correctly on 42 out of 50 real-world applications, and lead to publication of “best practices” rules of thumb required for the approach to work on future applications — e.g., fully declare database schemas, use the database to index file storage, use standard storage libraries, which are admittedly obvious to anyone with the software engineering training that some “app” developers sadly lack.

Contact Professor Roxana Geambasu (roxana@cs.columbia.edu) for further information.

Team Members

Roxana Geambasu
Gail Kaiser

Graduate Students
Riley Spahn

Former Graduate Students
Jonathan Bell



Riley Spahn,  Jonathan Bell, Michael Z. Lee, Sravan Bhamidipati, Roxana Geambasu and Gail Kaiser. Pebbles: Fine-Grained Data Management Abstractions for Modern Operating Systems. 11th USENIX Symposium on Operating Systems Design and Implementation, October 2014, pp. 113-129.

Code Similarity

Dynamic Code Similarity: This is a multi-disciplinary project joint with Profs. Simha Sethumadhavan and Tony Jebara. “Code clones” are statically similar code fragments that usually arise via copy/paste or independently writing lookalike code; best practice removes clones (refactoring) or tracks them (e.g., to ensure bugs fixed in one clone are also fixed in others). This part of the project instead studies dynamically similar code for two different similarity models. One model is functional similarity, finding code fragments that exhibit similar input/output behavior during execution. Our other dynamic similarity model is the novel notion of behavioral similarity, which we call “code relatives”. Two or more code fragments are deemed code relatives if their executions are similar. We model this as finding similarities among the dynamic data dependency graphs representing instruction-level execution traces. We used machine learning techniques to devise a (relatively) fast inexact subgraph isomorphism algorithm to cluster these execution-level similarities. Our experiments show that both of our tools find most of the same “similar” code as the best static code clone detectors but also find many others they can’t, because the code looks very different even though functionally and/or behaviorally similar; however, dynamic detection will not necessarily find all static code clones because lookalike code involving polymorphism need not exhibit the same function/behavior. Our behavioral and functional similarity detectors do not always find the same similarities, because two or more code fragments may compute the same function using very different algorithms. Thus these kinds of techniques complement each other. Beyond the conventional applications of static code clone detection, dynamic similarity detection also addresses malware detection, program understanding, re-engineering legacy software to use modern APIs, and informing design of hardware accelerators and compiler optimizations.

Static Code Similarity: We also investigate of static similarity detection to augment our similarity detection toolkit. This work is joint with Prof. Baishakhi Ray of the University of Virginia and Prof. Jonathan Bell of George Mason University. Unlike most other static code clone research, we look for similarities at the instruction level rather than in the source code, so our techniques can work even on obfuscated executables where no source code is available and thus conventional static detectors cannot be applied. This situation arises for both malware and misappropriated intellectual property. We exploit the increasingly popular notion of “big code”, i.e., training from open-source repositories, using features that combine instruction-level call graph analysis and topic modeling (an NLP-based machine learning technique). We believe we can effectively deobfuscate most suspect code by finding similarities within a corpus consisting of known code and its obfuscated counterparts. Our approach handles control flow transformations and introduction of extraneous methods, not just method names.

Contact Gail Kaiser (kaiser@cs.columbia.edu)

Team Members

Gail Kaiser

Former Graduate Students
Fang-Hsiang (“Mike”) Su
Jonathan Bell
Kenny Harvey   
Apoorv Patwardhan



Fang-Hsiang Su, Jonathan Bell, Kenneth Harvey, Simha Sethumadhavan, Gail Kaiser and Tony Jebara. Code Relatives: Detecting Similarly Behaving Software. 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), November 2016. Artifact accepted as platinum.

Fang-Hsiang Su, Jonathan Bell, Gail Kaiser and Simha Sethumadhavan. Identifying Functionally Similar Code in Complex Codebases. 24th IEEE International Conference on Program Comprehension (ICPC), May 2016, pp. 1-10. (ACM SIGSOFT Distinguished Paper Award)

Fang-Hsiang Su, Jonathan Bell, and Gail Kaiser. Challenges in Behavioral Code Clone Detection (Position Paper). 10th International Workshop on Software Clones (IWSC), affiliated with IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), March 2016, volume 3, pp. 21-22. (People’s Choice Award for Best Position Paper)


Download DyCLink from github.

Download HitoshiIO from github.

Download Code Similarity Experiments toolkit from github.

An Open Software Framework for the Emulation and Verification of Drosophila Brain Models on Multiple GPUs

We are working with Prof. Aurel Lazar’s Bionet Lab (http://www.bionet.ee.columbia.edu/) to design, implement and experimentally evaluate an open software framework called the Neurokernel that will enable the isolated and integrated emulation of fly brain model neural ciits and their connectivity patterns (e.g., sensory and locomotion systems)  and other parts of the fly’s nervous system on clusters of GPUs, and support the in vivo functional identification of neural circuits.  (Note this is NOT the same meaning of “in vivo” as PSL’s In Vivo Testing project.)

The Neurokernel will:

  1. Enable computational/systems neuroscientists to exploit new connectome data by directing emulation efforts at interoperable local processing units (LPUs), functional subdivisions of the brain that serve as its computational substrate;
  2. Capitalize on the representation of stimuli in the time domain to enable the development of novel asynchronous algorithms for processing spikes with neural circuits;
  3. Serve as an extended machine that will provide abstractions and interfaces for scalably leveraging a powerful commodity parallel computing hardware platform to study a tractable neural system;
  4. Serve as a resource allocator that will enable researchers to transparently take advantage of future improvements in this hardware platform;
  5. Enable testing of models, both by easing the detection and localization of programming errors and by operationally verifying the models’ designs against time-encoded signals to/from live fly brains in real-time;
  6. Accelerate the research community’s progress in developing new brain circuit model by facilitating the sharing and refinement of novel and/or improved models of LPUs and their constituent circuits by different  groups.

To ease its use by the neuroscience community and enable synergy with existing computational tools and packages, we are developing our software framework in Python, a high-level language that has enjoyed great popularity amongst computational neuroscientists.

As we enhance Neurokernel to model new regions of the fly brain, there may be a negative effect on previous models for other regions.  As the fly brain model(s) will be developed in iterative software development cycles, it will be imperative to ensure that each iteration re-verifies the platform and its individual LPUs against the actual fly brain neuropils.  We would like these tests on the Python code to be conducted automatically, without requiring the use of our fly interface equipment — which is manually intensive to operate.  We are constructing a tool to simulate the fly brain interface for software testing purposes that will capture the stimuli provided to the fly along with its responses.  From these sets of inputs and outputs, the tool will automatically generate test cases that recreate the same experiment without the need for repeated interfacing with the fly. This tool will also be used to automatically generate regression tests for the Neurokernel software that depend on other external factors.

Additional information is available on the Bionet website.

Contact Professor Aurel Lazar (aurel@ee.columbia.edu) for further information.

Team Members

Aurel Lazar
Gail Kaiser

Former PSL Graduate Students
Nikhil Sarda