Home » Projects » Active Projects

Category Archives: Active Projects

SESPO (SEmantic Security bug detection with Pseudo-Oracles): targeted subproject #1

Semantic security bugs resulting from logic flaws severely undermine the security of critical software systems. For example, attackers exploited a recent semantic security bug (CVE-2017-5638) in Apache Struts to steal sensitive personal data of 143 million customers of Equifax. Despite the severity of such semantic security bugs, there is no generic method for detecting these bugs. The key challenge is that semantic bugs, unlike memory corruption bugs, do not have any program-independent side-effects like crashing. Therefore, semantic security bugs are hard to detect without formal specifications of the program under test. Unfortunately, full formal specifications (test oracles) are usually not available even for security critical software as these specifications are difficult to create and keep in sync with functionality changes.

Our longer-term goal is to develop SESPO, a generic framework for automatically detecting semantic security bugs using pseudo-oracle testing—a way to create test oracles by comparing behaviors of related executions even when complete specifications are not available. SESPO will use two main classes of pseudo-oracles: (i) Differential Testing to compare outputs of different independent programs with similar functionality and (ii) Metamorphic Testing to check the relationship between outputs from multiple executions of the same program with different inputs.

We are currently looking for one or more undergraduate/MS project students to perform a relatively short-term project for academic credit: devising metamorphic relations corresponding to some security properties of Java and/or C/C++ software.  A “metamorphic relation” specifies a relationship between input/output pairs.  In particular, if you have a given input i and its already known output o (already known because you ran the software with input i), the relation defines 1. a constructor, 2. a predictor and 3. a checker, where the constructor crafts a new input i’ from i, the predictor determines what the output o’ ought to be from the already known i, o and i’, and the checker compares the real output o” (when you run the software with input i’) to the predicted output o’.  If o” is not consistent with o’, we know there’s a bug in the software (the checker is often but not necessarily equality).  For example, if you ran a sort program on a list of elements as input to produce sorted output, permuting the original input list to derive the new input should result in the identical output except for “ties”.

We will train you in the concepts of metamorphic testing.  Your job will be to find a reasonable number of different metamorphic relations relevant to security properties of real-world software, where inputs come from untrusted users.

To learn more, please contact Prof. Kaiser, kaiser@cs.columbia.edu.

MuPlay (MUtable rePLAY) for patching security vulnerabilities

Seeking new PhD student as well as MS thesis and graduate/undergraduate project students.

The project investigates the full patching workflow for both Java and C/C++, from when the security vulnerability is detected and a reproducible exploit is demonstrated, through testing candidate patches, to when users have validated and applied the released patch in production.  Developer productivity will be improved since constructing, executing and checking exploit test cases will be automated; the productivity of the application’s administrators will be improved since they will spend less time painstakingly validating patches by hand and more time directly benefiting their business.  The application’s users will more rapidly be protected from a vulnerable application threatening their own security and privacy.

We build on record/replay technology, which already supports deterministic bug reproduction and replay of the originally recorded log after very restricted changes, such as inserting print statements, recompiling with instrumentation and reconfiguring to increase verbosity, that do not affect application state.  However, unrestricted changes constituting a patch – which typically do affect application state – are often sufficiently modest that the modified application can still be replayed with small mutations to the original record/replay log.  In particular, static and dynamic analyses, such as slicing, data dependency graphs, and taint tracking, can rule out potential log mutations that would be unsound and suggest other prospective log mutations consistent with the code changes.   Then the replayer can skip over no longer relevant log entries and “go live” for new/replaced library and system calls, while reusing unaffected data from the recorded log.

In addition to longer-term participants, we are currently looking for one or more undergraduate/MS project students to perform a relatively short-term project for academic credit: We would like to convert the open-source rr record/replay system (https://rr-project.org/) to directly support Java programs.  rr already supports Java applications running on top of the JVM, but it records and replays all the activities of the underlying JVM rather than just the Java application, which is inefficient and makes it difficult to reason about the Java application state, needed to enable mutable replay.  We will train you in the concepts of record/replay.  Your job will be to study how rr works internally, to evaluate the feasibility of such a conversion and determine what, specifically, would need to be changed.

Other one-semester projects can be defined for interested project students.

To learn more, please contact Prof. Kaiser, kaiser@cs.columbia.edu.

Leveraging SABER (SimilAr BEhavioR) to aid program understanding

Seeking new PhD student as well as MS thesis and graduate/undergraduate project students.

Most software is developed by teams of engineers, not a single individual working alone. Further, a software product is rarely developed once and for all with no bug fixes or features added after original deployment. Original team members leave the company or the project and new members join. Thus most engineers need to understand software code originally written by other persons, indeed studies have found that engineers spend more of their time trying to understand unfamiliar code than creating or modifying code. The engineer might not know what a piece of code is supposed to do, how it works, how it is supposed to work, how to fix it to make it work; documentation might be poor, outdated, or non-existent, and knowledgeable co-workers are not always available.

This project investigates automated tools and techniques to help engineers more quickly and more completely understand the code they are tasked to work with, to improve their productivity and the quality of their work. With less time spent understanding existing code, engineers have more time to spend on modifying that code, fixing bugs and adding features desired by their users, and creating new software benefiting the public.

More specifically, this project investigates dynamic analysis approaches to identifying behavioral similarities among code elements in the same or different programs, particularly for code that behaves similarly during execution but does not look similar so would be difficult or impossible to detect using static analysis (code clones). While code clone technology is fairly mature, tools for detecting behavioral similarities are relatively primitive. The primary objective is to improve and shape behavioral similarity analysis for practical use cases, concentrating on finding similar code in the same or other codebases that might help developers understand, debug, and add features to unfamiliar code they are tasked to work with.

The project seeks to advance knowledge about what it means for code to be behaviorally similar, how dynamic analyses can identify behavioral code similarities, how to drive the executions necessary for these analyses, and how to leverage code whose behavior is reported as highly similar to the code at hand to achieve common software engineering tasks that may be ill-suited to representational code similarities (code clones). The research investigates the utility and scalability of dynamic analyses seeking behavioral similarities in corresponding representations of code executions; guiding input case generation techniques to produce test executions useful for comparing/contrasting code behaviors for particular use cases; and filtering and weighting schemes for adapting the preponderance of the evidence metaphor to choosing the most convincing similarities for the software engineering task.

To learn more, please contact Prof. Kaiser, kaiser@cs.columbia.edu.

Gameful Computational Thinking

Seeking undergraduate and graduate students in Computer Science and related fields. Also see separate project description for students with formal background in education.

Inspired by CS for All?  The Programming Systems Lab, led by Professor Gail Kaiser, is building a collaborative game-based learning and assessment system intended to help teachers infuse computational thinking in grade 6-8 curricula.  We’re recruiting students interested in collaborating as a Scrum team to continue development of several features:

  • A gameful direct instruction system that embeds Parson’s Programming Puzzles in Scratch
  • A gameful constructionism mode that integrates scoring systems for play and automated formative assessment through Blockly
  • A gameful intelligent tutor that provides hints on-demand and just-in-time via clustering, sequential pattern mining, and behavior detection algorithms
  • A gameful affinity space that enables curricula assimilation with quest management and learning metrics using Node.js and Angular

Participants should register for 3 credits of COMS E6901 (graduate students) or COMS W3998 (undergraduates).  CVN students are welcome.

To learn more, please contact Jeff Bender, jeffrey.bender@columbia.edu.

SAGE | Social Addictive Gameful Engineering


Gameful Computational Thinking: seeking students from Teachers College

Inspired by CS for All?  The Programming Systems Lab, led by Professor Gail Kaiser, is building a collaborative game-based learning and assessment system intended to help teachers infuse computational thinking in grade 6-8 curricula.  We are seeking students with backgrounds in teaching (anything) to middle-school age children, to help us transition to field studies with our partner school and possibly other educational programs.  Software development skills are not needed for these positions, but an interest in applying theories from the learning sciences within educational technology is.  Appropriate academic credit can be arranged on a case by case basis.

To learn more, please contact Jeff Bender, jeffrey.bender@columbia.edu.

SAGE | Social Addictive Gameful Engineering



Code Similarity

Dynamic Code Similarity: This is a multi-disciplinary project joint with Profs. Simha Sethumadhavan and Tony Jebara. “Code clones” are statically similar code fragments that usually arise via copy/paste or independently writing lookalike code; best practice removes clones (refactoring) or tracks them (e.g., to ensure bugs fixed in one clone are also fixed in others). This part of the project instead studies dynamically similar code for two different similarity models. One model is functional similarity, finding code fragments that exhibit similar input/output behavior during execution. Our other dynamic similarity model is the novel notion of behavioral similarity, which we call “code relatives”. Two or more code fragments are deemed code relatives if their executions are similar. We model this as finding similarities among the dynamic data dependency graphs representing instruction-level execution traces. We used machine learning techniques to devise a (relatively) fast inexact subgraph isomorphism algorithm to cluster these execution-level similarities. Our experiments show that both of our tools find most of the same “similar” code as the best static code clone detectors but also find many others they can’t, because the code looks very different even though functionally and/or behaviorally similar; however, dynamic detection will not necessarily find all static code clones because lookalike code involving polymorphism need not exhibit the same function/behavior. Our behavioral and functional similarity detectors do not always find the same similarities, because two or more code fragments may compute the same function using very different algorithms. Thus these kinds of techniques complement each other. Beyond the conventional applications of static code clone detection, dynamic similarity detection also addresses malware detection, program understanding, re-engineering legacy software to use modern APIs, and informing design of hardware accelerators and compiler optimizations.

Static Code Similarity: We also investigate of static similarity detection to augment our similarity detection toolkit. This work is joint with Prof. Baishakhi Ray of the University of Virginia and Prof. Jonathan Bell of George Mason University. Unlike most other static code clone research, we look for similarities at the instruction level rather than in the source code, so our techniques can work even on obfuscated executables where no source code is available and thus conventional static detectors cannot be applied. This situation arises for both malware and misappropriated intellectual property. We exploit the increasingly popular notion of “big code”, i.e., training from open-source repositories, using features that combine instruction-level call graph analysis and topic modeling (an NLP-based machine learning technique). We believe we can effectively deobfuscate most suspect code by finding similarities within a corpus consisting of known code and its obfuscated counterparts. Our approach handles control flow transformations and introduction of extraneous methods, not just method names.

Contact Gail Kaiser (kaiser@cs.columbia.edu)

Team Members

Gail Kaiser

Former Graduate Students
Fang-Hsiang (“Mike”) Su
Jonathan Bell
Kenny Harvey   
Apoorv Patwardhan



Fang-Hsiang Su, Jonathan Bell, Kenneth Harvey, Simha Sethumadhavan, Gail Kaiser and Tony Jebara. Code Relatives: Detecting Similarly Behaving Software. 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), November 2016. Artifact accepted as platinum.

Fang-Hsiang Su, Jonathan Bell, Gail Kaiser and Simha Sethumadhavan. Identifying Functionally Similar Code in Complex Codebases. 24th IEEE International Conference on Program Comprehension (ICPC), May 2016, pp. 1-10. (ACM SIGSOFT Distinguished Paper Award)

Fang-Hsiang Su, Jonathan Bell, and Gail Kaiser. Challenges in Behavioral Code Clone Detection (Position Paper). 10th International Workshop on Software Clones (IWSC), affiliated with IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), March 2016, volume 3, pp. 21-22. (People’s Choice Award for Best Position Paper)


Download DyCLink from github.

Download HitoshiIO from github.

Download Code Similarity Experiments toolkit from github.