Semantic security bugs cause serious vulnerabilities across a wide range of software, e.g., bypassing security checks, gaining access to sensitive information, and escalating privileges. Automatically detecting these bugs is hard because unlike crashing bugs they may not show any obvious side-effects. The safety and security specifications violated by semantic security bugs are rarely written in a form easy to express as test oracles that define precisely what the behavior should be for every possible input sequence.
Pseudo-oracle testing is one popular technique to identify non-crashing buggy behaviors when true test oracles are unavailable: Two or more executions are compared, as oracles for each other, to find discrepancies. The most common pseudo-oracle approach used in security research is differential testing, where multiple independent implementations of the same functionality are compared to see if they agree on, e.g., whether a given untrusted input is valid. But that only works when multiple independent implementations are available to the developers.
An alternative approach that requires only the one implementation at hand is metamorphic testing, which checks domain-specific metamorphic relations that are defined across sets of input/output pairs from multiple executions of the same test program. For example, a sample metamorphic property for program p adding two inputs a and b can be p(a, b) = p(a, 0) + p(b, 0). For a program r that sorts an array of inputs, r( [original array] ) should produce the same sorted result as r( [permutation of original array] ). A machine learning classifier should also produce the same results when the order of the training set is shuffled.
However, it is challenging to apply metamorphic relations to detecting security vulnerabilities, as security properties require richer semantics and often cannot be expressed as simple relationships among input-output pairs. This project investigates whether there is a comprehensive range of semantically expressive metamorphic relations that can successfully detect semantic security vulnerabilities.
We are seeking undergraduate and MS project students to participate in a pilot study: 1. Create a vulnerability database of known semantic security bugs, their corresponding patches, and the test cases needed to reproduce those bugs. 2. Manually classify the bugs into different sub-categories, e.g., incorrect error handling, missing input validation, API abuse, etc. 3. Determine which properties change across the buggy and fixed versions that can be expressed as violations of metamorphic relations. 4. Inspect other execution paths to check whether these relations are satisfied during benign executions. 5. Find out whether these metamorphic relations are also applicable to similar bugs in other programs. Students should be interested in security, software testing and program analysis.
Contact: Professor Gail Kaiser, kaiser@cs.columbia.edu. Please put “sespo” in the subject of your email.