Secret Ninja Testing with HALO Software Engineering
To appear in the Proceedings of the 4th. International Workshop on Social Software Engineering (SSE), Szeged, Hungary, September 2011
HALO (Highly Addictive, sociaLly Optimized) Software Engineering
Presented by Swapneel Sheth and Jonathan Bell at the 1st Games and Software Engineering Workshop (GAS 2011) on May 22, 2011
HALO (Highly Addictive, sociaLly Optimized) Software Engineering
@inproceedings{Sheth:2011:HSE:1984674.1984685,
author = {Sheth, Swapneel and Bell, Jonathan and Kaiser, Gail},
title = {HALO (highly addictive, socially optimized) software engineering},
booktitle = {Proceeding of the 1st international workshop on Games and software engineering},
series = {GAS ’11},
year = {2011},
isbn = {978-1-4503-0578-5},
location = {Waikiki, Honolulu, HI, USA},
pages = {29–32},
numpages = {4},
url = {http://doi.acm.org/10.1145/1984674.1984685},
doi = {http://doi.acm.org/10.1145/1984674.1984685},
acmid = {1984685},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {flow, games, mmorpg, operant conditioning, quests, social rewards, web 2.0},
}
Testing and Validating Machine Learning Classifiers by Metamorphic Testing
@article{Xie2011544,
title = “Testing and validating machine learning classifiers by metamorphic testing “,
journal = “Journal of Systems and Software “,
volume = “84”,
number = “4”,
pages = “544 – 558”,
year = “2011”,
note = “The Ninth International Conference on Quality Software “,
issn = “0164-1212”,
doi = “http://dx.doi.org/10.1016/j.jss.2010.11.920”,
url = “http://www.sciencedirect.com/science/article/pii/S0164121210003213”,
author = “Xiaoyuan Xie and Joshua W.K. Ho and Christian Murphy and Gail Kaiser and Baowen Xu and Tsong Yueh Chen”,
}
genSpace
About genSpace
geWorkbench (genomics Workbench) is a Java-based open-source platform for integrated genomics. Using a component architecture it allows individually developed plug-ins to be configured into complex bioinformatic applications. At present there are more than 70 available plug-ins supporting the visualization and analysis of gene expression and sequence data. Example use cases include:
- loading data from local or remote data sources.
- visualizing gene expression, molecular interaction networks, protein sequence and protein structure data in a variety of ways.
- providing access to client- and server-side computational analysis tools such as t-test analysis, hierarchical clustering, self organizing maps, regulatory neworks reconstruction, BLAST searches, pattern/motif discovery, etc.
- validating computational hypothesis through the integration of gene and pathway annotation information from curated sources as well as through Gene Ontology enrichment analysis.
genSpace is a suite of collaboration plugins to geWorkbench aimed to support knowledge sharing among computational biologists based on popular social networking motifs. genSpace logs all user activities to a backend server, and data mines this information to recommends tools and workflows (sequences of analysis and visualization tools) in “people like you” style. It also supports Facebook-like friends (direct collaborators) and networks (colleagues from same lab, institution or community), presence facilities including available/away/offline and live activity feed, and a shared research notebook that documents the details of all analyses. The introduction of genSpace web services can be found here.
This research is in collaboration with the Center for the Multiscale Analysis of Genomic and Cellular Networks (MAGNet) on the Columbia University Health Sciences campus, which is funded by NIH and NCI.
Team Members
Faculty
Prof. Gail Kaiser, kaiser [at] cs.columbia.edu
PhD Students
Fang-Hsiang (Mike) Su, mikefhsu [at] cs.columbia.edu
Former PhD Students and MS GRAs
Jon Bell, jbell [at] cs.columbia.edu
Swapneel Sheth, swapneel [at] cs.columbia.edu
Chris Murphy, cmurphy [at] cs.columbia.edu
Nikhil Sarda, ns2847 [at] columbia.edu
Project Students
John Murphy, jvm2108@columbia.edu
Abhaar Gupta, ag3468@columbia.edu
Former project students
Yu Wang
Ami Kumar
Huimin Sun
Diana Chang
Anureet Dhillon
Gowri Kanugovi
Mayur Lodha
Koichiro Matsunaga
Lakshmi Nadig
Joshua Nankin
Cheng Niu
Gaurav Pandey
Hyuksoo Seo
Yuan Wang
Eric Schmidt
Nan Luo
Danielle Cauthen
Flavio Antonelli
Ning Yu
Jason Halpern
Evgeny Fedetov
Aditya Bir
Alison Yang
Links
Papers, Presentations, etc.
C2B2 retreat poster and slides, May 2013
C2B2 retreat poster and slides, May 2012
DEIT 2011 paper and slides – “Towards using Cached Data Mining for Large Scale Recommender Systems”
RSSE 2010 paper and poster – “The weHelp Reference Architecture for Community-Driven Recommender Systems”
C2B2 retreat posters (1 and 2), April 2010
SSE 2010 paper and workshop presentation – “weHelp: A Reference Architecture for Social Recommender Systems”
C2B2 retreat presentation and poster, March 2009
SoSEA 2008 paper and workshop presentation – “genSpace: Exploring Social Networking Metaphors for Knowledge Sharing and Scientific Collaborative Work”
C2B2 retreat presentation and poster, April 2008
Documentation
genSpace wiki
geWorkbench wiki
C2B2 project management wiki
Source Code
geWorkbench repository (login required)
Contact: Fang-hsiang (Mike) Su
Towards using Cached Data Mining for Large Scale Recommender Systems
Presented by Swapneel Sheth at the International Conference on Data Engineering and Internet Technology (DEIT 2011) on March 15, 2011
Towards using Cached Data Mining for Large Scale Recommender Systems
To appear in the Proceedings of the 2011 International Conference on Data Engineering and Internet Technology (DEIT 2011), Bali, Indonesia, March 2011
VULCANA
About VULCANA
As the Internet has grown in popularity, security vulnerability detecting and testing are undoubtedly becoming crucial parts for commercial software, especially for web service applications. Vulnerability scanners, both commercial and open-source (i.e., SAINT, eEye, Nessus, etc.), were developed to achieve this goal. However, the absence of a well defined assessment benchmark makes the efficient evaluation of these scanners nearly impossible. With ongoing researches on new vulnerability scanners, the demand for such an assessment benchmark is urgent. We are working on developing VULCANA, a set of open-source web service applications with systematically injected vulnerabilities. The idea is that different vulnerability scanners can be used to scan the benchmark, and the percentage of detected vulnerabilities together with the resource consumption are used to provide reasonable evaluation.
In Spring 2009, we developed a prototype framework called Baseline, which is described in our tech report. The idea of BaseLine is that we tried to coach the users to pick the right Web Vulnerability Scanner by letting them set up a baseline for potential qualified scanners. We can then test the scanner with the baseline, revealing its effectiveness and efficiency in detecting the user’s most “care-about” vulnerabilities.
Brief Introduction of Baseline
Most of existing benchmarks use the scanners to scan a manually crafted website with a number of known vulnerabilities, and rate the scanners based on the percentage of successful detection. These benchmarks are only capable of judging which scanner is better in the matter of how well the scanners can detect the fixed set of vulnerabilities the benchmarks picked with static selection criteria. They suffer from drawbacks by neglecting the critical questions: Does the benchmark properly reflect the user’s security requirements; does it reflect the user’s actual deployment environment? In helping the users choose the right scanners, answering these questions is as crucial as evaluating the effectiveness and efficiency of the scanners. In this paper, we propose an approach called Baseline that addresses all of these problems: We implement a ranking system for dynamically generating the most suitable selection of weaknesses based on the user’s needs, which serves as the baseline that a qualified scanner should reach/detect. Then we pair the ranking system with a testing framework for generating test suites according to the selection of weaknesses. This framework maps a weakness into an FSM (Finite State Machine) with multiple end states that represent different types/mutations of exploitations of the weakness and each transition from state to state determined by scanner behavior, the framework then combines the FSMs of the selected weaknesses into a mimicked vulnerable website. When a scanner scans the “vulnerable” website, the transitions between the states are recorded and thus we are able to evaluate the scanner by looking at which end states were visited (effectiveness), in how much time, and over how many transitions(efficiency).
Currently we are looking at methods of measuring assorted aspects of the web vulnerability scanners. Specifically, the ability of bypassing client-side validation, the crawling coverage and the capability of scanning auto-generated pages.
Open research questions include:
- Currently, Baseline framework uses Regular expression to determine the transition between two states. Can we extend Baseline with more sophisticated validation methods?
- Client-side validation seems to be neglected by most (if not all) existing scanners. Are there any drawbacks for the scanners to omit them?
- There are no existing web vulnerabilty repository, can we create one?
Team Members
Faculty
Prof. Gail Kaiser, kaiser [at] cs.columbia.edu
Graduate Students
Huning Dai, hdd2210 [at] columbia.edu
Shreemanth Hosahalli, sh2959 [at] columbia.edu
Former Members
Michael Glass
Anshul Mittal
The weHelp Reference Architecture for Community-Driven Recommender Systems — Short Position Paper
Presented by Swapneel Sheth at the 2nd International Workshop on Recommendation Systems for Software Engineering (RSSE) on May 4, 2010
The weHelp Reference Architecture for Community-Driven Recommender Systems — Short Position Paper
@inproceedings{wehelp,
author = {Sheth, Swapneel and Arora, Nipun and Murphy, Christian and Kaiser, Gail},
title = {The {weHelp} reference architecture for community-driven recommender systems},
booktitle = {RSSE ’10: Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering},
year = {2010},
isbn = {978-1-60558-974-9},
pages = {46–47},
location = {Cape Town, South Africa},
doi = {http://doi.acm.org/10.1145/1808920.1808930},
publisher = {ACM},
address = {New York, NY, USA},
}