Three pioneering Stevens faculty, Samantha Kleinberg, Antonio Nicolosi and Wendy Wang, all from the Department of Computer Science are recent recipients of the National Science Foundation’s (NSF) highly esteemed CAREER grant awards.
As one of the most competitive programs the NSF conducts, the CAREER Award supports early career development of faculty in the sciences who are most likely to become leading researchers and educators. Such activities should build a firm foundation for a lifetime of integrated contributions to research and education.
“These awards are recognition by the National Science Foundation of both the past accomplishments and the promise of much more success in the years ahead for these three outstanding young faculty members,” says Michael Bruno, Feiler Chair Professor and Dean, School of Engineering and Science. “We are very proud of Samantha, Antonio and Wendy. We remain committed to their continued professional growth and look forward to their significant contributions to the Department of Computer Science, to the Institute, and to the nation.”
“The Computer Science department is improving rapidly on its way to being a department of national distinction thanks to the talents and efforts of these and other young faculty,” says Dan Duchamp, Director Department of Computer Science. “The groundbreaking research being done by Samantha, Antonio, and Wendy highlight our strength in cybersecurity and in the data management and artificial intelligence technologies that underlie 'big data' research."
In the first of the grant awards, researcher Samantha Kleinberg’s proposal, “Learning from Observational Data with Knowledge,” involves integrating prior knowledge with large-scale data to enable efficient inferences.
With the proliferation of new methods for gathering long-term population data (from electronic medical records) and real-world health data (from body-worn sensors), researchers have access to highly detailed time series data. Kleinberg's work in particular aims to understand the processes underlying recovery in stroke patients from ICU data and factors affecting blood glucose using real-world sensor data from people with diabetes. Thus, instead of relying on controlled experiments (which can be expensive and time consuming) to test a small set of highly targeted hypotheses, it is increasingly becoming possible to use the data itself to construct hypotheses.
According to Kleinberg, one of the challenges of large observational datasets from domains such as biomedical informatics and social networks is that these can enable researchers to test complex hypotheses, but this leads to an enormous search space and burdens researchers with figuring out what hypotheses may be interesting. Instead, her approach uses prior knowledge to help generate hypotheses from the data and to determine which are novel and merit further validation.
Kleinberg's work focuses specifically on inferring causal relationships, as these give insight into not just what is happening, but why it's happening and could ultimately lead to targets for future interventions. This work will lead to more robust and efficient inference from large-scale datasets, through a feedback loop between experiments and prior knowledge. The approach developed here will improve computational efficiency, while also enabling discovery of novel relationships.
In terms of education, Kleinberg is developing a causal inference module for the computer science summer intensive program for high school students and involving graduate students for hands-on project work. Students will work on problems at the cutting edge of computer science that also have applications to significant problems in health.
“It’s great to have five years of stable funding,” says Kleinberg. “It has allowed me to recruit a new PhD student that I wouldn’t otherwise be able to support.”
In the second NSF CAREER grant award, Antonio Nicolosi’s project “Non-Commutative Cryptography from Hard Learning Problems: Theory and Practice” proposes the development of the theory and practice of a novel approach to cryptography based on a class of learning problems over non-commutative groups, known collectively as Learning Homomorphisms with Noise (LHN).
The project explores four main threads: 1) designing efficient cryptographic constructions based on the hardness of the LHN, 2) establishing evidence of the intractability of the underlying learning problems, especially against quantum computing, 3) building a software library to manipulate instances of these learning problems efficiently and evaluating the performance of learning-based non-commutative cryptography, and 4) exploring additional LHN variants to overcome any limitation encountered in execution of the other threads.
By diversifying the premises on which to base cryptography and creating training opportunities in information security for tomorrow’s workforce, this project will strengthen a critical part of the modern information technology infrastructure.
"This project focuses on a little explored area of cryptography. Making headway will take some time," says Nicolosi, "but this funding allows for medium term effort."
Nicolosi explains that doctoral students will assist him in developing the mathematical theory of the LHN problem, and also in building a cryptographic library to operate on certain non-commutative groups efficiently.
"Masters students," says Nicolosi, "will experiment with the new cryptographic library by programming security applications on top of it. They will also evaluate the performance of alternate designs and will help to make the research more diverse," he says.
Also receiving an NSF CAREER award for her proposal titled “Verifiable Outsourcing of Data Mining Computations,” Wendy Wang’s research relates to the data-mining-as-a-service paradigm, one of the most quickly growing research areas in computer science.
With the rapid increase in large amounts of data, organizations are realizing the need to derive insights from the data they amass. Grocery stores, for example, need to derive promotions based on data they collect about consumer purchases. Quickly exceeding the perabyte boundary, the processing power needed for big data exceeds the capacity for any average business computer. In those situations, data is outsourced to a powerful service provider who returns data mining results back to the client.
In this paradigm, the client loses control over their data, following a pay-per-use format with the service provider that charges proportional to the services provided, seldom sharing the details of the computation. In order to increase revenues, service providers, therefore, may attempt to do fewer computations yet charge a premium price. This paradigm raises security issues relating to the result integrity.
Incorrect mining results, for example, may be compromised by an inside or outside attacker, which may return incorrect data mining results. Since data mining is mission-critical in many business situations, it’s essential to provide practical and efficient result integrity verification techniques for outsourced data mining results. According to Wang, most of the existing work only focuses on general-purpose computations and applying them to specific data mining problems would be impractical and expensive.
The goal of Wang’s project is to design the result integrity verification techniques for outsourced data mining computations that enable the client to verify whether the service provider has executed the computations faithfully, and returned the correct result without much effort. In terms of education, Wang plans to work with high school students as summer interns, and offer independent study to graduate students working on their theses.
“Their involvement gives me more insight into the research,” says Wang. “The students benefit from working on the project, and so does the university.”
“I think the next step in my research program, my long term research goal, is to be involved with building up cloud-based data management systems,” says Wang. I want to contribute to and be part of the future picture of the Internet.”