JavaScript Menu Courtesy of Milonic.com
blue blue
blue blue
blue blue
blue blue
Stevens ECE
ABOUT US
space
Overview
space
Mission Statement
space
Academic Programs
space
Academic Laboratories
space
Research Laboratories
space
News & Events
space
Publications
space
Seminars
space
        

News
gray gray Share/Save/Bookmark
Share
gray
Print
Seminar
Untitled Document
Back 


Streaming Techniques for Statistical Modeling

May 13, 2009

Speaker:Dr. Yihua WuGoogle, Inc. Time: Wednesday 05/13/2009 3-4PMLocation: Babbio 110Biography:Dr. Yihua Wu received her PhD in Computer Science from Rutgers, the State University of New Jersey in 2007 and has been working in Google Inc. New York since then.  Her research interests are streaming techniques for statistical modeling of massive data with applications to databases and networking areas.  During her PhD, she extensively studied i) parametric modeling of skewed data sets; ii) graph modeling of individual's communication patterns; iii) sequential change detection on data streams.  Dr. Yihua Wu spent years of her PhD collaborating with researchers from AT&T Shannon Labs, Telcordia Applied Research, Narus Inc. to develop space- and time-efficient streaming algorithms on real world data sets and is holding two patents on that.  While working at Google, she designs and develops features and models to improve search quality.Abstract:

Streaming is an important paradigm for handling high-speed data sets that are too large to fit in main memory. Prior work in data streams has shown how to estimate simple statistical parameters, such as histograms, heavy hitters, frequent moments, etc., on data streams. This talk focuses on a number of more sophisticated statistical analyses that are performed in near real-time, using limited resources.I will first present how to model stream data parametrically; in particular, we fit hierarchical (binomial multifractal) and non-hierarchical (Pareto) power-law models on a data stream. It yields algorithms that are fast, space-efficient, and provide accuracy guarantees. I also designed fast methods to perform online model validation at streaming speeds. Then I studied the detection of changes in models on data with unknown distributions. I adapt the sound statistical method of sequential probability ratio test to the online streaming case, without independence assumption. The resulting algorithm works seamlessly without window limitations inherent in prior work, and is highly effective at detecting changes quickly. Furthermore, I formulated and extended our streaming solution to the local change detection problem that has not been addressed earlier.As concrete applications of our techniques, we complement our analytic and algorithmic results with experiments on network traffic data to demonstrate the practicality of our methods at line speeds, and the potential power of streaming techniques for statistical modeling in data mining.

For more information please contact:

Yingying Chen
Assistant Professor & NIS Graduate Program Director
Burchard
Room 210
Phone: 201.216.8066
Fax: 201.216.8246
yingying.chen@stevens.edu

Dept_Seminar_0513

Stevens ECE
        

News
gray gray Share/Save/Bookmark
Share
gray
Print
gray

 E-mail to a friend

Contacts  

Dr. Yu-Dong Yao
Professor & Department Director
Burchard Building
Room B-212
Phone: 201.216.5264
Fax: 201.216.8246
yyao@stevens.edu

Stevens ECE
Stevens ECE Stevens ECE Stevens ECE
                Stevens Main Site       Office of the Provost       School of Engineering & Science       Institute Policies & Guidelines