DMSIG – On Diversity, Complexity, and Regularization in Ensemble Models
LinkedIn, 2025 Stierlin Ct, Mountain View, CA 94043
·
Monday, January 24, 2011, 6:30-9:30pm
Posted 1 year, 6 months ago
Abstract:
The discovery of ensemble methods is one of the most influential developments in Data Mining and Machine Learning in the past decade. These methods combine multiple models into a single predictive system that is more accurate than even the best of its components. The use of ensemble methods can provide a critical boost to existing systems addressing the hardest of industrial challenges... [read more]
Abstract:
The discovery of ensemble methods is one of the most influential developments in Data Mining and Machine Learning in the past decade. These methods combine multiple models into a single predictive system that is more accurate than even the best of its components. The use of ensemble methods can provide a critical boost to existing systems addressing the hardest of industrial challenges – from investment timing to drug discovery, from fraud detection to recommendation systems – where predictive accuracy is vital. This talk, based on a recently published book by the speaker, offers a concise introduction to this breakthrough topic. After a sketch of the major concerns in predictive learning, the talk will give an overview of regularization, a key concept driving the superior performance of modern ensemble algorithms. It then takes a shortcut into the heart of the popular tree-based ensemble creation strategies using recent developments from the frontiers of statistics, where research efforts are now focused to explain and harness the mysteries of ensembles.
Biography:
Giovanni Seni is a Senior Scientist with Elder Research, Inc. (ERI) and directs ERI’s Western office. As an active data mining practitioner in Silicon Valley, he has over 15 years R&D experience in statistical pattern recognition, data mining, and human-computer interaction applications. He has been a member of the technical staff at large technology companies, and a contributor at smaller organizations. He holds five US patents and has published over twenty conference and journal articles. His book with John Elder, “Ensemble Methods in Data Mining – Improving accuracy through combining predictions”, was published in February 2010 by Morgan & Claypool. Giovanni is also an adjunct faculty at the Computer Engineering Department of Santa Clara University, where he teaches an Introduction to Pattern Recognition and Data Mining class.