Join @TwitterEng's data science & machine learning-focused engineering teams at our new HQ in San Francisco on August 22nd for an evening of tech talks, free drinks (beer & wine, as well as many non-alcoholic options) and food! To kick the evening off, Anand Rajamaran (@anand_raj), a co-founder of Kosmix, will deliver a keynote presentation, and we'll follow with lightning...
[read more]
Join @TwitterEng's data science & machine learning-focused engineering teams at our new HQ in San Francisco on August 22nd for an evening of tech talks, free drinks (beer & wine, as well as many non-alcoholic options) and food! To kick the evening off, Anand Rajamaran (@anand_raj), a co-founder of Kosmix, will deliver a keynote presentation, and we'll follow with lightning talks from Twitter engineers Kurt Smith (@kurtosis0), Alek Kolcz (@zorbageek) and Kumar Chellapilla (@kumarc1). #talks Oil or Oxygen? The Transformative Power of Big Data, presented by Anand Rajamaran Marc Andreessen has argued persuasively that software is eating the world. Most if not all industries, as well as sciences and the humanities, are being transformed by software. It is becoming increasingly apparent that data is the fuel powering software's conquests. A popular analogy calls data "the new oil." While data shares some features with oil (it is valuable and needs to be refined to extract its true value), a better analogy might be to compare data with oxygen (vital and abundant). Software and data are synergistic. Data is created whenever humans and software interact, or when software interacts with other software. And one of the key reasons behind the success of software is the ability to analyze large amounts of data. This virtuous loop: the success of software begets more data, which in turn makes the software smarter, is a key dynamic transforming the world we live in. We trace the evolution of data-driven applications, showing that they have evolved through three generations and we are now in the midst of a fourth generation. We illustrate this transformative power using examples drawn from several fields. In particular, we take a close look at retail commerce, which is undergoing a sea change driven by social media and smartphone adoption. We describe the integral role Big Data plays in this transformation. Analytics at Twitter: From the Micro to the Macro, presented by Kurt Smith Economics has long made a distinction between the micro (individual behavior) and the macro (dynamics of national and international systems). A similar distinction arises in the key data science questions at Twitter. Micro-level topics include predictive modeling of user behavior, social ties, and trending topics. At the macro-level, questions involve understanding similarities and differences between the user base in different countries and their evolution over time. One of the promises of the big data era is the ability to build up from micro-level data to answer macro-level questions. I will show how sophisticated models of user interests give interesting insight into what sets one country apart from another on Twitter. Large-Scale Machine Learning to Tackle User Modeling at Twitter, presented by Alek Kolcz At its heart, Twitter connects users with other users, and with interesting content. Doing this well requires having a good notion of what users might like or not like. This is a perfect opportunity for machine learning techniques, but doing so in a scalable and user-friendly way is not necessarily easy. We will discuss our experiences with the pigML framework, which integrates machine learning directly into Pig and makes its application to big data problems relatively straightforward. Twitter Ads: Interesting Problems and Open Questions, presented by Kumar Chellapilla Here, we'll present a quick overview of Twitter's ad products (Promoted Tweets, Promoted Accounts and Promoted Trends) and the underlying challenges in designing algorithms and systems for ad targeting, engagement prediction, ranking, allocation, and pricing of ads. We'll also discuss how large-scale data minin and machine learning power such algorithms to ensure deliver of high quality ads in a healthy marketplace. Be sure to start Tweeting about the event to help us spread the word with the hashtag #twitterdsml, and you can e-mail openhouse@twitter.com with any questions. See you on the 22nd! Note: This event is not open to press.