Noah Smith

(Carnegie Mellon University)

"Machine Learning about People from their Language"

(Vortrag im Rahmen der "MPI Distinguished Lecture Series" in Kooperation mit dem Fachbereich Informatik)

This talk describes new analysis algorithms for text data aimed at understanding the social world from which the data emerged. The political world offers some excellent questions to explore: Do US presidential candidates "move to the political center" after winning a primary election? Are Supreme Court justices swayed by amicus curiae briefs, documents crafted at great expense? I'll show how our computational models capture theoretical commitments and uncertainty, offering new tools for exploring these kinds of questions and more. Time permitting, we'll close with an analysis of a quarter million biographies, discussing what can be discovered about human lives as well as those who write about them.

The primary collaborators on this research are my Ph.D. students David Bamman and Yanchuan Sim; collaborators from the Political Science Department at UNC Chapel Hill, Brice Acree, and Justin Gross; and Bryan Routledge from the Tepper School of Business at CMU.

Bio: Noah Smith designs algorithms for automated analysis of human language. He often exploits the web to this end, including mining the web for translations (Resnik and Smith, 2003), measuring public opinion from social messages (O'Connor et al., 2010), and inferring geographic linguistic variation
(Eisenstein et al., 2010).

Smith has also contributed algorithms tackling the core problems of natural language processing: parsing sentences into syntactic representations (Eisner et al., 2005; Martins et al., 2009) and semantic representations (Das et al., 2010; Flanigan et al., 2014), as well as cross-cutting techniques for unsupervised language learning (Smith and Eisner, 2005; Cohen and Smith, 2009). His 2011 book, Linguistic Structure Prediction, synthesizes many statistical modeling techniques for language.

Such methods advance applications for automatic translation (Al-Onaizan et al., 1999; Gimpel and Smith, 2011), empirical work in the social sciences (Kogan et al., 2009; Yano et al., 2009, Sim et al., 2013) and humanities (Bamman et al., 2014), and education (Heilman and Smith, 2010), and other next-generation language technologies.

Smith is Associate Professor of Language Technologies and Machine Learning in the School of Computer Science at Carnegie Mellon University. In fall 2015, he will join the University of Washington as Associate Professor of Computer Science & Engineering. Prior to coming to CMU, he was a Hertz Foundation Fellow at Johns Hopkins University, where he completed his Ph.D. in 2006. He is a clarinetist, tanguero, and swimmer.

Time: Thuesday, 04.11.2014, 3:30 pm
Place: MPI-SWS Saarbrücken, Campus E1 5, room 002
Video: Simultaneous video cast to MPI-SWS Kaiserslautern, Paul Ehrlich Str. 26, room 113