The real-world big data are largely unstructured, interconnected, and dynamic, in the form of natural language text. It is highly desirable to transform such massive unstructured data into structured knowledge. Many researchers rely on labor-intensive labeling and curation to extract knowledge from such data. However, such approaches may not be scalable, especially considering that a lot of text corpora are highly dynamic and domain-specific. The speaker argues that massive text data itself may disclose a large body of hidden patterns, structures, and knowledge. Equipped with domain-independent and domain-dependent knowledge-bases, he explores the power of massive data itself for turning unstructured data into structured knowledge.
In this lecture, the speaker will introduce a set of methods developed recently in his group on exploration of the power of big text data, including mining quality phrases, recognition and typing of entities and relations by weak supervision, pattern-based entity-attribute-value extraction, multi-faceted taxonomy discovery, and construction of multi-dimensional text cubes. He will show the massive text data itself can be powerful at disclosing patterns and structures, and it is promising to explore the power of massive text data to turn massive text data to structured knowledge.
About the speaker
Prof Jiawei Han received his PhD in Computer Sciences at the University of Wisconsin-Madison in 1985. He worked as an Assistant Professor at Northwestern University before moving to Simon Fraser University in 1987. In 2001, he joined the University of Illinois at Urbana-Champaign (UIUC) and is currently the Abel Bliss Professor of Engineering.
Prof Han’s research interests include data mining, data warehousing, stream data mining, spatiotemporal and multimedia data mining, biological data mining, social network analysis, text and Web mining, and software bug mining. He has over 300 conference and journal publications.
Prof Han received numerous awards including the Excellence in Graduate and Professional Teaching Award (2012) and the Tau Beta Pi Daniel C Drucker Eminent Faculty Award (2011) from UIUC, the Institute of Electrical and Electronics Engineers (IEEE) Computer Society W Wallace McDowell Award (2009), the IEEE Computer Society Technical Achievement Award (2005) and the Association for Computing Machinery (ACM) SIGKDD Innovation Award (2004). He was also elected a Fellow of IEEE (2009) and of ACM (2003).
For attendees’ attention
The lecture is free and open to all. Seating is on a first come, first served basis.