| |
|
|
| 09.00 - 11.05 |
Session I
|
|
| |
|
|
| 09.00 - 9.05 |
Introduction [ppt]
Craig Knoblock
|
|
| |
|
|
| 9.05 - 9.55 |
Keynote address by Gerald DeJong
University of Illinois at Urbana-Champaign Robustness through Prior Knowledge: Using Explanation-Based Learning to Distinguish Handwritten Chinese Characters [ppt]
|
|
| |
|
|
| 09.55 - 11.05 |
Sub-Session: Classification of Noisy Text (Session Chair: Craig Knoblock)
|
|
| |
|
|
| 9.55 - 10.15 |
Paper 1 Finding Structure in Noisy Text: Topic Classification and Unsupervised Clustering [ppt]
Rohit Prasad, Prem Natarajan, Krishna Subramanian, Shirin Saleem and Rich Schwartz
BBN Technologies, Cambridge, MA, USA. |
|
| |
|
|
| 10.15 - 10.35 |
Paper 2 Genre as Noise - Noise in Genre [ppt]
Andrea Stubbe, Christoph Ringlstetter* and Klaus U. Schulz
University of Munich, Germany.
*University of Alberta, Canada. |
|
| |
|
|
| 10.35 - 10.55 |
Paper 3 A Practical Implementation of Automatic Text Categorisation and Correction of Noisy OCR Documents into Braille and Large Print
Ryan Brooks, William John Teahan and David Hunnisett*
University of Wales, Bangor, UK.
*ETL Solutions Ltd., UK. |
|
| |
|
|
| 10.55 - 11.05 |
Boasters for Posters 1, 2, 3 and 4
|
|
| |
|
|
| 11.05 - 11.30 |
Tea/Coffee Break |
|
| |
|
|
| 11.30 - 13.00 |
Session II: Detecting and Correcting Noisy Text (Session Chair: Venu Govindaraju)
|
|
| |
|
|
| 11.30 - 11.50 |
Paper 1 Text Correction Using Domain Dependent Bigram Models from Web Crawls [ppt]
Christoph Ringlstetter, Max Hadersbeck*, Klaus U. Schulz* and Stoyan Mihov**
University of Alberta, Canada.
*University of Munich, Germany.
**Bulgarian Academy of Sciences, Sofia, Bulgaria. |
|
| |
|
|
| 11.50 - 12.10 |
Paper 2 Enhanced Integrated Scoring for Text Preprocessing in Ontology Engineering from Dirty Text [ppt]
Wilson Wong, Wei Liu and Mohammed Bennamoun
University of Western Australia, Australia. |
|
| |
|
|
| 12.10 - 12.30 |
Paper 3 Investigation and Modeling of the Structure of Texting Language [ppt]
Monojit Choudhury, Rahul Saraf*, Vijit Jain, Sudeshna Sarkar and Anupam Basu
Indian Institute of Technology, Kharagpur, India.
*National Institute of Technology, Jaipur, India. |
|
| |
|
|
| 12.30 - 12.50 |
Paper 4 Adding Sentence Boundaries to Conversational Speech Transcriptions using Noisily Labelled Examples [ppt]
Tetsuya Nasukawa, Diwakar Punjani*, Shourya Roy*, L Venkata Subramaniam* and Hironori Takeuchi
IBM Research, Tokyo, Japan.
*IBM Research, New Delhi, India. |
|
| |
|
|
| 12.50 - 13.00 |
Boasters for Posters 5, 6, 7 and 8
|
|
| |
|
|
| 13.00 - 14.00 |
Lunch |
|
| |
|
|
| 14.00 - 15.30 |
Session III: Information Extraction from Noisy Text (Session Chair: Klaus Schulz)
|
|
| |
|
|
14.00 - 14.20 |
Paper 1 A Supervised Machine Learning Approach to Conjunction Disambiguation in Named Entities [ppt]
Pawel Mazur and Robert Dale
Macquarie University, Australia. |
|
|
|
|
| 14.20 - 14.40 |
Paper 2 BlogVox: Separating Blog Wheat from Blog Chaff [ppt]
Akshay Java, Pranam Kolari, Tim Finin, Justin Martineau, Anupam Joshi and James Mayfield*
University of Maryland, Baltimore County, MD, USA.
*Johns Hopkins University, USA. |
|
| |
|
|
| 14.40 - 15.00 |
Paper 3 An Automatic Approach to Semantic Annotation of Unstructured, Ungrammatical Sources: A First Look [ppt]
Matthew Michelson and Craig Knoblock
Information Sciences Institute, University of Southern California, USA. |
|
| |
|
|
| 15.00 - 15.20 |
Paper 4 Information Extraction for Multi-Participant, Task-Oriented, Synchronous, Computer-Mediated Communication: A Corpus Study of Chat Data [ppt]
Cassandre Creswell, Nicholas Schwartzmyer and Rohini Srihari
Janya Inc., USA. |
|
| |
|
|
| 15.20 - 15.30 |
Boasters for Posters 9, 10, 11 and 12
|
|
| |
| |
|
|
| 15.30 - 16.00 |
Tea/Coffee Break (along with Posters) |
|
| |
|
|
| 16.00 - 18.30 |
Session IV
| |
| |
|
|
16.00 - 17.00 |
Poster Presentations
|
|
|
|
|
|
Paper 1 Discovering Identies in Web Contexts Using Unsupervised Clustering [ppt]
Ted Pedersen and Anagha Kulkarni
University of Minnesota, Duluth, MN, USA. |
|
|
|
|
|
Paper 2 Ontology Based Algorithms for Indexing and Search of Semantically Close Natural Language Phrases [ppt]
Srikanth Kamath
National Institute of Technology Karnataka, India. |
|
|
|
|
|
Paper 3 A Treebank Conversion Algorithm for Non-Configurational Languages
Ahmad Pouramini and Naser Mozayani
Iran University of Science and Technology, Iran. |
|
|
|
|
|
Paper 4 Generating a Treebank of Ungrammatical English [ppt]
Jennifer Foster
Dublin City University, Dublin, Ireland. |
|
|
|
|
|
Paper 5 Multi-Level Feature Extraction for Spelling Correction [pdf]
Johannes Schaback and Fang Li*
Technische Universitaet, Berlin, Germany.
*Shanghai Jiao Tong University, China. |
|
|
|
|
|
Paper 6 Hidden Markov Model Based Identification of Transliterated Regional Language Words in Text Documents [ppt]
Achuth Sankar S. Nair, Vrinda V. Nair* and Vinod Chandra S. S.**
University of Kerala, Thiruvananthapuram, India.
*College of Engineering, Trissur, India.
**College of Engineering, Thiruvananthapuram, India. |
|
|
|
|
|
Paper 7 Multiclass Hierarchical SVM for Recognition of Printed Tamil Characters [ppt]
Shivsubramani K, Loganathan Ramasamy, Srinivasan CJ, Ajay V and Soman KP
Amrita Vishwa Vidyapeetham, India. |
|
|
|
|
|
Paper 8 A Causal Characterisation of Orthography Errors in Web Texts [ppt]
Mirko Tavosanis
Univerity of Pisa, Italy. |
|
|
|
|
|
Paper 9 Alignment of Noisy Unstructured Text Data [ppt]
Julien Bourdaillet and Jean-Gabriel Ganascia
Universite Pierre et Marie Curie, France. |
|
|
|
|
|
Paper 10 Information Access to Historical Documents from the Early New High German Period [ppt]
Andreas Hauser, Markus Heller, Elisabeth Leiss, Klaus U. Schulz and Christiane Wanzeck
University of Munich, Germany. |
|
|
|
|
|
Paper 11 On Extracting Structured Knowledge from Unstructured Business Documents [ppt]
Gaurav Pandey and Rakshit Daga*
University of Minnesota, Twin Cities, USA.
*SAP Labs, USA. |
|
|
|
|
|
Paper 12 Mining Conversational Text for Procedures [ppt]
Deepak S. Padmanabhan and Krishna Kummammuru
IBM Research, Bangalore, India. |
|
|
|
|
| 17.00 - 18.00 |
Panel Discussion: Noisy Text Analytics: An Exercise in Futility?
Daniel Lopresti (moderator), Lehigh University, Bethelehem, PA, USA. [ppt]
Sreeram Balakrishnan, IBM Research, New Delhi, India. [ppt] Hwee Tou Ng, National University of Singapore, Singapore. [ppt]
Rohini Srihari, Janya Inc., USA. [ppt] |
|
| |
|
|
| 18.00 - 18.03 |
IAPR Best Student Paper Award Announcement [ppt]
Raghuram Krishnapuram
IBM Research, New Delhi, India. |
|
|
|
|
| 18.03 - 18.10 |
Closing [ppt]
Craig Knoblock, Daniel Lopresti, Shourya Roy, L. Venkata Subramaniam
| |
|
|
|
| 18.00 - 22.00 |
IJCAI Inauguration and Welcome Dinner |
|