iJCai-07 Wor kShop on Analitics for noisy Unstrctrd Txt Data

Hyderabad, India - January 8, 2007


Home
Programme
Proceedings
Call for Papers
Important Dates
People
Submission
Attendance
Contact
IBM Research
Supported by IBM Research
IAPR
Endorsed by the International Association for Pattern Recogntion

Noisy unstructured text data is found in informal settings such as online chat, SMS, emails, message boards, newsgroups, blogs, wikis and web pages. Also, text produced by processing spontaneous speech, printed text, handwritten text contains processing noise. Text produced under such circumstances is typically highly noisy containing spelling errors, abbreviations, non-standard words, false starts, repetitions, missing punctuations, missing case information, pause filling words such as “um” and “uh.” Such text can be seen in large amounts in contact centers, on-line chat rooms, OCRed text documents, SMS corpus etc. The theme of the IJCAI 2007 Conference is "AI and its benefits to society." In keeping with this theme, this workshop proposes to look at text analytics of highly noisy text that is produced in such everyday applications in society. more

The goal of the workshop is to focus on the problems encountered in analyzing such noisy documents coming from various sources. The nature of the text warrants moving beyond traditional text analytics techniques. We hope that the workshop will allow researchers to present current research and development in addressing this challenge. We also believe that as a result of this workshop there will be sharing of real life noisy data sets and will result in their becoming available to a wider research community. potential datasets



********Please send your feedback, comments and ideas on future workshops to the AND 07 contacts.

********Thank you for your participation in the workshop. The workshop went off very well with all your help and support. Please click here for a post workshop summary.




News

Watch this space for
latest news!