The IJCAI 2009 Workshop on

Information Integration on the Web

At the Pasadena Conference Center in Pasadena, California, July 11-17, 2009



Focus: Data Issues in Services and Service Integration




Workshop Schedule


Important Dates

  • March 31, 2009: Paper Submission (new deadline)
  • April 17, 2009: Notification Decision
  • May 8, 2009: Camera Ready Due
  • Workshop date: July 11, 2009


The explosive growth of the World Wide Web has amassed a huge number of information sources on the Internet with unprecedented potential for information access. Information originating from different sources needs to be integrated in order to gain full advantage of having that information available.  Information integration techniques enable the interaction between users and data sources through a centralized access point and uniform query interfaces that give users the ability to query a single integrated system.

Along with the growth of the World Wide Web, there has been phenomenal growth of the Internet as the medium to deliver services. Services in this context span a wide range: business processes (e.g., call center) of organizations being outsourced,  to applications made available as per Service Oriented Architecture that can be readily composed by a 3rd party, to infrastructure services where IT resources are made available on-demand (e.g., cloud computing). Services are driven by information and increasingly are distributed. The data challenges in services that a service provider faces stem from the fact that although service and information needs of different customer engagements are broadly similar, they are presently recorded and managed differently. The economies of scale and reuse will come from organizing information in some uniform way, yet there are customer sensitivities about data (e.g., privacy and regulation) that must be tackled. The magnitude of data involved could also be enormous. For the customers, the data challenges stem from their need to bring efficiency to operations, yet control their differentiation from their competition and meet all the regulatory requirements.  We can clearly see the connections to problems of interest to the IIWeb community.

This workshop, seventh in the IIWeb series, is focused on making the research in information integration on the web more relevant to the challenges in services. The workshop will continue its traditional themes of interest, namely integration architectures, information extraction, web object extraction, record linkage, named entity extraction, source meta-data learning, query execution and optimization. However, we will give special emphasis to how this can be applied to services and service integration.  After information is extracted and integrated, how should it be organized? How can we build or exploit models. taxonomies or ontologies; trust models for information on the web, service specification in multi-model environment, searching for services, impact of data changes on composition and runtime adaptation, scale-up of methods to web levels.

The workshop will provide a platform to discuss research directions and share experience and insights from both academia and industry.  The anticipated outcome of the workshop is to assess the state of the art in the area, as well as to identify critical next steps to pursue in this topic. As information integration and services is interdisciplinary in nature, its researchers have spanned the related areas of data mining, machine learning, databases, information retrieval, semantic Web, Web services, and others.

Workshop Notes (Click here for schedule)




Invited Talk: Matching and clustering product descriptions using learned similarity metrics”,
                          by Prof. William Cohen, CMU.


Full Papers:


  1. Discovering and Learning Semantic Models of Online Sources for Information Integration (slides);

Jos´e Luis Ambite, Bora Gazen, Craig A. Knoblock, Kristina Lerman, Thomas Russ;

Contact: knoblock at

  1. Towards Scalable Information Integration with Instance Coreferences (slides);

Abir Qasem, Dimitre A. Dimitrov, Jeff Heflin;

Contact: abir.qasem at

  1. Unstructured information integration through data-driven similarity discovery (slides);

Rema Ananthanarayanan, Sreeram Balakrishnan, Berthold Reinwald, Yuen Yee;

Contact: arema at

  1. Combining Multiple Query Interface Matchers Using Dempster-Shafer Theory of Evidence (slides);

Jun Hong, Zhongtian He and David A. Bell;

Contact:           j.hong at

  1. Cross document person name disambiguation using entity profiles (slides);

Harish Srinivasan, John Chen, Rohini Srihari;

Contact:           hsrinivasan at

  1. Challenges in Moving from Documents to Information Web for Services (slides);

Rakesh Mohan, Biplav Srivastava, Pietro Mazzoleni, Richard Goodwin;

Contact: sbiplav at



Short Papers:


  1. A Flexible Machine Learning Framework with User Interaction for Ontology Matching (slides);

Hoai-Viet To, Ryutaro Ichise, and Hoai-Bac Le;

Contact:           thviet at

  1. Flexible query formulation for federated search (slides);

Matthew Michelson, Sofus A. Macskassy, and Steven N. Minton;

Contact:           mmichelson at

  1. On Developing Service-Oriented Web Applications (slides);

Sabah S. Al-Fedaghi;

Contact:           sabah at

  1. Composition of Services that Share an Infinite-State Blackboard (slides);

Fabio Patrizi and Giuseppe De Giacomo;

Contact:           fabio.patrizi at


The Organizers


Biplav Srivastava

IBM T.J. Watson Research Center, Hawthorne, USA

Email: sbiplav  at


Ullas Nambiar

IBM India Research Lab, New Delhi, India,

Email: ubnambiar  at


Craig Knoblock
University of Southern California, USA.

Email: knoblock  at


Steering Committee:

·         Subbarao Kambhampati, Arizona State University, USA

·         Rakesh Mohan, IBM Research, USA


Program Committee:


·         Avigdor Gal, Technion - Israel Institute of Technology

·         Biplav Srivastava,  IBM Watson Research Lab

·         Chen Li, University of California, Irvine

·         Craig Knoblock,  Univ of Southern California

·         Eran Toch,  Carnegie-Mellon University

·         Felix Naumann, Hasso Plattner Institut, Potsdam

·         Gautam Das, University of Texas,Arlington

·         Kevin Chang, University of Illinois at Urbana-Champaign

·         L Venkat Subramaniam, IBM India Research Lab

·         Lise Getoor,  University of Maryland, College Park

·         Louiqa Raschid, University of Maryland, College Park

·         Michael Sheng, Univ of Adelaide, Australia

·         Mong Li Lee, National University of Singapore

·         Prasan Roy, Aster Data Systems

·         Prashant Doshi, University of Georgia

·         Rakesh Mohan,  IBM Watson Research Lab

·         Richard Goodwin,  IBM Watson Research Lab

·         Subbarao Kambhampati, Arizona State University

·         Thomas Lee, The Wharton School, Univ of Pennsylvania

·         Ullas Nambiar,  IBM India Research Lab

·         Vikram Pudi, IIIT Hyderabad

·         Weiyi Meng, Binghamton University, NY

·         Yuan Chi Chang, IBM Watson Research Lab

·         Zaiqing Nie, Microsoft Research Asia, China


History of the Workshop


IIWeb-09 will follow the tradition established by the following six workshops, each having a different focus theme each time and held at premier venues across related communities.


·         AI and Information Integration, held at AAAI 1998 Co-chairs: Craig Knoblock and Alon Levy  


·         Intelligent Information Integration (III 99), held at IJCAI 1999  Co-chairs: D. Fensel, C.A. Knoblock, N. Kushmerick and M.C. Rousset


·         IIWeb 2003, held at IJCAI 2003, received 40 submissions (33 accepted), attracted 45 attendees.  Co-chairs: Craig Knoblock and Subbarao Kambhampati,



·         IIWeb 2004, held at VLDB 2004, received 45 submissions (25 accepted), attracted 45 attendees.   Co-chairs: Hasan Davulcu and Nick Kushmerick.  



·         IIWeb 2006, held at  WWW 2006. Co-chairs: Kevin Chang and Avigdor Gal  



·         IIWeb 2007, held at AAAI 2007, received 24 submissions, (12 accepted), attracted 30 attendees. Co-chairs: Ullas Nambiar and Zaiqing Nie