JOURNAL OF DIGITAL INFORMATION MANAGEMENT

(ISSN 0972-7272) The peer reviewed  journal

Home | Aim&Scope | Editorial Board | Author Guidelines | Publisher | Subscription | Current Issue | Contact Us |

Volume 4 Issue 3 September 2006

Abstracts

On the Use of Ontologies for an Optimal Representation and Exploration of the Web


Nicolas Guelfi, Cédric Pruski

Laboratory for Advanced Software Systems
Faculty of Sciences, Technology and Communication
University of Luxembourg
6, rue Coudenhove-Kalergi
Luxembourg-Kirchberg, Luxembourg
Email: {nicolas.guelfi, cedric.pruski}@uni.lu

Abstract

The use and definition of ontology for the representation and the exploration of knowledge are critical issues for approaches dealing with information retrieval. In this paper, we propose a new ontology-based approach for improving the quality, in terms of relevance, of the results obtained when searching documents on the Internet. This is done by a coherent integration of ontologies, Web data and query languages. We propose new data structures built upon ontologies: the WPGraph and the W3Graph which allow Web data to be modelled. We also discuss the use of ontologies for an efficient exploration of the knowledge contained in our conceptual structures using ASK, a specific query language introduced in this paper. An experimental validation of our approach is proposed through a prototype supporting our innovative framework.

Categories and Subject Descriptors
D.3.3 [Language Constructs and Features] H.2.3 [Data Description Languages]
General Terms
Ontology, Web graph, Web data
Keywords
Web search, Query Language, Ontology, Conceptual Structures, Graphs
 


Study of Effectiveness of Implicit Indicators and Their Optimal Combination for Accurate Inference of Users Interests

Bracha Shapira*
Department of Information Systems Engineering
Ben-Gurion University, Beer-Sheva, P.O.B. 653, Israel
Email: bshapira@bgu.ac.il

Meirav Taieb-Maimon
Department of Information Systems Engineering
Ben-Gurion University, Beer-Sheva, P.O.B. 653, Israel
Email: Meiravta@bgu.ac.il
Anny moskowitz
Department of Information Systems Engineering
Ben-Gurion University, Beer-Sheva, P.O.B. 653, Israel

Abstract

Retrieval and filtering systems may apply relevance feedback to gain information on users’ needs in order to improve their ad-hoc queries or long term profiling. Explicit relevance involves explicit ratings of documents or terms by the users and disrupts their normal patterns of browsing and searching. The alternative non-disruptive method is implicit feedback inferring users’ needs and interests by monitoring their regular interaction with the system. Some implicit indicators of interest, such as reading time, have been investigated in previous studies and were found indicative to the relevance of documents but not sufficiently accurate to substitute explicit ratings. In this paper we present several new relative implicit feedback indicators and examine their effectiveness as well as the effect of combining several implicit indicators. The paper describes a large-scale user study on which users’ searches were observed by a specially developed browser that recorded their behavior (implicit indicators) as well as their explicit ratings. The relationship between implicit indicators and explicit ratings was analyzed and found that a certain combination of implicit indicators achieved higher correlation with the explicit ratings than any of the individual indicators. We have also found that the newly suggested relative indicators are more indicative to the level of interest of the user in an information item than the non-relative indicators.

Categories and Subject Descriptors
H.3.3 [Information Search and Retrieval]: Relevance feedback; H.3.4 [Systems and Software]: User profile and alert services
General Terms
User profiling, Query formulation
Keywords: Relevance feedback, Implicit indicators, User studies, User Modeling


 Peer-to-Peer content search supported by a distributed index in a publication/search model

Tolosa, Gabriel H.1, Peri, Jorge A. y Bordignon, Fernando R.A.
Universidad Nacional de Luján
Departamento de Ciencias Básicas
Laboratorio de Redes
Cruce rutas 5 y 7 – Luján – Bs. As.
Argentina.
Email: {tolosoft, jperi, bordi}@unlu.edu.ar

Abstract

Peer-to-peer networks (P2P) are considered a valid approach for the construction of distributed systems. Further research projects in the last few years have focused on using this kind of networks as an alternative for solving different situations that have traditionally required centralized servers, such as search engines. This paper deals with the problem of content search in highly distributed and dynamic environments. We propose and evaluate a distributed index model built upon a peer-to-peer network which supports complete indexing of text documents and allows searching by content. A distinctive feature of this proposal is that it requires no specific network topology or hierarchy. Evaluations with different settings were performed by simulating a 10,000-node network, where each node had the capability to share documents.With regard to the traffic generated, experiments show an improvement in efficiency of between 84% and 93% over similar systems like Gnutella. The evaluation of retrieval performance using a test collection showed that the P2P system was able to achieve the same level of performance as the centralized system. It was also found that the amount of traffic generated by this model varies between 80 and 225 Kb per set of query and answers.

Categories and Subject Descriptors
C.2.1 [Network architecture and design]: Distributed networks; H.3.4 [Systems and Software]: Information networks
General Terms
Peer to peer networks, Content search
Keywords: P2P Networks, Nodes, Distributed index model, P2P network evaluation


An Intelligent Paradigm for Multi-Objects Tracking in Crowded Environment


Ayoub K. Al-Hamadi, Bernd Michaelis
Institute for Electronics, Signal Processing and Communications (IESK)
Otto-von-Guericke-University Magdeburg
D-39016 Magdeburg, P.O. Box 4120, Germany
Email: Ayoub.Al-Hamadi@et.uni-magdeburg.de


Abstract

In this paper, we describe a novel object tracking technique in color video sequences, with application to multi-object tracking in crowded scenes. The proposed paradigm integrates object detection into the object tracking process and provide a robust tracking framework under ambiguity conditions. In order to reduce the computational complexity and to increase the robustness, we use a tri-sectional structure. i.e., firstly it distinguishes between real world objects, secondly extracts image features like motion blobs and color patches and thirdly abstracts objects like meta-objects that shall denote real world objects. Through such a tight integration of the motion blobs and color patches, as well as the global optimization of object trajectories, we have accomplished not only robust and efficient multi-object tracking, but also the ability to deal with merging/ splitting of objects, irregular object motions, changing appearances, etc. which are the challenging problems for the most traditional tracking methods. The efficiency of the suggested technique for multi-objects detection and tracking will be demonstrated in this paper on the basis of analysis of strongly disturbed real image sequences

Categories and Subject Descriptors
H.5.1 [Multimedia Information Systems]; E.2 [Data Storage Representations]: Object representations I.4.10 [Image representation]
General Terms
Object processing, Video sequencing
Keywords: Object motions, Colour video processing, Moving objects, MDI approach


Parallel Algorithms for the Generalized Same Generation Query in Deductive Databases


Nabil Arman
Faculty of Computer Science
Palestine Polytechnic University
Hebron, Palestine
Email: narman@ppu.edu

Abstract

The intelligence of traditional database systems can be improved by recursion. Using recursion, relational database systems are extended into knowledge-base systems (deductive database systems). Linear recursion is the most frequently found type of recursion in deductive databases. Deductive databases queries are computationally intensive and lend themselves naturally to parallelization to speed up the solution of such queries. In this paper, parallel algorithms to solve the generalized fully and partially instantiated forms of the same generation query in deductive databases are presented. The algorithms use special data structures, namely, a special matrix that stores paths from source nodes of the graph representing a two-attribute normalized database relation to all nodes reachable from these source nodes, and a reverse matrix that stores paths from any node to all source nodes related to that node.

Categories and Subject Descriptors
C.2.1 [Network architecture and design]: Distributed networks; H.3.4 [Systems and Software]: Information networks
General Terms
Peer to peer networks, Content search

Keywords: Deductive Databases, Linear Recursive Rules, Same Generation Query, Parallel Databases


Pruning Techniques in Associative Classification: Survey and Comparison


Fadi Thabtah
Management Information systems Department
Philadelphia University,
Amman, Jordan
ffayez@philadelphia.edu.jo

Abstract

Association rule discovery and classification are common data mining tasks. Integrating association rule and classification also known as associative classification is a promising approach that derives classifiers highly competitive with regards to accuracy to that of traditional classification approaches such as rule induction and decision trees. However, the size of the classifiers generated by associative classification is often large and therefore pruning becomes an essential task. In this paper, we survey different rule pruning methods used by current associative classification techniques. Further, we compare the effect of three pruning methods (database coverage, pessimistic error estimation, lazy pruning) on the accuracy rate and the number of rules derived from different classification data sets. Results obtained from experimenting on different data sets from UCI data collection indicate that lazy pruning algorithms may produce slightly higher predictive classifiers than those which utilise database coverage and pessimistic error pruning methods. However, the potential use of such classifiers is limited because they are difficult to understand and maintain by the end-user.

Categories and Subject Descriptors
H.2 [Database management] H.2.8 [Database Applications]; Data mining:
General Terms
Data mining, Pruning methods
Keywords: Associative Classification, Association Rule, Classification, Data Mining, Rule Pruning


Home | Aim&Scope | Editorial Board | Author Guidelines | Publisher | Subscription | Current Issue | Contact Us |