A Cross-language Information Retrieval System Based On Linguistic And Statistical Approaches

Nasreddine Semmar; Faiza Elkateb-Gara

doi:10.61850/allj.v19i2.479

pdf

Published: Dec 28, 2013

DOI: https://doi.org/10.61850/allj.v19i2.479

Keywords:

language information retrieval, linguistic analysis, statistical model, bilingual dictionaries

Nasreddine Semmar

Multilingual Multimedia Knowledge Engineering Laboratory

Faiza Elkateb-Gara

Multilingual Multimedia Knowledge Engineering Laboratory

Abstract

As the number of non-English documents that are available on the World Wide Web and in corporate repositories increases, the ability to quickly and effectively search and view documents across language boundaries will continue to grow in importance. Cross-language information retrieval techniques allow searchers access to a wider range of material without requiring specialized knowledge of the content or the languages in the database. We present in this paper a cross-language information retrieval system based on a deep linguistic analysis of documents and queries and a statistical model which assigns a weight to each word in the database according to discriminating power. A comparison tool is used to evaluate all possible intersections between queries and documents and order documents by their relevance.

Plum Analytics

Artifact Widget

How to Cite

Semmar, N., & Elkateb-Gara, F. (2013). A Cross-language Information Retrieval System Based On Linguistic And Statistical Approaches. AL-Lisaniyyat, 19(2), 1-10. https://doi.org/10.61850/allj.v19i2.479

Issue

Vol. 19 No. 2 (2013): v19i22013

Section

Articles

In accordance with its open access publishing policy, AL-Lisaniyyat acknowledges and guarantees authors the full and exclusive ownership of copyright and intellectual property rights related to their scholarly contributions.

The publication of an article in the journal does not result in any transfer, assignment, or limitation of these rights. Authors retain full rights over their works, without the requirement to obtain prior written authorization from the journal.

References

[BESANCON & Al 2003]
R. Besançon, Gaël de Chalendar, Olivier Ferret, Christian Fluhr, Olivier Mesnard and Hubert
Naets, “The LIC2M’s CLEF 2003 system”, In Working Notes for the CLEF 2003 Workshop,
Trondheim, Norway, 21-22 August 2003.
[BUCKWALTER 2002]
T. Buckwalter, “Buckwalter Arabic Morphological Analyzer Version 1.0”, Linguistic Data
Consortium, 2002.
[DEBILI & ZOUARI 1985]
F. Debili and L. Zouari, “Analyse morphologique de l'arabe écrit voyellé ou non fondée sur la
construction automatique d'un dictionnaire arabe”, Cognitiva, Paris, France, 1985.
[DEBILI & Al 1988]
F. Debili, C. Fluhr and P. Radasoa, “About reformulation in full text IRS”, Information
processing and Management, England, 1988.
[GREFENSTETTE 1998]
G. Grefenstette, “Cross-language information retrieval”, Boston: Kluwer Academic
Publishers, 1998.
[MAAMOURI & Al 2004]
M. Maamouri, Ann Bies, Tim Buckwalter and Wigdan Mekki, “The Penn Arabic Treebank:
Building a Large-Scale Annotated Arabic Corpus”, NEMLAR International Conference on
Arabic Language Resources and Tools, Cairo, Egypt, 22-23 September 2004.
[SEMMAR & FLUHR 2004]
N. Semmar and C. Fluhr, "Multilingual Search Engine implementation", Final Technical
report of ALMA project, EURO-MED programme, DG XIII, Commission of the European
Union, Systran, France, July 2004.
[ZOUARI 1989]
L. Zouari, “Construction automatique d'un dictionnaire orienté vers l'analyse morphosyntaxique de l'arabe, écrit voyellé ou non voyellé”, Thèse de doctorat, Université Paris XI,
Paris, France, 1989.

Article Sidebar

Main Article Content

Abstract

Article Details

References