Academic Publications

@article{ayter2015statistical,

title = {Statistical analysis to establish the importance of information retrieval parameters},

author = {Julie Ayter and Adrian Chifu and Sébastien Déjean and Cecile Desclaux and Josiane Mothe},

url = {https://hal.archives-ouvertes.fr/hal-01592043/document},

year  = {2015},

date = {2015-12-01},

urldate = {2015-12-01},

journal = {Journal of Universal Computer Science},

volume = {21},

number = {13},

pages = {pp--1767},

abstract = {Search engines are based on models to index documents, match queries and documents and rank documents. Research in Information Retrieval (IR) aims at defining these models and their parameters in order to optimize the results. Using benchmark collections, it has been shown that there is not a best system configura- tion that works for any query, but rather that performance varies from one query to another. It would be interesting if a meta-system could decide which system config- uration should process a new query by learning from the context of previousqueries. This paper reports a deep analysis considering more than 80,000 search engine config- urations applied to 100 queries and the corresponding performance. The goal of the analysis is to identify which configuration responds best to a certain type of query. We considered two approaches to define query types: one is post-evaluation, based on query clustering according to the performance measured with Average Precision, while the second approach is pre-evaluation, using query features (including query difficulty predictors) to cluster queries. Globally, we identified two parameters that should be optimized: retrieving model and TrecQueryTags process. One could ex- pect such results as these two parameters are major components of IR process. However our work results in two main conclusions: 1/ based on post-evaluation approach, we found that retrieving model is the most influential parameter for easy queries while TrecQueryTags process is for hard queries; 2/ for pre-evaluation, current query fea- tures do not allow to cluster queries to identify differences in the influential parameters.},

key = {Information Retrieval, query difficulty, query clustering, IR system pa- rameters, Random Forest},

keywords = {Information Retrieval, IR System Parameter, Query Clustering, Query Difficulty, Random Forest},

pubstate = {published},

tppubtype = {article}

}

Close

Adrian-Gabriel Chifu; Florentina Hristea; Josiane Mothe; Marius Popescu

Word sense discrimination in information retrieval: A spectral clustering-based approach Journal Article

In: Information Processing & Management, vol. 51, no. 2, pp. 16–31, 2015.

Abstract | Links | BibTeX | Tags: High Precision, Information Retrieval, Spectral Clustering, Word Sense Disambiguation, Word Sense Discrimination

Radu Tudor Ionescu; Adrian-Gabriel Chifu; Josiane Mothe

DeShaTo: Describing the Shape of Cumulative Topic Distributions to Rank Retrieval Systems without Relevance Judgments Conference

International Symposium on String Processing and Information Retrieval, SPIRE2015 Springer 2015.

Abstract | Links | BibTeX | Tags: Document Topic Distribution, Information Retrieval, Kurtosis, LDA, Ranking Retrieval Systems, Skewness, Topic Modeling

Adrian Chifu; Léa Laporte; Josiane Mothe

La prédiction efficace de la difficulté des requêtes : une tâche impossible ? Conference

Conférence francophone en Recherche d'Information et Applications (CORIA 2015), Paris, 2015.

Abstract | Links | BibTeX | Tags: Data Mining, Evaluation, Information Retrieval, Query Difficulty Prediction

@conference{ChifuCORIA2015,

title = {La prédiction efficace de la difficulté des requêtes : une tâche impossible ?},

author = {Adrian Chifu and Léa Laporte and Josiane Mothe},

url = {https://oatao.univ-toulouse.fr/15263/1/chifu_15263.pdf},

year  = {2015},

date = {2015-03-18},

booktitle = {Conférence francophone en Recherche d'Information et Applications (CORIA 2015), Paris},

abstract = {Résumé :



Les moteurs de recherche d’information (RI) retrouvent des réponses quelle que soit la requête, mais certaines requêtes sont difficiles (le système n’obtient pas de bonne performance en termes de mesure de RI). Pour les requêtes difficiles, des traitements ad-hoc doivent être appliqués. Prédire qu’une requête est difficile est donc crucial et différents prédicteurs ont été proposés. Dans cet articlenous étudions la variété de l’information captée par les prédicteurs existants et donc leur non redondance. Par ailleurs, nous montrons que les corrélations entre les prédicteurs et les performance des systèmes donnent peu d’espoir sur la capacité de ces prédicteurs à être réellement efficaces. Enfin, nous étudions la capacité des prédicteurs à prédire les classes de difficulté des requêtes en nous appuyant sur une variété de méthodes exploratoires et d’apprentissage. Nous montrons que malgré les (faibles) corrélations observées avec les mesures de performance, les prédicteurs actuels conduisent à des performances de prédiction variables et sont donc difficilement utilisables dans une application concrète de RI.



Abstract:



Search engines found answers whatever the user query is, but some queries are more difficult than others for the system. For difficult queries, adhoc treatments must be applied. Predicting query difficulty is crucial and different predictors have been proposed. In this paper, we revisit these predictors. First we check the non statistical redundancy of predictors. Then, we show that the correlation between the values of predictors and system performance gives little hope on the ability of these predictors to be effective. Finally, we study the ability of predictors to predict the classes of difficulty by relying on a variety of exploratory and learning methods. We show that despite the (low) correlation with performance measures, current predictors are not robust enough to be used in practical IR applications.},

keywords = {Data Mining, Evaluation, Information Retrieval, Query Difficulty Prediction},

pubstate = {published},

tppubtype = {conference}

}

Close

Résumé :

Les moteurs de recherche d’information (RI) retrouvent des réponses quelle que soit la requête, mais certaines requêtes sont difficiles (le système n’obtient pas de bonne performance en termes de mesure de RI). Pour les requêtes difficiles, des traitements ad-hoc doivent être appliqués. Prédire qu’une requête est difficile est donc crucial et différents prédicteurs ont été proposés. Dans cet articlenous étudions la variété de l’information captée par les prédicteurs existants et donc leur non redondance. Par ailleurs, nous montrons que les corrélations entre les prédicteurs et les performance des systèmes donnent peu d’espoir sur la capacité de ces prédicteurs à être réellement efficaces. Enfin, nous étudions la capacité des prédicteurs à prédire les classes de difficulté des requêtes en nous appuyant sur une variété de méthodes exploratoires et d’apprentissage. Nous montrons que malgré les (faibles) corrélations observées avec les mesures de performance, les prédicteurs actuels conduisent à des performances de prédiction variables et sont donc difficilement utilisables dans une application concrète de RI.

Abstract:

Search engines found answers whatever the user query is, but some queries are more difficult than others for the system. For difficult queries, adhoc treatments must be applied. Predicting query difficulty is crucial and different predictors have been proposed. In this paper, we revisit these predictors. First we check the non statistical redundancy of predictors. Then, we show that the correlation between the values of predictors and system performance gives little hope on the ability of these predictors to be effective. Finally, we study the ability of predictors to predict the classes of difficulty by relying on a variety of exploratory and learning methods. We show that despite the (low) correlation with performance measures, current predictors are not robust enough to be used in practical IR applications.

Close

Julie Ayter; Cecile Desclaux; Adrian Chifu; Josiane Mothe; Sébastien Déjean

Performance Analysis of Information Retrieval Systems Conference

Spanish Conference on Information Retrieval (CERI2014), Coruna, 2014, 2014.

Abstract | Links | BibTeX | Tags: Adaptive Information Retrieval, Classification, Information Retrieval, Optimization, Query Difficulty, Random Forest

Adrian-Gabriel Chifu

Prédire la Difficulté des Requêtes: la Combinaison de Mesures Statistiques et Sémantiques Conference

COnférence francophone en Recherche d'Information et Applications, CORIA2013 2013.

Abstract | Links | BibTeX | Tags: Combined Predictors, Information Retrieval, Measure Correlation, Query Ambiguity, Query Difficulty, Query Performance Prediction

Arabic	Hebrew	Polish
Bulgarian	Hindi	Portuguese
Catalan	Hmong Daw	Romanian
Chinese Simplified	Hungarian	Russian
Chinese Traditional	Indonesian	Slovak
Czech	Italian	Slovenian
Danish	Japanese	Spanish
Dutch	Klingon	Swedish
English	Korean	Thai
Estonian	Latvian	Turkish
Finnish	Lithuanian	Ukrainian
French	Malay	Urdu
German	Maltese	Vietnamese
Greek	Norwegian	Welsh
Haitian Creole	Persian

2020

Conferences

2019

Conferences

2018

Journal Articles

2016

Conferences

2015

Journal Articles

Conferences

2014

Conferences

2013

Conferences