Academic Publications

Adrian-Gabriel Chifu; Sébastien Fournier

Sentiment Difficulty in Aspect-Based Sentiment Analysis Journal Article

In: Mathematics, vol. 11, no. 22, 2023, ISSN: 2227-7390.

Abstract | Links | BibTeX | Tags: Difficulty, Sentiment Analysis

@article{math11224647,

title = {Sentiment Difficulty in Aspect-Based Sentiment Analysis},

author = {Adrian-Gabriel Chifu and Sébastien Fournier},

url = {https://www.mdpi.com/2227-7390/11/22/4647},

doi = {10.3390/math11224647},

issn = {2227-7390},

year  = {2023},

date = {2023-01-01},

urldate = {2023-01-01},

journal = {Mathematics},

volume = {11},

number = {22},

abstract = {Subjectivity is a key aspect of natural language understanding, especially in the context of user-generated text and conversational systems based on large language models. Natural language sentences often contain subjective elements, such as opinions and emotions, that make them more nuanced and complex. The level of detail at which the study of the text is performed determines the possible applications of sentiment analysis. The analysis can be done at the document or paragraph level, or, even more granularly, at the aspect level. Many researchers have studied this topic extensively. The field of aspect-based sentiment analysis has numerous data sets and models. In this work, we initiate the discussion around the definition of sentence difficulty in this context of aspect-based sentiment analysis. To assess and quantify the difficulty of the aspect-based sentiment analysis, we conduct an experiment using three data sets: “Laptops”, “Restaurants”, and “MTSC” (Multi-Target-dependent Sentiment Classification), along with 21 learning models from scikit-learn. We also use two textual representations, TF-IDF (Terms frequency-inverse document frequency) and BERT (Bidirectional Encoder Representations from Transformers), to analyze the difficulty faced by these models in performing aspect-based sentiment analysis. Additionally, we compare the models with a fine-tuned version of BERT on the three data sets. We identify the most challenging sentences using a combination of classifiers in order to better understand them. We propose two strategies for defining sentence difficulty. The first strategy is binary and considers sentences as difficult when the classifiers are unable to correctly assign the sentiment polarity. The second strategy uses a six-level difficulty scale based on how many of the top five best-performing classifiers can correctly identify sentiment polarity. These sentences with assigned difficulty classes are then used to create predictive models for early difficulty detection. The purpose of estimating the difficulty of aspect-based sentiment analysis is to enhance performance while minimizing resource usage.},

keywords = {Difficulty, Sentiment Analysis},

pubstate = {published},

tppubtype = {article}

}

Close

Subjectivity is a key aspect of natural language understanding, especially in the context of user-generated text and conversational systems based on large language models. Natural language sentences often contain subjective elements, such as opinions and emotions, that make them more nuanced and complex. The level of detail at which the study of the text is performed determines the possible applications of sentiment analysis. The analysis can be done at the document or paragraph level, or, even more granularly, at the aspect level. Many researchers have studied this topic extensively. The field of aspect-based sentiment analysis has numerous data sets and models. In this work, we initiate the discussion around the definition of sentence difficulty in this context of aspect-based sentiment analysis. To assess and quantify the difficulty of the aspect-based sentiment analysis, we conduct an experiment using three data sets: “Laptops”, “Restaurants”, and “MTSC” (Multi-Target-dependent Sentiment Classification), along with 21 learning models from scikit-learn. We also use two textual representations, TF-IDF (Terms frequency-inverse document frequency) and BERT (Bidirectional Encoder Representations from Transformers), to analyze the difficulty faced by these models in performing aspect-based sentiment analysis. Additionally, we compare the models with a fine-tuned version of BERT on the three data sets. We identify the most challenging sentences using a combination of classifiers in order to better understand them. We propose two strategies for defining sentence difficulty. The first strategy is binary and considers sentences as difficult when the classifiers are unable to correctly assign the sentiment polarity. The second strategy uses a six-level difficulty scale based on how many of the top five best-performing classifiers can correctly identify sentiment polarity. These sentences with assigned difficulty classes are then used to create predictive models for early difficulty detection. The purpose of estimating the difficulty of aspect-based sentiment analysis is to enhance performance while minimizing resource usage.

Close

Mihaela Gaman; Adrian-Gabriel Chifu; William Domingues; Radu-Tudor Ionescu

FreCDo: A Large Corpus for French Cross-Domain Dialect Identification Conference

27th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2023), KES International, Athens, Greece, 2023.

Abstract | Links | BibTeX | Tags: Cross-Domain Evaluation, dialect identification, French Corpus

Ismail Badache; Adrian-Gabriel Chifu; Sébastien Fournier

Unsupervised and Supervised Methods to Estimate Temporal-Aware Contradictions in Online Course Reviews Journal Article

In: Mathematics, vol. 10, no. 5, 2022.

Abstract | Links | BibTeX | Tags: Aspect Detection, Contradiction Intensity, Feature Evaluation, Rating, Sentiment Analysis, Temporality

@article{badache2022,

title = {Unsupervised and Supervised Methods to Estimate Temporal-Aware Contradictions in Online Course Reviews},

author = {Ismail Badache and Adrian-Gabriel Chifu and Sébastien Fournier},

editor = {MDPI},

url = {https://www.mdpi.com/2227-7390/10/5/809},

doi = {10.3390/math10050809},

year  = {2022},

date = {2022-03-03},

urldate = {2022-03-03},

journal = {Mathematics},

volume = {10},

number = {5},

abstract = {The analysis of user-generated content on the Internet has become increasingly popular for a wide variety of applications. One particular type of content is represented by the user reviews for programs, multimedia, products, and so on. Investigating the opinion contained by reviews may help in following the evolution of the reviewed items and thus in improving their quality. Detecting contradictory opinions in reviews is crucial when evaluating the quality of the respective resource. This article aims to estimate the contradiction intensity (strength) in the context of online courses (MOOC). This estimation was based on review ratings and on sentiment polarity in the comments, with respect to specific aspects, such as “lecturer”, “presentation”, etc. Between course sessions, users stop reviewing, and also, the course contents may evolve. Thus, the reviews are time dependent, and this is why they should be considered grouped by the course sessions. Having this in mind, the contribution of this paper is threefold: (a) defining the notion of subjective contradiction around specific aspects and then estimating its intensity based on sentiment polarity, review ratings, and temporality; (b) developing a dataset to evaluate the contradiction intensity measure, which was annotated based on a user study; (c) comparing our unsupervised method with supervised methods with automatic feature selection, over the dataset. The dataset collected from coursera.org is in English. It includes 2244 courses and 73,873 user-generated reviews of those courses.The results proved that the standard deviation of the ratings, the standard deviation of the polarities, and the number of reviews are suitable features for predicting the contradiction intensity classes. Among the supervised methods, the J48 decision trees algorithm yielded the best performance, compared to the naive Bayes model and the SVM model.},

keywords = {Aspect Detection, Contradiction Intensity, Feature Evaluation, Rating, Sentiment Analysis, Temporality},

pubstate = {published},

tppubtype = {article}

}

Close

Yann Duperis; Adrian-Gabriel Chifu; Bernard Espinasse; Sébastien Fournier; Arthur Kuehn

Deep Unordered Composition for Multi-label Classification applied to Skills Prediction Conference

Joint Conference of the Information Retrieval Communities in Europe CIRCLE 2022, Samatan, France, 2022.

Abstract | Links | BibTeX | Tags: Job recommender system, Natural Language Processing, Neural Networks

Igor Nascimento; Rinaldo Lima; Adrian Chifu; Bernard Espinasse; Sébastien Fournier

DeepREF: A Framework for Optimized Deep Learning-based Relation Classification Conference

Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), European Language Resources Association (ELRA), Marseille, France, 2022.

Abstract | Links | BibTeX | Tags: DDI, DeepREF, Embeddings, Framework, NLP, Optuna, Relation Classification, SemEval

@conference{ChifuLREC2022,

title = {DeepREF: A Framework for Optimized Deep Learning-based Relation Classification},

author = {Igor Nascimento and Rinaldo Lima and Adrian Chifu and Bernard Espinasse and Sébastien Fournier},

url = {http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.480.pdf},

year  = {2022},

date = {2022-06-20},

urldate = {2022-06-20},

booktitle = {Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022)},

pages = {4513–4522},

publisher = {European Language Resources Association (ELRA)},

address = {Marseille, France},

abstract = {Relation Extraction (RE) is an important basic Natural Language Processing (NLP) task for many applications, including search engines and question-answering systems. There are many studies in this subarea of NLP that continue to be explored, such as the ones concerned by SemEval shared tasks. For many years, several RE systems based on statistical models have been proposed, as well as the frameworks to develop them. We focus on frameworks allowing to develop such RE systems using deep learning models. Such frameworks make it possible to reproduce experiments using many deep learning models and preprocessing techniques. Currently, there are very few frameworks of this type. In this paper, we propose an open and optimizable framework called DeepREF, inspired by two other existing frameworks: OpenNRE and REflex. DeepREF allows the rapid development of deep learning models for Relation Classification (RC). In addition, it enables hyperparameter optimization, and the application of many preprocessing techniques on the input textual data. DeepREF provides means to boost the process of running deep learning models for RC tasks on different datasets and models. DeepREF is evaluated on three reference corpora and has demonstrated competitive results compared to other state-of-the-art RC systems.},

keywords = {DDI, DeepREF, Embeddings, Framework, NLP, Optuna, Relation Classification, SemEval},

pubstate = {published},

tppubtype = {conference}

}

Close

Noëemi Aepli; Antonios Anastasopoulos; Adrian-Gabriel Chifu; William Domingues; Fahim Faisal; Mihaela Gaman; Radu Tudor Ionescu; Yves Scherrer

Findings of the VarDial Evaluation Campaign 2022 Proceedings

Association for Computational Linguistics, Gyeongju, Republic of Korea, 2022.

Abstract | Links | BibTeX | Tags:

Radu-Tudor Ionescu; Adrian-Gabriel Chifu

FreSaDa: A French Satire Data Set for Cross-Domain Satire Detection Conference

The International Joint Conference on Neural Network, IJCNN 2021, IJCNN2021 2021.

Abstract | Links | BibTeX | Tags: Cross-Domain Evaluation, Satire Detection, Text Classification, Unsupervised Domain Adaptation

Yann Duperis; Adrian-Gabriel Chifu; Bernard Espinasse; Sébastien Fournier; Arthur Kuehn

Vers un système de recommandation de profils experts dans l'industrie des procédés Conference

COnférence en Recherche d’Information et Applications, CORIA2021 Grenoble, France (virtuel), 2021.

Abstract | Links | BibTeX | Tags: Expert search, Job recommender system, Semantic web

Antoine Doucet; Adrian-Gabriel Chifu (Ed.)

COnférence en Recherche d'Informations et Applications - CORIA 2021, French Information Retrieval Conference, Grenoble, France, April 15, 2021 Proceedings

ARIA, 2021.

BibTeX | Tags:

Adrian Chifu; Josiane Mothe; Md Zia Ullah

Fair Exposure of Documents in Information Retrieval: a Community Detection Approach Conference

Joint Conference of the Information Retrieval Communities in Europe, CIRCLE2020 2020.

Abstract | Links | BibTeX | Tags: Document Communities, Document Network, Document Re-ranking, Fair Document Exposure, Information Retrieval, Information Systems

Arabic	Hebrew	Polish
Bulgarian	Hindi	Portuguese
Catalan	Hmong Daw	Romanian
Chinese Simplified	Hungarian	Russian
Chinese Traditional	Indonesian	Slovak
Czech	Italian	Slovenian
Danish	Japanese	Spanish
Dutch	Klingon	Swedish
English	Korean	Thai
Estonian	Latvian	Turkish
Finnish	Lithuanian	Ukrainian
French	Malay	Urdu
German	Maltese	Vietnamese
Greek	Norwegian	Welsh
Haitian Creole	Persian

2023

Journal Articles

Conferences

2022

Journal Articles

Conferences

Proceedings

2021

Conferences

Proceedings

2020

Conferences