2020
Conferences
Francisco Rodrigues; Rinaldo Lima; William Domingues; Robson Fidalgo; Adrian Chifu; Bernard Espinasse; Sébastien Fournier
DeepNLPF: A Framework for Integrating Third Party NLP Tools Conference
Proceedings of the 12th Language Resources and Evaluation Conference, LREC2020 2020.
Abstract | Links | BibTeX | Tags: Framework, Natural Language Processing, NLP Tools Integration
@conference{rodrigues2020deepnlpf,
title = {DeepNLPF: A Framework for Integrating Third Party NLP Tools},
author = {Francisco Rodrigues and Rinaldo Lima and William Domingues and Robson Fidalgo and Adrian Chifu and Bernard Espinasse and Sébastien Fournier},
url = {http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.895.pdf},
year = {2020},
date = {2020-05-11},
urldate = {2020-01-01},
booktitle = {Proceedings of the 12th Language Resources and Evaluation Conference},
pages = {7244--7251},
series = {LREC2020},
abstract = {Natural Language Processing (NLP) of textual data is usually broken down into a sequence of several subtasks, where the output of one the subtasks becomes the input to the following one, which constitutes an NLP pipeline. Many third-party NLP tools are currently available, each performing distinct NLP subtasks. However, it is difficult to integrate several NLP toolkits into a pipeline due to many problems, including different input/output representations or formats, distinct programming languages, and tokenization issues. This paper presents DeepNLPF, a framework that enables easy integration of third-party NLP tools, allowing the user to preprocess natural language texts at lexical, syntactic, and semantic levels. The proposed framework also provides an API for complete pipeline customization including the definition of input/output formats, integration plugin management, transparent multiprocessing execution strategies, corpus-level statistics, and database persistence. Furthermore, the DeepNLPF user-friendly GUI allows its use even by a non-expert NLP user. We conducted runtime performance analysis showing that DeepNLPF not only easily integrates existent NLP toolkits but also reduces significant runtime processing compared to executing the same NLP pipeline in a sequential manner.},
keywords = {Framework, Natural Language Processing, NLP Tools Integration},
pubstate = {published},
tppubtype = {conference}
}
Natural Language Processing (NLP) of textual data is usually broken down into a sequence of several subtasks, where the output of one the subtasks becomes the input to the following one, which constitutes an NLP pipeline. Many third-party NLP tools are currently available, each performing distinct NLP subtasks. However, it is difficult to integrate several NLP toolkits into a pipeline due to many problems, including different input/output representations or formats, distinct programming languages, and tokenization issues. This paper presents DeepNLPF, a framework that enables easy integration of third-party NLP tools, allowing the user to preprocess natural language texts at lexical, syntactic, and semantic levels. The proposed framework also provides an API for complete pipeline customization including the definition of input/output formats, integration plugin management, transparent multiprocessing execution strategies, corpus-level statistics, and database persistence. Furthermore, the DeepNLPF user-friendly GUI allows its use even by a non-expert NLP user. We conducted runtime performance analysis showing that DeepNLPF not only easily integrates existent NLP toolkits but also reduces significant runtime processing compared to executing the same NLP pipeline in a sequential manner.
TRANSLATE with x
English
TRANSLATE with
Enable collaborative features and customize widget: Bing Webmaster Portal