Funding Program: Own funding though CONSERT lab resources
Topic: Development and training of Natural Language Processing Agents through the use of Deep Reinforcement Learning
Period: 01/02/2021 – 31/01/2022
Total Cost: 6.555 €
Role in the Project: Internal CoNSeRT project
Description: Τhe objective of this project is to train a Natural Language Processing (NLP) algorithm to generate text based on the collected sparse rewards produced by a Deep Reinforcement Learning (DRL) model. In particular, a Transformer-based Natural Language Generation (NLG) model (e.g.,GPT-2) will be used to create text. At the end of a sentence another Transformer-based model finetuned on a specific task (e.g., RoBERTa on Sentiment Analysis) will evaluate whether the goal has being accomplished (e.g., whether the NLG model has produced a positive comment). Using this pipeline the reward or the penalty of the latter will be backpropagated to the weights of the NLG model based on a DRL algorithm, such as Proximal Policy Optimization (PPO).
This approach will be extremely useful for augmenting textual data when it comes to tasks comprised of few annotated data, or goal-based chatbots that want to accomplish an objective, such as to book a restaurant.
Objectives:
- Goal-based Natural Language Generation
- Textual data augmentation