Week #4: We've Got Models To Train

Rui Oliveira
Mar 28, 2022
2 min read

Updated: Apr 25, 2022

1. Slot tagging and Response generation

The first models the team decided to focus on were the GPT-3 models, responsible for the slot tagging, intention classification and response generation tasks, with the emotion analysis being left out, as the teachers thought it would be a good opportunity to implement a deep learning model from scratch, using LSTM/GRU architectures, and apply the techniques studied in the theoretical classes.

On the other hand, the lack of data for these specific tasks made GPT-3 an obvious choice, since it is known to perform well even when provided with only a few examples, given how generalized and complex the neural network is.

By building a small dataset of interactions between a client and a sales assistant, with a few dozens of examples, the team was able to train models capable of performing these tasks and generalizing to any given scenario with remarkable ease and effectiveness.

2. Emotion analysis

As stated previously, the emotion analysis task was a seen as a good opportunity to apply the techniques taught in the theoretical classes.

Despite this, the team found trouble while searching for a dataset that allowed the training of a neural network that met the team's expectations, either because they were too small, the classes were imbalanced or other reasons.

As such, since the idea of the emotion analysis was the adaptation of the chat bot to the cliente based on the feedback received during the conversation, the team decided that simplifying the problem by instead training a sentiment analysis (i.e. positive or negative feedback) model was the right choice and should be tackled next week.

3. Scientific article

This week, the team was asked to make an intermediate delivery of the scientific article, containing the state of the art and a brief overview of the solution. Having written the introduction to the article last week, the team focused on the aforementioned chapters for this week.

4. Week retrospective

By the end of the week, although the results were not as expected, with the emotion analysis model not working as well as anticipated, the balance made by the team is still a positive one, with the GPT-3 model working better than expected and progress being made in the scientific article.

Next week, the team expects to have better results with the sentiment analysis model, besides working on the recommendation engine, a key component of the chat bot as it is responsible for the product recommendation for the client, the main goal of the digital sales assistant.

Week #4: We've Got Models To Train

1. Slot tagging and Response generation

2. Emotion analysis

3. Scientific article

4. Week retrospective

Recent Posts

Comentarios