Week #3: Let's Get To Work

Rui Oliveira
Mar 22, 2022
3 min read

Updated: Apr 24, 2022

1. GPT-3 model

As stated in the previous week's post, the team decided to study the usage of the GPT-3 neural network to implement the digital sales assistant, helping with tasks such as slot tagging in a user's utterance and the response generation for the chat bot to reply to the user.

During this week, the team studied the OpenAI API for the GPT-3 model, understanding how to leverage its features and make the most out of this state-of-the-art model, in order to implement the most natural chat-bot possible.

2. Datasets

Regarding datasets, the team acknowledged that for this challenge, more than one dataset would be needed, as datasets will be necessary to train the neural networks themselves and another dataset, with a store catalog of products, will be necessary for the demonstration of the solution.

For the former, the team understood that GPT-3, being a very generalised and high performant model, even for zero/one-shot approaches, will not need a large dataset, with a couple of dozens or hundreds of examples sufficing to make the model perform as intended.

In case the team identifies the need to train another neural network other than GPT-3, the need for a proper large dataset will have to be reconsidered.

Regarding the dataset for demonstration purposes, the team settled that a clothes store would be the best bet for this challenge, as it is relatively easy to obtain and clothes have different properties that can be interesting to explore in the demonstration.

With that said, the team found that while there are many clothes datasets out there, they are usually very specific with the category of clothes (i.e. datasets with only shoes, accessories, jewellery, etc.). As such, the team had to collect several datasets and implement an ETL (Extract, Transform and Load) pipeline to merge these datasets into a single one.

3. Web application

The neural networks alone will not be able to fully respond to the problem at hand, being necessary to implement a back-end service that integrates the several modules that compose the solution and provides an API that can be interacted with by the front-end, the latter being implemented for demonstration purposes.

As it stands, the team has designed the following back-end service and respective modules, described in the diagram presented below.

In the above diagram, it is possible to visualize the integration between the neural networks, discussed previously, and the Dialogue Manager and Recommendation Engine, the former being the module that stores and manages what the user is looking for, his emotion, and intent, among other attributes that may be relevant for the chat bot to interact with the user naturally and intelligently, and the latter being the module responsible for coming up with the list of products that the user may be interested in.

4. Scientific article

Having settled on the challenge's subject and started its implementation, the team decided that it was time to also start working on the scientific article, starting with its introduction and continuing with the state of the art in the next week, which was already briefly presented in last week's post.

5. Week retrospective

In retrospective, this week was rather productive, with progress being made in several fronts, all the way from the GPT-3 model investigation and dataset gathering to the back-end implementation and scientific article writing.

Next week, the team expects to continue this positive performance and make further advancements in the implementation of the digital sales assistant, as well as the scientific article that accompanies it.