Bot with Big Personality

I’m working on a bot that has to keep a consistent personality while talking about anything. What I have now is a bot that can be given a description of an environment, a description of which personality it should take, and the sentence that a human is asking to it. For now it is able to provide varied replies. The language it uses is not yet the sharpest possible but I think it’s fun to interact with.

A quick summary of what I did is that I trained a variant of the Transformer, loosely inspired by the Wizard of Wikipedia architecture, on the LIGHT dataset, since it was the closest to the needs of the project. To sample the probability distribution generated by the model at inference time, I use Top-p sampling since according to several metrics gives the most human like responses, which was as well my impression when I compared to other options. All that using basically TensorFlow. To share the result I wanted to have it quickly online, so I used Anvil for that.

I made it produce as well an action to take and an emotion to show. It’s interesting that when the description of the persona is friendly, it tries to hug everything, while if the description is not that friendly, it tries to hit everything! Better versions will come!

Things I will eventually do are: use larger model, larger data to train, distill larger models pretrained on gigantic datasets, put it in the front end, faster inference, create a task that encourages the model to retrieve info from its persona, the past of the convo and the description of the environment.

Written on September 2, 2022