Reddit Dialogue Feedback Dataset

A dataset to learn which dialogue response gets better human feedback.

View project on GitHub

Existing data-driven dialogue generation methods are mostly aimed to minimic the human response distribution. However, even human responses have different quality: some of them are popular, getting lots of upvoted, and some of them are less interesting or even offensive. So, we probably should optimize or predict human feedback, instead of just perplexity.

Which dialogue response gets better human feedback?

We leverage millions of Reddit human feedback data (number of upvotes👍 or replies💬) to formulate the following dialogue ranking tasks, to help building more engaging conversational AI.

See our EMNLP paper or view our project on Github for more details.

Task Description Training size
updown which comment gets more upvotes👍? 40.7 M
width which comment gets more direct replies💬? 22.3 M
depth which comment gets longer follow-up thread💬? 25.1 M

Download the Dataset and checkout the Leaderboard