Go back Home | View on Github | EMNLP Paper |
Leaderboard
Baselines
We evaluate the pairwise accuracy (a random guess is expected to have 0.5 accuracy)
updown |
depth |
width |
|
---|---|---|---|
Length baseline | 0.531 | 0.543 | 0.591 |
Bag-of-words baseline | 0.571 | 0.583 | 0.596 |
Dialog ppl. | 0.488 | 0.508 | 0.513 |
Reverse dialog ppl. | 0.560 | 0.557 | 0.571 |
DialogRPT | 0.683 | 0.695 | 0.752 |
Submit new results!
Want to submit a new results? Please create an issue!
Go back Home | View on Github | EMNLP Paper |