Papers
arxiv:1904.06472

A Repository of Conversational Datasets

Published on Apr 13, 2019
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

Progress in Machine Learning is often driven by the availability of large datasets, and consistent evaluation metrics for comparing modeling approaches. To this end, we present a repository of conversational datasets consisting of hundreds of millions of examples, and a standardised evaluation procedure for conversational response selection models using '1-of-100 accuracy'. The repository contains scripts that allow researchers to reproduce the standard datasets, or to adapt the pre-processing and data filtering steps to their needs. We introduce and evaluate several competitive baselines for conversational response selection, whose implementations are shared in the repository, as well as a neural encoder model that is trained on the entire training set.

Community

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 113

Browse 113 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/1904.06472 in a dataset README.md to link it from this page.

Spaces citing this paper 3,386

Collections including this paper 1