Microsoft Research Question-Answering Corpus

This download consists of data only: a text file containing 1.4K questions aimed at the text of Encarta 98, the full text of Encarta 98, and a set of human annotations identifying pieces of text in Encarta that fully or partially answer the question. These annotations additionally specify information about the precise nature of the match, such as whether the linguistic forms of the question and the answer are similar. The annotation data has been split in two different ways to facilitate different algorithm-training methodologies: 1) 10 files, each containing 10 percent of the original 1.4K questions, along with the full set of answers for each question, and 2) 10 files, each containing 10 percent of the full, pooled set of 10K+ question/answer pairs.

Download Details

File Name: MSR Encarta QA Corpus.msi
Version: 1.0.0
Date Published: 13 November 2008
Download Size: 36.76 MB
Note: By installing, copying, or otherwise using this software, you agree to be bound by the terms of its license.

