Description
Hi,
thanks for providing the dataset as a download. I downloaded the dataset from the location mentioned in #12 (comment)
But it appears that the format of the dataset is different from the files you receive if you dowload the data yourself.
See this gist, the first file 12092740.data
I downloaded myself from archive.org, while the second file was part of the dowloaded dataset.
As you can see the downloaded file contains the attributes [XSUM]URL[XSUM]
, [XSUM]INTRODUCTION[XSUM]
and [XSUM]RESTBODY[XSUM]
. But the file from the dataset has [SN]URL[SN]
, [SN]TITLE[SN]
, [SN]FIRST-SENTENCE[SN]
and [SN]RESTBODY[SN]
.
My problem is that if I follow the tutorial at https://github.com/EdinburghNLP/XSum/tree/master/XSum-Dataset the scripts don't work with the unmodified files.
Which changes do I need to make to the scripts?
Best,
Pyfisch