Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

RLSNLP/Sum4Simp

Repository files navigation

Sum4Simp

Codes and the dataset for the paper "Exploiting Summarization Data to Help Text Simplification"(https://aclanthology.org/2023.eacl-main.3.pdf)

The S4S dataset is a stardard sentence simplification dataset mentioned in the paper. You could also mix them all for data augmentation.

If you want to obtain the aligned sentence pairs yourself, you should download the CNN and DM datasets at first. Then, you need to run 'python align.py'.

If you want to filter the suitable sentence pairs from the aligned pairs, you should calculate the attribute values at first. We have upload some example files (for WikiLarge) and you could run 'python filter.py' and check out the total scores. You could set a threshold to filter the pairs you need.

If you have any questions, please contact us: sunrenliangpku@gmail.com

About

Codes and the dataset for the paper "Exploiting Summarization Data to Help Text Simplification"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

Morty Proxy This is a proxified and sanitized view of the page, visit original site.