Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[ACL 2023] A Text Editing Repository for reproduction and innovation.

License

Notifications You must be signed in to change notification settings

VisualJoyce/TERepo

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TERepo

TERepo

A Text Editing Repository for reproduction and innovation.

GEC

This repo contains code for the following paper accepted to ACL 2023 Findings

@inproceedings{tan-etal-2023-focal,
    title = "Focal Training and Tagger Decouple for Grammatical Error Correction",
    author = "Minghuan, Tan  and
      Min, Yang  and
      Ruifeng, Xu",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
    month = july,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    abstract = "In this paper, we investigate how to improve tagging-based Grammatical Error Correction models. We address two issues of current tagging-based approaches, label imbalance issue, and tagging entanglement issue. Then we propose to down-weight the loss of correctly classified labels using Focal Loss and decouple the error detection layer from the label tagging layer through an extra self-attention-based matching module. Experiments on three recent Chinese Grammatical Error Correction datasets show that our proposed methods are effective. We further analyze choices of hyper-parameters for Focal Loss and inference tweaking.",
}

Preprocessing

The datasets used are publicly available online.

  • MuCGEC
  • FCGEC
  • MCSCSet
PYTHONPATH=src python3.9 preprocess/text_editing/mucgec_to_wds.py \
--input_dir data/datasets/MuCGEC --output_dir data/annotations/text_editing/zh/mucgec 

Training

bash examples/gector/train.sh gector_focal aihijo/gec-zh-gector-bert-large 0 mucgec "--focal_gamma=2"
bash examples/gector/train.sh gector_focal aihijo/gec-zh-gector-bert-large 0 fcgec "--focal_gamma=2"
bash examples/gector/train.sh gector_focal aihijo/gec-zh-gector-bert-large 0 mcscset "--focal_gamma=2"

Predicting

Submission

About

[ACL 2023] A Text Editing Repository for reproduction and innovation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
Morty Proxy This is a proxified and sanitized view of the page, visit original site.