Cover
Vol. 19 No. 1 (2023)

Published: June 30, 2023

Pages: 87-92

Original Article

Identifying Discourse Elements in Writing by Longformer for NER Token Classification

Abstract

Current automatic writing feedback systems cannot distinguish between different discourse elements in students' writing. This is a problem because, without this ability, the guidance provided by these systems is too general for what students want to achieve on arrival. This is cause for concern because automated writing feedback systems are a great tool for combating student writing declines. According to the National Assessment of Educational Progress, less than 30 percent of high school graduates are gifted writers. If we can improve the automatic writing feedback system, we can improve the quality of student writing and stop the decline of skilled writers among students. Solutions to this problem have been proposed, the most popular being the fine-tuning of bidirectional encoder representations from Transformers models that recognize various utterance elements in student written assignments. However, these methods have their drawbacks. For example, these methods do not compare the strengths and weaknesses of different models, and these solutions encourage training models over sequences (sentences) rather than entire articles. In this article, I'm redesigning the Persuasive Essays for Rating, Selecting, and Understanding Argumentative and Discourse Elements corpus so that models can be trained for the entire article, and I've included Transformers, the Long Document Transformer's bidirectional encoder representation, and the Generative Improving a pre trained Transformer 2 model for utterance classification in the context of a named entity recognition token classification problem. Overall, the bi-directional encoder representation of the Transformers model railway using my sequence-merging preprocessing method outperforms the standard model by 17% and 41% in overall accuracy. I also found that the Long Document Transformer model performed the best in utterance classification with an overall f-1 score of 54%. However, the increase in validation loss from 0.54 to 0.79 indicates that the model is overfitting. Some improvements can still be made due to model overfittings, such as B. Implementation of early stopping techniques and further examples of rare utterance elements during training.

References

  1. I. Yulianawati, “Self-Efficacy and Writing : A Case Study at A Senior High School in Indonesian EFL Setting,” Vis. J. Lang. Foreign Lang. Learn., vol. 8, no. 1, pp. 79– 101, 2019,
  2. L. Darling-Hammond, “Teacher quality and student achievement: A review of state policy evidence,” Educ. Policy Anal. Arch., vol. 8, no. November 1999, 2000.
  3. Trey, “5 reasons to use grammarly,” Oct 2019. [Online]. Available https://www.apoven.com/grammarly-benefits/
  4. T. N. Fitria, “Grammarly as AI-powered English Writing Assistant: Students’ Alternative for Writing English,” Metathesis J. English Lang. Lit. Teach., vol. 5, no. 1, p. 65, 2021.
  5. J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186, 2019.
  6. I. Beltagy, M. E. Peters, and A. Cohan, “Longformer: The Alkabool, Abdullah, Zadeh, & Mahfooz Long-Document Transformer,” arXiv:2004.05150, 2020, [Online]. Available: http://arxiv.org/abs/2004.05150.
  7. J. Burstein, D. Marcu, S. Andreyev, and M. Chodorow, “Towards automatic classification of discourse elements in essays,” ACL '01: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pp. 98–105, July 2001. https://doi.org/10.3115/1073012.1073026
  8. A. H. Mohammed and A. H. Ali, “Survey of BERT (Bidirectional Encoder Representation Transformer) types,” J. Phys. Conf. Ser., vol. 1963, no. 1, 2021, doi: 10.1088/1742-6596/1963/1/012173.
  9. W. C. Mann and M. Taboada, “Rhetorical Structure Theory : looking back and moving ahead,” Discourse Stud., vol. 8, no. 3, pp. 423–459, 2006.
  10. J. Peller, “Feedback- baseline sentence classifier [0.226],” Kaggle, Dec 2021. [Online]. Available: https://www.kaggle.com/code/julian3833/feedbackbaseli sentence-classifier-0-226/notebook.
  11. A. Habiby, “Roberta qna model,” Kaggle, Jan 2022. [Online]https://www.kaggle.com/code/aliasgherman/robert a-qnamodel-maxlen-448-stride-192
  12. R. Solovyev, W. Wang, and T. Gabruseva, “Weighted boxes fusion: Ensembling boxes from different object detection models,” Image Vis. Comput., vol. 107, p. 104- 117, 2021, doi: 10.1016/j.imavis.2021.104117.
  13. A. Habiby, “Randomforest only (gradientboostnow),” Kaggle, Jan 2022. [Online]. Available: https://www.kaggle.com/code/aliasgherman/randomforest only-
  14. Lonnie, “Name entity recognition with keras,” Kaggle, Dec 2021. [Online] https://www.kaggle.com/code/lonnieqin/namentityrecognit ion-with-keras
  15. raghaven drakotala, “Fine-tunned on roberta-base as ner problem [0.533],” Kaggle, Dec 2021. [Online]. Avail able: https://www.kaggle.com/code/raghavendrakotala/finetunn ed-on-roberta-base-as-nerproblem-0-533
  16. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  17. Huggingface, “Models,” Apr 2022. [Online]. Available: https://huggingface.co/models
  18. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  19. https://huggingface.co/bert-base-uncased
  20. D. Rothman," Transformers for Natural Language Processing: Build Innovative Deep Neural Network Architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and More," Packt Publishing,2021.
  21. V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” pp. 2–6, 2019, [Online]. Available: http://arxiv.org/abs/1910.01108.
  22. K Yuki, M. Fujiogi, S. Koutsogiannaki. “COVID-19 pathophysiology: A review”. Clin Immunol. 2020;215:108427. doi:10.1016/j.clim.2020.108427.
  23. https://elitedatascience.com/overfitting-in-machine- learning.
  24. X. Shi, Z. Guo, K. Li, Y. Liang, and X. Zhu, “Self-paced Resistance Learning against Overfitting on Noisy Labels,” Pattern Recognit., no. II, p. 109080, 2022. doi:10.1016/j.patcog.2022.109080.