Page 91 - IJEEE-2023-Vol19-ISSUE-1
P. 91

Received: 22 August 2022                Revised: 07 November 2022  Accepted: 12 November 2022
DOI: 10.37917/ijeee.19.1.11
                                                                                               Vol. 19| Issue 1| June 2023

                                                                                               Ð Open Access

Iraqi Journal for Electrical and Electronic Engineering

Original Article

Identifying Discourse Elements in Writing by
 Longformer for NER Token Classification

Alia Salih Alkabool 1, Sukaina Abdul Hussain Abdullah2, Sadiq Mahdi Zadeh2, Hani Mahfooz2
                                       1 University of Basrah, Basrah, Iraq

                                    2 Islamic Azad University, Isfahan, Iran

Correspondence
*Alia Salih Alkabool
University of Basrah, Basrah, Iraq

Email: aliasalihjali@gmail.com

Abstract
Current automatic writing feedback systems cannot distinguish between different discourse elements in students' writing. This
is a problem because, without this ability, the guidance provided by these systems is too general for what students want to
achieve on arrival. This is cause for concern because automated writing feedback systems are a great tool for combating student
writing declines. According to the National Assessment of Educational Progress, less than 30 percent of high school graduates
are gifted writers. If we can improve the automatic writing feedback system, we can improve the quality of student writing and
stop the decline of skilled writers among students. Solutions to this problem have been proposed, the most popular being the
fine-tuning of bidirectional encoder representations from Transformers models that recognize various utterance elements in
student written assignments. However, these methods have their drawbacks. For example, these methods do not compare the
strengths and weaknesses of different models, and these solutions encourage training models over sequences (sentences) rather
than entire articles. In this article, I'm redesigning the Persuasive Essays for Rating, Selecting, and Understanding
Argumentative and Discourse Elements corpus so that models can be trained for the entire article, and I've included
Transformers, the Long Document Transformer's bidirectional encoder representation, and the Generative Improving a pre
trained Transformer 2 model for utterance classification in the context of a named entity recognition token classification
problem. Overall, the bi-directional encoder representation of the Transformers model railway using my sequence-merging
preprocessing method outperforms the standard model by 17% and 41% in overall accuracy. I also found that the Long
Document Transformer model performed the best in utterance classification with an overall f-1 score of 54%. However, the
increase in validation loss from 0.54 to 0.79 indicates that the model is overfitting. Some improvements can still be made due
to model overfittings, such as B. Implementation of early stopping techniques and further examples of rare utterance elements
during training.
KEYWORDS: BERT - Bidirectional Encoder Representations from Transformers, NER - Named Entity Recognition,
Longformer – Long Document Transformer, GPT2 - Generative Pre-Trained Transformer 2, NLP - Natural Language
Processing, GSU - Georgia State University

                        I. INTRODUCTION                            communities where proficient writing rates are less than 15%
                                                                   [2]. As researchers at Georgia State University have pointed
    1) The importance of writing                                   out, this problem is primarily due to many schools, especially
   Having the ability to write clearly and concisely is a key      those in low-income communities, not having the resources
skill for all careers. Individuals who are able to express their   to provide personalized feedback on students' writing [3].
thoughts and ideas have an advantage when writing business         Fortunately, one of the problems can be resolved by
emails, proposals, or opposing or supporting new policies.         automatically writing feedback. Automatic writing feedback
The Source Expert website notes in their article 43 Why            systems are programs that can analyze and critique students'
Writing Matters to Students: "There are a variety of ways to       writing while the teacher is away. These programs are
communicate with others, but writing will always be part of        already popular in many applications, such as Microsoft
your daily life." [1]. Although writing is an important human      Outlook's Autosuggest and Grammarly. In fact, Trey from
skill, many students lack writing skills. The National             the website “apoven”, at how a writing feedback system like
Assessment of Educational Progress found that less than            Grammarly can be used to expand one's vocabulary and
30% of high school graduates are proficient writers. They          provide them with instant mini grammar lessons [4]. In
also showed that this problem is more acute in low-income

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and

reproduction in any medium, provided the original work is properly cited.

© 2023 The Authors. Published by Iraqi Journal for Electrical and Electronic Engineering by College of Engineering, University of Basrah.

https://doi.org/10.37917/ijeee.19.1.11                                                         https://www.ijeee.edu.iq 87
   86   87   88   89   90   91   92   93   94   95   96