Page 95 - IJEEE-2023-Vol19-ISSUE-1

P. 95

Alkabool, Abdullah, Zadeh, & Mahfooz | 91

that giving a model full paper allows the model to perform
well in uncommon categories. However, some categories,
such as rebuttals and counterclaims, may require further
training examples.

Fig. 4: Average F-1 scores across all models (except Fig. 6: F-1 scores during model training.
Baseline) for each discourse element
IV. CONCLUSION
4) Increasing validation loss
All models started to show an increase in validation loss In conclusion, writing is an important skill and it is vital
after epoch 3, for example, the top-performing longformer for young people to develop their writing skills. By using an
model increased its validation loss from 0.54 to 0.79 over automated writing feedback system, we can help students
epochs 3 to 7 (Fig. 5). According to the Javatpoint article develop their writing talents by providing a detailed analysis
"Overfitting in Machine Learning" [22], a telltale sign of of their writing. One way to improve current automated
overfitting a model is increased validation error during writing feedback systems is to combine them with machine
training, and one way to prevent this is to stop early. Figure learning models to differentiate between different writing
6 show the F-1 scores during model training. As defined by elements in student essays. In this experiment, I show that
the Elite Data Science website, early stopping is the process the longformer model outperforms the BERT or GPT2
of "...stopping the training process before the learner passes models in discourse classification. I also show how the entire
that point [point where variance starts to increase] ..."[23]. I article guides the model during fine-tuning to learn positional
believe I should implement early stopping for my model relationships between utterance elements, especially for the
around the 2nd or 3rd epoch because that's when the variance Lead and Final Statement classes. However, positional
starts to increase. Another approach I could try is to augment encoding alone does not solve the discourse classification
the examples during the training phase. According to problem, and more attention needs to be paid to acquiring
Xiaoshuang Shi in his article The Problem of Overfitting and more categories of data, such as rebuttals or counterclaims,
How to Resolve It [24], sharing more training examples is a to improve the overall results.
good way to solve the overfitting problem. In particular, I
should provide articles with many examples of CONFLICT OF INTEREST
counterclaims and rebuttals, because that's where my model's
performance is weakest. My work here shows that when fine- The authors have no conflict of relevant interest to this
tuning a model for classifying discourse elements, more article.
emphasis needs to be placed on getting more examples,
rather than applying the model to a large number of epochs. REFERENCES

Fig. 5 Validation loss during model training [1] I. Yulianawati, “Self-Efficacy and Writing : A Case
Study at A Senior High School in Indonesian EFL Setting,”
Vis. J. Lang. Foreign Lang. Learn., vol. 8, no. 1, pp. 79–
101, 2019,
[2] L. Darling-Hammond, “Teacher quality and student
achievement: A review of state policy evidence,” Educ.
Policy Anal. Arch., vol. 8, no. November 1999, 2000.
[3] Trey, “5 reasons to use grammarly,” Oct 2019. [Online].

Available https://www.apoven.com/grammarly-benefits/
[4] T. N. Fitria, “Grammarly as AI-powered English Writing
Assistant: Students’ Alternative for Writing English,”
Metathesis J. English Lang. Lit. Teach., vol. 5, no. 1, p. 65,
2021.
[5] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova,
“BERT: Pre-training of deep bidirectional transformers for
language understanding,” NAACL HLT 2019 - 2019 Conf.
North Am. Chapter Assoc. Comput. Linguist. Hum. Lang.
Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186,
2019.
[6] I. Beltagy, M. E. Peters, and A. Cohan, “Longformer: The

90 91 92 93 94 95 96 97 98 99 100