글쓰기 역량 평가에서 ChatGPT 활용 가능성 탐색 : 논술형 글쓰기 평가를 중심으로

박소영; 홍유정; 이병윤

서지반출

국문초록

이 연구에서는 글쓰기 평가에 ChatGPT를 활용할 수 있는 가능성을 탐색하고자 하였다. 이를 위해 한 가지 주제에 대하여 대학생이 작성한 47개의 에세이 자료를 인간과 ChatGPT가 평가한 후, 그 결과를 비교･분석하였다. 분석결과, 두 평가자 간의 일치도는 평가 영역과 세부 평가 항목에 따라 다르게 나타났으며, 전체적으로 낮은 일치도를 보였다. 전체 13개 평가 항목 중 내용 3개, 조직 1개, 표현 1개 항목이 유의하였으며, 내용의 질적인 부분에 대한 평가보다는 주제 및 내용, 또는 문단 간 관련성을 평가하는 경우에 인간과 ChatGPT 간 평가의 일치도가 유의한 경향을 보였다. 또한 ChatGPT가 점수를 부여한 이유를 살펴본 결과, 두 개의 평가 기준을 한 번에 고려해야 하는 경우에는 평가마다 다른 평가 기준을 적용하여 평가의 일관성이 떨어졌으며, 개수를 기준으로 점수를 부여하는 경우에는 인간에 비해 상당히 세세하게 글을 확인하는 모습이 나타났다. 또한 내용의 타당성을 평가할 때는 상대적으로 구체적인 근거가 제시되길 기대하는 경향을 확인할 수 있었다. 이 연구는 정답이 있는 글이 아닌 자신의 생각을 논리적으로 표현하는 논술형 글을 분석 대상으로 하여 자동채점을 위한 ChatGPT가 평가 가능한 영역을 탐색하였다는 점에서 차별점을 지닌다. 더불어, ChatGPT가 글쓰기 평가 기준을 해석하고 수행하는 양상을 살펴보았다는 데 의의가 있다.

영문초록

This study explored the possibility of using ChatGPT for writing skills assessment. 47 essays written by undergraduate students on a given topic were assessed by both humans and ChatGPT, and their assessment results were compared and analyzed. The analysis highlighted varying degrees of agreement between the two evaluators depending on the assessment domain and specific assessment items, with an overall low level of agreement. Among the 13 assessment items, three related to content, one to organization, and one to expression were significant. The agreement between humans and ChatGPT assessments were significant when assessing the relevance of topics and content or the coherence between paragraphs rather than the qualitative aspects of content. Additionally, when examining the reasons why ChatGPT assigned certain scores, it was found that the consistency of assessments decreased when more than one criterion needed to be considered simultaneously. In contrast, scoring based on a numerical coun tshowed that ChatGPT seemed to scrutinize the text more meticulously than humans. Moreover, when assessing the validity of content, there was a tendency to expect relatively specific evidence to be presented. This study is distinctive as it explores areas where ChatGPT can evaluate discursive essays, which require the logical expression of one’s thoughts, rather than essays with definite answers. Furthermore, it is significant in examining how ChatGPT interprets and executes writing skills assessment criteria.

키워드

글쓰기 역량 논술형 평가 ChatGPT 자동채점 채점자 간 신뢰도

구매하기 (5,900)

장바구니

국문초록

영문초록

목차

키워드