본문 바로가기
주메뉴 바로가기
서브메뉴 바로가기

고려대학교 행정학과

QUICK MENU

로그인
닫기
홈페이지 가입을 위한 개인정보 수집.이용에 대한 동의안내

고려대학교는 제공자가 동의한 아래의 내용 외의 다른 목적으로 활용하지 않습니다.
- - 개인정보 수집·이용 목적 : 홈페이지 가입
- - 개인정보 수집항목 : 포탈아이디, 이름
- - 개인정보 보유 및 이용기간 : 회원탈퇴시까지
- - 개인정보 동의 거부권리 안내 : 신청인은 본 개인정보 수집에 대한 동의를 거부하실 수 있으며, 이 경우 홈페이지 가입이 제한됩니다.
동의 비동의

확인

사이트맵
고려대학교
KUPID

홈 학술연구 논문집 언어정보 지난호

지난호

트위터
페이스북

세종 구문분석 말뭉치를 기반으로 한 확률 문맥자유문법 규칙 상세
분류
제목	세종 구문분석 말뭉치를 기반으로 한 확률 문맥자유문법 규칙
내용	최재웅, 송상헌, 전지은 (고려대학교). 2008. 세종 구문분석 말뭉치를 기반으로 한 확률 문맥자유문법 규칙 (Probabilistic Context-Free Grammar Rules based on Sejong Korean Treebank). Language Information. Volume 9. 87-139. The Sejong Korean Treebank (SKT) was built as part of 10 year government-sponsored Sejong project, and more than 80 million graphic-word Korean parsed corpus has been released to the public at the end of 2007. The purpose of this paper is to extract Context-Free Grammar (CFG) rules from SKT and to draw some linguistic generalizations based on the CFG rules. We introduce an extraction algorithm that was used in this study and prove that it meets the minimal requirements as an objective extraction method in terms of its precision and recall rates. Then our discussion of the extracted CFG rules proceed in terms of the minimal tree structure containing a mother node (MN) and its two daughter nodes (Left DN, Right DN). We arrive at various linguistic or stochastic generalizations restricting the distribution of the categories in the minimal tree structure for Korean, for example, one that states 'In more than 95% of the cases that involve S, VP, NP, VNP, and AP, MN and RDN share the same category.' We provide most of the detailed statistical information regarding the basic properties of SKT and the CFG rules derived from it. Keywords: Sejong Korean Treebank, probabilistic context-free grammar rules, corpus, frequency, parsed nodes, trees
첨부	LI_vol9_5.pdf

목록