http://dla.library.upenn.edu/dla/olac/record.html?id=www_ldc_upenn_edu_LDC2016T13 WebChinese Treebank 9.0 consists of approximately two million words of annotated and parsed text from Chinese newswire, government documents, magazine articles, various broadcast news and broadcast conversation programs, web newsgroups, weblogs, discussion forums, chat messages and transcribed conversational telephone speech. ...
GitHub - baidu/DDParser: 百度开源的依存句法分析系统
WebEnglish treebank (ECTB). Both treebanks are segmented, POS tagged, and syntactically-annotated. A particular feature of CTB data is that, before the treebank process, source Chinese data are segmented into leaf tokens according to the word segmentation scheme proposed by the Penn Chinese treebank team (Xue et al., 2005). WebThis document describes the segmentation guidelines for the Penn Chinese Treebank Project. The goal of the project is the creation of a 100-thousand-word corpus of Mandarin Chinese text with syntactic bracketing. The Chinese Treebank has been released via the Linguistic Data Consortium (LDC) and is available to the public. gps wilhelmshaven personalabteilung
University of Pennsylvania ScholarlyCommons
Web1 人 赞同了该回答. Chinese PropBank已经有了三个版本,其将Predicate-Argument关系加入到Chinese TreeBank语料的语法树结构上,其版本对应关系如下图所示. CPB都通过LDC来进行发布,其中CPB1.0需要付费,CPB2.0和CPB3.0是免费下载的,链接如下. 发布于 2024-05-29 02:57. 赞同 1. WebChinese PropBank已经有了三个版本,其将Predicate-Argument关系加入到Chinese TreeBank语料的语法树结构上,其版本对应关系如下图所示 CPB都通过LDC来进行发 … WebJun 20, 2007 · Chinese Treebank 5.0. Chinese Treebank 5.0 was produced by Linguistic Data Consortium (LDC) catalog number LDC2005T01 and ISBN 1-58563-323-2. The Penn Chinese Treebank is an ongoing project that started in the summer of 1998. The goal of the project is to create a 500,000-word corpus of Chinese text with syntactic bracketing. gps wilhelmshaven