斯坦福stanford coreNLP 賓州樹庫漢語短語類別表23個

短語標記17個

標註	英文說明	中文說明
ADJP	Adjective phrase	形容詞短語，由JJ投射
ADVP	Adverbial phrase headed by AD	由副詞開頭的副詞短語、狀語
CLP	Classifier phrase	量詞短語
CP	Clause headed by C（complementizer）	由補語引導的補語從句，關係從句
DNP	Phrase formed by “XP+DEG”	XP+DEG結構構成的短語
DP	Determiner phrease	限定詞短語
DVP	Phrase formed BY ‘’XP+DEB“	XP+DEV結構構成的短語
FRAG	fragment	片段
IP	InflectionPhrase	Simple clause headed by I（INFL或其他曲折成份）
LCP	Phrase formed by ”XP+LC“	處所詞爲中心語的短語
LST	List marker	用於解釋說明性的列表標記短語
NP	Noun phrase	名詞短語
PP	Preposition phrase	介詞短語
PRN	Parenthetical	插入語
QP	Quantifier phrase	數詞短語，由數量詞構成的短語結構
UCP	Unidentical coordination phrase	非一致性並列短語
VP	Verb phrase	動詞短語

動詞複合6個標記

VCD 並列動詞複合(VCD (VV投資) (VV辦廠))
VCP VV+VC 動詞+是
VNV A不A，A一A，(VNV(VV 能) (AD 不) (VV 能))
VPT V的R，或V不R (VPT (VV 得) (AD 不) (VV 到))
VRD 動詞結果複合，第二個成份是第一個成份的結果(VRD (VV 呈現) (VV 出))；(VP(VRD(VV 聯合) (VV 起來)))
VSB 定語+核心複合，第一個成份爲不及物動詞，兩個成份之間沒有附加語或者體標記，VSB (VV 加速) (VV 建設)) (VP(VSB(VV 仰頭)(VV 望去)))

NP

中心詞爲名詞構成的短語。從語法角度看，有兩種含義：（1）按句法成份構成的短語，如組塊在句子中充當主語、賓語等，可以增加輔助標籤，NP-Sbg，NP-Obj；（2）知識庫中的實體和屬性，這種組塊稱爲baseNP。

VP

以動詞爲中心，與其修飾、限定、並列成份共同構成的一種語義組塊。

CoreNLP中源碼

nonTerminalInfo.put("ROOT",new String[][]{{left, "IP"}});
nonTerminalInfo.put("PAIR",new String[][]{{left, "IP"}});

// Major syntactic categories
nonTerminalInfo.put("ADJP",new String[][]{{left, "JJ","ADJP"}}); // there is one ADJP unary rewrite to AD but otherwiseall have JJ or ADJP
nonTerminalInfo.put("ADVP",new String[][]{{left, "AD","CS", "ADVP","JJ"}}); // CS is a subordinating conjunctor, and there are acouple of ADVP->JJ unary rewrites
nonTerminalInfo.put("CLP",new String[][]{{right, "M","CLP"}});
//nonTerminalInfo.put("CP", newString[][] {{left,"WHNP","IP","CP","VP"}}); // this iscomplicated; see bracketing guide p. 34. Actually, all WHNP are empty. IP/CP seems to be the best semantic head; syntax would dictate DEC/ADVP.Using IP/CP/VP/M is INCREDIBLY bad for Dep parser - lose 3% absolute.
nonTerminalInfo.put("CP",new String[][]{{right, "DEC","WHNP", "WHPP"},rightExceptPunct}); // the (syntax-oriented) right-first head rule
// nonTerminalInfo.put("CP", new String[][]{{right, "DEC","ADVP", "CP", "IP", "VP","M"}}); // the (syntax-oriented) right-first head rule
nonTerminalInfo.put("DNP",new String[][]{{right, "DEG","DEC"}, rightExceptPunct});//according to tgrep2, first preparation, all DNPs have a DEG daughter
nonTerminalInfo.put("DP",new String[][]{{left, "DT","DP"}}); // there's one instance of DP adjunction
nonTerminalInfo.put("DVP",new String[][]{{right, "DEV","DEC"}}); // DVP always has DEV under it
nonTerminalInfo.put("FRAG",new String[][]{{right, "VV","NN"}, rightExceptPunct});//FRAGseems only to be used for bits at the beginnings of articles:"Xinwenshe<DATE>" and "(wan)"
nonTerminalInfo.put("INTJ",new String[][]{{right, "INTJ","IJ", "SP"}});
nonTerminalInfo.put("IP",new String[][]{{left, "VP","IP"}, rightExceptPunct}); // CDM July 2010 following email from Pi-Chuanchanged preference to VP over IP: IP can be -SBJ, -OBJ, or -ADV, and shouldn'tbe head
nonTerminalInfo.put("LCP",new String[][]{{right, "LC","LCP"}}); // there's a bit of LCP adjunction
nonTerminalInfo.put("LST",new String[][]{{right, "CD","PU"}}); // covers all examples
nonTerminalInfo.put("NP",new String[][]{{right, "NN","NR", "NT","NP", "PN","CP"}}); // Basic heads are NN/NR/NT/NP; PN is pronoun.  Some NPs are nominalized relative clauseswithout overt nominal material; these are NP->CP unary rewrites.  Finally, note that this doesn't give any specialtreatment of coordination.
nonTerminalInfo.put("PP",new String[][]{{left, "P","PP"}}); // in the manual there's an example of VV heading PP butI couldn't find such an example with tgrep2
// cdm 2006: PRN changed to not choose punctuation.  Helped parsing (if not significantly)
// nonTerminalInfo.put("PRN", new String[][]{{left,"PU"}}); //presumably left/right doesn't matter
nonTerminalInfo.put("PRN",new String[][]{{left, "NP","VP", "IP","QP", "PP","ADJP", "CLP","LCP"}, {rightdis, "NN","NR", "NT","FW"}});
// cdm 2006: QP: add OD -- occurs some;occasionally NP, NT, M; parsing performance no-op
nonTerminalInfo.put("QP",new String[][]{{right, "QP","CLP", "CD","OD", "NP","NT", "M"}});//there's some QP adjunction
// add OD?
nonTerminalInfo.put("UCP",new String[][]{{left, }}); //an alternative would be"PU","CC"
nonTerminalInfo.put("VP",new String[][]{{left, "VP","VCD", "VPT","VV", "VCP","VA", "VC","VE", "IP","VSB", "VCP","VRD", "VNV"},leftExceptPunct}); //note that ba and long bei introduce IP-OBJ smallclauses; short bei introduces VP
// add BA, LB, as needed

// verb compounds
nonTerminalInfo.put("VCD",new String[][]{{left, "VCD","VV", "VA","VC", "VE"}});//could easily be right instead
nonTerminalInfo.put("VCP",new String[][]{{left, "VCD","VV", "VA","VC", "VE"}});// notmuch info from documentation
nonTerminalInfo.put("VRD",new String[][]{{left, "VCD","VRD", "VV","VA", "VC","VE"}}); // definitely left
nonTerminalInfo.put("VSB",new String[][]{{right, "VCD","VSB", "VV","VA", "VC","VE"}}); // definitely right, though some examples lookquestionably classified (na2lai2 zhi1fu4)
nonTerminalInfo.put("VNV",new String[][]{{left, "VV","VA", "VC","VE"}}); // left/right doesn't matter
nonTerminalInfo.put("VPT",new String[][]{{left, "VV","VA", "VC","VE"}}); // activity verb is to the left

// some POS tags apparently sit where phrases are supposed to be
nonTerminalInfo.put("CD",new String[][]{{right, "CD"}});
nonTerminalInfo.put("NN",new String[][]{{right, "NN"}});
nonTerminalInfo.put("NR",new String[][]{{right, "NR"}});

// I'm adding these POS tags to doprimitive morphology for character-level
// parsing.  It shouldn't affect anythingelse because heads of preterminals are not
// generally queried - GMA
nonTerminalInfo.put("VV",new String[][]{{left}});
nonTerminalInfo.put("VA",new String[][]{{left}});
nonTerminalInfo.put("VC",new String[][]{{left}});
nonTerminalInfo.put("VE",new String[][]{{left}});

// new for ctb6.
nonTerminalInfo.put("FLR",new String[][]{rightExceptPunct});

// new for CTB9
nonTerminalInfo.put("DFL",new String[][]{rightExceptPunct});
nonTerminalInfo.put("EMO",new String[][]{leftExceptPunct});//left/right doesn't matter
nonTerminalInfo.put("INC",new String[][]{leftExceptPunct});
nonTerminalInfo.put("INTJ",new String[][]{leftExceptPunct});
nonTerminalInfo.put("OTH",new String[][]{leftExceptPunct});
nonTerminalInfo.put("SKIP",new String[][]{leftExceptPunct});

斯坦福stanford coreNLP 賓州樹庫漢語短語類別表23個

短語標記17個

動詞複合6個標記

NP

VP

CoreNLP中源碼

一篇基於pthon和scikt-learn的關於機器學習的介紹

Mat簡介入門

C++矩陣處理工具Eigen類淺析

opencv中PCA源碼理解與訓練、使用

增強現實初始

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結