最近同學問我一個問題關於gz壓縮文件解析XML文檔,然後就自己研究了一下性能最好的Dom4j,記錄一下測試的過程。
XML文檔示例:
<?xml version="1.0" encoding="utf-8" ?>
<order>
<orderitems>
<orderitem>
<ID>aaa0784004100611500511</ID>
<YPBH>0041006</YPBH>
<YP>aaa07840041006</YP>
<YPMC>複方氨酚葡鋅片(康必得)</YPMC>
<YPDM>FFAFPXPKBD</YPDM>
<GG>12s*600</GG>
<CDMC>河北恆利集團製藥股份有限公司</CDMC>
<CDDM>HBHLJTZYGFYXGS</CDDM>
<PH>1150051</PH>
<PCH>1</PCH>
<YXQ>2016-12</YXQ>
<SL>30</SL>
<BZ>600</BZ>
<ZBZ>30</ZBZ>
<DW>盒</DW>
<DJ>4.5000</DJ>
<LSJ>6.7000</LSJ>
<PZWH>H20003806</PZWH>
<JX>片劑</JX>
<ISRETAIL>1</ISRETAIL>
</orderitem>
<orderitem>
<ID>aaa002201560151409061</ID>
<YPBH>0156015</YPBH>
<YP>aaa00220156015</YP>
<YPMC>法莫替丁片</YPMC>
<YPDM>FMTDP</YPDM>
<GG>20mg*24s</GG>
<CDMC>湖南迪諾製藥有限公司</CDMC>
<CDDM>HNDNZYYXGS</CDDM>
<PH>140906</PH>
<PCH>1</PCH>
<YXQ>2016-08</YXQ>
<SL>10</SL>
<BZ>600</BZ>
<ZBZ>10</ZBZ>
<DW>盒</DW>
<DJ>2.9800</DJ>
<LSJ>6.9000</LSJ>
<PZWH>H43020667</PZWH>
<JX>片劑</JX>
<ISRETAIL>1</ISRETAIL>
</orderitem>
</orderitems>
</order>
下面是Dom4j解析xml文件的代碼,很簡單:
package com.util.execute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
import java.io.File;
import java.util.Iterator;
/**
* @author zhaoj
* @version OpenApiTest.java, v 0.1 2019-04-15 19:26
*/
public class OpenApiTest {
public static void main(String[] args) {
try {
File inputFile = new File("E:\\myjava\\test.xml");
SAXReader reader = new SAXReader();
Document document = reader.read(inputFile);
//方法一
Element order = document.getRootElement();
for (Iterator i = order.elementIterator(); i.hasNext(); ) {
Element orderItems = (Element) i.next();
for (Iterator j = orderItems.elementIterator(); j.hasNext(); ) {
Element orderItem = (Element) j.next();
for (Iterator k = orderItem.elementIterator(); k.hasNext(); ) {
Element item = (Element) k.next();
System.out.println(item.getName() + ":" + item.getText());
}
System.out.println("----------------------------------------------");
}
}
//方法二
List<Node> nodes = document.selectNodes("/order/orderitems/orderitem");
for (Node node : nodes) {
System.out.println("標籤名=:" + node.getName());
System.out.println("ID:" + node.selectSingleNode("ID").getText());
System.out.println("YPBH:" + node.selectSingleNode("YPBH").getText());
System.out.println("YP:" + node.selectSingleNode("YP").getText());
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
測試結果如下:
ID:aaa0784004100611500511
YPBH:0041006
YP:aaa07840041006
YPMC:複方氨酚葡鋅片(康必得)
YPDM:FFAFPXPKBD
GG:12s*600
CDMC:河北恆利集團製藥股份有限公司
CDDM:HBHLJTZYGFYXGS
PH:1150051
PCH:1
YXQ:2016-12
SL:30
BZ:600
ZBZ:30
DW:盒
DJ:4.5000
LSJ:6.7000
PZWH:H20003806
JX:片劑
ISRETAIL:1
--------------------------------------------------------------------
ID:aaa002201560151409061
YPBH:0156015
YP:aaa00220156015
YPMC:法莫替丁片
YPDM:FMTDP
GG:20mg*24s
CDMC:湖南迪諾製藥有限公司
CDDM:HNDNZYYXGS
PH:140906
PCH:1
YXQ:2016-08
SL:10
BZ:600
ZBZ:10
DW:盒
DJ:2.9800
LSJ:6.9000
PZWH:H43020667
JX:片劑
ISRETAIL:1
標籤名=:orderitem
ID:aaa0784004100611500511
YPBH:0041006
YP:aaa07840041006
標籤名=:orderitem
ID:aaa002201560151409061
YPBH:0156015
YP:aaa00220156015
其中從根元素開始,尋找到自己需要獲得的節點,然後進行遍歷取值。
需要引入的pom文件:
<!--Spring自帶的也有-->
<dependency>
<groupId>dom4j</groupId>
<artifactId>dom4j</artifactId>
<version>1.6.1</version>
<scope>test</scope>
</dependency>
<!--Spring自帶的-->
<dependency>
<groupId>jaxen</groupId>
<artifactId>jaxen</artifactId>
</dependency>