問題描述
最近在使用一個內部的RPC框架時,發現如果使用Object類型,實際類型爲BigDecimal的時候,作爲傳輸對象的時候,會出現丟失精度的問題;比如在序列化前爲金額1.00,反序列化之後爲1.0,本身值可能沒有影響,但是在有些強依賴金額的地方,會出現問題;
問題分析
查看源碼發現RPC框架默認使用的序列化框架爲Jackson,那簡單,看一下本地是否可以重現問題;
1.準備數據傳輸bean
public class Bean1 {
private String p1;
private BigDecimal p2;
...省略get/set...
}
public class Bean2 {
private String p1;
private Object p2;
...省略get/set...
}
爲了更好的看出問題,分別準備了2個bean;
2.準備測試類
public class JKTest {
public static void main(String[] args) throws IOException {
ObjectMapper mapper = new ObjectMapper();
Bean1 bean1 = new Bean1("haha1", new BigDecimal("1.00"));
Bean2 bean2 = new Bean2("haha2", new BigDecimal("2.00"));
String bs1 = mapper.writeValueAsString(bean1);
String bs2 = mapper.writeValueAsString(bean2);
System.out.println(bs1);
System.out.println(bs2);
Bean1 b1 = mapper.readValue(bs1, Bean1.class);
System.out.println(b1.toString());
Bean2 b22 = mapper.readValue(bs2, Bean2.class);
System.out.println(b22.toString());
}
}
分別對Bean1和Bean2進行序列化和反序列化操作,然後查看結果;
3.顯示結果
{"p1":"haha1","p2":1.00}
{"p1":"haha2","p2":2.00}
Bean1 [p1=haha1, p2=1.00]
Bean2 [p1=haha2, p2=2.0]
4.結果分析
結果可以發現兩個問題:
1.在序列化的時候2個bean都沒有問題;
2.重現了問題,Bean2在反序列化時,p2出現了精度丟失的問題;
5.源碼分析
通過一步一步查看Jackson源碼,最終定位到UntypedObjectDeserializer的Vanilla內部類中,反序列方法如下:
public Object deserialize(JsonParser p, DeserializationContext ctxt) throws IOException
{
switch (p.getCurrentTokenId()) {
case JsonTokenId.ID_START_OBJECT:
{
JsonToken t = p.nextToken();
if (t == JsonToken.END_OBJECT) {
return new LinkedHashMap<String,Object>(2);
}
}
case JsonTokenId.ID_FIELD_NAME:
return mapObject(p, ctxt);
case JsonTokenId.ID_START_ARRAY:
{
JsonToken t = p.nextToken();
if (t == JsonToken.END_ARRAY) { // and empty one too
if (ctxt.isEnabled(DeserializationFeature.USE_JAVA_ARRAY_FOR_JSON_ARRAY)) {
return NO_OBJECTS;
}
return new ArrayList<Object>(2);
}
}
if (ctxt.isEnabled(DeserializationFeature.USE_JAVA_ARRAY_FOR_JSON_ARRAY)) {
return mapArrayToArray(p, ctxt);
}
return mapArray(p, ctxt);
case JsonTokenId.ID_EMBEDDED_OBJECT:
return p.getEmbeddedObject();
case JsonTokenId.ID_STRING:
return p.getText();
case JsonTokenId.ID_NUMBER_INT:
if (ctxt.hasSomeOfFeatures(F_MASK_INT_COERCIONS)) {
return _coerceIntegral(p, ctxt);
}
return p.getNumberValue(); // should be optimal, whatever it is
case JsonTokenId.ID_NUMBER_FLOAT:
if (ctxt.isEnabled(DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS)) {
return p.getDecimalValue();
}
return p.getNumberValue();
case JsonTokenId.ID_TRUE:
return Boolean.TRUE;
case JsonTokenId.ID_FALSE:
return Boolean.FALSE;
case JsonTokenId.ID_END_OBJECT:
// 28-Oct-2015, tatu: [databind#989] We may also be given END_OBJECT (similar to FIELD_NAME),
// if caller has advanced to the first token of Object, but for empty Object
return new LinkedHashMap<String,Object>(2);
case JsonTokenId.ID_NULL: // 08-Nov-2016, tatu: yes, occurs
return null;
//case JsonTokenId.ID_END_ARRAY: // invalid
default:
}
return ctxt.handleUnexpectedToken(Object.class, p);
}
在Bean2中的p2是一個Object類型,所以Jackson中給定的反序列化類爲UntypedObjectDeserializer,這個比較容易理解;然後根據具體的數據類型,調用不用的讀取方法;因爲json這種序列化方式,除了數據,本身並沒有存放具體的數據類型,所有這裏Jackson認定2.00爲一個ID_NUMBER_FLOAT類型,在這個case下面有2個選擇,默認是直接調用getNumberValue()方法,這種情況會丟失精度,返回結果爲2.0;或者開啓使用USE_BIG_DECIMAL_FOR_FLOATS特性,問題解決也很簡單,使用此特性即可;
6.使用USE_BIG_DECIMAL_FOR_FLOATS特性
ObjectMapper mapper = new ObjectMapper();
mapper.enable(DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS);
再次測試,可以發現結果如下:
{"p1":"haha1","p2":1.00}
{"p1":"haha2","p2":2.00}
Bean1 [p1=haha1, p2=1.00]
Bean2 [p1=haha2, p2=2.00]
7.反序列擴展
Jackson本身提供了對序列化和反序列擴展的功能,對應特殊的Bean可以自己定義反序列類,比如針對Bean2,可以實現Bean2Deserializer,然後在ObjectMapper進行註冊
ObjectMapper mapper = new ObjectMapper();
SimpleModule desModule = new SimpleModule("testModule");
desModule.addDeserializer(Bean2.class, new Bean2Deserializer(Bean2.class));
mapper.registerModule(desModule);
擴展
Json本身並沒有存放數據類型,只有數據本身,那應該類Json的序列化方式應該都存在此問題;
1.FastJson分析
準備測試代碼如下:
public class FJTest {
public static void main(String[] args) {
Bean1 bean1 = new Bean1("haha1", new BigDecimal("1.00"));
Bean2 bean2 = new Bean2("haha2", new BigDecimal("2.00"));
String jsonString1 = JSON.toJSONString(bean1);
String jsonString2 = JSON.toJSONString(bean2);
System.out.println(jsonString1);
System.out.println(jsonString2);
Bean1 bean11 = JSON.parseObject(jsonString1, Bean1.class);
Bean2 bean22 = JSON.parseObject(jsonString2, Bean2.class);
System.out.println(bean11.toString());
System.out.println(bean22.toString());
}
}
結果如下:
{"p1":"haha1","p2":1.00}
{"p1":"haha2","p2":2.00}
Bean1 [p1=haha1, p2=1.00]
Bean2 [p1=haha2, p2=2.00]
可以發現FastJson並不存在此問題,查看源碼,定位到DefaultJSONParser的parse方法,部分代碼如下:
public Object parse(Object fieldName) {
final JSONLexer lexer = this.lexer;
switch (lexer.token()) {
case SET:
lexer.nextToken();
HashSet<Object> set = new HashSet<Object>();
parseArray(set, fieldName);
return set;
case TREE_SET:
lexer.nextToken();
TreeSet<Object> treeSet = new TreeSet<Object>();
parseArray(treeSet, fieldName);
return treeSet;
case LBRACKET:
JSONArray array = new JSONArray();
parseArray(array, fieldName);
if (lexer.isEnabled(Feature.UseObjectArray)) {
return array.toArray();
}
return array;
case LBRACE:
JSONObject object = new JSONObject(lexer.isEnabled(Feature.OrderedField));
return parseObject(object, fieldName);
case LITERAL_INT:
Number intValue = lexer.integerValue();
lexer.nextToken();
return intValue;
case LITERAL_FLOAT:
Object value = lexer.decimalValue(lexer.isEnabled(Feature.UseBigDecimal));
lexer.nextToken();
return value;
case LITERAL_STRING:
String stringLiteral = lexer.stringVal();
lexer.nextToken(JSONToken.COMMA);
if (lexer.isEnabled(Feature.AllowISO8601DateFormat)) {
JSONScanner iso8601Lexer = new JSONScanner(stringLiteral);
try {
if (iso8601Lexer.scanISO8601DateIfMatch()) {
return iso8601Lexer.getCalendar().getTime();
}
} finally {
iso8601Lexer.close();
}
}
return stringLiteral;
case NULL:
lexer.nextToken();
return null;
case UNDEFINED:
lexer.nextToken();
return null;
case TRUE:
lexer.nextToken();
return Boolean.TRUE;
case FALSE:
lexer.nextToken();
return Boolean.FALSE;
...省略...
}
類似jackson的方式,根據不同的類型做不同的數據處理,同樣2.00也被認爲是float類型,同樣需要檢測是否開啓Feature.UseBigDecimal特性,只不過FastJson默認開啓了此功能;
2.Protostuff分析
下面再來看一個非Json類序列化方式,看protostuff是如果處理此種問題的;
準備測試代碼如下:
@SuppressWarnings("unchecked")
public class PBTest {
public static void main(String[] args) {
Bean1 bean1 = new Bean1("haha1", new BigDecimal("1.00"));
Bean2 bean2 = new Bean2("haha2", new BigDecimal("2.00"));
LinkedBuffer buffer1 = LinkedBuffer.allocate(LinkedBuffer.DEFAULT_BUFFER_SIZE);
Schema schema1 = RuntimeSchema.createFrom(bean1.getClass());
byte[] bytes1 = ProtostuffIOUtil.toByteArray(bean1, schema1, buffer1);
Bean1 bean11 = new Bean1();
ProtostuffIOUtil.mergeFrom(bytes1, bean11, schema1);
System.out.println(bean11.toString());
LinkedBuffer buffer2 = LinkedBuffer.allocate(LinkedBuffer.DEFAULT_BUFFER_SIZE);
Schema schema2 = RuntimeSchema.createFrom(bean2.getClass());
byte[] bytes2 = ProtostuffIOUtil.toByteArray(bean2, schema2, buffer2);
Bean2 bean22 = new Bean2();
ProtostuffIOUtil.mergeFrom(bytes2, bean22, schema2);
System.out.println(bean22.toString());
}
}
結果如下:
Bean1 [p1=haha1, p2=1.00]
Bean2 [p1=haha2, p2=2.00]
可以發現Protostuff也不存在此問題,原因是因爲Protostuff在序列化的時候就將類型等信息存放在二進制中,不同的類型給定了不同的標識,RuntimeFieldFactory列出了所有標識:
public abstract class RuntimeFieldFactory<V> implements Delegate<V>
{
static final int ID_BOOL = 1, ID_BYTE = 2, ID_CHAR = 3, ID_SHORT = 4,
ID_INT32 = 5, ID_INT64 = 6, ID_FLOAT = 7,
ID_DOUBLE = 8,
ID_STRING = 9,
ID_BYTES = 10,
ID_BYTE_ARRAY = 11,
ID_BIGDECIMAL = 12,
ID_BIGINTEGER = 13,
ID_DATE = 14,
ID_ARRAY = 15, // 1-15 is encoded as 1 byte on protobuf and
// protostuff format
ID_OBJECT = 16, ID_ARRAY_MAPPED = 17, ID_CLASS = 18,
ID_CLASS_MAPPED = 19, ID_CLASS_ARRAY = 20,
ID_CLASS_ARRAY_MAPPED = 21,
ID_ENUM_SET = 22, ID_ENUM_MAP = 23, ID_ENUM = 24,
ID_COLLECTION = 25, ID_MAP = 26,
ID_POLYMORPHIC_COLLECTION = 28, ID_POLYMORPHIC_MAP = 29,
ID_DELEGATE = 30,
ID_ARRAY_DELEGATE = 32, ID_ARRAY_SCALAR = 33, ID_ARRAY_ENUM = 34,
ID_ARRAY_POJO = 35,
ID_THROWABLE = 52,
// pojo fields limited to 126 if not explicitly using @Tag
// annotations
ID_POJO = 127;
......
}
序列化的時候是已如下格式來存儲數據的,如下圖所示:
tag裏面包含了字段的位置標識,比如第一個字段,第二個字段…,以及類型信息,可以看一下兩個bean序列化之後的二進制信息:
10 5 104 97 104 97 49 18 4 49 46 48 48
10 5 104 97 104 97 50 19 98 4 50 46 48 48 20
104 97 104 97 49和104 97 104 97 50分別是:haha1和haha2;49 46 48 48和50 46 48 48分別是1.00和2.00;
Bean2存儲的數據量明細比Bean1大,因爲Bean2中的p2作爲Object存儲,需要存儲Object的起始標識和結束標識,還需要保存具體的類型信息;
更多可以參考:https://my.oschina.net/OutOfM...
總結
類Json序列化方式本身沒有保存數據的類型,所以在反序列時有些類型不能區分,只能通過設置特性的方式來解決,但是json格式有更好的可讀性;直接序列化爲二進制的方式可讀性差點,但是可以將很多信息保存進去,更加完善;
示例代碼地址
https://github.com/ksfzhaohui...
https://gitee.com/OutOfMemory...