如何解析json
json本質上講是一個結構型的文本。可以對其也解析的複雜程度遠不如編譯器。編譯器的一個基礎功能便是詞法分析。而Gson的核心功能之一,也是詞法分析。
詞法分析的任務是要解析出Token流,Token是一個二元組。類似於這樣,Token<符號,種別碼>。
Gson中與token有關的結構有兩個,分別是JsonToken和JsonScope:
public enum JsonToken {
BEGIN_ARRAY, //數組開始
END_ARRAY,//數組結束
BEGIN_OBJECT,//對象開始
END_OBJECT,//對象結束
NAME,//鍵和值
STRING,//字符串
NUMBER,//數字
BOOLEAN,//布爾型
NULL,//空
END_DOCUMENT//文本結束
}
JsonToken中包含son字符串中的所有情況。JsonScope定義了token的種別碼。
final class JsonScope {
static final int EMPTY_ARRAY = 1;
static final int NONEMPTY_ARRAY = 2;
static final int EMPTY_OBJECT = 3;
static final int DANGLING_NAME = 4;
static final int NONEMPTY_OBJECT = 5;
static final int EMPTY_DOCUMENT = 6;
static final int NONEMPTY_DOCUMENT = 7;
static final int CLOSED = 8;
}
其中DANGLING_NAME通俗點講就是key,當出現這種類型的字串,接下來解析出來的便是value。
解析的核心部分放在JsonReader中:
public JsonToken peek() throws IOException {
int p = peeked;
if (p == PEEKED_NONE) {
p = doPeek();
}
switch (p) {
case PEEKED_BEGIN_OBJECT:
return JsonToken.BEGIN_OBJECT;
case PEEKED_END_OBJECT:
return JsonToken.END_OBJECT;
case PEEKED_BEGIN_ARRAY:
return JsonToken.BEGIN_ARRAY;
case PEEKED_END_ARRAY:
return JsonToken.END_ARRAY;
case PEEKED_SINGLE_QUOTED_NAME:
case PEEKED_DOUBLE_QUOTED_NAME:
case PEEKED_UNQUOTED_NAME:
return JsonToken.NAME;
case PEEKED_TRUE:
case PEEKED_FALSE:
return JsonToken.BOOLEAN;
case PEEKED_NULL:
return JsonToken.NULL;
case PEEKED_SINGLE_QUOTED:
case PEEKED_DOUBLE_QUOTED:
case PEEKED_UNQUOTED:
case PEEKED_BUFFERED:
return JsonToken.STRING;
case PEEKED_LONG:
case PEEKED_NUMBER:
return JsonToken.NUMBER;
case PEEKED_EOF:
return JsonToken.END_DOCUMENT;
default:
throw new AssertionError();
}
}
變量p便是token的種別碼。peek函數返回的是一個JsonToken,調用一次,返回一個,並非一次性全部解析完。PEEKED_NONE表示一次讀取完畢,每次調用完,p都會被賦值爲PEEKED_NONE,所以會調用doPeek():
int doPeek() throws IOException {
int peekStack = stack[stackSize - 1];
if (peekStack == JsonScope.EMPTY_ARRAY) {
stack[stackSize - 1] = JsonScope.NONEMPTY_ARRAY;
} else if (peekStack == JsonScope.NONEMPTY_ARRAY) {
// Look for a comma before the next element.
int c = nextNonWhitespace(true);
switch (c) {
case ']':
return peeked = PEEKED_END_ARRAY;
case ';':
checkLenient(); // fall-through
case ',':
break;
default:
throw syntaxError("Unterminated array");
}
} else if (peekStack == JsonScope.EMPTY_OBJECT || peekStack == JsonScope.NONEMPTY_OBJECT) {
stack[stackSize - 1] = JsonScope.DANGLING_NAME;
// Look for a comma before the next element.
if (peekStack == JsonScope.NONEMPTY_OBJECT) {
int c = nextNonWhitespace(true);
switch (c) {
case '}':
return peeked = PEEKED_END_OBJECT;
case ';':
checkLenient(); // fall-through
case ',':
break;
default:
throw syntaxError("Unterminated object");
}
}
int c = nextNonWhitespace(true);
switch (c) {
case '"':
return peeked = PEEKED_DOUBLE_QUOTED_NAME;
case '\'':
checkLenient();
return peeked = PEEKED_SINGLE_QUOTED_NAME;
case '}':
if (peekStack != JsonScope.NONEMPTY_OBJECT) {
return peeked = PEEKED_END_OBJECT;
} else {
throw syntaxError("Expected name");
}
default:
checkLenient();
pos--; // Don't consume the first character in an unquoted string.
if (isLiteral((char) c)) {
return peeked = PEEKED_UNQUOTED_NAME;
} else {
throw syntaxError("Expected name");
}
}
} else if (peekStack == JsonScope.DANGLING_NAME) {
stack[stackSize - 1] = JsonScope.NONEMPTY_OBJECT;
// Look for a colon before the value.
int c = nextNonWhitespace(true);
switch (c) {
case ':':
break;
case '=':
checkLenient();
if ((pos < limit || fillBuffer(1)) && buffer[pos] == '>') {
pos++;
}
break;
default:
throw syntaxError("Expected ':'");
}
} else if (peekStack == JsonScope.EMPTY_DOCUMENT) {
if (lenient) {
consumeNonExecutePrefix();
}
stack[stackSize - 1] = JsonScope.NONEMPTY_DOCUMENT;
} else if (peekStack == JsonScope.NONEMPTY_DOCUMENT) {
int c = nextNonWhitespace(false);
if (c == -1) {
return peeked = PEEKED_EOF;
} else {
checkLenient();
pos--;
}
} else if (peekStack == JsonScope.CLOSED) {
throw new IllegalStateException("JsonReader is closed");
}
int c = nextNonWhitespace(true);
switch (c) {
case ']':
if (peekStack == JsonScope.EMPTY_ARRAY) {
return peeked = PEEKED_END_ARRAY;
}
// fall-through to handle ",]"
case ';':
case ',':
// In lenient mode, a 0-length literal in an array means 'null'.
if (peekStack == JsonScope.EMPTY_ARRAY || peekStack == JsonScope.NONEMPTY_ARRAY) {
checkLenient();
pos--;
return peeked = PEEKED_NULL;
} else {
throw syntaxError("Unexpected value");
}
case '\'':
checkLenient();
return peeked = PEEKED_SINGLE_QUOTED;
case '"':
return peeked = PEEKED_DOUBLE_QUOTED;
case '[':
return peeked = PEEKED_BEGIN_ARRAY;
case '{':
return peeked = PEEKED_BEGIN_OBJECT;
default:
pos--; // Don't consume the first character in a literal value.
}
int result = peekKeyword();
if (result != PEEKED_NONE) {
return result;
}
result = peekNumber();
if (result != PEEKED_NONE) {
return result;
}
if (!isLiteral(buffer[pos])) {
throw syntaxError("Expected value");
}
checkLenient();
return peeked = PEEKED_UNQUOTED;
}
調用nextNonWhitespace返回一個字符,然後根據這個字符,返回不同的類型。JsonReader中定義如下:
private static final int PEEKED_NONE = 0;
private static final int PEEKED_BEGIN_OBJECT = 1;
private static final int PEEKED_END_OBJECT = 2;
private static final int PEEKED_BEGIN_ARRAY = 3;
private static final int PEEKED_END_ARRAY = 4;
private static final int PEEKED_TRUE = 5;
private static final int PEEKED_FALSE = 6;
private static final int PEEKED_NULL = 7;
private static final int PEEKED_SINGLE_QUOTED = 8;
private static final int PEEKED_DOUBLE_QUOTED = 9;
private static final int PEEKED_UNQUOTED = 10;
/** When this is returned, the string value is stored in peekedString. */
private static final int PEEKED_BUFFERED = 11;
private static final int PEEKED_SINGLE_QUOTED_NAME = 12;
private static final int PEEKED_DOUBLE_QUOTED_NAME = 13;
private static final int PEEKED_UNQUOTED_NAME = 14;
/** When this is returned, the integer value is stored in peekedLong. */
private static final int PEEKED_LONG = 15;
private static final int PEEKED_NUMBER = 16;
private static final int PEEKED_EOF = 17;
JsonReader中,讀取字符流,然後根據json方法,生成Token。