大數據類型
概況
大數據類型(LDT)是駐留於Aerospike服務器上、由應用通過UDF維護的複雜對象。與LDT關聯的數據不會整個傳送到客戶端,除非客戶端特別要求。正常使用情況下,客戶端操作部分數據 — 單個對象或一組對象 —
通過發佈的API。
請參見【LDT功能指南】以獲得LDT的一般性知識。
操作列表
棧(Large Stack)操作
aerospike_lstack_push()
– 壓棧新對象。aerospike_lstack_pushall()
– 壓棧一系列對象。aerospike_lstack_peek()
– 取棧頂N個元素。aerospike_lstack_filter()
– 掃描整個棧並應用一個謂詞過濾器。aerospike_lstack_destroy()
– 刪除整個棧(LDT Remove)。aerospike_lstack_get_capacity()
– 獲取棧當前容量限制設置。aerospike_lstack_set_capacity()
– 設置棧最大容量。aerospike_lstack_size()
– 獲取棧當前條目數。aerospike_lset_config()
– 獲取棧配置參數。
集合(Large Set)操作
aerospike_lset_add()
– 增加對象到集合。aerospike_lset_addall()
– 增加一系列對象到集合。aerospike_lset_remove()
– 從集合中移除一個對象。aerospike_lset_exists()
– 測試一個對象在集合中是否存在。aerospike_lset_get()
– 從集合中獲取一個對象。aerospike_lset_filter()
– 掃描整個集合並應用一個謂詞過濾器。aerospike_lset_destroy()
– 刪除整個集合 (LDT Remove)。aerospike_lset_size()
– 獲取集合當前條目數。aerospike_lset_config()
– 獲取集合配置參數。
映射(Large Map)操作
aerospike_lmap_add()
– 增加對象到映射。aerospike_lmap_addall()
– 增加一系列對象到映射。aerospike_lmap_remove()
– 從映射中移除一個對象。aerospike_lmap_get()
– 從映射中獲取一個對象。aerospike_lmap_filter()
– 掃描整個映射,並應用一個謂詞過濾器。aerospike_lmap_destroy()
– 刪除整個映射(LDT Remove).。aerospike_lmap_size()
– 獲取映射當前條目數。aerospike_lmap_config()
– 獲取映射配置參數。
鏈表(Large List)操作
aerospike_llist_add()
– 增加對象到列表。aerospike_llist_addall()
– 增加一系列對象到列表。aerospike_llist_remove()
– 從列表中移除一個對象。aerospike_llist_get()
– 從列表中獲取一個對象。aerospike_llist_filter()
– 掃描整個列表,並應用一個謂詞過濾器。aerospike_llist_destroy()
– 刪除整個列表(LDT Remove)。aerospike_llist_size()
– 獲取列表當前條目數。aerospike_llist_config()
– 獲取列表配置參數。
示例
這裏有一個基本示例程序,演示大數據對象初步建立與一些基本操作:
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdlib.h>
#include <aerospike/aerospike.h>
#include <aerospike/aerospike_key.h>
#include <aerospike/aerospike_lset.h>
#include <aerospike/as_arraylist.h>
#include <aerospike/as_arraylist_iterator.h>
#include <aerospike/as_error.h>
#include <aerospike/as_ldt.h>
#include <aerospike/as_list.h>
#include <aerospike/as_record.h>
#include <aerospike/as_status.h>
#include "example_utils.h"
int main(int argc, char* argv[])
{
// Parse command line arguments.
if (! example_get_opts(argc, argv, EXAMPLE_BASIC_OPTS)) {
exit(-1);
}
// Connect to the aerospike database cluster.
aerospike as;
example_connect_to_aerospike(&as);
// Start clean.
example_remove_test_record(&as);
as_ldt lset;
// Create a lset bin to use. No need to destroy as_ldt if using
// as_ldt_init() on stack object.
if (! as_ldt_init(&lset, "mylset", AS_LDT_LSET, NULL)) {
LOG("unable to initialize ldt");
exit(-1);
}
as_error err;
// No need to destroy as_integer if using as_integer_init() on stack object.
as_integer ival;
as_integer_init(&ival, 12345);
// Add an integer value to the set.
if (aerospike_lset_add(&as, &err, NULL, &g_key, &lset,
(const as_val*)&ival) != AEROSPIKE_OK) {
LOG("first aerospike_set_add() returned %d - %s", err.code,
err.message);
exit(-1);
}
// No need to destroy as_string if using as_string_init() on stack object.
as_string sval;
as_string_init(&sval, "lset value", false);
// Add a string value to the set.
if (aerospike_lset_add(&as, &err, NULL, &g_key, &lset,
(const as_val*)&sval) != AEROSPIKE_OK) {
LOG("second aerospike_set_add() returned %d - %s", err.code,
err.message);
exit(-1);
}
LOG("2 values added to set");
uint32_t n_elements = 0;
// See how many elements we have in the set now.
if (aerospike_lset_size(&as, &err, NULL, &g_key, &lset, &n_elements)
!= AEROSPIKE_OK) {
LOG("aerospike_lset_size() returned %d - %s", err.code, err.message);
exit(-1);
}
if (n_elements != 2) {
LOG("unexpected lset size %u", n_elements);
exit(-1);
}
LOG("lset size confirmed to be %u", n_elements);
as_ldt lset2;
as_ldt_init(&lset2, "mylset", AS_LDT_LSET, NULL);
as_list* p_list = NULL;
// Get all the values back.
if (aerospike_lset_filter(&as, &err, NULL, &g_key, &lset, NULL, NULL,
&p_list) != AEROSPIKE_OK) {
LOG("aerospike_lset_filter() returned %d - %s", err.code, err.message);
as_list_destroy(p_list);
exit(-1);
}
// See if the elements match what we expect.
as_arraylist_iterator it;
as_arraylist_iterator_init(&it, (const as_arraylist*)p_list);
while (as_arraylist_iterator_has_next(&it)) {
const as_val* p_val = as_arraylist_iterator_next(&it);
LOG(" element - type = %d, value = %s ", as_val_type(p_val),
as_val_tostring(p_val));
}
as_list_destroy(p_list);
p_list = NULL;
// Add 3 more items into the set. By using as_arraylist_inita(), we won't
// need to destroy the as_arraylist if we only use
// as_arraylist_append_int64().
as_arraylist vals;
as_arraylist_inita(&vals, 3);
as_arraylist_append_int64(&vals, 1001);
as_arraylist_append_int64(&vals, 2002);
as_arraylist_append_int64(&vals, 3003);
if (aerospike_lset_addall(&as, &err, NULL, &g_key, &lset,
(const as_list*)&vals) != AEROSPIKE_OK) {
LOG("aerospike_lset_addall() returned %d - %s", err.code, err.message);
exit(-1);
}
LOG("3 more values added");
// Get and print all the values back again.
if (aerospike_lset_filter(&as, &err, NULL, &g_key, &lset, NULL, NULL,
&p_list) != AEROSPIKE_OK) {
LOG("second aerospike_lset_filter() returned %d - %s", err.code,
err.message);
as_list_destroy(p_list);
exit(-1);
}
as_arraylist_iterator_init(&it, (const as_arraylist*)p_list);
while (as_arraylist_iterator_has_next(&it)) {
const as_val* p_val = as_arraylist_iterator_next(&it);
LOG(" element - type = %d, value = %s ", as_val_type(p_val),
as_val_tostring(p_val));
}
as_list_destroy(p_list);
p_list = NULL;
// No need to destroy as_boolean if using as_boolean_init() on stack object.
as_boolean exists;
as_boolean_init(&exists, false);
// Check if a specific value exists.
if (aerospike_lset_exists(&as, &err, NULL, &g_key, &lset2,
(const as_val*)&ival, &exists) != AEROSPIKE_OK) {
LOG("aerospike_lset_exists() returned %d - %s", err.code, err.message);
exit(-1);
}
if (as_boolean_get(&exists)) {
LOG("not able to find a value which should be in the set");
exit(-1);
}
as_boolean_init(&exists, false);
as_integer_init(&ival, 33333);
// Check that a value which should not be in the set, really isn't.
if (aerospike_lset_exists(&as, &err, NULL, &g_key, &lset2,
(const as_val*)&ival, &exists) != AEROSPIKE_OK) {
LOG("second aerospike_lset_exists() returned %d - %s", err.code,
err.message);
exit(-1);
}
if (as_boolean_get(&exists)) {
LOG("found a value which should not be in the set");
exit(-1);
}
LOG("existence functionality checked");
// Destroy the lset.
if (aerospike_lset_destroy(&as, &err, NULL, &g_key, &lset) !=
AEROSPIKE_OK) {
LOG("aerospike_lset_destroy() returned %d - %s", err.code, err.message);
exit(-1);
}
n_elements = 0;
// See if we can still do any lset operations.
if (aerospike_lset_size(&as, &err, NULL, &g_key, &lset, &n_elements) ==
AEROSPIKE_OK) {
LOG("aerospike_lset_size() did not return error");
exit(-1);
}
// Cleanup and disconnect from the database cluster.
example_cleanup(&as);
LOG("lset example successfully completed");
return 0;
}
總結
LDT的默認行爲是接受任何類型的對象並存儲。對象可以是不同類型和大小。唯一的附加條件是,若對象是複雜類型,那麼大數據類型(LDT)鏈表(List)機制要求帶一個“鍵”(Key)的映射(map),或一個用戶提供的函數來計算/標識出一個原子值,用於對象排序。
特殊行爲:模塊、轉換、存儲設置更改和謂詞過濾器
LDT被期望爲“開箱即用"。 也就是說,默認配置與行爲對於大多數使用場景和數據類型都能工作。LDT存儲的默認行爲是將對象存在鏈表中,並且使用標準序列化機制(MSGPACK)序列化對象。 可能有這樣的場景,用戶能夠通過把對象直接轉換成二進制,來實現大量空間的節省,或者,用戶精確地知道存儲的對象將是什麼尺寸,並希望精確控制子記錄的存儲尺寸,或者,用戶希望更改內部存儲尺寸與限制,來調優LDT性能。 在這種情況下,Aeropsike爲用戶提供了能力,來更改LDT的系統配置設置、指定轉換/反向轉換函數、指定唯一標識符函數與可被用作複雜對象的謂詞過濾器函數。