Bigtable: A Distributed Storage System for Structured Data : part2 Data Model (數據模型)

2 Data Model
A Bigtable is a sparse, distributed, persistent multidimensional sorted map. 
The map is indexed by a row key, column key, and a timestamp; 
each value in the map is an uninterpreted array of bytes.
(row:string, column:string, time:int64) → string

2數據模型
Bigtable是一種稀疏,分佈式,持久的多維排序映射。
map由[行鍵/列鍵/時間戳索引]
map 中的每個值都是一個未解釋的字節數組。
[row:string]
[column:string]
[time:int64]

→string


Figure 1: A slice of an example table that stores Web pages. 
The row name is a reversed URL. 
The contents column family contains the page contents, 
and the anchor column family contains the text of any anchors that reference the page. 
CNN’s home page is referenced by both the Sports Illustrated and the MY-look home pages, so the row contains columns named anchor:cnnsi.com and anchor:my.look.ca. 
Each anchor cell has one version; 
the contents column has three versions, at timestamps t3 , t5 , and t6 .

圖1:存儲網頁的示例表的一個切片。
行名稱是一個反向URL。 
(PS : 有兩種column :一種是內容的,一種是anchor的。)
內容column系列包含頁面內容,anchors列系列包含引用頁面的任何anchors的文本。
CNN的主頁由 Sports Illustrated 和 MY-look 主頁引用,因此該行包含名爲anchor的列:
anchor:cnnsi.com和 anchor:my.look.ca。
每個anchor單元有一個版本;                              (1->1)
內容column具有三個版本,時間戳t3,t5和t6。(1->3)有三個時間戳就是有三個版本


We settled on this data model after examining a variety of potential uses of a Bigtable-like system. 
As one concrete example that drove some of our design decisions, suppose we want to keep a copy of a large collection of web pages and related information that could be used by many different projects; let us call this particular table the Webtable. 
In Webtable, we would use URLs as row keys, various aspects of web pages as column names, and store the contents of the web pages in the contents: column under the timestamps when they were fetched, as illustrated in Figure 1.

在研究了類似Bigtable的系統的各種潛在用途之後,我們就定了這個數據模型。
作爲推動我們的一些設計決策的一個具體例子,假設我們要保留許多不同項目可以使用的大量網頁和相關信息的副本; 
讓我們把這個表格稱爲Webtable。
在Webtable中,我們將使用URL作爲行鍵,網頁的各個方面作爲列名,並將網頁的內容存儲在抓取時間戳的內容:列中,如圖1所示。


Rows
The row keys in a table are arbitrary strings (currently up to 64KB in size, although 10-100 bytes is a typical size for most of our users). Every read or write of data under a single row key is atomic (regardless of the number of different columns being read or written in the row),a design decision that makes it easier for clients to reason about the system’s behavior in the presence of concurrent updates to the same row.


表中的行鍵是任意字符串(目前最大64KB大小,儘管10-100字節是我們大多數用戶的典型大小)。 
在單行鍵下的每個讀或寫數據都是原子的(不管讀取或寫入行的不同列的數量如何),這是一種設計決策,使得客戶端更容易在存在並行的情況下推理出系統的行爲更新到同一行。


Bigtable maintains data in lexicographic order by row key. 
The row range for a table is dynamically partitioned.
Each row range is called a tablet, which is the unit of distribution and load balancing. 
As a result, reads of short row ranges are efficient and typically require communication with only a small number of machines. 
Clients can exploit this property by selecting their row keys so that they get good locality for their data accesses. 
For example, in Webtable, pages in the same domain are grouped together into contiguous rows by reversing the hostname components of the URLs. 
For example, we store data for maps.google.com/index.html under the key com.google.maps/index.html. 
Storing pages from the same domain near each other makes some host and domain analyses more efficient.

Bigtable通過 row key 維護字典順序的數據。
表的 row 範圍是動態分區的。
每一行範圍稱爲片劑,其是分佈和負載平衡的單元。
因此,讀取短行範圍是有效的,並且通常僅需要與少量機器進行通信。
客戶可以通過選擇它們的 row key 來利用此屬性,以便他們獲得良好的數據訪問位置。
例如,在Webtable中,通過反轉URL的主機名組件,將同一個域中的頁面分組在一起成爲連續的行。
例如,我們會 在com.google.maps/index.html key 下存儲maps.google.com/index.html的數據(value)。
在key 對應的字符串存儲對應的value 。
將頁面從彼此相鄰的域附近存儲,使得一些主機和域分析更有效率。

Column Families
Column keys are grouped into sets called column families, which form the basic unit of access control. 
All data stored in a column family is usually of the same type (we compress data in the same column family together). 
A column family must be created before data can be stored under any column key in that family; 
after a family has been created, any column key within the family can be used. 
It is our intent that the number of distinct column families in a table be small (in the hundreds at most), and that families rarely change during operation. 
In contrast, a table may have an unbounded number of columns.

Column家庭
Column鍵被分組成稱爲 Column 族的集合,這是組成訪問控制的基本單元。
存儲在 Column 系列中的所有數據通常是相同的類型(我們將數據壓縮在同一 Column 系列中)。
必須創建Column族,才能將數據存儲在該族中的任何Column  key 下;
families成立後,可以使用families內的任何column key。
我們的意圖是,表中不同column families的數量很小(最多爲數百個),並且在操作期間families很少改變。
相比之下,表可能具有無限數量的columns。

A column key is named using the following syntax:
family:qualifier. 
Column family names must be printable, but qualifiers may be arbitrary strings. 
An example column family for the Webtable is language, which stores the language in which a web page was written. 
We use only one column key in the language family, and it stores each web page’s language ID. 
Another useful column family for this table is anchor; each column key in this family represents a single anchor, as shown in Figure 1. 
The qualifier is the name of the referring site; the cell contents is the link text.
Access control and both disk and memory accounting are performed at the column-family level. 
In our Webtable example, these controls allow us to manage several different types of applications: 
some that add new base data, 
some that read the base data and create derived column families, 
and some that are only allowed to view existing data (and possibly not even to view all of the existing families for privacy reasons).

column鍵使用以下語法命名:
family:限定符。
column family 名稱必須是可打印的,但限定符可以是任意字符串。
Webtable的一個示例column family是用於存儲編寫(網頁的語言)的語言。
我們只使用一個column key在語言family中,它存儲每個網頁的語言ID。
該表的另一個有用的列系列是anchor;該family中的每column key代表單個anchor,如圖1所示。
限定詞是引用網站的名稱;單元格內容是鏈接文本。
訪問控制和磁盤和內存計費均在column-family級別(level)執行。
在我們的Webtable示例中,這些控件允許我們管理幾種不同類型的應用程序:
一些添加新的基礎數據,
其中一些讀取基本數據並創建派生column families,
一些僅允許查看現有數據(甚至出於隱私的原因不能查看所有現有的families)。

Timestamps
Each cell in a Bigtable can contain multiple versions of the same data; these versions are indexed by timestamp. 
Bigtable timestamps are 64-bit integers. 
They can be assigned by Bigtable, in which case they represent “real time” in microseconds, or be explicitly assigned by client applications. 
Applications that need to avoid collisions must generate unique timestamps themselves. 
Different versions of a cell are stored in decreasing timestamp order, so that the most recent versions can be read first.
To make the management of versioned data less onerous, we support two per-column-family settings that tell Bigtable to garbage-collect cell versions automatically.
The client can specify either that only the last n versions of a cell be kept, or that only new-enough versions be kept 
(e.g., only keep values that were written in the last seven days).
In our Webtable example, we set the timestamps of the crawled pages stored in the contents: 
column to the times at which these page versions were actually crawled. 
The garbage-collection mechanism described above lets us keep only the most recent three versions of every page.

時間戳
BigTable中的每個單元格可以包含相同數據的多個版本;這些版本由時間戳索引。
Bigtable時間戳是64位整數。
它們可以由Bigtable分配,在這種情況下,它們以微秒錶示“實時”,或者由客戶端應用程序明確分配。
需要避免衝突的應用程序必須自己生成唯一的時間戳。
不同版本的單元格以遞減的時間戳順序存儲,以便可以先讀取最新版本。(t6->t5->t4)
爲了使版本數據的管理更加繁重,我們支持兩列每列設置,這些設置可以讓BigTable自動收集單元格版本。
客戶端可以指定只保留單元格的最後n個版本,或者僅保留足夠新版本(例如,僅保留在過去七天內寫入的值)。
在我們的Webtable示例中,我們設置了內容中存儲的已爬網頁的時間戳:
列到實際抓取這些頁面版本的時間。
上述垃圾收集機制讓我們只保留每一頁最近的三個版本。

發佈了63 篇原創文章 · 獲贊 24 · 訪問量 6萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章