1 HBase 刪除操作
刪除ColumnFamily
Delete delete = new Delete(rowKey);
delete.addFamily(columnFamily)
delete.setTimestamp(tm);
刪除cf下, 小於或等於給定timestamp 的所有值; 若沒有timestamp,使用most recent cell's timestamp;[這樣代價比較大,需要先查詢獲取已經存在版本的最大時間]
刪除Column
Delete delete = new Delete(rowKey);
delete.addColumns(columnFamily, column)
delete.setTimestamp(tm);
刪除指定的column下, 小於或等於給定timestamp 的所有值; 若沒有timestamp,使用most recent cell's timestamp;[這樣代價比較大,需要先查詢獲取已經存在版本的最大時間]
刪除某版本的值
Delete delete = new Delete(rowKey);
delete.addColumn(columnFamily, column, timestamp);
刪除指定的column下, 等於給定timestamp 的值;
2 刪除之後的讀取和寫入
刪除ColumnFamily
delete.addFamily(columnFamily, ts)
刪除column family後,<=ts的數據無法讀出,也無法寫入<=ts的數據。
刪除Column
delete.addColumns(columnFamily, column, ts)
刪除column後,<=ts的數據無法讀出,也無法寫入<=ts的數據。
刪除某版本
delete.addColumn(columnFamily, column, ts);
刪除某版本後,僅僅=ts的數據無法讀出,也無法寫入=ts的數據。
刪除某版本的值後,該版本之前的值能讀取得到嗎?
/** * uuid-1, 1L; * uuid-2, 2L; * * cache --> uuid-2, 3L; * * del uuid-2, 2L; * put uuid-2, 3L; * * get all: * uuid-1 1L * uuid-2 3L * */ @Test public void testLimitUpdate() throws IOException { // write into two uuid String member = "member-limit-4"; String uuid1 = "uuid-1"; String uuid2 = "uuid-2"; Long uuid2Time = Long.valueOf(2L); Map<String, Long> map = new HashMap<>(); map.put(uuid1, Long.valueOf(1L)); map.put(uuid2, uuid2Time); for (Map.Entry<String, Long> entry: map.entrySet()) { putUuid(member, entry.getKey(), entry.getValue()); } System.out.println("---> after put "); getLimit(member, HColumn.UUID.name()); // delete uuid2 2L delLimit(member, uuid2Time); System.out.println("---> after del "); getLimit(member, HColumn.UUID.name()); // update uuid2, timestamp = 3L putUuid(member, uuid2, Long.valueOf(3L)); System.out.println("---> after update uuid2"); getLimit(member, HColumn.UUID.name()); }
總結:刪除某個版本的值,在該版本前的值依然能夠讀取得到,只是該版本的值沒有了。
刪除某版本後寫入
刪除某版本後,寫入該版本相等或之前的值,能寫入嗎?
@Test public void testLimitUpdate() throws IOException { // write into two uuid String member = "member-limit-11"; String uuid1 = "uuid-1"; String uuid3 = "uuid-3"; putUuid(member, uuid1, Long.valueOf(1L)); putUuid(member, uuid3, Long.valueOf(3L)); System.out.println("---> after put u1, u3"); getLimit(member, HColumn.UUID.name()); // delete 3L delLimit(member, Long.valueOf(3L)); System.out.println("---> after del 3L"); getLimit(member, HColumn.UUID.name()); // put uuid2, timestamp = 3L, can't write putUuid(member, "uuid-2", Long.valueOf(3L)); System.out.println("---> after put u2, 3L"); getLimit(member, HColumn.UUID.name()); // put uuid3, timestamp = 2L, write OK putUuid(member, "uuid-3", Long.valueOf(2L)); System.out.println("---> after put u3, 2L"); getLimit(member, HColumn.UUID.name()); }
---> after put u1, u3
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-3, timestamp => 3
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
---> after del 3L
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
---> after put u2, 3L
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
---> after put u3, 2L
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-3, timestamp => 2
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
刪除某版本後,寫入該timestamp相等的值,不能寫入;寫入不相等的timestamp的值,可以寫入。
具體原因參考:
https://hbase.apache.org/book.html#_delete
https://hbase.apache.org/book.html#version.delete
之前碰到的詭異現象:
刪除某個column,然後再向改column寫入相同的數據,死活寫不進去。一度懷疑是程序封裝得有問題。
後來偶然原因去把表刪除後,又可以寫進去了。
原因分析:
HBase的Delete不是直接刪除數據所對應的文件位置內容,而是一個標記刪除動作。
即在刪除的時候,加上一條類似<delete, cell, timestamp>的記錄。在下一次major compact之前,這條delete記錄跟真實的數據記錄,比如<cell, timestamp1>,都存在於HBase的存儲當中。
當讀取的時候,<delete, cell, timestamp>跟<cell, timestamp1>都會被讀取出來,以timestamp最大的最爲最終返回個用戶的結果(timestamp比delete這個小的,都被認爲是刪除掉的)。
原來HBase 刪除是tomb stone方式的刪除,給忘記了。
https://hbase.apache.org/book.html#_delete
https://hbase.apache.org/book.html#version.delete
附測試用到的公共方法:
public static void putUuid(String memberSrl, String uuid, long timestamp) { String rowKey = LIMIT_CONTROL.getRowKey(memberSrl); Put put = new Put(Bytes.toBytes(rowKey)); byte[] value = Bytes.toBytes(uuid); put.addColumn(LIMIT_CONTROL.getCF(), HColumn.UUID.getCol(), timestamp, value); HBaseUtil.put(LIMIT_CONTROL, Arrays.asList(put)); } public static void delLimit(String memberSrl, long timestamp) { String rowKey = LIMIT_CONTROL.getRowKey(memberSrl); Delete del = new Delete(Bytes.toBytes(rowKey)); del.addColumn(LIMIT_CONTROL.getCF(), HColumn.UUID.getCol(), timestamp); List<Delete> deletes = new ArrayList<>(); deletes.add(del); HBaseUtil.del(LIMIT_CONTROL, deletes); } public static void getLimit(String memberSrl, String trustType) throws IOException { String rowKey = LIMIT_CONTROL.getRowKey(memberSrl); Get get = new Get(Bytes.toBytes(rowKey)); get.addColumn(LIMIT_CONTROL.getCF(), HColumn.valueOf(trustType).getCol()); int limit = DurationLimit.getLimit(HColumn.UUID.name()); System.out.println("get limit version: " + limit); //!!! set version is important for limit control get.setMaxVersions(limit); Result[] results = HBaseUtil.get(LIMIT_CONTROL, get); for (Result result: results) { CellUtil.displayResult(result, String.class); } }
附HBase client 1.2 Delete 代碼:
/** * Used to perform Delete operations on a single row. * <p> * To delete an entire row, instantiate a Delete object with the row * to delete. To further define the scope of what to delete, perform * additional methods as outlined below. * <p> * To delete specific families, execute {@link #addFamily(byte[]) deleteFamily} * for each family to delete. * <p> * To delete multiple versions of specific columns, execute * {@link #addColumns(byte[], byte[]) deleteColumns} * for each column to delete. * <p> * To delete specific versions of specific columns, execute * {@link #addColumn(byte[], byte[], long) deleteColumn} * for each column version to delete. * <p> * Specifying timestamps, deleteFamily and deleteColumns will delete all * versions with a timestamp less than or equal to that passed. If no * timestamp is specified, an entry is added with a timestamp of 'now' * where 'now' is the servers's System.currentTimeMillis(). * Specifying a timestamp to the deleteColumn method will * delete versions only with a timestamp equal to that specified. * If no timestamp is passed to deleteColumn, internally, it figures the * most recent cell's timestamp and adds a delete at that timestamp; i.e. * it deletes the most recently added cell. * <p>The timestamp passed to the constructor is used ONLY for delete of * rows. For anything less -- a deleteColumn, deleteColumns or * deleteFamily -- then you need to use the method overrides that take a * timestamp. The constructor timestamp is not referenced. */ @InterfaceAudience.Public @InterfaceStability.Stable public class Delete extends Mutation implements Comparable<Row> { /** * Create a Delete operation for the specified row. * <p> * If no further operations are done, this will delete everything * associated with the specified row (all versions of all columns in all * families). * @param row row key */ public Delete(byte [] row) { this(row, HConstants.LATEST_TIMESTAMP); } /** * Create a Delete operation for the specified row and timestamp.<p> * * If no further operations are done, this will delete all columns in all * families of the specified row with a timestamp less than or equal to the * specified timestamp.<p> * * This timestamp is ONLY used for a delete row operation. If specifying * families or columns, you must specify each timestamp individually. * @param row row key * @param timestamp maximum version timestamp (only for delete row) */ public Delete(byte [] row, long timestamp) { this(row, 0, row.length, timestamp); } /** * Create a Delete operation for the specified row and timestamp.<p> * * If no further operations are done, this will delete all columns in all * families of the specified row with a timestamp less than or equal to the * specified timestamp.<p> * * This timestamp is ONLY used for a delete row operation. If specifying * families or columns, you must specify each timestamp individually. * @param rowArray We make a local copy of this passed in row. * @param rowOffset * @param rowLength */ public Delete(final byte [] rowArray, final int rowOffset, final int rowLength) { this(rowArray, rowOffset, rowLength, HConstants.LATEST_TIMESTAMP); } /** * Create a Delete operation for the specified row and timestamp.<p> * * If no further operations are done, this will delete all columns in all * families of the specified row with a timestamp less than or equal to the * specified timestamp.<p> * * This timestamp is ONLY used for a delete row operation. If specifying * families or columns, you must specify each timestamp individually. * @param rowArray We make a local copy of this passed in row. * @param rowOffset * @param rowLength * @param ts maximum version timestamp (only for delete row) */ public Delete(final byte [] rowArray, final int rowOffset, final int rowLength, long ts) { checkRow(rowArray, rowOffset, rowLength); this.row = Bytes.copy(rowArray, rowOffset, rowLength); setTimestamp(ts); } /** * @param d Delete to clone. */ public Delete(final Delete d) { this.row = d.getRow(); this.ts = d.getTimeStamp(); this.familyMap.putAll(d.getFamilyCellMap()); this.durability = d.durability; for (Map.Entry<String, byte[]> entry : d.getAttributesMap().entrySet()) { this.setAttribute(entry.getKey(), entry.getValue()); } } /** * Advanced use only. * Add an existing delete marker to this Delete object. * @param kv An existing KeyValue of type "delete". * @return this for invocation chaining * @throws IOException */ @SuppressWarnings("unchecked") public Delete addDeleteMarker(Cell kv) throws IOException { // TODO: Deprecate and rename 'add' so it matches how we add KVs to Puts. if (!CellUtil.isDelete(kv)) { throw new IOException("The recently added KeyValue is not of type " + "delete. Rowkey: " + Bytes.toStringBinary(this.row)); } if (Bytes.compareTo(this.row, 0, row.length, kv.getRowArray(), kv.getRowOffset(), kv.getRowLength()) != 0) { throw new WrongRowIOException("The row in " + kv.toString() + " doesn't match the original one " + Bytes.toStringBinary(this.row)); } byte [] family = CellUtil.cloneFamily(kv); List<Cell> list = familyMap.get(family); if (list == null) { list = new ArrayList<Cell>(); } list.add(kv); familyMap.put(family, list); return this; } /** * Delete all versions of all columns of the specified family. * <p> * Overrides previous calls to deleteColumn and deleteColumns for the * specified family. * @param family family name * @return this for invocation chaining * @deprecated Since 1.0.0. Use {@link #addFamily(byte[])} */ @Deprecated public Delete deleteFamily(byte [] family) { return addFamily(family); } /** * Delete all versions of all columns of the specified family. * <p> * Overrides previous calls to deleteColumn and deleteColumns for the * specified family. * @param family family name * @return this for invocation chaining */ public Delete addFamily(final byte [] family) { this.deleteFamily(family, this.ts); return this; } /** * Delete all columns of the specified family with a timestamp less than * or equal to the specified timestamp. * <p> * Overrides previous calls to deleteColumn and deleteColumns for the * specified family. * @param family family name * @param timestamp maximum version timestamp * @return this for invocation chaining * @deprecated Since 1.0.0. Use {@link #addFamily(byte[], long)} */ @Deprecated public Delete deleteFamily(byte [] family, long timestamp) { return addFamily(family, timestamp); } /** * Delete all columns of the specified family with a timestamp less than * or equal to the specified timestamp. * <p> * Overrides previous calls to deleteColumn and deleteColumns for the * specified family. * @param family family name * @param timestamp maximum version timestamp * @return this for invocation chaining */ public Delete addFamily(final byte [] family, final long timestamp) { if (timestamp < 0) { throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp); } List<Cell> list = familyMap.get(family); if(list == null) { list = new ArrayList<Cell>(); } else if(!list.isEmpty()) { list.clear(); } KeyValue kv = new KeyValue(row, family, null, timestamp, KeyValue.Type.DeleteFamily); list.add(kv); familyMap.put(family, list); return this; } /** * Delete all columns of the specified family with a timestamp equal to * the specified timestamp. * @param family family name * @param timestamp version timestamp * @return this for invocation chaining * @deprecated Since hbase-1.0.0. Use {@link #addFamilyVersion(byte[], long)} */ @Deprecated public Delete deleteFamilyVersion(byte [] family, long timestamp) { return addFamilyVersion(family, timestamp); } /** * Delete all columns of the specified family with a timestamp equal to * the specified timestamp. * @param family family name * @param timestamp version timestamp * @return this for invocation chaining */ public Delete addFamilyVersion(final byte [] family, final long timestamp) { List<Cell> list = familyMap.get(family); if(list == null) { list = new ArrayList<Cell>(); } list.add(new KeyValue(row, family, null, timestamp, KeyValue.Type.DeleteFamilyVersion)); familyMap.put(family, list); return this; } /** * Delete all versions of the specified column. * @param family family name * @param qualifier column qualifier * @return this for invocation chaining * @deprecated Since hbase-1.0.0. Use {@link #addColumns(byte[], byte[])} */ @Deprecated public Delete deleteColumns(byte [] family, byte [] qualifier) { return addColumns(family, qualifier); } /** * Delete all versions of the specified column. * @param family family name * @param qualifier column qualifier * @return this for invocation chaining */ public Delete addColumns(final byte [] family, final byte [] qualifier) { addColumns(family, qualifier, this.ts); return this; } /** * Delete all versions of the specified column with a timestamp less than * or equal to the specified timestamp. * @param family family name * @param qualifier column qualifier * @param timestamp maximum version timestamp * @return this for invocation chaining * @deprecated Since hbase-1.0.0. Use {@link #addColumns(byte[], byte[], long)} */ @Deprecated public Delete deleteColumns(byte [] family, byte [] qualifier, long timestamp) { return addColumns(family, qualifier, timestamp); } /** * Delete all versions of the specified column with a timestamp less than * or equal to the specified timestamp. * @param family family name * @param qualifier column qualifier * @param timestamp maximum version timestamp * @return this for invocation chaining */ public Delete addColumns(final byte [] family, final byte [] qualifier, final long timestamp) { if (timestamp < 0) { throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp); } List<Cell> list = familyMap.get(family); if (list == null) { list = new ArrayList<Cell>(); } list.add(new KeyValue(this.row, family, qualifier, timestamp, KeyValue.Type.DeleteColumn)); familyMap.put(family, list); return this; } /** * Delete the latest version of the specified column. * This is an expensive call in that on the server-side, it first does a * get to find the latest versions timestamp. Then it adds a delete using * the fetched cells timestamp. * @param family family name * @param qualifier column qualifier * @return this for invocation chaining * @deprecated Since hbase-1.0.0. Use {@link #addColumn(byte[], byte[])} */ @Deprecated public Delete deleteColumn(byte [] family, byte [] qualifier) { return addColumn(family, qualifier); } /** * Delete the latest version of the specified column. * This is an expensive call in that on the server-side, it first does a * get to find the latest versions timestamp. Then it adds a delete using * the fetched cells timestamp. * @param family family name * @param qualifier column qualifier * @return this for invocation chaining */ public Delete addColumn(final byte [] family, final byte [] qualifier) { this.deleteColumn(family, qualifier, this.ts); return this; } /** * Delete the specified version of the specified column. * @param family family name * @param qualifier column qualifier * @param timestamp version timestamp * @return this for invocation chaining * @deprecated Since hbase-1.0.0. Use {@link #addColumn(byte[], byte[], long)} */ @Deprecated public Delete deleteColumn(byte [] family, byte [] qualifier, long timestamp) { return addColumn(family, qualifier, timestamp); } /** * Delete the specified version of the specified column. * @param family family name * @param qualifier column qualifier * @param timestamp version timestamp * @return this for invocation chaining */ public Delete addColumn(byte [] family, byte [] qualifier, long timestamp) { if (timestamp < 0) { throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp); } List<Cell> list = familyMap.get(family); if(list == null) { list = new ArrayList<Cell>(); } KeyValue kv = new KeyValue(this.row, family, qualifier, timestamp, KeyValue.Type.Delete); list.add(kv); familyMap.put(family, list); return this; } /** * Set the timestamp of the delete. * * @param timestamp */ public void setTimestamp(long timestamp) { if (timestamp < 0) { throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp); } this.ts = timestamp; } @Override public Map<String, Object> toMap(int maxCols) { // we start with the fingerprint map and build on top of it. Map<String, Object> map = super.toMap(maxCols); // why is put not doing this? map.put("ts", this.ts); return map; } }