HBase刪除之後的讀取和寫入

1 HBase 刪除操作

 

刪除ColumnFamily

Delete delete = new Delete(rowKey);

delete.addFamily(columnFamily)

delete.setTimestamp(tm);

刪除cf下, 小於或等於給定timestamp 的所有值; 若沒有timestamp,使用most recent cell's timestamp;[這樣代價比較大,需要先查詢獲取已經存在版本的最大時間]

 

刪除Column

Delete delete = new Delete(rowKey);

delete.addColumns(columnFamily, column)

delete.setTimestamp(tm);

刪除指定的column下, 小於或等於給定timestamp 的所有值; 若沒有timestamp,使用most recent cell's timestamp;[這樣代價比較大,需要先查詢獲取已經存在版本的最大時間]

 

刪除某版本的值

Delete delete = new Delete(rowKey);

delete.addColumn(columnFamily, column, timestamp);

刪除指定的column下, 等於給定timestamp 的值;

 

2 刪除之後的讀取和寫入

 

刪除ColumnFamily

delete.addFamily(columnFamily, ts)

刪除column family後,<=ts的數據無法讀出,也無法寫入<=ts的數據。

 

刪除Column

delete.addColumns(columnFamily, column, ts)

刪除column後,<=ts的數據無法讀出,也無法寫入<=ts的數據。

 

刪除某版本

delete.addColumn(columnFamily, column, ts);

刪除某版本後,僅僅=ts的數據無法讀出,也無法寫入=ts的數據。

 

刪除某版本的值後,該版本之前的值能讀取得到嗎?

/**
 * uuid-1, 1L;
 * uuid-2, 2L;
 *
 * cache --> uuid-2, 3L;
 *
 * del uuid-2, 2L;
 * put uuid-2, 3L;
 *
 * get all:
 * uuid-1 1L
 * uuid-2 3L
 *
 */
@Test
public void testLimitUpdate() throws IOException {
    // write into two uuid
    String member = "member-limit-4";
    String uuid1 = "uuid-1";
    String uuid2 = "uuid-2";
    Long uuid2Time = Long.valueOf(2L);

    Map<String, Long> map = new HashMap<>();
    map.put(uuid1, Long.valueOf(1L));
    map.put(uuid2, uuid2Time);

    for (Map.Entry<String, Long> entry: map.entrySet()) {
        putUuid(member, entry.getKey(), entry.getValue());
    }
    System.out.println("---> after put ");
    getLimit(member, HColumn.UUID.name());

    // delete uuid2 2L
    delLimit(member, uuid2Time);
    System.out.println("---> after del ");
    getLimit(member, HColumn.UUID.name());

    // update uuid2, timestamp = 3L
    putUuid(member, uuid2, Long.valueOf(3L));
    System.out.println("---> after update uuid2");
    getLimit(member, HColumn.UUID.name());

}

總結:刪除某個版本的值,在該版本前的值依然能夠讀取得到,只是該版本的值沒有了

 

刪除某版本後寫入

刪除某版本後,寫入該版本相等或之前的值,能寫入嗎?

@Test
public void testLimitUpdate() throws IOException {
    // write into two uuid
    String member = "member-limit-11";
    String uuid1 = "uuid-1";
    String uuid3 = "uuid-3";

    putUuid(member, uuid1, Long.valueOf(1L));
    putUuid(member, uuid3, Long.valueOf(3L));
    System.out.println("---> after put u1, u3");
    getLimit(member, HColumn.UUID.name());

    // delete 3L
    delLimit(member, Long.valueOf(3L));
    System.out.println("---> after del 3L");
    getLimit(member, HColumn.UUID.name());

    // put uuid2, timestamp = 3L, can't write
    putUuid(member, "uuid-2", Long.valueOf(3L));
    System.out.println("---> after put u2, 3L");
    getLimit(member, HColumn.UUID.name());

    // put uuid3, timestamp = 2L, write OK
    putUuid(member, "uuid-3", Long.valueOf(2L));
    System.out.println("---> after put u3, 2L");
    getLimit(member, HColumn.UUID.name());
}

 

---> after put u1, u3
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-3, timestamp => 3
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
---> after del 3L
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
---> after put u2, 3L
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
---> after put u3, 2L
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-3, timestamp => 2
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1

刪除某版本後,寫入該timestamp相等的值,不能寫入;寫入不相等的timestamp的值,可以寫入。

 

具體原因參考:

https://hbase.apache.org/book.html#_delete

https://hbase.apache.org/book.html#version.delete

 

之前碰到的詭異現象:

刪除某個column,然後再向改column寫入相同的數據,死活寫不進去。一度懷疑是程序封裝得有問題。

後來偶然原因去把表刪除後,又可以寫進去了。

 

原因分析:

HBase的Delete不是直接刪除數據所對應的文件位置內容,而是一個標記刪除動作。
即在刪除的時候,加上一條類似<delete, cell, timestamp>的記錄。在下一次major compact之前,這條delete記錄跟真實的數據記錄,比如<cell, timestamp1>,都存在於HBase的存儲當中。
 
當讀取的時候,<delete, cell, timestamp>跟<cell, timestamp1>都會被讀取出來,以timestamp最大的最爲最終返回個用戶的結果(timestamp比delete這個小的,都被認爲是刪除掉的)。

 

原來HBase 刪除是tomb stone方式的刪除,給忘記了。

https://hbase.apache.org/book.html#_delete

https://hbase.apache.org/book.html#version.delete

 

 

附測試用到的公共方法:

public static void putUuid(String memberSrl, String uuid, long timestamp) {
    String rowKey = LIMIT_CONTROL.getRowKey(memberSrl);

    Put put = new Put(Bytes.toBytes(rowKey));
    byte[] value = Bytes.toBytes(uuid);
    put.addColumn(LIMIT_CONTROL.getCF(), HColumn.UUID.getCol(), timestamp, value);

    HBaseUtil.put(LIMIT_CONTROL, Arrays.asList(put));
}

public static void delLimit(String memberSrl, long timestamp) {
    String rowKey = LIMIT_CONTROL.getRowKey(memberSrl);

    Delete del = new Delete(Bytes.toBytes(rowKey));
    del.addColumn(LIMIT_CONTROL.getCF(), HColumn.UUID.getCol(), timestamp);

    List<Delete> deletes = new ArrayList<>();
    deletes.add(del);

    HBaseUtil.del(LIMIT_CONTROL, deletes);
}

public static void getLimit(String memberSrl, String trustType) throws IOException {
    String rowKey = LIMIT_CONTROL.getRowKey(memberSrl);
    Get get = new Get(Bytes.toBytes(rowKey));
    get.addColumn(LIMIT_CONTROL.getCF(), HColumn.valueOf(trustType).getCol());

    int limit = DurationLimit.getLimit(HColumn.UUID.name());
    System.out.println("get limit version: " + limit);
    //!!! set version is important for limit control
    get.setMaxVersions(limit);

    Result[] results = HBaseUtil.get(LIMIT_CONTROL, get);
    for (Result result: results) {
        CellUtil.displayResult(result, String.class);
    }
}

 

附HBase client 1.2 Delete 代碼:

/**
 * Used to perform Delete operations on a single row.
 * <p>
 * To delete an entire row, instantiate a Delete object with the row
 * to delete.  To further define the scope of what to delete, perform
 * additional methods as outlined below.
 * <p>
 * To delete specific families, execute {@link #addFamily(byte[]) deleteFamily}
 * for each family to delete.
 * <p>
 * To delete multiple versions of specific columns, execute
 * {@link #addColumns(byte[], byte[]) deleteColumns}
 * for each column to delete.
 * <p>
 * To delete specific versions of specific columns, execute
 * {@link #addColumn(byte[], byte[], long) deleteColumn}
 * for each column version to delete.
 * <p>
 * Specifying timestamps, deleteFamily and deleteColumns will delete all
 * versions with a timestamp less than or equal to that passed.  If no
 * timestamp is specified, an entry is added with a timestamp of 'now'
 * where 'now' is the servers's System.currentTimeMillis().
 * Specifying a timestamp to the deleteColumn method will
 * delete versions only with a timestamp equal to that specified.
 * If no timestamp is passed to deleteColumn, internally, it figures the
 * most recent cell's timestamp and adds a delete at that timestamp; i.e.
 * it deletes the most recently added cell.
 * <p>The timestamp passed to the constructor is used ONLY for delete of
 * rows.  For anything less -- a deleteColumn, deleteColumns or
 * deleteFamily -- then you need to use the method overrides that take a
 * timestamp.  The constructor timestamp is not referenced.
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public class Delete extends Mutation implements Comparable<Row> {
  /**
   * Create a Delete operation for the specified row.
   * <p>
   * If no further operations are done, this will delete everything
   * associated with the specified row (all versions of all columns in all
   * families).
   * @param row row key
   */
  public Delete(byte [] row) {
    this(row, HConstants.LATEST_TIMESTAMP);
  }

  /**
   * Create a Delete operation for the specified row and timestamp.<p>
   *
   * If no further operations are done, this will delete all columns in all
   * families of the specified row with a timestamp less than or equal to the
   * specified timestamp.<p>
   *
   * This timestamp is ONLY used for a delete row operation.  If specifying
   * families or columns, you must specify each timestamp individually.
   * @param row row key
   * @param timestamp maximum version timestamp (only for delete row)
   */
  public Delete(byte [] row, long timestamp) {
    this(row, 0, row.length, timestamp);
  }

  /**
   * Create a Delete operation for the specified row and timestamp.<p>
   *
   * If no further operations are done, this will delete all columns in all
   * families of the specified row with a timestamp less than or equal to the
   * specified timestamp.<p>
   *
   * This timestamp is ONLY used for a delete row operation.  If specifying
   * families or columns, you must specify each timestamp individually.
   * @param rowArray We make a local copy of this passed in row.
   * @param rowOffset
   * @param rowLength
   */
  public Delete(final byte [] rowArray, final int rowOffset, final int rowLength) {
    this(rowArray, rowOffset, rowLength, HConstants.LATEST_TIMESTAMP);
  }

  /**
   * Create a Delete operation for the specified row and timestamp.<p>
   *
   * If no further operations are done, this will delete all columns in all
   * families of the specified row with a timestamp less than or equal to the
   * specified timestamp.<p>
   *
   * This timestamp is ONLY used for a delete row operation.  If specifying
   * families or columns, you must specify each timestamp individually.
   * @param rowArray We make a local copy of this passed in row.
   * @param rowOffset
   * @param rowLength
   * @param ts maximum version timestamp (only for delete row)
   */
  public Delete(final byte [] rowArray, final int rowOffset, final int rowLength, long ts) {
    checkRow(rowArray, rowOffset, rowLength);
    this.row = Bytes.copy(rowArray, rowOffset, rowLength);
    setTimestamp(ts);
  }

  /**
   * @param d Delete to clone.
   */
  public Delete(final Delete d) {
    this.row = d.getRow();
    this.ts = d.getTimeStamp();
    this.familyMap.putAll(d.getFamilyCellMap());
    this.durability = d.durability;
    for (Map.Entry<String, byte[]> entry : d.getAttributesMap().entrySet()) {
      this.setAttribute(entry.getKey(), entry.getValue());
    }
  }

  /**
   * Advanced use only.
   * Add an existing delete marker to this Delete object.
   * @param kv An existing KeyValue of type "delete".
   * @return this for invocation chaining
   * @throws IOException
   */
  @SuppressWarnings("unchecked")
  public Delete addDeleteMarker(Cell kv) throws IOException {
    // TODO: Deprecate and rename 'add' so it matches how we add KVs to Puts.
    if (!CellUtil.isDelete(kv)) {
      throw new IOException("The recently added KeyValue is not of type "
          + "delete. Rowkey: " + Bytes.toStringBinary(this.row));
    }
    if (Bytes.compareTo(this.row, 0, row.length, kv.getRowArray(),
        kv.getRowOffset(), kv.getRowLength()) != 0) {
      throw new WrongRowIOException("The row in " + kv.toString() +
        " doesn't match the original one " +  Bytes.toStringBinary(this.row));
    }
    byte [] family = CellUtil.cloneFamily(kv);
    List<Cell> list = familyMap.get(family);
    if (list == null) {
      list = new ArrayList<Cell>();
    }
    list.add(kv);
    familyMap.put(family, list);
    return this;
  }

  /**
   * Delete all versions of all columns of the specified family.
   * <p>
   * Overrides previous calls to deleteColumn and deleteColumns for the
   * specified family.
   * @param family family name
   * @return this for invocation chaining
   * @deprecated Since 1.0.0. Use {@link #addFamily(byte[])}
   */
  @Deprecated
  public Delete deleteFamily(byte [] family) {
    return addFamily(family);
  }

  /**
   * Delete all versions of all columns of the specified family.
   * <p>
   * Overrides previous calls to deleteColumn and deleteColumns for the
   * specified family.
   * @param family family name
   * @return this for invocation chaining
   */
  public Delete addFamily(final byte [] family) {
    this.deleteFamily(family, this.ts);
    return this;
  }

  /**
   * Delete all columns of the specified family with a timestamp less than
   * or equal to the specified timestamp.
   * <p>
   * Overrides previous calls to deleteColumn and deleteColumns for the
   * specified family.
   * @param family family name
   * @param timestamp maximum version timestamp
   * @return this for invocation chaining
   * @deprecated Since 1.0.0. Use {@link #addFamily(byte[], long)}
   */
  @Deprecated
  public Delete deleteFamily(byte [] family, long timestamp) {
    return addFamily(family, timestamp);
  }

  /**
   * Delete all columns of the specified family with a timestamp less than
   * or equal to the specified timestamp.
   * <p>
   * Overrides previous calls to deleteColumn and deleteColumns for the
   * specified family.
   * @param family family name
   * @param timestamp maximum version timestamp
   * @return this for invocation chaining
   */
  public Delete addFamily(final byte [] family, final long timestamp) {
    if (timestamp < 0) {
      throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp);
    }
    List<Cell> list = familyMap.get(family);
    if(list == null) {
      list = new ArrayList<Cell>();
    } else if(!list.isEmpty()) {
      list.clear();
    }
    KeyValue kv = new KeyValue(row, family, null, timestamp, KeyValue.Type.DeleteFamily);
    list.add(kv);
    familyMap.put(family, list);
    return this;
  }

  /**
   * Delete all columns of the specified family with a timestamp equal to
   * the specified timestamp.
   * @param family family name
   * @param timestamp version timestamp
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addFamilyVersion(byte[], long)}
   */
  @Deprecated
  public Delete deleteFamilyVersion(byte [] family, long timestamp) {
    return addFamilyVersion(family, timestamp);
  }

  /**
   * Delete all columns of the specified family with a timestamp equal to
   * the specified timestamp.
   * @param family family name
   * @param timestamp version timestamp
   * @return this for invocation chaining
   */
  public Delete addFamilyVersion(final byte [] family, final long timestamp) {
    List<Cell> list = familyMap.get(family);
    if(list == null) {
      list = new ArrayList<Cell>();
    }
    list.add(new KeyValue(row, family, null, timestamp,
          KeyValue.Type.DeleteFamilyVersion));
    familyMap.put(family, list);
    return this;
  }

  /**
   * Delete all versions of the specified column.
   * @param family family name
   * @param qualifier column qualifier
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addColumns(byte[], byte[])}
   */
  @Deprecated
  public Delete deleteColumns(byte [] family, byte [] qualifier) {
    return addColumns(family, qualifier);
  }

  /**
   * Delete all versions of the specified column.
   * @param family family name
   * @param qualifier column qualifier
   * @return this for invocation chaining
   */
  public Delete addColumns(final byte [] family, final byte [] qualifier) {
    addColumns(family, qualifier, this.ts);
    return this;
  }

  /**
   * Delete all versions of the specified column with a timestamp less than
   * or equal to the specified timestamp.
   * @param family family name
   * @param qualifier column qualifier
   * @param timestamp maximum version timestamp
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addColumns(byte[], byte[], long)}
   */
  @Deprecated
  public Delete deleteColumns(byte [] family, byte [] qualifier, long timestamp) {
    return addColumns(family, qualifier, timestamp);
  }

  /**
   * Delete all versions of the specified column with a timestamp less than
   * or equal to the specified timestamp.
   * @param family family name
   * @param qualifier column qualifier
   * @param timestamp maximum version timestamp
   * @return this for invocation chaining
   */
  public Delete addColumns(final byte [] family, final byte [] qualifier, final long timestamp) {
    if (timestamp < 0) {
      throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp);
    }
    List<Cell> list = familyMap.get(family);
    if (list == null) {
      list = new ArrayList<Cell>();
    }
    list.add(new KeyValue(this.row, family, qualifier, timestamp,
        KeyValue.Type.DeleteColumn));
    familyMap.put(family, list);
    return this;
  }

  /**
   * Delete the latest version of the specified column.
   * This is an expensive call in that on the server-side, it first does a
   * get to find the latest versions timestamp.  Then it adds a delete using
   * the fetched cells timestamp.
   * @param family family name
   * @param qualifier column qualifier
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addColumn(byte[], byte[])}
   */
  @Deprecated
  public Delete deleteColumn(byte [] family, byte [] qualifier) {
    return addColumn(family, qualifier);
  }

  /**
   * Delete the latest version of the specified column.
   * This is an expensive call in that on the server-side, it first does a
   * get to find the latest versions timestamp.  Then it adds a delete using
   * the fetched cells timestamp.
   * @param family family name
   * @param qualifier column qualifier
   * @return this for invocation chaining
   */
  public Delete addColumn(final byte [] family, final byte [] qualifier) {
    this.deleteColumn(family, qualifier, this.ts);
    return this;
  }

  /**
   * Delete the specified version of the specified column.
   * @param family family name
   * @param qualifier column qualifier
   * @param timestamp version timestamp
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addColumn(byte[], byte[], long)}
   */
  @Deprecated
  public Delete deleteColumn(byte [] family, byte [] qualifier, long timestamp) {
    return addColumn(family, qualifier, timestamp);
  }

  /**
   * Delete the specified version of the specified column.
   * @param family family name
   * @param qualifier column qualifier
   * @param timestamp version timestamp
   * @return this for invocation chaining
   */
  public Delete addColumn(byte [] family, byte [] qualifier, long timestamp) {
    if (timestamp < 0) {
      throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp);
    }
    List<Cell> list = familyMap.get(family);
    if(list == null) {
      list = new ArrayList<Cell>();
    }
    KeyValue kv = new KeyValue(this.row, family, qualifier, timestamp, KeyValue.Type.Delete);
    list.add(kv);
    familyMap.put(family, list);
    return this;
  }

  /**
   * Set the timestamp of the delete.
   *
   * @param timestamp
   */
  public void setTimestamp(long timestamp) {
    if (timestamp < 0) {
      throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp);
    }
    this.ts = timestamp;
  }

  @Override
  public Map<String, Object> toMap(int maxCols) {
    // we start with the fingerprint map and build on top of it.
    Map<String, Object> map = super.toMap(maxCols);
    // why is put not doing this?
    map.put("ts", this.ts);
    return map;
  }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章