使用hbase來解決上億條數據的準實時響應

使用hbase來解決億級數據的準實時響應


項目中的app行爲日誌,用戶授權收集的通訊錄、通話記錄、短信和聯繫人信息,隨着時間的推進,數據量進入億數據級,千萬級的創建索引,來加快查詢速度的優化方式,此時可能已經不起作用了。爲解決信審階段實時的查詢請求,引入hbase來解決響應慢的問題。


When Should I Use HBase?
HBase isn’t suitable for every problem.


First, make sure you have enough data. If you have hundreds of millions or billions of rows, then HBase is a good candidate. If you only have a few thousand/million rows, then using a traditional RDBMS might be a better choice due to the fact that all of your data might wind up on a single node (or two) and the rest of the cluster may be sitting idle.


Second, make sure you can live without all the extra features that an RDBMS provides (e.g., typed columns, secondary indexes, transactions, advanced query languages, etc.) An application built against an RDBMS cannot be "ported" to HBase by simply changing a JDBC driver, for example. Consider moving from an RDBMS to HBase as a complete redesign as opposed to a port.


Third, make sure you have enough hardware. Even HDFS doesn’t do well with anything less than 5 DataNodes (due to things such as HDFS block replication which has a default of 3), plus a NameNode.


hbase並不適合解決所有的問題。首先要有足夠多的數據;其次,沒有關係型數據庫的特性(列類型,二級索引,事務,強大的查詢語言等 )業務可以正常進行;另外,確定有足夠的硬件,特別是HDFS沒有5臺DataNode和一個NameNode節點不會工作的很好。


項目通過新增一個大數據平臺來處理大流量,高併發,低延時的請求,數據一方面與hbase交互,另一方面進入數據處理總線kafka,與數據中心打通數據流。


 

深圳逆時針


 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章