Elasticsearch工具類 支持樹形結構
- 1. 前言
- 2. 目標
- 3. 問題和解決
- 4. 代碼結構設計
- 4.1 普通結構接口-EsRepository
- 4.2 樹形結構接口-EsTreeRepository
- 4.3 普通結構抽象類-AbstractEsRepository
- 4.4 樹形結構抽象類-AbstractEsTreeRepository
- 4.5 樹形結構基類-TreeNode
- 4.6 測試對象實體類-TestTreeEntity
- 5. 實現方法
- 5.1 saveAll
- 5.2 deleteById
- 5.3 deleteByQuery
- 5.4 updateAllById
- 5.5 existsById
- 5.6 根據id集合獲取數據List
- 5.7 countGroupBy
- 5.8 樹形結構的countByParentId
- 總結
1. 前言
最近做的幾個項目用ES作爲數據庫,一個項目用的開源的jest作爲ES工具,感覺用的還可以,但是它好久不更新了。還有一個項目的工具類是自己寫的,寫的很粗糙,老大的意思要支持ES5.6和ES6.8這兩個版本。後來我就用了ES5.6的Low Level Java API實現了常用CRUD方法。後來體驗了下Spring Data Elasticsearch,感覺這個框架體驗極好,API非常豐富,Spring出品的果然牛。還有因爲我常用Spring Data JPA,所以上手有很熟悉的感覺。
因爲我技術比較菜,幾年Java 開發工作中,也就CRUD,所以看Spring Data ES的源碼很吃力,反正看不懂。所以想自己再寫個簡單的ES工具類,全當熟悉下ES的Java API,和優雅的Spring Data ES比,相差十萬八千里。
2. 目標
恰巧我自己寫的第一個Java功能是一個ORM工具類,就是根據實體類,產生CRUD方法,所以對Java的泛型和反射還有一點點印象,所以寫這個ES的ORM工具又有了當年熟悉的味道。先定好這次的幾個目標:
2.1 目標:基於實體類的CRUD
看了網上那麼多JPA和Mybatis哪個好的文章,我感覺這些爭吵毫無意義,適合自己的就是好的。我個人喜歡JPA的那種面向對象的調調,它也提供了手寫SQL查詢功能。所以我可能要實現的是如下風格的接口:
T sava(T t);
T findById(String id);
long count(QueryBuilder queryBuilder);
2.2 目標:支持查詢ES中樹形結構數據
這裏我說的ES中樹形結構,它不是ES自帶的父子文檔,我感覺用ES的parent語法挺難用,也許是我太菜,不太會用ES的父子文檔。
這裏我說的ES的樹形結構參考我上一篇博客ES保存樹形結構 結合Spring Data Elasticsearch
這裏順便提一下,樹形結構最好一個節點只有一個父親節點,一個節點多父親的情況在工作中確實會遇到,但是那個坑很多,維護起來很麻煩。所以對我個人而言,拒絕多父親的樹形結構。
3. 問題和解決
3.1 問題:獲取泛型T的class,避免顯示傳入class
泛型T已經傳來了,獲取T的class,會讓代碼更加優雅。
不然你看下網上別人的代碼,還得傳一個clazz,是不是特別讓人不爽。
T getById(M id, Class<T> clazz)
boolean exists(M id, Class<T> clazz)
long count(QueryBuilder queryBuilder, Class<T> clazz);
List<T> searchMore(QueryBuilder queryBuilder,int limitSize, Class<T> clazz);
3.1 解決:抄Spring Data的作業
Spring Data ES裏有段代碼,不明覺厲。雖然我看不懂,但大概理解爲子類(AbstractElasticsearchRepository<T, ID>)實現接口(ElasticsearchRepository<T, ID>),在子類中獲取父親的T的類型,這段我也就抄作業抄一半。
private ParameterizedType resolveReturnedClassFromGenericType(Class<?> clazz) {
Object genericSuperclass = clazz.getGenericSuperclass();
if (genericSuperclass instanceof ParameterizedType) {
ParameterizedType parameterizedType = (ParameterizedType) genericSuperclass;
Type rawtype = parameterizedType.getRawType();
if (SimpleElasticsearchRepository.class.equals(rawtype)) {
return parameterizedType;
}
}
return resolveReturnedClassFromGenericType(clazz.getSuperclass());
}
3.2 問題:樹形結構如何設計
對於一個樹形結構數據,我們常用到如下場景:
- 根據Id,獲取其直接兒子節點
- 根據Id,獲取其所有子孫節點,例如子孫節點總個數
- 根據Id,獲取其所有祖先節點
- 節點變更父親,修改該節點所以子孫節點的path信息
- 刪除一個節點,判斷其下是否有子孫,有則不允許刪除
3.2 解決:利用ES的nested類型,記錄祖先節點ID
參考ES保存樹形結構 結合Spring Data Elasticsearch,這裏我給下ES的mapping和例子數據
PUT /pigg_tree/_mapping/_doc
{
"properties":{
"id":{
"type":"keyword"
},
"level":{
"type":"keyword"
},
"name":{
"type":"keyword"
},
"parentId":{
"type":"keyword"
},
"path":{
"type":"nested",
"properties":{
"id":{
"type":"keyword"
},
"level":{
"type":"keyword"
}
}
}
}
}
{
"_index" : "pigg_tree",
"_type" : "_doc",
"_id" : "5ebdf2a8551fa08956079179",
"_score" : null,
"_source" : {
"parentId" : "5ebdf263551fd81d52158964",
"level" : 3,
"path" : [
{
"level" : 1,
"id" : "5ebdf241551f9ae2328fa452"
},
{
"level" : 2,
"id" : "5ebdf263551fd81d52158964"
}
],
"id" : "5ebdf2a8551fa08956079179",
"name" : "夏夏夏"
},
"sort" : [
"夏夏夏",
"3"
]
}
4. 代碼結構設計
4.1 普通結構接口-EsRepository
@NoRepositoryBean
public interface EsRepository<T> {
T save(T t);
T saveWithoutRefresh(T t);
Iterable<T> saveAll(Iterable<T> entities);
boolean deleteById(String id);
void deleteByQuery(QueryBuilder query);
boolean updateById(String id, Map<String, Object> doc);
void updateAllById(Iterable<String> ids, Map<String, Object> doc);
void updateByQuery(QueryBuilder query, Script script);
boolean existsById(String id);
Optional<T> findById(String id);
Optional<T> findById(String id, SourceFilter sourceFilter);
List<T> findAllById(Iterable<String> ids);
List<T> findAllById(Iterable<String> ids, SourceFilter sourceFilter);
List<T> findByQuery(QueryBuilder query);
List<T> findByQuery(QueryBuilder query, SourceFilter sourceFilter);
List<T> findByQuery(SearchQuery searchQuery);
PageInfo<T> pageQuery(SearchQuery searchQuery);
Long count(QueryBuilder query);
Map<String, Long> countGroupBy(String field, QueryBuilder query, Integer resultSize);
Class<T> getEntityClass();
}
4.2 樹形結構接口-EsTreeRepository
@NoRepositoryBean
public interface EsTreeRepository<T extends TreeNode> extends EsRepository<T> {
T saveNode(T t);
Iterable<T> saveAllNodeOfParent(String parentId, Iterable<T> entities);
boolean deleteNodeById(String id);
List<T> findChildrenByParentId(String parentId, boolean onlyNextLevel, SearchQuery searchQuery);
List<T> findForefathersById(String id, SourceFilter sourceFilter);
Long countByParentId(String parentId, boolean onlyNextLevel, QueryBuilder query);
Map<String, Long> countByParentId(List<String> parentIds, boolean onlyNextLevel, QueryBuilder query);
}
4.3 普通結構抽象類-AbstractEsRepository
public abstract class AbstractEsRepository<T> implements EsRepository<T> {
....
省略實現方法
....
}
4.4 樹形結構抽象類-AbstractEsTreeRepository
@Component
public class AbstractEsTreeRepository<T extends TreeNode> extends AbstractEsRepository<T> implements EsTreeRepository<T> {
....
省略實現方法
....
}
4.5 樹形結構基類-TreeNode
@Data
public class TreeNode {
@EsNodeParentId
private String parentId;
@EsNodeLevel
private int level;
@EsPath
private List<ParentNode> path;
}
4.6 測試對象實體類-TestTreeEntity
注意下面的@ToString(callSuper=true),因爲我用了@Data註解,在反序列化時發現得到的對象沒有父類TreeNode的屬性,經過排查發現是lombok默認重寫了toString()方法,所以這樣要加@ToString(callSuper=true),或者你就不要用lombok。
@Data
@NoArgsConstructor
@AllArgsConstructor
@ToString(callSuper=true)
@EsDocument(indexName = "pigg_tree", type = "_doc")
public class TestTreeEntity extends TreeNode {
@EsId
private String id;
private String name;
}
5. 實現方法
因爲代碼實在太多了,不可能全部貼博客了,列舉幾個感覺比較重要的實現方法。
5.1 saveAll
public Iterable<T> saveAll(Iterable<T> entities) {
BulkRequest bulkRequest = new BulkRequest();
Metadata metadataOfClass = null;
Iterator iterator = entities.iterator();
T first = (T) iterator.next();
metadataOfClass = MetadataUtils.getMetadata(first.getClass());
Metadata finalMetadataOfClass = metadataOfClass;
entities.forEach(t -> {
IndexRequest indexRequest = prepareIndex(t, finalMetadataOfClass);
if (indexRequest != null) {
bulkRequest.add(indexRequest);
}
});
try {
checkForBulkUpdateFailure(client.bulk(bulkRequest, RequestOptions.DEFAULT));
} catch (IOException e) {
throw new ElasticsearchException("Error while bulk for request: " + bulkRequest.toString(), e);
}
return entities;
}
5.2 deleteById
public boolean deleteById(String id) {
if (StringUtils.isEmpty(id)) {
throw new ElasticsearchException("ID cannot be empty");
}
Metadata metadata = MetadataUtils.getMetadata(getEntityClass());
DeleteRequest request = new DeleteRequest(
metadata.getIndexName(),
metadata.getTypeName(),
id);
request.setRefreshPolicy(WriteRequest.RefreshPolicy.NONE);
try {
DeleteResponse deleteResponse = client.delete(request, RequestOptions.DEFAULT);
if (deleteResponse.getResult() == DocWriteResponse.Result.DELETED) {
return true;
}
} catch (IOException e) {
throw new ElasticsearchException("Error while deleting item request: " + request.toString(), e);
}
return false;
}
5.3 deleteByQuery
public void deleteByQuery(QueryBuilder query) {
if (query == null) {
throw new ElasticsearchException("query cannot be empty");
}
Metadata metadata = MetadataUtils.getMetadata(getEntityClass());
DeleteByQueryRequest deleteByQueryRequest = new DeleteByQueryRequest(metadata.getIndexName())
.setDocTypes(metadata.getTypeName())
.setQuery(query)
.setAbortOnVersionConflict(false)
.setRefresh(true);
deleteByQueryRequest.setConflicts("proceed");
try {
client.deleteByQuery(deleteByQueryRequest, RequestOptions.DEFAULT);
} catch (IOException e) {
throw new ElasticsearchException("Error for delete request: " + deleteByQueryRequest.toString(), e);
}
}
5.4 updateAllById
public void updateAllById(Iterable<String> ids, Map<String, Object> doc){
Assert.notNull(ids, "ids can't be null.");
List<String> idList = stringIdsRepresentation(ids);
Metadata metadata = MetadataUtils.getMetadata(getEntityClass());
BulkRequest bulkRequest = new BulkRequest();
idList.forEach(id -> {
UpdateRequest request = new UpdateRequest(metadata.getIndexName(), metadata.getTypeName(), id);
request.doc(doc);
bulkRequest.add(request);
});
try {
checkForBulkUpdateFailure(client.bulk(bulkRequest, RequestOptions.DEFAULT));
} catch (IOException e) {
throw new ElasticsearchException("Error while bulk for request: " + bulkRequest.toString(), e);
}
}
5.5 existsById
public boolean existsById(String id) {
String thisId = stringIdRepresentation(id);
if (StringUtils.isEmpty(thisId)) {
throw new ElasticsearchException("ID cannot be empty");
}
Metadata metadata = MetadataUtils.getMetadata(getEntityClass());
GetRequest getRequest = new GetRequest(
metadata.getIndexName(),
metadata.getTypeName(),
thisId);
getRequest.fetchSourceContext(new FetchSourceContext(false));
getRequest.storedFields("_none_");
try {
return client.exists(getRequest, RequestOptions.DEFAULT);
} catch (IOException e) {
throw new ElasticsearchException("Error for existsById request: " + getRequest.toString(), e);
}
}
5.6 根據id集合獲取數據List
public List<T> findAllById(Iterable<String> ids, SourceFilter sourceFilter) {
Assert.notNull(ids, "ids can't be null.");
List<String> idList = stringIdsRepresentation(ids);
Metadata metadata = MetadataUtils.getMetadata(getEntityClass());
if (metadata != null) {
MultiGetRequest request = new MultiGetRequest();
for (String id : idList) {
MultiGetRequest.Item item = new MultiGetRequest.Item(metadata.getIndexName(), metadata.getTypeName(), id);
if (sourceFilter != null && !(sourceFilter.getIncludes() == null && sourceFilter.getExcludes() == null)) {
item.fetchSourceContext(new FetchSourceContext(true, sourceFilter.getIncludes(), sourceFilter.getExcludes()));
}
request.add(item);
}
try {
MultiGetResponse response = client.mget(request, RequestOptions.DEFAULT);
return EsResponseUtils.multiGetResponse2Obj(response, this.entityClass);
} catch (IOException e) {
throw new ElasticsearchException("Error for findAllById request: " + request.toString(), e);
}
}
return null;
}
5.7 countGroupBy
public Map<String, Long> countGroupBy(String field, QueryBuilder query, Integer resultSize){
if (StringUtils.isEmpty(field)) {
throw new ElasticsearchException("field cannot be empty");
}
if (resultSize == null || resultSize <= 0){
resultSize = 1000;
}
Map<String, Long> groupMap = new LinkedHashMap<>();
Metadata metadata = MetadataUtils.getMetadata(getEntityClass());
AggregationBuilder agg = AggregationBuilders.terms("agg")
.field(field)
.size(resultSize)
.order(BucketOrder.key(true))
.order(BucketOrder.count(false));
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
if (null != query) {
boolQueryBuilder.filter(query);
}
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(boolQueryBuilder);
searchSourceBuilder.size(0);
searchSourceBuilder.aggregation(agg);
SearchRequest request = new SearchRequest(metadata.getIndexName());
request.types(metadata.getTypeName());
request.source(searchSourceBuilder);
try {
SearchResponse searchResponse = client.search(request, RequestOptions.DEFAULT);
Terms groups = searchResponse.getAggregations().get("agg");
for (Terms.Bucket entry : groups.getBuckets()) {
groupMap.put(entry.getKey().toString(), entry.getDocCount());
}
} catch (IOException e) {
e.printStackTrace();
}
return groupMap;
}
5.8 樹形結構的countByParentId
這個方法是統計一組節點下其各自兒子或者孫子(通過onlyNextLevel區分)的共節點個數。
public Map<String, Long> countByParentId(List<String> parentIds, boolean onlyNextLevel, QueryBuilder query) {
if (CollectionUtils.isEmpty(parentIds)) {
throw new ElasticsearchException("parentIds cannot be empty");
}
Map<String, Long> result = new HashMap<>();
Metadata metadata = MetadataUtils.getMetadata(getEntityClass());
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
if (onlyNextLevel){
boolQueryBuilder.filter(QueryBuilders.termsQuery("parentId", parentIds));
return countGroupBy("parentId", boolQueryBuilder, parentIds.size());
}else {
if (query != null){
boolQueryBuilder.filter(query);
}
BoolQueryBuilder boolQueryBuilderForNested = QueryBuilders.boolQuery();
boolQueryBuilderForNested.filter(QueryBuilders.termsQuery("path.id", parentIds));
boolQueryBuilder.filter(QueryBuilders.nestedQuery("path", boolQueryBuilderForNested, ScoreMode.None));
NestedAggregationBuilder nestedAggregationBuilder = AggregationBuilders.nested("group_by_path", "path");
nestedAggregationBuilder.subAggregation(AggregationBuilders.terms("terms_by_path").field("path.id").size(parentIds.size()));
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(boolQueryBuilder);
searchSourceBuilder.size(0);
searchSourceBuilder.aggregation(nestedAggregationBuilder);
System.out.println(boolQueryBuilder.toString());
System.out.println(nestedAggregationBuilder.toString());
SearchRequest request = new SearchRequest(metadata.getIndexName());
request.types(metadata.getTypeName());
request.source(searchSourceBuilder);
try {
SearchResponse searchResponse = client.search(request, RequestOptions.DEFAULT);
Aggregations aggregations = searchResponse.getAggregations();
if (aggregations != null) {
Map<String, Aggregation> aggregationMap = aggregations.asMap();
if (aggregationMap != null && !aggregationMap.isEmpty()) {
Aggregation groupByAncestorId = aggregationMap.get("group_by_path");
if (groupByAncestorId != null) {
ParsedNested parsedNested = (ParsedNested) groupByAncestorId;
//獲得所有的桶
Aggregations subAggregations = parsedNested.getAggregations();
Map<String, Aggregation> subAggregationsMap = subAggregations.getAsMap();
Aggregation termsByAncestorId = subAggregationsMap.get("terms_by_path");
ParsedStringTerms parsedStringTerms = (ParsedStringTerms) termsByAncestorId;
//獲得所有的桶
List<? extends Terms.Bucket> buckets = parsedStringTerms.getBuckets();
if (!CollectionUtils.isEmpty(buckets)) {
buckets.stream().forEach(bucket ->
{
result.put(bucket.getKeyAsString(), bucket.getDocCount());
});
}
}
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
return result;
}
總結
- 學一門技術,需要先廣度,後深度,不要要求自己一下子達到什麼高度,先完成簡單的。
- 比如這次寫這個ORM,暫時不考慮ES的index和mapping設置,version字段,多index操作等,這些可以後期慢慢完善。
- 要區分反射時getDeclaredFields()和getFields()方法,如果要獲取父類的屬性,可以用Hutool工具的ReflectUtil.getFieldsDirectly(clazz, true)。