不想用Rank 和Partition by取一組數據中最大項的行

經常會遇到取一組數據分組後最大(小)值的行,以前常用Rank 和Partition by,我想了下換個方法其實還可以,沒有測試性能如何.

 

create table test

(

col1  number,

col2 varchar2(20),

col3 number

);


insert into test
select 1,'content',2 from dual;
insert into test
select 1,'content2',3 from dual;
insert into test
select 1,'content2',4 from dual;

insert into test
select 2,'content',1 from dual;
insert into test
select 2,'content2',30 from dual;
insert into test
select 2,'content2',4 from dual;

insert into test
select 3,'content',10 from dual;
insert into test
select 3,'content2',3 from dual;
insert into test
select 3,'content2',4 from dual;

commit ;

 

rank---partition by的寫法

 

select * from
(
select rank() over (partition by col1 order by col3 desc) rn,
       a.*
from test  a
)X
where rn=1;

 

現在使用max函數也行

select * from test a
where col3 in
(
 select max(col3) from test b
 where a.col1=b.col1
);

不知道對於大數據量到底如何測試下數據中一個表

select count(*) from form_action_log

 

COUNT(*) 9903874

 

用Rank

select count(*)  from
(
select rank() over (partition by a.form_id order by a.action_time desc) rn,
       a.*
from form_action_log  a
)X
where rn=1;
------70.125s   result COUNT(*) 4248095

 

用MAX1

select count(*) from form_action_log a
where a.action_time in
(
 select max(b.action_time) from form_action_log b
 where a.form_id=b.form_id
);

---326.000s   COUNT(*) 4248095

 

用MAX2

select count(*) from form_action_log a
where a.action_time>=
(
 select max(b.action_time) from form_action_log b
 where a.form_id=b.form_id
);

<60s  COUNT(*) 4248095

 

不過奇怪 看到執行計劃上面用Rank的方法比用Max1,Cost/IO Cost要大  執行time多久,

但實際結果時間好像不一樣.(PL/Sql developer7.0上測試的結果)

 

改寫了下SQL MAX2快一點了

 

以後再想想怎麼改寫吧.......

 

就到這裏了

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章