2016-07-06 3 views
0

の範囲:。私はやって探しています何ハイブウィンドウ関数は、私は以下のようになりますテーブルを持っている

TagName | DateTime   | Value 

TagName1|2016-07-06 09:49:34|14 
TagName1|2016-07-06 09:50:34|15 
TagName1|2016-07-06 09:51:34|18 
TagName2|2016-07-03 02:13:34|421 
TagName2|2016-07-03 03:13:34|422 
TagName3|2016-07-01 03:13:34|14 

は、それぞれのTagName(旧総和、加重平均のために複数の集計このテーブルの上のです、最新の値、カウントなど)を定義します。

この

は、私がこれまで持っているものです:私は最後の5日間でわずか5日間に加えて

WHERE par_date > date_format(date_sub(to_date(current_date), 5),'yyyyMMdd')) 

で探しています。この場合

SELECT * 
FROM 
(
SELECT 
t1.TagName, 
reflect("java.util.UUID", "randomUUID") as rv_id, 
t2.item_id as rs_id, 
from_unixtime(unix_timestamp()) as tstamp, 
t1.datetime as last_date, 
t1.value as last_value, 
t1.minimum as minimum, 
t1.maximum as maximum, 
t1.count as count, 
t1.total as total, 
t1.average as average, 
SUM(t1.weight_value) OVER (PARTITION BY TagName) as weighted_average, 
t1.Rank as Rank 
FROM 
(SELECT 
TagName, 
value, 
datetime, 
MIN(value) OVER (PARTITION BY TagName) as minimum, 
MAX(value) OVER (PARTITION BY TagName) as maximum, 
ROW_NUMBER() OVER (PARTITION BY TagName ORDER BY datetime DESC) as Rank, 
SUM(value) OVER (PARTITION BY TagName) as total, 
COUNT(value) OVER (PARTITION BY TagName) as count, 
AVG(value) OVER (PARTITION BY TagName) as average, 
(unix_timestamp(datetime) - LAG(unix_timestamp(datetime),1) OVER (PARTITION BY TagName ORDER BY datetime))/ 
(SUM(unix_timestamp(datetime) - LAG(unix_timestamp(datetime),1) OVER (PARTITION BY TagName ORDER BY datetime)) OVER (PARTITION BY TagName)) * 
(LAG(value,1) OVER (PARTITION BY TagName ORDER BY datetime)) as weight_value 
FROM raw.analog_history_dynamic 
WHERE par_date > date_format(date_sub(to_date(current_date), 5),'yyyyMMdd')) t1 
LEFT JOIN meta.item_meta t2 
ON t1.TagName = t2.name) t3 
WHERE t3.Rank =1; 

、私は10の他の範囲を持っているI

-- 1min 
WHERE par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 60000; 

-- 5Min 
WHERE par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 300000; 

-- 10 Min 
WHERE par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 600000; 

-- 30 Min 
WHERE par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 1800000; 

-- 1 Month 
WHERE par_date > date_format(date_sub(to_date(current_date), 30),'yyyyMMdd'); 

-- 2 Month 
WHERE par_date > date_format(date_sub(to_date(current_date), 60),'yyyyMMdd'); 

少なくとも、私は次のものを組み合わせたいと思っています。すべて同じパーティションの下にある< 1日集計(日付でパーティション化された表)

異なる条件でそれぞれを個別に実行するのではなく、1つのクエリ内でこれらの計算をすべて組み合わせることに関するアイデアや提案。

おかげ

答えて

0
In the select query statement only you could use "case when condition;s" which you have given in where clause eg - 

SELECT * 
FROM 
(
SELECT 
t1.TagName, 
reflect("java.util.UUID", "randomUUID") as rv_id, 
t2.item_id as rs_id, 
from_unixtime(unix_timestamp()) as tstamp, 
t1.datetime as last_date, 
t1.value as last_value, 
t1.flag, 
t1.minimum as minimum, 
t1.maximum as maximum, 
t1.count as count, 
t1.total as total, 
t1.average as average, 
SUM(t1.weight_value) OVER (PARTITION BY TagName) as weighted_average, 
t1.Rank as Rank 
FROM 
(SELECT 
TagName, 
value, 
datetime, 
case 
when par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 60000 
then flag_1min 
when par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 300000 
then flag_5min 
when .......and so on 
end as flag, 
MIN(value) OVER (PARTITION BY TagName) as minimum, 
MAX(value) OVER (PARTITION BY TagName) as maximum, 
ROW_NUMBER() OVER (PARTITION BY TagName ORDER BY datetime DESC) as Rank, 
SUM(value) OVER (PARTITION BY TagName) as total, 
COUNT(value) OVER (PARTITION BY TagName) as count, 
AVG(value) OVER (PARTITION BY TagName) as average, 
(unix_timestamp(datetime) - LAG(unix_timestamp(datetime),1) OVER (PARTITION BY TagName ORDER BY datetime))/ 
(SUM(unix_timestamp(datetime) - LAG(unix_timestamp(datetime),1) OVER (PARTITION BY TagName ORDER BY datetime)) OVER (PARTITION BY TagName)) * 
(LAG(value,1) OVER (PARTITION BY TagName ORDER BY datetime)) as weight_value 
FROM raw.analog_history_dynamic 
WHERE par_date > date_format(date_sub(to_date(current_date), 5),'yyyyMMdd')) t1 
LEFT JOIN meta.item_meta t2 
ON t1.TagName = t2.name 
group by TagName, 
value, 
datetime, 
case 
when par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 60000 
then flag_1min 
when par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 300000 
then flag_5min 
when .......and so on 
end as flag,) t3 
WHERE t3.Rank =1; 

NOTE: in the above code of yours, you have forgotten to use GROUP BY function since you had aggregate functions 
+0

私はすべての凝集体がそれ自体でどのグループPARTITIONを超えているため、GROUP BYが必要とされていると思ういけません。オリジナルでグループ化しようとするとエラーが発生します – scrayon

関連する問題