列ごとに条件を指定して集計します。

3つのクエリを1つにまとめようとしていますが、同じ結果を1つのテーブルとして作成しようとしています。 ColumnAとColumnBはどちらも実際には 'yyyy-mm-dd'の日付形式になっています。理想的には、最終結果は単に日付の列になり、各クエリから別々にカウントされます。 UNION ALLと列ごとに条件を指定して集計します。

select columnA, count(*) 
from data.table 
where timestamp between '2017-01-01' and '2017-01-07' 
group by columnA 

select columnB, count(*) 
from data.table 
where timestamp between '2017-01-01' and '2017-01-07' 
group by columnB 

select columnB, count(distinct columnC) 
from data.table 
where timestamp between '2017-01-01' and '2017-01-07' 
and columnX in ('itemA','ItemB') 
group by columnB

出典

2017-04-20 Dick McManus

'UNION ALL'の教科書ユースケースに似ています –

私はゴードンがよく理解していると思いますが、 –

ゴー：

select columnA, count(*) 
from data.table 
where timestamp between '2017-01-01' and '2017-01-07' 
group by columnA 
UNION ALL 
select columnB, count(*) 
from data.table 
where timestamp between '2017-01-01' and '2017-01-07' 
group by columnB 
UNION ALL 
select columnB, count(distinct columnC) 
from data.table 
where timestamp between '2017-01-01' and '2017-01-07' 
and columnX in ('itemA','ItemB') 
group by columnB

出典

2017-04-20 22:23:34 zipa

次のクエリは、あなたが何をしたいかを表現する：

select d.dte, coalesce(a.cnt, 0) as acnt, coalesce(b.cnt, 0) as bcnt, 
     b.c_cnt 
from (select columnA as dte from data.table where timestamp between '2017-01-01' and '2017-01-07' 

     union 
     select columnB from data.table where timestamp between '2017-01-01' and '2017-01-07' 
    ) d left join 
    (select columnA, count(*) as cnt 
     from data.table 
     where timestamp between '2017-01-01' and '2017-01-07' 
     group by columnA 
    ) a 
    on d.dte = a.columnA left join 
    (select columnB, count(*) as cnt, 
      count(distinct case when columnX in ('itemA','ItemB') then columnC end) as c_cnt 
     from data.table 
     where timestamp between '2017-01-01' and '2017-01-07' 
     group by columnB 
    ) b 
    on d.dte = b.columnB;

私は、これはハイブ互換性があると思うが、時折ハイブがから意外な偏差を持っていますSQLの他の方言。

出典

2017-04-20 22:24:05

以下は、あなたが望むように見えるでしょう：

select columnA, count(*) as cnt from data.table where timestamp between '2017-01-01' and '2017-01-07' group by columnA 
Union All 
select columnB, count(*) as cnt from data.table where timestamp between '2017-01-01' and '2017-01-07' group by columnB 
Union All 
select columnB, count(distinct columnC) as cnt from data.table where timestamp between '2017-01-01' and '2017-01-07' and columnX in ('itemA','ItemB') group by columnB

出典

2017-04-20 22:26:49

私はそれには、次の方法使用して仕事を得ることができました：

With pullA as 
(
    select columnA, count(*) as A_count 
    from data.table 
    group by columnA 
), 
pullB as 
(
    select columnB, count(*) as B_count 
    from data.table 
    group by columnB 
), 

pullC as 
(
    select columnB , count(*) as C_count 
    from data.table 
    where columnX in ('itemA', 'itemB') 
    group by columnB 
) 

select ColumnB, A_count, B_count, C_count 
from pullB 
left join pullA 
on ColumnB = ColumnA 
left join pullC 
on ColumnB = ColumnC

を任意の多かれ少なかれ効率的なよりも、このアプローチですユニオンまたはサブクエリのアプローチですか？

出典

2017-04-21 00:04:34

列ごとに条件を指定して集計します。

答えて

関連する問題