は、もう少し挑戦的です。それでも実際には、主な違いは別のUNNEST
操作が必要であるということです。それ以外は、同じロジックです。
#standardSQL
WITH data AS(
SELECT '1' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(1 AS index, 'value1' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, 2 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '2' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(1 AS index, 'value1' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '3' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '3' AS fullvisitorid, 2 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(3 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits UNION ALL
SELECT '4' AS fullvisitorid, 1 AS visitid, ARRAY<STRUCT< hitNumber INT64, customDimension ARRAY<STRUCT<index INT64, value STRING> > >> [STRUCT(1 AS hitNumber, [STRUCT(1 AS index, 'landing_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(2 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension),
STRUCT(3 AS hitNumber, [STRUCT(3 AS index, 'model_selection_page' AS value), STRUCT(2 AS index, 'value2' AS value)] AS customDimension)] AS hits
)
各ユーザー(fullvisitorid
)および各セッション(visitid
)そのhits
ARRAYを持っています。ここでは
はあなたにそれを示すために、いくつかのデータがあります。通知私はそれぞれのヒットを、hitNumber
で区切っています。これは、わかりやすくなります。
指数は3だったと値が「model_selection_page」ここで、次のクエリは、インデックス1と値を持つcustomDimension
「landing_page」は起こった合計セッション数を計算し、同様に:
#standardSQL
SELECT
SUM((SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page' LIMIT 1)) Landing_Page,
SUM((SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE EXISTS(SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page') AND index = 3 AND value = 'model_selection_page' LIMIT 1)) Model_Selection
FROM
data
あなたがで遊ぶことができますここで何が起こっているのかをよりよく理解するためにシミュレートされたデータです。一言で言えば、UNNEST
の2つが発生しています。最初にhits
の値を取得し、2番目の値はcustomDimension
の値を取得します。それは最初にあなたがこの表現に見ることができるように寸法を「landing_page」は、解雇されたかどうかを評価する必要があるため、
フィールドModel_Selection
は、もう少し複雑です:
EXISTS(SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page')
hits
がどこかに「landing_page」を持っていた場合この式はWHERE
句にTrue
を返します。あなたはまたそうのような、ユーザレベルでの結果をもたらすことができる
:
#standardSQL
SELECT
COUNT(DISTINCT (SELECT fullvisitorid FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page' LIMIT 1)) Landing_Page,
COUNT(DISTINCT (SELECT fullvisitorid FROM UNNEST(hits), UNNEST(customDimension) WHERE EXISTS(SELECT 1 FROM UNNEST(hits), UNNEST(customDimension) WHERE index = 1 AND value = 'landing_page') AND index = 3 AND value = 'model_selection_page' LIMIT 1)) Model_Selection
FROM
data
あなたがBigQueryのを学習しているように、私は、シミュレートされたデータで遊んで出力を観察し、各段階のテストをお勧めします。 UNNEST
で遊んで、その出力をテストするクエリをいくつか実行して、これらのテクニックの使い方をより深く理解することができます。
徹底的な回答のために@willianfuksに感謝します!これは間違いなく私にそれを少し理解するのを助けるでしょう。私はまた、シミュレーションされたデータで遊んであなたのアドバイスをフォローします。 – Jesper
ちょうど(長い)休暇から戻ってきました。私はクエリとシミュレーションデータの作業を開始しました。入力を検証すると、セッション数のわずかな違いがGAのUIデータと比較して約4%になりました。ユーザー(fullvisitorid)の場合、データは正確に対応しています。最初は、hits.typeがないために違いが生じたと仮定しましたが、このフィールドはArrayフィールドにアクセスできません。セッションでこの違いを引き起こす原因は何ですか? – Jesper