2017-01-03 6 views
1

私はUDFも使用するselect文で2つのテーブルを結合するにはどうすればよいですか? SQLクエリとUDF関数を、bqコマンドラインから呼び出す2つのファイルに格納しました。私はのgcloud認証方法を経由して、正しいプロジェクトにログインしていBigQuery joinとUDF

BigQuery error in query operation: Error processing job '[projectID]:bqjob_[error_number]': Table name cannot be resolved: dataset name is missing.

注:私はそれを実行したときしかし、私は次のエラーを取得します。 私のSQL文:私はもちろんの正しい名前と私のdatasetnameを置き換え

SELECT 
    substr(date,1,6) as date, 
    device, 
    channelGroup, 
    COUNT(DISTINCT CONCAT(fullVisitorId,cast(visitId as string))) AS sessions, 
    COUNT(DISTINCT fullVisitorId) AS users, 
FROM 
    defaultChannelGroup(
    SELECT 
     a.date, 
     a.device.deviceCategory AS device, 
     b.hits.page.pagePath AS page, 
     a.fullVisitorId, 
     a.visitId, 
     a.trafficSource.source AS trafficSourceSource, 
     a.trafficSource.medium AS trafficSourceMedium, 
     a.trafficSource.campaign AS trafficSourceCampaign 
    FROM FLATTEN(
     SELECT date,device.deviceCategory,trafficSource.source,trafficSource.medium,trafficSource.campaign,fullVisitorId,visitID 
     FROM 
     TABLE_DATE_RANGE([datasetname.ga_sessions_],TIMESTAMP('2016-10-01'),TIMESTAMP('2016-10-31')) 
    ,hits) as a 
    LEFT JOIN FLATTEN(
     SELECT hits.page.pagePath,hits.time,visitID,fullVisitorId 
     FROM 
     TABLE_DATE_RANGE([datasetname.ga_sessions_],TIMESTAMP('2016-10-01'),TIMESTAMP('2016-10-31')) 
     WHERE 
     hits.time = 0 
     and trafficSource.medium = 'organic' 
    ,hits) as b 
    ON a.fullVisitorId = b.fullVisitorId AND a.visitID = b.visitID 
) 
GROUP BY 
    date, 
    device, 
    channelGroup 
ORDER BY sessions DESC 

。 と(別のクエリで動作します)UDFの一部:

function defaultChannelGroup(row, emit) 
{ 
    function output(channelGroup) { 
    emit({channelGroup:channelGroup, 
     fullVisitorId: row.fullVisitorId, 
     visitId: row.visitId, 
     device: row.device, 
     date: row.date 
     }); 
    } 
    computeDefaultChannelGroup(row, output); 
} 

bigquery.defineFunction(
    'defaultChannelGroup', 
    ['date', 'device', 'page', 'trafficSourceMedium', 'trafficSourceSource', 'trafficSourceCampaign', 'fullVisitorId', 'visitId'], 
    //['device', 'page', 'trafficSourceMedium', 'trafficSourceSource', 'trafficSourceCampaign', 'fullVisitorId', 'visitId'], 
    [{'name': 'channelGroup', 'type': 'string'}, 
    {'name': 'fullVisitorId', 'type': 'string'}, 
    {'name': 'visitId', 'type': 'integer'}, 
    {'name': 'device', 'type': 'string'}, 
    {'name': 'date', 'type': 'string'} 
], 
    defaultChannelGroup 
); 
+0

私は再現できませんでした。 (BigQueryチームのメンバーの中には、仕事のIDを残しておくとログを見ることもできます) –

+0

ありがとう@FelipeHoffa。私は今朝、次のコマンドでそれをやり直しました: 'bq query --udf_resource = Desktop/bq.js" $(cat Desktop/bq-sd-mkt-channels.sql) "' と同じエラーメッセージが出ました。 : 'bqjob_r324a276c6f5130bc_000001596949a59f_1 ':テーブル名を解決できません: データセット名がありません – kekchoze

答えて

1

括弧にあることが必要FLATTEN関数内のSELECTステートメント。

はシェルでBQコマンドを実行した: bq query --udf_resource=udf.js "$(cat query.sql)"

query.sqlは、以下のスクリプトが含まれています

SELECT 
    substr(date,1,6) as date, 
    device, 
    channelGroup, 
    COUNT(DISTINCT CONCAT(fullVisitorId,cast(visitId as string))) AS sessions, 
    COUNT(DISTINCT fullVisitorId) AS users, 
    COUNT(DISTINCT transactionId) as orders, 
    CAST(SUM(transactionRevenue)/1000000 AS INTEGER) as sales 
FROM 
    defaultChannelGroup(
    SELECT 
     a.date as date, 
     a.device.deviceCategory AS device, 
     b.hits.page.pagePath AS page, 
     a.fullVisitorId as fullVisitorId, 
     a.visitId as visitId, 
     a.trafficSource.source AS trafficSourceSource, 
     a.trafficSource.medium AS trafficSourceMedium, 
     a.trafficSource.campaign AS trafficSourceCampaign, 
     a.hits.transaction.transactionRevenue as transactionRevenue, 
     a.hits.transaction.transactionID as transactionId 
    FROM FLATTEN((
     SELECT date,device.deviceCategory,trafficSource.source,trafficSource.medium,trafficSource.campaign,fullVisitorId,visitID, 
       hits.transaction.transactionID, hits.transaction.transactionRevenue 
     FROM 
     TABLE_DATE_RANGE([datasetname.ga_sessions_],TIMESTAMP('2016-10-01'),TIMESTAMP('2016-10-31')) 
    ),hits) as a 
    LEFT JOIN FLATTEN((
     SELECT hits.page.pagePath,hits.time,trafficSource.medium,visitID,fullVisitorId 
     FROM 
     TABLE_DATE_RANGE([datasetname.ga_sessions_],TIMESTAMP('2016-10-01'),TIMESTAMP('2016-10-31')) 
     WHERE 
     hits.time = 0 
     and trafficSource.medium = 'organic' 
    ),hits) as b 
    ON a.fullVisitorId = b.fullVisitorId AND a.visitID = b.visitID 
) 
GROUP BY 
    date, 
    device, 
    channelGroup 
ORDER BY sessions DESC 

とudf.jsは、次の関数( 'computeDefaultChannelGroup' 関数が含まれていません含まれてい):

function defaultChannelGroup(row, emit) 
{ 
    function output(channelGroup) { 
    emit({channelGroup:channelGroup, 
     date: row.date, 
     fullVisitorId: row.fullVisitorId, 
     visitId: row.visitId, 
     device: row.device, 
     transactionId: row.transactionId, 
     transactionRevenue: row.transactionRevenue, 
     }); 
    } 
    computeDefaultChannelGroup(row, output); 
} 

bigquery.defineFunction(
    'defaultChannelGroup', 
    ['date', 'device', 'page', 'trafficSourceMedium', 'trafficSourceSource', 'trafficSourceCampaign', 'fullVisitorId', 'visitId', 'transactionId', 'transactionRevenue'], 
    [{'name': 'channelGroup', 'type': 'string'}, 
    {'name': 'date', 'type': 'string'}, 
    {'name': 'fullVisitorId', 'type': 'string'}, 
    {'name': 'visitId', 'type': 'integer'}, 
    {'name': 'device', 'type': 'string'}, 
    {'name': 'transactionId', 'type': 'string'}, 
    {'name': 'transactionRevenue', 'type': 'integer'} 
], 
    defaultChannelGroup 
); 

Googleアナリティクスでエラーが発生せず、データと一致しました。

関連する問題