2017-05-23 6 views
1

を設定し、 400kイベントと400万オクターブ。 特定の時間範囲のイベントをフィルタリングし、時間単位で集計し、同じ頻度を持つオクターブの平均値ごとに返したいとします。 私が使用しているEF6のLINQのコードは次のとおりです。EF6集約2つのテーブル、イベントとオクターブがあり

動作しますが、時間のスパンが非常に少ない(数時間)のときのみ
_context.Events 
     .Where(x => x.Time >= afterDate) 
     .Where(x => x.Time <= beforeDate) 
     .Select(x => new { year = x.Time.Year, month = x.Time.Month, day = x.Time.Day, hour = x.Time.Hour, data = x.Data }) 
     .GroupBy(x => new { year = x.year, month = x.month, day = x.day, hour = x.hour }) 
     .Where(x => x.Any()) 
     .Select(x => new 
     { 
     Time = DbFunctions.CreateDateTime(x.Key.year, x.Key.month, x.Key.day, x.Key.hour, 0, 0), 
     Data = x.SelectMany(y => y.data).GroupBy(y => new { frequency = y.Frequency }).Select(y => new 
     { 
      frequency = y.Key.frequency, 
      value = Math.Round(y.Average(z => z.Value), 1), 
     }) 

     }) 
     .OrderByDescending(m => m.Time) 
     .Take(limit); 

。それが数日間に増えた場合、クエリは永遠に実行されるようです。 SQL Serverに多すぎることを尋ねていますか?または、このクエリを実行してデータを構造化するより良い方法がありますか? SelectMany(...)。GroupBy(...)を削除すると、それはもはや狂っていません。

SQLクエリが生成される:

SELECT 
    [Project5].[C1] AS [C1], 
    [Project5].[C2] AS [C2], 
    [Project5].[C3] AS [C3], 
    [Project5].[C4] AS [C4], 
    [Project5].[C5] AS [C5], 
    [Project5].[C6] AS [C6], 
    [Project5].[C8] AS [C7], 
    [Project5].[Frequency] AS [Frequency], 
    [Project5].[C7] AS [C8] 
    FROM (SELECT 
     [Limit1].[C1] AS [C1], 
     [Limit1].[C2] AS [C2], 
     [Limit1].[C3] AS [C3], 
     [Limit1].[C4] AS [C4], 
     [Limit1].[C5] AS [C5], 
     [Limit1].[C6] AS [C6], 
     CASE WHEN ([GroupBy1].[K1] IS NULL) THEN CAST(NULL AS float) ELSE ROUND([GroupBy1].[A1], 1) END AS [C7], 
     [GroupBy1].[K1] AS [Frequency], 
     CASE WHEN ([GroupBy1].[K1] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C8] 
     FROM (SELECT TOP (10000) [Project4].[C1] AS [C1], [Project4].[C2] AS [C2], [Project4].[C3] AS [C3], [Project4].[C4] AS [C4], [Project4].[C5] AS [C5], [Project4].[C6] AS [C6] 
      FROM (SELECT 
       [Project2].[C1] AS [C1], 
       [Project2].[C2] AS [C2], 
       [Project2].[C3] AS [C3], 
       [Project2].[C4] AS [C4], 
       1 AS [C5], 
       convert (datetime2,right('000' + convert(varchar(255), [Project2].[C1]), 4) + '-' + convert(varchar(255), [Project2].[C2]) + '-' + convert(varchar(255), [Project2].[C3]) + ' ' + convert(varchar(255), [Project2].[C4]) + ':' + convert(varchar(255), 0) + ':' + str(cast(0 as float(53)), 10, 7), 121) AS [C6] 
       FROM (SELECT 
        [Distinct1].[C1] AS [C1], 
        [Distinct1].[C2] AS [C2], 
        [Distinct1].[C3] AS [C3], 
        [Distinct1].[C4] AS [C4] 
        FROM (SELECT DISTINCT 
         DATEPART (year, [Extent1].[TimeEnd]) AS [C1], 
         DATEPART (month, [Extent1].[TimeEnd]) AS [C2], 
         DATEPART (day, [Extent1].[TimeEnd]) AS [C3], 
         DATEPART (hour, [Extent1].[TimeEnd]) AS [C4] 
         FROM [dbo].[Events] AS [Extent1] 
         WHERE ([Extent1].[TimeEnd] >= @p__linq__1) AND ([Extent1].[TimeEnd] <= @p__linq__2) 
        ) AS [Distinct1] 
       ) AS [Project2] 
       WHERE EXISTS (SELECT 
        1 AS [C1] 
        FROM [dbo].[Events] AS [Extent2] 
        WHERE ([Extent2].[TimeEnd] >= @p__linq__1) AND ([Extent2].[TimeEnd] <= @p__linq__2) AND (([Project2].[C1] = (DATEPART (year, [Extent2].[TimeEnd]))) OR (([Project2].[C1] IS NULL) AND (DATEPART (year, [Extent2].[TimeEnd]) IS NULL))) AND (([Project2].[C2] = (DATEPART (month, [Extent2].[TimeEnd]))) OR (([Project2].[C2] IS NULL) AND (DATEPART (month, [Extent2].[TimeEnd]) IS NULL))) AND (([Project2].[C3] = (DATEPART (day, [Extent2].[TimeEnd]))) OR (([Project2].[C3] IS NULL) AND (DATEPART (day, [Extent2].[TimeEnd]) IS NULL))) AND (([Project2].[C4] = (DATEPART (hour, [Extent2].[TimeEnd]))) OR (([Project2].[C4] IS NULL) AND (DATEPART (hour, [Extent2].[TimeEnd]) IS NULL))) 
       ) 
      ) AS [Project4] 
      ORDER BY [Project4].[C6] DESC) AS [Limit1] 
     OUTER APPLY (SELECT 
      [Extent4].[Frequency] AS [K1], 
      AVG([Extent4].[Value]) AS [A1] 
      FROM [dbo].[Events] AS [Extent3] 
      INNER JOIN [dbo].[Octaves] AS [Extent4] ON [Extent3].[EventId] = [Extent4].[EventId] 
      WHERE ([Extent3].[TimeEnd] >= @p__linq__1) AND ([Extent3].[TimeEnd] <= @p__linq__2) AND (([Limit1].[C1] = (DATEPART (year, [Extent3].[TimeEnd]))) OR (([Limit1].[C1] IS NULL) AND (DATEPART (year, [Extent3].[TimeEnd]) IS NULL))) AND (([Limit1].[C2] = (DATEPART (month, [Extent3].[TimeEnd]))) OR (([Limit1].[C2] IS NULL) AND (DATEPART (month, [Extent3].[TimeEnd]) IS NULL))) AND (([Limit1].[C3] = (DATEPART (day, [Extent3].[TimeEnd]))) OR (([Limit1].[C3] IS NULL) AND (DATEPART (day, [Extent3].[TimeEnd]) IS NULL))) AND (([Limit1].[C4] = (DATEPART (hour, [Extent3].[TimeEnd]))) OR (([Limit1].[C4] IS NULL) AND (DATEPART (hour, [Extent3].[TimeEnd]) IS NULL))) 
      GROUP BY [Extent4].[Frequency]) AS [GroupBy1] 
    ) AS [Project5] 
    ORDER BY [Project5].[C6] DESC, [Project5].[C1] ASC, [Project5].[C2] ASC, [Project5].[C3] ASC, [Project5].[C4] ASC, [Project5].[C8] ASC 

UPDATE 1

私が直接オクターブを照会し、私はより良い結果を抱えていることで、 'フリップ' にクエリを試みました。私は最初に日付と頻度でそれらをグループ化し、平均を計算してから、時間ごとにグループ化します。それはまったくエレガントではありませんが、実際に働く最初の解決策です。グループ分けが異なって行われた場合(例えば、最初に、次に、頻度別、平均化された場合)、それはまだ機能しません。

_context.Octaves 
.Where(x => x.Event.Time >= afterDate) 
.Where(x => x.Event.Time <= beforeDate) 
.GroupBy(x => new { year = x.Event.Time.Year, month = x.Event.Time.Month, day = x.Event.Time.Day, hour = x.Event.Time.Hour, freq = x.Frequency }) 
.Select(x => new 
{ 
    year = x.Key.year, 
    month = x.Key.month, 
    day = x.Key.day, 
    hour = x.Key.hour, 
    freq = x.Key.freq, 
    value = Math.Round(x.Average(y => y.Value), 1) 

}) 
.GroupBy(x => new { year = x.year, month = x.month, day = x.day, hour = x.hour }) 
.Select(x => new 
{ 
    timeEnd = DbFunctions.CreateDateTime(x.Key.year, x.Key.month, x.Key.day, x.Key.hour, 0, 0), 
    data = x.Select(y=> new {freq = y.freq, value = y.value }) 

}) 
.OrderByDescending(m => m.timeEnd) 
.Take(limit) 
+0

インデックスに適切なインデックスがありますか?毎時の集計データを別のテーブルに格納することについて考えましたか?それはオプションになりますか? –

+0

Events.EventId、Octaves.EventId、Octaves.OctaveId、Octaves.Frequencyにはクラスター化されていないインデックスがあります。 私は集計データを別のテーブルに格納することを考えましたが、必要ではないと考えました。 ありがとう – teocomi

+0

日付+時間を表す計算列をテーブルに作成してから、その列のインデックスを作成してみてください。あなたのEFクエリのその列でグループ化すれば、より高速になるはずです。 –

答えて

0

わかりませんが、これを試してみるとよいでしょう。それは私が確信していない悪化する可能性があります。

_context.Events.AsNoTracking() 
    .Where(x => x.Time >= afterDate && x.Time <= beforeDate) 
.GroupBy(x => new { year = x.year, month = x.month, day = x.day, hour = x.hour }) 
.Select(x => new 
       {Time = DbFunctions.CreateDateTime(x.Key.year, x.Key.month, x.Key.day, x.Key.hour, 0, 0), 
        Data = x.SelectMany 
        (y => 
         y.Select(h => 
         h.data.GroupBy(y => y.Frequency).select(y => 
           new { 
             frequency = y.Key, 
             value = Math.Round(y.Average(z => z.Value), 1) 
            } 
)))) 
    .OrderByDescending(m => m.Time) 
    .Take(limit); 
関連する問題