を設定し、 400kイベントと400万オクターブ。 特定の時間範囲のイベントをフィルタリングし、時間単位で集計し、同じ頻度を持つオクターブの平均値ごとに返したいとします。 私が使用しているEF6のLINQのコードは次のとおりです。EF6集約2つのテーブル、イベントとオクターブがあり
動作しますが、時間のスパンが非常に少ない(数時間)のときのみ_context.Events
.Where(x => x.Time >= afterDate)
.Where(x => x.Time <= beforeDate)
.Select(x => new { year = x.Time.Year, month = x.Time.Month, day = x.Time.Day, hour = x.Time.Hour, data = x.Data })
.GroupBy(x => new { year = x.year, month = x.month, day = x.day, hour = x.hour })
.Where(x => x.Any())
.Select(x => new
{
Time = DbFunctions.CreateDateTime(x.Key.year, x.Key.month, x.Key.day, x.Key.hour, 0, 0),
Data = x.SelectMany(y => y.data).GroupBy(y => new { frequency = y.Frequency }).Select(y => new
{
frequency = y.Key.frequency,
value = Math.Round(y.Average(z => z.Value), 1),
})
})
.OrderByDescending(m => m.Time)
.Take(limit);
。それが数日間に増えた場合、クエリは永遠に実行されるようです。 SQL Serverに多すぎることを尋ねていますか?または、このクエリを実行してデータを構造化するより良い方法がありますか? SelectMany(...)。GroupBy(...)を削除すると、それはもはや狂っていません。
SQLクエリが生成される:
SELECT
[Project5].[C1] AS [C1],
[Project5].[C2] AS [C2],
[Project5].[C3] AS [C3],
[Project5].[C4] AS [C4],
[Project5].[C5] AS [C5],
[Project5].[C6] AS [C6],
[Project5].[C8] AS [C7],
[Project5].[Frequency] AS [Frequency],
[Project5].[C7] AS [C8]
FROM (SELECT
[Limit1].[C1] AS [C1],
[Limit1].[C2] AS [C2],
[Limit1].[C3] AS [C3],
[Limit1].[C4] AS [C4],
[Limit1].[C5] AS [C5],
[Limit1].[C6] AS [C6],
CASE WHEN ([GroupBy1].[K1] IS NULL) THEN CAST(NULL AS float) ELSE ROUND([GroupBy1].[A1], 1) END AS [C7],
[GroupBy1].[K1] AS [Frequency],
CASE WHEN ([GroupBy1].[K1] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C8]
FROM (SELECT TOP (10000) [Project4].[C1] AS [C1], [Project4].[C2] AS [C2], [Project4].[C3] AS [C3], [Project4].[C4] AS [C4], [Project4].[C5] AS [C5], [Project4].[C6] AS [C6]
FROM (SELECT
[Project2].[C1] AS [C1],
[Project2].[C2] AS [C2],
[Project2].[C3] AS [C3],
[Project2].[C4] AS [C4],
1 AS [C5],
convert (datetime2,right('000' + convert(varchar(255), [Project2].[C1]), 4) + '-' + convert(varchar(255), [Project2].[C2]) + '-' + convert(varchar(255), [Project2].[C3]) + ' ' + convert(varchar(255), [Project2].[C4]) + ':' + convert(varchar(255), 0) + ':' + str(cast(0 as float(53)), 10, 7), 121) AS [C6]
FROM (SELECT
[Distinct1].[C1] AS [C1],
[Distinct1].[C2] AS [C2],
[Distinct1].[C3] AS [C3],
[Distinct1].[C4] AS [C4]
FROM (SELECT DISTINCT
DATEPART (year, [Extent1].[TimeEnd]) AS [C1],
DATEPART (month, [Extent1].[TimeEnd]) AS [C2],
DATEPART (day, [Extent1].[TimeEnd]) AS [C3],
DATEPART (hour, [Extent1].[TimeEnd]) AS [C4]
FROM [dbo].[Events] AS [Extent1]
WHERE ([Extent1].[TimeEnd] >= @p__linq__1) AND ([Extent1].[TimeEnd] <= @p__linq__2)
) AS [Distinct1]
) AS [Project2]
WHERE EXISTS (SELECT
1 AS [C1]
FROM [dbo].[Events] AS [Extent2]
WHERE ([Extent2].[TimeEnd] >= @p__linq__1) AND ([Extent2].[TimeEnd] <= @p__linq__2) AND (([Project2].[C1] = (DATEPART (year, [Extent2].[TimeEnd]))) OR (([Project2].[C1] IS NULL) AND (DATEPART (year, [Extent2].[TimeEnd]) IS NULL))) AND (([Project2].[C2] = (DATEPART (month, [Extent2].[TimeEnd]))) OR (([Project2].[C2] IS NULL) AND (DATEPART (month, [Extent2].[TimeEnd]) IS NULL))) AND (([Project2].[C3] = (DATEPART (day, [Extent2].[TimeEnd]))) OR (([Project2].[C3] IS NULL) AND (DATEPART (day, [Extent2].[TimeEnd]) IS NULL))) AND (([Project2].[C4] = (DATEPART (hour, [Extent2].[TimeEnd]))) OR (([Project2].[C4] IS NULL) AND (DATEPART (hour, [Extent2].[TimeEnd]) IS NULL)))
)
) AS [Project4]
ORDER BY [Project4].[C6] DESC) AS [Limit1]
OUTER APPLY (SELECT
[Extent4].[Frequency] AS [K1],
AVG([Extent4].[Value]) AS [A1]
FROM [dbo].[Events] AS [Extent3]
INNER JOIN [dbo].[Octaves] AS [Extent4] ON [Extent3].[EventId] = [Extent4].[EventId]
WHERE ([Extent3].[TimeEnd] >= @p__linq__1) AND ([Extent3].[TimeEnd] <= @p__linq__2) AND (([Limit1].[C1] = (DATEPART (year, [Extent3].[TimeEnd]))) OR (([Limit1].[C1] IS NULL) AND (DATEPART (year, [Extent3].[TimeEnd]) IS NULL))) AND (([Limit1].[C2] = (DATEPART (month, [Extent3].[TimeEnd]))) OR (([Limit1].[C2] IS NULL) AND (DATEPART (month, [Extent3].[TimeEnd]) IS NULL))) AND (([Limit1].[C3] = (DATEPART (day, [Extent3].[TimeEnd]))) OR (([Limit1].[C3] IS NULL) AND (DATEPART (day, [Extent3].[TimeEnd]) IS NULL))) AND (([Limit1].[C4] = (DATEPART (hour, [Extent3].[TimeEnd]))) OR (([Limit1].[C4] IS NULL) AND (DATEPART (hour, [Extent3].[TimeEnd]) IS NULL)))
GROUP BY [Extent4].[Frequency]) AS [GroupBy1]
) AS [Project5]
ORDER BY [Project5].[C6] DESC, [Project5].[C1] ASC, [Project5].[C2] ASC, [Project5].[C3] ASC, [Project5].[C4] ASC, [Project5].[C8] ASC
UPDATE 1
私が直接オクターブを照会し、私はより良い結果を抱えていることで、 'フリップ' にクエリを試みました。私は最初に日付と頻度でそれらをグループ化し、平均を計算してから、時間ごとにグループ化します。それはまったくエレガントではありませんが、実際に働く最初の解決策です。グループ分けが異なって行われた場合(例えば、最初に、次に、頻度別、平均化された場合)、それはまだ機能しません。
_context.Octaves
.Where(x => x.Event.Time >= afterDate)
.Where(x => x.Event.Time <= beforeDate)
.GroupBy(x => new { year = x.Event.Time.Year, month = x.Event.Time.Month, day = x.Event.Time.Day, hour = x.Event.Time.Hour, freq = x.Frequency })
.Select(x => new
{
year = x.Key.year,
month = x.Key.month,
day = x.Key.day,
hour = x.Key.hour,
freq = x.Key.freq,
value = Math.Round(x.Average(y => y.Value), 1)
})
.GroupBy(x => new { year = x.year, month = x.month, day = x.day, hour = x.hour })
.Select(x => new
{
timeEnd = DbFunctions.CreateDateTime(x.Key.year, x.Key.month, x.Key.day, x.Key.hour, 0, 0),
data = x.Select(y=> new {freq = y.freq, value = y.value })
})
.OrderByDescending(m => m.timeEnd)
.Take(limit)
インデックスに適切なインデックスがありますか?毎時の集計データを別のテーブルに格納することについて考えましたか?それはオプションになりますか? –
Events.EventId、Octaves.EventId、Octaves.OctaveId、Octaves.Frequencyにはクラスター化されていないインデックスがあります。 私は集計データを別のテーブルに格納することを考えましたが、必要ではないと考えました。 ありがとう – teocomi
日付+時間を表す計算列をテーブルに作成してから、その列のインデックスを作成してみてください。あなたのEFクエリのその列でグループ化すれば、より高速になるはずです。 –