次のpsqlテーブルがあります。実際には、およそ20億行があります。ジョインが多いほど、PSQLクエリの結果は少なくなります
id word lemma pos textid country_genre
1 Stuffing stuff vvg 190568 AN
2 her her appge 190568 AN
3 key key nn1 190568 AN
4 into into ii 190568 AN
5 the the at 190568 AN
6 lock lock nn1 190568 AN
7 she she appge 190568 AN
8 pushed push vvd 190568 AN
9 her her appge 190568 AN
10 way way nn1 190568 AN
11 into into ii 190568 AN
12 the the appge 190568 AN
13 house house nn1 190568 AN
14 . . 190568 AN
15 She she appge 190568 AN
16 had have vhd 190568 AN
17 also also rr 190568 AN
18 cajoled cajole vvd 190568 AN
19 her her appge 190568 AN
20 way way nn1 190568 AN
21 into into ii 190568 AN
22 the the at 190568 AN
23 home home nn1 190568 AN
24 . . 190568 AN
.. ... ... .. ... ..
私は言葉のサイド・バイ・サイドですべての「道」-idiomsといくつかの列「country_genre」、「補題」からのデータと「POS」を示しており、以下の表を作成したいと思います。
country_genre word word word lemma pos word word word word word lemma pos word word
AN lock she pushed push vvd her way into the house house nn1 . she
AN had also cajoled cajole vvd her way into the home home nn1 . A
AN tried to force force vvi her way into the palace palace nn1 , officials
私は(ボヘミアンのおかげ:https://stackoverflow.com/a/47496945/3957383!):次のコードを使用し
copy(
SELECT
c1.id, c1.country_genre, c1.textid, c1.wordid, c1.word, c2.word, c3.word, c4.word, c4.lemma, c4.pos, c5.word, c6.word, c7.word, c8.word, c9.word, c9.lemma, c9.pos, c10.word, c11.word
FROM
orderedflatcorpus AS c1
JOIN orderedflatcorpus AS c2 ON c1.id + 1 = c2.id
JOIN orderedflatcorpus AS c3 ON c1.id + 2 = c3.id
JOIN orderedflatcorpus AS c4 ON c1.id + 3 = c4.id
JOIN orderedflatcorpus AS c5 ON c1.id + 4 = c5.id
JOIN orderedflatcorpus AS c6 ON c1.id + 5 = c6.id
JOIN orderedflatcorpus AS c7 ON c1.id + 6 = c7.id
JOIN orderedflatcorpus AS c8 ON c1.id + 7 = c8.id
JOIN orderedflatcorpus AS c9 ON c1.id + 8 = c9.id
JOIN orderedflatcorpus AS c10 ON c1.id + 9 = c10.id
JOIN orderedflatcorpus AS c11 ON c1.id + 10 = c11.id
WHERE
c4.pos LIKE 'vv%'
AND c5.pos = 'appge'
AND c6.word = 'way'
AND c7.pos LIKE 'i%'
AND c8.word = 'the'
AND c9.pos LIKE 'n%'
)
TO
'/home/postgres/Results/OUTPUT.csv'
DELIMITER E'\t'
csv header;
このクエリは、18706体の関連構造を返します。
Iはよりコンテキスト(代わりに11ワードの21)を抽出するが、以前のものとそうでない場合は等しく、次のコードを使用する場合は、心配何かが起こる:私は18555関連構造を得ます。
copy(
SELECT c1.id, c1.country_genre, c1.textid, c1.wordid, c1.word, c1.pos, c2.word, c2.pos, c3.word, c3.pos, c4.word, c4.pos, c5.word, c5.pos, c6.word, c6.pos,
c7.word, c7.pos, c8.word, c8.pos, c8.lemma, c9.word, c9.pos, c10.word, c10.pos, c11.word, c11.pos, c12.word, c12.pos, c13.word, c13.pos, c13.lemma, c14.word,
c14.pos, c15.word, c15.pos, c16.word, c16.pos, c17.word, c17.pos, c18.word, c18.pos, c19.word, c19.pos, c20.word, c20.pos, c21.word, c21.pos
FROM
orderedflatcorpus AS c1
JOIN orderedflatcorpus AS c2 ON c1.id + 1 = c2.id
JOIN orderedflatcorpus AS c3 ON c1.id + 2 = c3.id
JOIN orderedflatcorpus AS c4 ON c1.id + 3 = c4.id
JOIN orderedflatcorpus AS c5 ON c1.id + 4 = c5.id
JOIN orderedflatcorpus AS c6 ON c1.id + 5 = c6.id
JOIN orderedflatcorpus AS c7 ON c1.id + 6 = c7.id
JOIN orderedflatcorpus AS c8 ON c1.id + 7 = c8.id
JOIN orderedflatcorpus AS c9 ON c1.id + 8 = c9.id
JOIN orderedflatcorpus AS c10 ON c1.id + 9 = c10.id
JOIN orderedflatcorpus AS c11 ON c1.id + 10 = c11.id
JOIN orderedflatcorpus AS c12 ON c1.id + 11 = c12.id
JOIN orderedflatcorpus AS c13 ON c1.id + 12 = c13.id
JOIN orderedflatcorpus AS c14 ON c1.id + 13 = c14.id
JOIN orderedflatcorpus AS c15 ON c1.id + 14 = c15.id
JOIN orderedflatcorpus AS c16 ON c1.id + 15 = c16.id
JOIN orderedflatcorpus AS c17 ON c1.id + 16 = c17.id
JOIN orderedflatcorpus AS c18 ON c1.id + 17 = c18.id
JOIN orderedflatcorpus AS c19 ON c1.id + 18 = c19.id
JOIN orderedflatcorpus AS c20 ON c1.id + 19 = c20.id
JOIN orderedflatcorpus AS c21 ON c1.id + 20 = c21.id
WHERE
c8.pos LIKE 'vv%'
AND c9.pos = 'appge'
AND c10.word = 'way'
AND c11.pos LIKE 'i%'
AND c12.word = 'the'
AND c13.pos LIKE 'n%'
)
TO '/home/postgres/Results/OUTPUT.csv' DELIMITER E'\t' csv header;
2番目のクエリで見つからない行は見ましたが、除外されているパターンは検出できません。
いずれかがここで起こっかもしれないもののアイデアを持っていますか?ありがとう!
これは、前の質問への継続である場合、あなたは非常に少なくとも提供する必要があります。実際にあなたのことを私たちに伝えることで、この質問を自分自身の2フィートに立ててください。ここで行って。 –
ヒントをありがとう、私はちょうどリンクを追加しました! – Znusgy