2016-09-20 11 views
0

私はグラフに2K +ノードと8k +エッジを押していますが、これは約7000msです。さらに、私は100k +ノードと関係で作業します。私のクエリは、このように操作をマージ使用しています。クエリのパフォーマンスをマージ - 一意の制約またはインデックス?

MERGE (a:User){user:'username'} 
MERGE (b:Hobby){hobby:'hobby'} 
MERGE (a)-[r:Hobby]->(b) 

Note: username and hobby are strings in the query

を今、私は、クエリのパフォーマンスを改善しようとしています。そうするために、グーグルの後、私は2つの方法を知りました。

  1. ノードプロパティのユーザー名と趣味に関するインデックス付け。そのため、マージ の操作でパフォーマンスが向上します。
  2. ノードプロパティのユーザ名と趣味に対する制約。このメソッドを提案する多くの人々

私の質問は以下のとおりです。

  1. プロパティのインデックス作成や財産上の制約 の作成の違いは何ですか?どのようにしてこれらの(内部的に何をするのと同じような)操作をグラフが扱うのか?
  2. パフォーマンスを向上させる正しい方法はどれですか?

編集:

マイコード:

session = driver.session() 
session.run('CREATE CONSTRAINT ON (u:user) ASSERT u.user IS UNIQUE') 
session.run('CREATE CONSTRAINT ON (h:hobby) ASSERT h.hobby IS UNIQUE') 

session.close() 

def writeBatch(b): 
    print("writing batch of " + str(len(b))) 
    session = driver.session() 
    session.run('UNWIND {batch} AS elt '+ 
       'MERGE (u:user{user: elt.user})'+ 
       'MERGE (h:hobby{hobby:elt.hobby})'+ 
       'MERGE (u)-[r:hobby]->(h)' 
       +'', {'batch': b}) 
    session.close() 

エラー:制約対

Traceback (most recent call last): 
    File "/Users/adaggula/Documents/workspace2/Facebook/FbNeo.py", line 145, in <module> 
    userhobby.foreach(write2neo) 
    File "/usr/local/spark/python/pyspark/rdd.py", line 747, in foreach 
    self.mapPartitions(processPartition).count() # Force evaluation 
    File "/usr/local/spark/python/pyspark/rdd.py", line 1004, in count 
    return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum() 
    File "/usr/local/spark/python/pyspark/rdd.py", line 995, in sum 
    return self.mapPartitions(lambda x: [sum(x)]).fold(0, operator.add) 
    File "/usr/local/spark/python/pyspark/rdd.py", line 869, in fold 
    vals = self.mapPartitions(func).collect() 
    File "/usr/local/spark/python/pyspark/rdd.py", line 771, in collect 
    port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd()) 
    File "/usr/local/spark/python/pyspark/rdd.py", line 2379, in _jrdd 
    pickled_cmd, bvars, env, includes = _prepare_for_python_RDD(self.ctx, command, self) 
    File "/usr/local/spark/python/pyspark/rdd.py", line 2299, in _prepare_for_python_RDD 
    pickled_command = ser.dumps(command) 
    File "/usr/local/spark/python/pyspark/serializers.py", line 428, in dumps 
    return cloudpickle.dumps(obj, 2) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 646, in dumps 
    cp.dump(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 107, in dump 
    return Pickler.dump(self, obj) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump 
    self.save(obj) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 562, in save_tuple 
    save(element) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 199, in save_function 
    self.save_function_tuple(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 236, in save_function_tuple 
    save((code, closure, base_globals)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 548, in save_tuple 
    save(element) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 600, in save_list 
    self._batch_appends(iter(obj)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 633, in _batch_appends 
    save(x) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 199, in save_function 
    self.save_function_tuple(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 236, in save_function_tuple 
    save((code, closure, base_globals)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 548, in save_tuple 
    save(element) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 600, in save_list 
    self._batch_appends(iter(obj)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 633, in _batch_appends 
    save(x) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 199, in save_function 
    self.save_function_tuple(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 236, in save_function_tuple 
    save((code, closure, base_globals)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 548, in save_tuple 
    save(element) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 600, in save_list 
    self._batch_appends(iter(obj)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 633, in _batch_appends 
    save(x) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 199, in save_function 
    self.save_function_tuple(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 236, in save_function_tuple 
    save((code, closure, base_globals)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 548, in save_tuple 
    save(element) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 600, in save_list 
    self._batch_appends(iter(obj)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 633, in _batch_appends 
    save(x) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 199, in save_function 
    self.save_function_tuple(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 236, in save_function_tuple 
    save((code, closure, base_globals)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 548, in save_tuple 
    save(element) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 600, in save_list 
    self._batch_appends(iter(obj)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 636, in _batch_appends 
    save(tmp[0]) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 199, in save_function 
    self.save_function_tuple(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 236, in save_function_tuple 
    save((code, closure, base_globals)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 548, in save_tuple 
    save(element) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 600, in save_list 
    self._batch_appends(iter(obj)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 636, in _batch_appends 
    save(tmp[0]) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 193, in save_function 
    self.save_function_tuple(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 241, in save_function_tuple 
    save(f_globals) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict 
    self._batch_setitems(obj.iteritems()) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 681, in _batch_setitems 
    save(v) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 193, in save_function 
    self.save_function_tuple(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 241, in save_function_tuple 
    save(f_globals) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict 
    self._batch_setitems(obj.iteritems()) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 686, in _batch_setitems 
    save(v) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save 
    self.save_reduce(obj=obj, *rv) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 542, in save_reduce 
    save(state) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict 
    self._batch_setitems(obj.iteritems()) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 681, in _batch_setitems 
    save(v) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save 
    self.save_reduce(obj=obj, *rv) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 524, in save_reduce 
    save(args) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 548, in save_tuple 
    save(element) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 600, in save_list 
    self._batch_appends(iter(obj)) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 636, in _batch_appends 
    save(tmp[0]) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save 
    self.save_reduce(obj=obj, *rv) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 542, in save_reduce 
    save(state) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict 
    self._batch_setitems(obj.iteritems()) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 681, in _batch_setitems 
    save(v) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save 
    self.save_reduce(obj=obj, *rv) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 542, in save_reduce 
    save(state) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict 
    self._batch_setitems(obj.iteritems()) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 681, in _batch_setitems 
    save(v) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save 
    self.save_reduce(obj=obj, *rv) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 542, in save_reduce 
    save(state) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict 
    self._batch_setitems(obj.iteritems()) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 686, in _batch_setitems 
    save(v) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save 
    self.save_reduce(obj=obj, *rv) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 542, in save_reduce 
    save(state) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict 
    self._batch_setitems(obj.iteritems()) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 681, in _batch_setitems 
    save(v) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save 
    self.save_reduce(obj=obj, *rv) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 542, in save_reduce 
    save(state) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 548, in save_tuple 
    save(element) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict 
    self._batch_setitems(obj.iteritems()) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 681, in _batch_setitems 
    save(v) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 315, in save_builtin_function 
    return self.save_function(obj) 
    File "/usr/local/spark/python/pyspark/cloudpickle.py", line 191, in save_function 
    if islambda(obj) or obj.__code__.co_filename == '<stdin>' or themodule is None: 
AttributeError: 'builtin_function_or_method' object has no attribute '__code__' 
16/09/20 16:35:22 INFO SparkContext: Invoking stop() from shutdown hook 

答えて

2

指数

インデックスは、プロパティをインデックス付き見つけるノードの高速な手段である

ある値を持ち、シーケンシャルなsを置き換えるすべてのノードのうちの1つである(O(n)アルゴリズムの代わりに、通常O(log(n))となります。多くのノードは、同じ値を持つプロパティを持つことができます。

制約は、データにスキーマを適用する方法です。

  1. プロパティの単一性:

    CREATE CONSTRAINT ON (n:Node) ASSERT n.uuid IS UNIQUE; 
    
  2. プロパティの有無:のNeo4jのノード上の制約の2種類があり、それは偶然にも

    CREATE CONSTRAINT ON (n:Node) ASSERT exists(n.name); 
    

は、単一性制約が使用しています別のノードがすでに同じ値を使用しているかどうかを素早く見つけるためのインデックス。

したがって、単一性制約を持つラベルもプロパティにインデックスがありますが、プロパティにインデックスを持つラベルは一意性を必要としません。

どちらを使用しますか?

UserノードとHobbyノードを検索または作成するためにMERGEを使用しているため、このプロパティは明らかに一意です。単純に索引を持つのではなく、スキーマを強制するために必ずunicity制約を使用する必要があります。

CREATE CONSTRAINT ON (n:User) ASSERT n.user IS UNIQUE; 
CREATE CONSTRAINT ON (n:Hobby) ASSERT n.hobby IS UNIQUE; 
+0

場合islambda(OBJ)またはOBJ .__コード__ co_filename == '' またはthemoduleがNone: はAttributeError: 'builtin_function_or_method' オブジェクトは、このエラーを投げる何の属性 '__code__' –

+0

を持っていません。私はneo4j.viドライバでセッションを使用しています。データをプッシュする前に、あるセッションで制約を作成しました。そして、私はバッチとしてデータをプッシュしています。 –

+0

私が使用したコードと私が@ Frank Pavageauに遭遇したエラーで私の質問を編集しました –

関連する問題