oranie's blog

旧:iをgに変えると・・・なんだっけ・・・

Cassandra 読み書き一貫性レベル(consistency Level)のまとめ

ソースは1.1.5で確認。ConsistencyLevel.javaに分かりやすく書かれていた。
行数は56-75

 * Write consistency levels make the following guarantees before reporting success to the client:
 *   ANY          Ensure that the write has been written once somewhere, including possibly being hinted in a non-target node.
 *   ONE          Ensure that the write has been written to at least 1 node's commit log and memory table
 *   TWO          Ensure that the write has been written to at least 2 node's commit log and memory table
 *   THREE        Ensure that the write has been written to at least 3 node's commit log and memory table
 *   QUORUM       Ensure that the write has been written to <ReplicationFactor> / 2 + 1 nodes
 *   LOCAL_QUORUM Ensure that the write has been written to <ReplicationFactor> / 2 + 1 nodes, within the local datacenter (requires NetworkTopologyStrategy)
 *   EACH_QUORUM  Ensure that the write has been written to <ReplicationFactor> / 2 + 1 nodes in each datacenter (requires NetworkTopologyStrategy)
 *   ALL          Ensure that the write is written to <code>&lt;ReplicationFactor&gt;</code> nodes before responding to the client.
 * 
 * Read consistency levels make the following guarantees before returning successful results to the client:
 *   ANY          Not supported. You probably want ONE instead.
 *   ONE          Returns the record obtained from a single replica.
 *   TWO          Returns the record with the most recent timestamp once two replicas have replied.
 *   THREE        Returns the record with the most recent timestamp once three replicas have replied.
 *   QUORUM       Returns the record with the most recent timestamp once a majority of replicas have replied.
 *   LOCAL_QUORUM Returns the record with the most recent timestamp once a majority of replicas within the local datacenter have replied.
 *   EACH_QUORUM  Returns the record with the most recent timestamp once a majority of replicas within each datacenter have replied.
 *   ALL          Returns the record with the most recent timestamp once all replicas have replied (implies no replica may be down)..
 */

で、一番使うのはONEかQUORUMなのでここだけ確認。
writeの場合ONEはレプリカセットのどれか一つに書き込み(commit logに書き込めてmemory(memtable)に行けば完了)出来たらOKで、QUORUMの場合はレプリカセットの数/2+1をして、その数が書き込み成功したらOK。
readはちょっと複雑でONEはレプリカのどれか一つが返したらそれをクライアントに返して成功。その為、全部成功した場合は最速のレスポンスが返される?ここはもう少しソースを確認していく。QUORUMはレプリカセットの内/2+1した数のノードがまず返答し、その上timestampをチェックして最新のだけ返す。

で、LOCAL_QUORUMとEACH_QUORUMというのもあり、これはiDCまたぎとかのネットワーク・トポロジー設定に絡んで一貫性をコントロールする感じですな。

まだ厳密な所が未確認なので、全部追えたら追記する。