Print
카테고리: [ NoSQL ]
조회수: 22955
mongodb sharding cluster에선
collection을 여러 조각으로 파티션하고 각 조각을 여러 샤드 서버에 분산해서 저장하는데 이 데이터 조각을 chunk 라고 합니다.
이러한 chunk는 각 샤드서버에 균등하게 저장되어야 좋은 performance를 낼 수 있는데 균등하기 저장하기 위해
mongodb 에서는 큰 chunk 를 작은 chunk로 chunk split 하고, chunk가 많은 샤드에서 적은 샤드로 chunk migration 을 수행합니다.
이번 글에서는 chunk split 과 chunk migration 을 통한 balancing 작업이 어떻게 진행되는 지 확인해보겠습니다.
 

-1. chunk size 설정

mongos> use config
switched to db config
mongos> db.settings.find()
{ "_id" : "chunksize", "value" : 64 }

mongos> db.settings.save ( {_id:"chunksize", value:1})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
mongos> db.settings.find()
{ "_id" : "chunksize", "value" : 1 }
 

-2. data 적재에 따른 chunk 분산 테스트

1) shard key의 cardinality가 높을 때 
 
mongos> sh.enableSharding('test')

mongos> sh.shardCollection('test.chunk_test',{t:1})

for (var i = 1; i <= 100000; i++) {
   db.chunk_test.insert( { t : i } )
}
=> shard key가 t 인 chunk_test collection 에 데이터 적재
 
mongos> sh.status()
--- Sharding Status ---
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
=> default 로 split , balancer 는 enabled
 
 
mongos> db.chunk_test.getShardDistribution()

Shard repl_shard2 at repl_shard2/mongo_shard4:27018,mongo_shard5:27018,mongo_shard6:27018
 data : 1.14MiB docs : 36421 chunks : 4
 estimated data per chunk : 293KiB
 estimated docs per chunk : 9105

Shard repl_shard1 at repl_shard1/mongo_shard1:27018,mongo_shard2:27018,mongo_shard3:27018
 data : 2MiB docs : 63703 chunks : 3
 estimated data per chunk : 684KiB
 estimated docs per chunk : 21234

Totals
 data : 3.15MiB docs : 100124 chunks : 7
 Shard repl_shard2 contains 36.37% data, 36.37% docs in cluster, avg obj size on shard : 33B
 Shard repl_shard1 contains 63.62% data, 63.62% docs in cluster, avg obj size on shard : 33B
 
=> chunk_test collection 은 repl_shard1 / 2 에 각각 3개 4개의 chunk로 sharding 된 상태
적재된 데이터 특성상 비율은 5:5가 아닌 repl_shard1에 좀 더 편중되어있으나 maxkey 값이 더 적재될 수록 5:5에 가까워짐
chunk split 의 기준은 chunksize 와 chunk size 별 보유할 수 있는 최대 document (rdbms의 record) 
 
balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours:
            2 : Success
            1 : Failed with error 'aborted', from repl_shard2 to repl_shard1
chunks:
    repl_shard1 3
    repl_shard2 4
{ "t" : { "$minKey" : 1 } } -->> { "t" : 2 } on : repl_shard2 Timestamp(2, 0)
{ "t" : 2 } -->> { "t" : 31776 } on : repl_shard1 Timestamp(3, 1)
{ "t" : 31776 } -->> { "t" : 47663 } on : repl_shard1 Timestamp(2, 2)
{ "t" : 47663 } -->> { "t" : 63581 } on : repl_shard1 Timestamp(2, 3)
{ "t" : 63581 } -->> { "t" : 79468 } on : repl_shard2 Timestamp(3, 2)
{ "t" : 79468 } -->> { "t" : 95390 } on : repl_shard2 Timestamp(3, 3)
{ "t" : 95390 } -->> { "t" : { "$maxKey" : 1 } } on : repl_shard2 Timestamp(3, 4)
 
=> chunk split 이 진행되면서 샤드 간 chunk 불균형이 발생해 balancer가 chunk migration 을 3번 시도함
 
mongos> db.changelog.find({ $and:[ {ns:"test.chunk_test"} , {what:/split/}]} ).pretty()

{
"_id" : "5608defd0b97:27019-2019-10-01T05:52:56.241+0000-5d92e9b83fa963a1872b1e05",
"what" : "multi-split",
"ns" : "test.chunk_test",
"details" : {
"before" : {
"min" : {
"t" : { "$minKey" : 1 }
},
"max" : {
"t" : { "$maxKey" : 1 }
},
"lastmod" : Timestamp(1, 0),
"lastmodEpoch" : ObjectId("5d92e79f2ce99004d7291d2c")
},
"number" : 1,
"of" : 3,
"chunk" : {
"min" : {
"t" : { "$minKey" : 1 }
},
"max" : {
"t" : 2
},
"lastmod" : Timestamp(1, 1),
"lastmodEpoch" : ObjectId("5d92e79f2ce99004d7291d2c")}}}

{
"_id" : "5608defd0b97:27019-2019-10-01T05:52:56.241+0000-5d92e9b83fa963a1872b1e07",
"what" : "multi-split",
"ns" : "test.chunk_test",
"details" : {
"before" : {
"min" : {
"t" : { "$minKey" : 1 }
},
"max" : {
"t" : { "$maxKey" : 1 }
},
"lastmod" : Timestamp(1, 0),
"lastmodEpoch" : ObjectId("5d92e79f2ce99004d7291d2c")
},
"number" : 2,
"of" : 3,
"chunk" : {
"min" : {
"t" : 2
},
"max" : {
"t" : 31776
},
"lastmod" : Timestamp(1, 2),
"lastmodEpoch" : ObjectId("5d92e79f2ce99004d7291d2c")}}}

{
"_id" : "5608defd0b97:27019-2019-10-01T05:52:56.242+0000-5d92e9b83fa963a1872b1e09",
"what" : "multi-split",
"ns" : "test.chunk_test",
"details" : {
"before" : {
"min" : {
"t" : { "$minKey" : 1 }
},
"max" : {
"t" : { "$maxKey" : 1 }
},
"lastmod" : Timestamp(1, 0),
"lastmodEpoch" : ObjectId("5d92e79f2ce99004d7291d2c")
},
"number" : 3,
"of" : 3,
"chunk" : {
"min" : {
"t" : 31776
},
"max" : {
"t" : { "$maxKey" : 1 }
},
"lastmod" : Timestamp(1, 3),
"lastmodEpoch" : ObjectId("5d92e79f2ce99004d7291d2c")}}}
=> chunk split log
chunk split 은 meta data 만 변경하는 것으로 config 서버에서 수행됨
 
 
mongos> db.changelog.find({ $and:[ {ns:"test.chunk_test"} , {what:/^moveChunk/}]} ).pretty()

{
"_id" : "f45339b29103:27018-2019-10-01T05:52:56.451+0000-5d92e9b82ce99004d72b7f07",
"shard" : "repl_shard1",
"what" : "moveChunk.start",
"ns" : "test.chunk_test",
"details" : {
"min" : {
"t" : { "$minKey" : 1 }
},
"max" : {
"t" : 2
},
"from" : "repl_shard1",
"to" : "repl_shard2"}}

{
"_id" : "f7541b718362:27018-2019-10-01T05:52:56.849+0000-5d92e9b801038c8eb11809f2",
"server" : "f7541b718362:27018",
"shard" : "repl_shard2",
"what" : "moveChunk.to",
"ns" : "test.chunk_test",
"details" : {
"min" : {
"t" : { "$minKey" : 1 }
},
"max" : {
"t" : 2
},
"step 1 of 6" : 139,
"step 2 of 6" : 97,
"step 3 of 6" : 33,
"step 4 of 6" : 2,
"step 5 of 6" : 35,
"step 6 of 6" : 1,
"note" : "success"}}

{
"_id" : "f45339b29103:27018-2019-10-01T05:52:56.882+0000-5d92e9b82ce99004d72b7fe8",
"server" : "f45339b29103:27018",
"shard" : "repl_shard1",
"what" : "moveChunk.commit",
"ns" : "test.chunk_test",
"details" : {
"min" : {
"t" : { "$minKey" : 1 }
},
"max" : {
"t" : 2
},
"from" : "repl_shard1",
"to" : "repl_shard2",
"counts" : {
"cloned" : NumberLong(1),
"clonedBytes" : NumberLong(33),
"catchup" : NumberLong(0),
"steady" : NumberLong(0)}}}

{
"_id" : "f45339b29103:27018-2019-10-01T05:52:56.897+0000-5d92e9b82ce99004d72b7fef",
"server" : "f45339b29103:27018",
"shard" : "repl_shard1",
"what" : "moveChunk.from",
"ns" : "test.chunk_test",
"details" : {
"min" : {
"t" : { "$minKey" : 1 }
},
"max" : {
"t" : 2
},
"step 1 of 6" : 0,   => From-shard 로 move chunk 명령 실행
"step 2 of 6" : 14,  => 이동 대상 chunk의 상태와 moveChunk 파라미터 오류 체크
"step 3 of 6" : 98,  => To-shard 에게 From-shard로부터 chunk 를 복사하도록 지시함
"step 4 of 6" : 263, => From-shard 엔 있고 To-shard 엔 없는 index 확인 후 index 생성 , From-shard로 부터 data 가져옴
"step 5 of 6" : 45,  => config-server 의 chunk meta data 변경
"step 6 of 6" : 37,  => To-shard로 이동된 chunk data 삭제
"to" : "repl_shard2",
"from" : "repl_shard1",
"note" : "success"}}
=> move chunk log
=> split이 진행되다보면 해당 노드에만 chunk가 많아지고 샤드 간 chunk 불균형이 발생할 수 있음
meta data만 변경하는 chunk split 과 달리 move chunk 는 실제 shard node 간 chunk를 복사해감.
balancer는 아래 규칙에 따라 일정 개수 이상 차이 나면 balancing 작업을 수행
 
Number of Chunks Migration Threshold
Fewer than 20 2
20-79 4
80 and greater 8

-. Migration process

from->to chunk migration 은 아래와 같이 진행됨
 
-1. The balancer process sends the moveChunk command to the source shard.

-2. The source starts the move with an internal moveChunk command. During the migration process, operations to the chunk route to the source shard. The source shard is responsible for incoming write operations for the chunk.

-3. The destination shard builds any indexes required by the source that do not exist on the destination.

-4. The destination shard begins requesting documents in the chunk and starts receiving copies of the data.

-5. After receiving the final document in the chunk, the destination shard starts a synchronization process to ensure that it has the changes to the migrated documents that occurred during the migration.

-6. When fully synchronized, the source shard connects to the config database and updates the cluster metadata with the new location for the chunk.

-7. After the source shard completes the update of the metadata, and once there are no open cursors on the chunk, the source shard deletes its copy of the documents.
* balancer는 3.4version부터 mongos 가 아닌 config server의 primary server에서만 수행되며 
기존에 mongos 에서 실행될 때는 mongos 간 balancer 를 차지하기 위해 lock 경합이 심했으나 config server로 이관되면서 lock 이슈가 사라짐
 
 

 

2) shard key의 cardinality가 낮을 때

 
for (var i = 1; i <= 100000; i++) {
   db.chunk_test2.insert( { name : "kimdubi" } )
}
=> chunk_test2 collection 은 cardinality 가 1인 name:kimdubi 만 insert 10만건 수행
 
mongos> db.chunk_test2.getShardDistribution()

Shard repl_shard1 at repl_shard1/mongo_shard1:27018,mongo_shard2:27018,mongo_shard3:27018
 data : 3.81MiB docs : 100000 chunks : 1
 estimated data per chunk : 3.81MiB
 estimated docs per chunk : 100000

Totals
 data : 3.81MiB docs : 100000 chunks : 1
 Shard repl_shard1 contains 100% data, 100% docs in cluster, avg obj size on shard : 40B
 
=> 1MB 이상 커졌지만 shard key 값이 kimdubi 하나인 상태로 split이 불가능함
설정된 chunksize 보다 커진 chunk 를 jumbo chunk 라고 하며 jumbo chunk 가 많아질 수록 특정 chunk에만 부하가 몰릴 수 있기 때문에 shard key 설계로 jumbo chunk를 방지해야함