hbase中缓存的优先级

今天同事问到hbase中in-memory属性的作用,以前没有注意过,今天仔细看了下代码:

// Instantiate priority buckets
      BlockBucket bucketSingle = new BlockBucket(bytesToFree, blockSize,
          singleSize());
      BlockBucket bucketMulti = new BlockBucket(bytesToFree, blockSize,
          multiSize());
      BlockBucket bucketMemory = new BlockBucket(bytesToFree, blockSize,
          memorySize());

      // Scan entire map putting into appropriate buckets
      for(CachedBlock cachedBlock : map.values()) {
        switch(cachedBlock.getPriority()) {
          case SINGLE: {
            bucketSingle.add(cachedBlock);
            break;
          }
          case MULTI: {
            bucketMulti.add(cachedBlock);
            break;
          }
          case MEMORY: {
            bucketMemory.add(cachedBlock);
            break;
          }
        }
      }

      PriorityQueue<BlockBucket> bucketQueue =
        new PriorityQueue<BlockBucket>(3);

      bucketQueue.add(bucketSingle);
      bucketQueue.add(bucketMulti);
      bucketQueue.add(bucketMemory);

      int remainingBuckets = 3;
      long bytesFreed = 0;

      BlockBucket bucket;
      while((bucket = bucketQueue.poll()) != null) {
        long overflow = bucket.overflow();
        if(overflow > 0) {
          long bucketBytesToFree = Math.min(overflow,
            (bytesToFree - bytesFreed) / remainingBuckets);
          bytesFreed += bucket.free(bucketBytesToFree);
        }
        remainingBuckets--;
      }

hbase内部的blockcache分三个队列:single、multi以及memory,分别占用25%,50%,25%的大小。这涉及到family属性中的in-memory选项,默认是false。

设为false的话,第一次访问到该数据时,会将它写入single队列,否则写入memory队列。当再次访问该数据并且在single中读到了该数据时,single会升级为multi

这三个队列其实是在共用blockcache的资源,区别是在LRU淘汰数据时,single会优先淘汰,其次为multi,最后为memory。

所以结论有两点:

1同一个family不会占用全部的blockcache资源

2当某些family特别重要时,可以将它的in-memory设为true,单独使用一个缓存队列,保证cache的优先使用

相关推荐