【Mongodb】视图 && 索引

萌亖

2020-04-15

准备工作

准备2个集合的数据，后面视图和索引都会用到
1个订单集合，一个收款信息集合

var orders = new Array();
var shipping = new Array();
var addresses = ["广西省玉林市", "湖南省岳阳市", "湖北省荆州市", "甘肃省兰州市", "吉林省松原市", "江西省景德镇", "辽宁省沈阳市", "福建省厦门市", "广东省广州市", "北京市朝阳区"];

for (var i = 10000; i < 20000; i++) {
    var orderNo = i + Math.random().toString().substr(2, 5);
    orders[i] = { orderNo: orderNo, userId: i, price: Math.round(Math.random() * 10000) / 100, qty: Math.floor(Math.random() * 10) + 1, orderTime: new Date(new Date().setSeconds(Math.floor(Math.random() * 10000))) };

    var address = addresses[Math.floor(Math.random() * 10)];
    shipping[i] = { orderNo: orderNo, address: address, recipienter: "Wilson", province: address.substr(0, 3), city: address.substr(3, 3) }
}
db.order.insert(orders);
db.shipping.insert(shipping);

【Mongodb】视图 && 索引

视图

概述

A MongoDB view is a queryable object whose contents are defined by an aggregation pipeline on other collections or views. MongoDB does not persist the view contents to disk. A view’s content is computed on-demand when a client queries the view. MongoDB can require clients to have permission to query the view. MongoDB does not support write operations against views.

Mongodb的视图基本上和SQL的视图一样

数据源（集合或视图）
提供查询
不实际存储硬盘
客户端发起请求查询时计算而得

1. 创建视图

有两种方法创建视图

db.createCollection(
  "<viewName>",
  {
    "viewOn" : "<source>",
    "pipeline" : [<pipeline>],
    "collation" : { <collation> }
  }
)

db.createView(
  "<viewName>",
  "<source>",
  [<pipeline>],
  {
    "collation" : { <collation> }
  }
)

一般使用db.createView

viewName : 必须，视图名称

source : 必须，数据源，集合/视图

[<pipeline>] : 可选，一组管道，可见管道是Mongodb比较重要的一环

1.1 单个集合创建视图

假设现在查看当天最高的10笔订单视图，例如后台某个地方需要实时显示金额最高的订单

db.createView(
    "orderInfo",         //视图名称
    "order",             //数据源   
    [
        //筛选符合条件的订单，大于当天，这里要注意时区
        { $match: { "orderTime": { $gte: ISODate("2020-04-13T16:00:00.000Z") } } },
        //按金额倒序
        { $sort: { "price": -1 } },
        //限制10个文档
        { $limit: 10 },
        //选择要显示的字段
        //0: 排除字段，若字段上使用（_id除外），就不能有其他包含字段
        //1: 包含字段
        { $project: { _id: 0, orderNo: 1, price: 1, orderTime: 1 } }
    ]
)

然后就可以直接使用orderInfo这个视图查询数据

db.orderInfo.find({})

返回结果

{ "orderNo" : "1755149436", "price" : 100, "orderTime" : ISODate("2020-04-14T13:49:42.220Z") }
{ "orderNo" : "1951423853", "price" : 99.99, "orderTime" : ISODate("2020-04-14T15:08:07.240Z") }
{ "orderNo" : "1196303215", "price" : 99.99, "orderTime" : ISODate("2020-04-14T15:15:41.158Z") }
{ "orderNo" : "1580069456", "price" : 99.98, "orderTime" : ISODate("2020-04-14T13:41:07.199Z") }
{ "orderNo" : "1114480559", "price" : 99.98, "orderTime" : ISODate("2020-04-14T13:31:58.150Z") }
{ "orderNo" : "1229542817", "price" : 99.98, "orderTime" : ISODate("2020-04-14T15:15:35.162Z") }
{ "orderNo" : "1208031402", "price" : 99.94, "orderTime" : ISODate("2020-04-14T14:13:02.160Z") }
{ "orderNo" : "1680622670", "price" : 99.93, "orderTime" : ISODate("2020-04-14T15:17:25.210Z") }
{ "orderNo" : "1549824953", "price" : 99.92, "orderTime" : ISODate("2020-04-14T13:09:41.196Z") }
{ "orderNo" : "1449930147", "price" : 99.92, "orderTime" : ISODate("2020-04-14T15:16:15.187Z") }

1.2 多个集合创建视图

其实跟单个是集合是一样，只是多了$lookup连接操作符，视图根据管道最终结果显示，所以可以关联多个集合（若出现这种情况就要考虑集合设计是否合理，mongodb本来就是文档型数据库）

db.orderDetail.drop()
db.createView(
    "orderDetail",
    "order",
    [
        { $lookup: { from: "shipping", localField: "orderNo", foreignField: "orderNo", as: "shipping" } },
        { $project: { "orderNo": 1, "price": 1, "shipping.address": 1 } }
    ]
)

查询视图，得到如下结果

{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6c3"), "orderNo" : "1000039782", "price" : 85.94, "shipping" : [ { "address" : "北京市朝阳区" } ] }
{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6c4"), "orderNo" : "1000102128", "price" : 29.04, "shipping" : [ { "address" : "吉林省松原市" } ] }
{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6c5"), "orderNo" : "1000214514", "price" : 90.69, "shipping" : [ { "address" : "湖南省岳阳市" } ] }
{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6c6"), "orderNo" : "1000337987", "price" : 75.05, "shipping" : [ { "address" : "辽宁省沈阳市" } ] }
{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6c7"), "orderNo" : "1000468969", "price" : 76.84, "shipping" : [ { "address" : "江西省景德镇" } ] }
{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6c8"), "orderNo" : "1000572219", "price" : 60.25, "shipping" : [ { "address" : "江西省景德镇" } ] }
{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6c9"), "orderNo" : "1000611743", "price" : 19.14, "shipping" : [ { "address" : "广东省广州市" } ] }
{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6ca"), "orderNo" : "1000773917", "price" : 31.5, "shipping" : [ { "address" : "北京市朝阳区" } ] }
{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6cb"), "orderNo" : "1000879146", "price" : 76.16, "shipping" : [ { "address" : "吉林省松原市" } ] }
{ "_id" : ObjectId("5e95af8c4ef6faf974b4a6cc"), "orderNo" : "1000945977", "price" : 93.98, "shipping" : [ { "address" : "辽宁省沈阳市" } ] }

可以看到，mongodb不是像SQL那样把连接的表当成列列出，而是把连接结果放在数组里面，这很符合Mongodb文档型结构。

2. 修改视图

假设现在需要增加一个数量的字段

db.runCommand({
    collMod: "orderInfo",
    viewOn: "order",
    pipeline: [
        { $match: { "orderTime": { $gte: ISODate("2020-04-13T16:00:00.000Z") } } },
        { $sort: { "price": -1 } },
        { $limit: 10 },
        //增加qty
        { $project: { _id: 0, orderNo: 1, price: 1, qty: 1, orderTime: 1 } }
    ]
})

当然，也可以删除视图，重新用db.createView()创建视图

3. 删除视图

db.orderInfo.drop();

索引

概述

Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.

索引能提供高效的查询，没有索引的查询，mongole执行集合扫描，相当于SQL SERVER的全表扫描，扫描每一个文档。

数据存在存储介质上，大多数情况是为了查询，查询的快慢直接影响用户体验，mongodb索引也是空间换时间，添加索引，CUD操作都会导致索引重新生成，影响速度。

1. 准备工作

1.1 准备200W条数据

var orderNo = 100 * 10000;
for (var i = 0; i < 100; i++) {
    //分批次插入，每次20000条
    var orders = new Array();
    for (var j = 0; j < 20000; j++) {
        var orderNo = orderNo++;
        orders[j] = { orderNo: orderNo, userId: i + j, price: Math.round(Math.random() * 10000) / 100, qty: Math.floor(Math.random() * 10) + 1, orderTime: new Date(new Date().setSeconds(Math.floor(Math.random() * 10000))) };
    }
    //不需写入确认
    db.order.insert(orders, { writeConcern: { w: 0 } });
}

1.2 mongodb的查询计划

db.collection.explain().<method(...)>

一般使用执行统计模式，例如

db.order.explain("executionStats").find({orderNo:1000000})

返回的executionStats对象字段说明

部分字段说明

字段	说明
executionSuccess	是否执行成功
nReturned	返回匹配文档数量
executionTimeMillis	执行时间，单位：毫秒
totalKeysExamined	索引检索数目
totalDocsExamined	文档检索数目

查看未加索引前查询计划

db.order.explain("executionStats").find({orderNo:1000000})

截取部分返回结果,可以看出

executionTimeMillis : 用时1437毫秒
totalDocsExamined : 扫描文档200W
executionStages.stage : 集合扫描

"executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 1,
    "executionTimeMillis" : 1437,
    "totalKeysExamined" : 0,
    "totalDocsExamined" : 2000000,
    "executionStages" : {
            "stage" : "COLLSCAN",

1.3 查看当前集合统计信息
db.order.stats()

截取部分信息，可以看出现在存储文件大小大概为72M
{
        "ns" : "mongo.order",
        "size" : 204000000,
        "count" : 2000000,
        "avgObjSize" : 102,
        "storageSize" : 74473472,

2. 创建索引

db.order.createIndex({ orderNo: 1 }, { name: "ix_orderNo" })

索引名称不是必须，若不指定，按字段名称_排序类型组合自动生成，索引名称一旦创建不能修改，若要修改，只能删除索引重新生成索引，建议还是建索引的时候就把索引名称设置好。

2.1 执行查询计划

db.order.explain("executionStats").find({orderNo:1000000})

截取部分结果，直观就可以感觉查询速度有了质的提升，再看查询计划更加惊讶

nReturned : 匹配到1个文档
executionTimeMillis : 0，呃。。
totalKeysExamined : 总共检索了1个索引
totalDocsExamined : 总共检索了1个文档
executionStages.stage : FETCH，根据索引去检索指定文档，像SQL的Index Seek

"executionStats" : {
                "executionSuccess" : true,
                "nReturned" : 1,
                "executionTimeMillis" : 0,
                "totalKeysExamined" : 1,
                "totalDocsExamined" : 1,
                "executionStages" : {
                        "stage" : "FETCH"

这里只介绍最简单的单个字段索引，mongodb还有很多索引