`

mongodb官方文档摘要

阅读更多
By default, mongo looks for a database server listening on port 27017 on the localhost interface. To connect to a server on a different port or interface, use the --port and --host options.

After starting the mongo shell your session will use the test database by default. At any time, issue the following operation at the mongo to report the name of the current database: db

At this point, if you issue the show dbs operation again, it will not include the mydb database. MongoDB will not permanently create a database until you insert data into that database.

用mongo连接上mongod实例时,默认使用test db,需要用use [db_name]命令,切换到目标数据库。除非确实插入文档,否则不会真正创建数据库,用show dbs命令暂时不会看见

When you access documents in a cursor using the array index notation (like c[6]), mongo first calls the cursor.toArray() method and loads into RAM all documents returned by the cursor. The index is then applied to the resulting array. This operation iterates the cursor completely and exhausts the cursor. For very large result sets, mongo may run out of available memory.

find()方法返回一个cursor,如果调用cursor[1],则会隐式调用cursor.toArray()函数,如果结果集很大,可能会OOM

With the findOne() method you can return a single document from a MongoDB collection. The findOne() method takes the same parameters as find(), but returns a document rather than a cursor.

findOne()方法确实地返回一个文档,而不是一个cursor

Collections are analogous to a table in relational databases.

芒果的collection是松散组织(no schema),允许插入完全没有关系的多条数据,唯一的关联是共享_id索引。但是这是不好的做法,还是推荐将有组织的文档插入同一个集合

Read operations, or queries, retrieve data stored in the database. In MongoDB, queries select documents from a single collection.

Queries specify criteria, or conditions, that identify the documents that MongoDB returns to the clients. A query may include a projection that specifies the fields from the matching documents to return. The projection limits the amount of data that MongoDB returns to the client over the network.

芒果的查询语句,包括criteria、projection、cursor modifier

MongoDB queries exhibit the following behavior:
•All queries in MongoDB address a single collection.
•You can modify the query to impose limits, skips, and sort orders.
•The order of documents returned by a query is not defined and is not necessarily consistent unless you specify a sort().
•Operations that modify existing documents (i.e. updates) use the same query syntax as queries to select documents to update.
•In aggregation pipeline, the $match pipeline stage provides access to MongoDB queries.

MongoDB projections have the following properties:
•In MongoDB, the _id field is always included in results unless explicitly excluded.
•For fields that contain arrays, MongoDB provides the following projection operators: $elemMatch, $slice, $.
•For related projection functionality in the aggregation framework pipeline, use the $project pipeline stage.

To access the documents, you need to iterate the cursor. However, in the mongo shell, if the returned cursor is not assigned to a variable using the var keyword, then the cursor is automatically iterated up to 20 times to print up to the first 20 documents in the results.

在mongo shell中,如果find()方法没有赋值给一个var变量,则会默认取出前20条记录(调用20次cursor.next)

Because the cursor is not isolated during its lifetime, intervening write operations on a document may result in a cursor that returns a
document more than once if that document has changed.

cursor并不是isolated的,并不是DB某个时刻的快照

Some query operations are not selective. These operations cannot use indexes effectively or cannot use indexes at all.

The inequality operators $nin and $ne are not very selective, as they often match a large portion of the index. As a result, in most cases, a $nin or $ne query with an index may perform no better than a $nin or $ne query that must scan all documents in a collection.

Queries that specify regular expressions, with inline JavaScript regular expressions or $regex operator expressions, cannot use an index with one exception. Queries that specify regular expression with anchors at the beginning of a string can use an index.

在合适的情况下,建立索引可以有效地提升查询效率;但是索引对写效率有影响,所以:
1、找出最频繁的,成为瓶颈的查询语句,在列上建立索引
2、避免创建毫无意义的索引
3、在业务负载低时进行索引创建的操作

Example:

Given a collection inventory with the following index on the type and item fields:
{ type: 1, item: 1 }

This index will cover the following query on the type and item fields, which returns only the item field:
db.inventory.find( { type: "food", item:/^c/ },
                   { item: 1, _id: 0 } )

However, the index will not cover the following query, which returns the item field and the _id field:
db.inventory.find( { type: "food", item:/^c/ },
                   { item: 1 } )

The MongoDB query optimizer processes queries and chooses the most efficient query plan for a query given the available indexes. The query system then uses this query plan each time the query runs. The query optimizer occasionally reevaluates query plans as the content of the collection changes to ensure optimal query plans.

每个mongod是一个实例,mongod只是启动的入口,可以只有一个。但是指定了dbpath之后,会生成mongod.lock文件,阻止其他实例访问。同一个目录下的mongod.exe(.sh)可以启动多个实例,只要指定了不同的dbpath

Read operations to a sharded cluster. Query criteria includes the shard key. The query router mongos can target the query to the appropriate shard or shards.

If a query does not include the shard key, the mongos must direct the query to all shards in the cluster. These scatter gather queries can be (very) inefficient. On larger clusters, scatter gather queries are unfeasible for routine operations.

虽然mongodb做sharding很容易,但是分片仍然需要提前规划好。比如有3个sharding mongod,共享一个shard key。那么当查询criteria包含这个key时, mongos立刻可以知道应该到哪台mongod上查询文档,查询的效率很高;反之如果查询criteria不包含shard key,mongos就需要轮询所有的mongod,这称为scatter gather,查询效率非常低,应该尽量避免

Replica sets use read preferences to determine where and how to route read operations to members of the replica set. By default, MongoDB always reads data from a replica set’s primary. You can modify that behavior by changing the read preference mode.

A write operation is any operation that creates or modifies data in the MongoDB instance. In MongoDB, write operations target a single collection. All write operations in MongoDB are atomic on the level of a single document.

No insert, update, or remove can affect more than one document atomically.

有且只有针对单个文档的写操作是atom的

If you add a new document without the _id field, the client library or the mongod instance adds an _id field and populates the field with a unique ObjectId.

If you specify the _id field, the value must be unique within the collection. For operations with write concern, if you try to create a document with a duplicate _id value, mongod returns a duplicate key exception.

如果手工生成_id,必须保证是唯一的

MongoDB provides different levels of write concern to better address the specific needs of applications. Clients may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather than ensure persistence to the entire deployment.

Although the errors ignored write concern provides fast performance, this performance gain comes at the cost of significant risks for data persistence and durability.

In-place-updates are significantly more efficient than updates that cause document growth. When possible, use data models that minimize the need for document growth.

in-place更新效率比较高,如果改变了文档大小,mongo可能需要先分配磁盘空间,从而降低更新效率

Solid state drives (SSDs) can outperform spinning hard disks (HDDs) by 100 times or more for random workloads.

The insert() method, when passed an array of documents, will perform a bulk insert, and inserts each document atomically. Drivers provide their own interface for this kind of operation.

如果传递给insert()方法一个[]参数,则会执行batch insert,效率比较高

Update operations can increase the size of the document. If a document outgrows its current allocated record space, MongoDB must allocate a new space and move the document to this new location.

To reduce the number of moves, MongoDB includes a small amount of extra space, or padding, when allocating the record space. This padding reduces the likelihood that a slight increase in document size will cause the document to exceed its allocated record size.

db.testData.find({$or:[{name:"zhengshengdong",age:29},{age:22}]})
db.testData.find({age:29,$or:[{name:"zhengshengdong"},{name:"chenruizhong"}]})

If you do not specify a query, remove() removes all documents from a collection, but does not remove the indexes. To remove all documents from a collection, it may be more efficient to use the drop() method to drop the entire collection, including the indexes, and then recreate the collection and rebuild the indexes.

如果系统已经超过了其负载,那么移动数据会变得很难,不要在系统性能不行的时候才去加shard。为了分片,mongo需要将待移动数据放到RAM中,如果RAM不够,则业务查询所需的数据会被移出RAM,性能会越来越差
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics