`
hideto
  • 浏览: 2651771 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

HPM Note2, MySQL Architecture

阅读更多
MySQL's Logical Architecture
Three layer:
1, Connection/thread handling
2, Query parsing, analysis, optimization, caching, build-in functions
3, Storage engine

MySQL server caches threads, so they don't need to be created and destroyed for each new conneciton
Client authentication is based on username, originating host, and password

MySQL parses queries to create an internal structure(parse tree), and then applies a variety of optimizations
These may include rewriting the query, determineing the order in which it will read tables, choosing which indexes to use, and so on
You can pass hints to optimizer through special keywords in the query, affecting its decision-making process

The optimizer does not really care what storage engine a particular table uses, but the storage engine does affect how the server optimizes query
Before even parsing the query, the sever consults the query cache, which can store only SELECT statements, along with their result sets

Concurrency Control
Systems that deal with concurrent read/write access typically implement a locking system that consists of two lock types
These locks are usually known as shared locks and exclusive locks, or read locks and write locks

Read locks on a resources are shared, or mutually nonblocking: many clients may read from a resource at the same time and not interfere with each other
Write locks, on the other hand, are exclusive--i.e., they block both read locks and other write locks--because the only safe policy is to have a single client writing to the resource at given time and to prevent all reads when a client is writing

Minimizing the amount of data that you lock at any one time lets changes to a given resource occur simultaneously, as long as they don't conflict with each other
A locking strategy is a compromise between lock overhead and data safety, and that compromise affects performance

MySQL's storage engines can implement their own locking policies and lock granularities

Table locks is the most basic locking strategy available in MySQL with the lowest overhead
Write locks have a higher priority than read locks, so a request for a write lock will advance to the front of the lock queue even if readers are already in the queue
Write locks can advance past read locks in the queue, but read locks cannot advance past write locks

Although storage engines can manage their own locks, MySQL itself also uses a variety of locks that are effectively table-level for various purposes
For instance, the server uses a table-level lock for statements such as ALTER TABLE, regardless of the storage engine

The locking style that offers the greatest concurrency and carries the greatest overhead is the use of row locks
Row-level locking is available in the InnoDB and Falcon storage engines
Row locks are implemented in the storage engine, not the server
The server is completely unaware of locks implemented in the storage engines, and the storage engines all implement locking in their own ways

Transactions
A transaction is a group of SQL queries that are treated atomically, as a single unit of work

Atomicity: A transaction must function as a single indivisible unit of work so that the entire transaction is either applied or rolled back
Consistency: The database should always move from one consistent state to the next
Isolation: The results of a transaction are usually invisible to other transactions until the transaction is complete
Durability: Once committed, a transaction's change are permanent

The SQL standard defines four isolation levels:
READ UNCOMMITTED: transactions can view the results of uncommitted transactions
READ COMMITTED: The default isolation level for most database system(but not MySQL), transactiosn will see only those changes made by transactions that were already committed, and its changes won't be visible to others until it has committed
REPEATABLE READ: MySQL's default transaction isolation level, it guarantees that any rows a transaction reads will "look the same" in subsequent reads within the same transaction
SERIALIZABLE: The highest level of isolation, solves the phantom read problem by forcing transactions to be ordered so that they can't possibly conflict

ANSI SQL isolation levels:
Isolation level    Dirty reads possible    Nonrepeatable reads possible    Phantom reads possible    Locking reads
READ UNCOMMITTED           Y                         Y                             Y                     N
READ COMMITTED             N                         Y                             Y                     N
REPEATABLE READ            N                         N                             Y                     N
SERIALIZABLE               N                         N                             N                     Y

A deaklock is when two or more transactions are mutually holding and requesting locks on the same resources, creating a cycle of dependencies
The more sophisticated systems, such as the InnoDB storage engine, will notice circular dependencies and return an error instantly
Deadlocks cannot be broken without rolling back one of the transactions, either partially or wholly
They are a fact of life in transactional systems, and your applications should be designed to handle them
Many applications can simply retry their transactions from the beginning

Transaction logging helps make transactions more efficient
Instead of updating the tables on disk each time a change occurs, the storage engine can change its in-memory copy of the data
This is very fast. The storage engine can then write a record of the change to the transaction log, which is on disk and therefore durable
This is also a relatively fast operation, because appending log events involves sequential I/O in one small area of the disk instead of random I/O in many places
Then, at some later time, a process can update the table on disk
Thus, most storage engines that use this technique(known as write-ahead logging) end up writing the changes to disk twice
(The PBXT storage engine cleverly avoids some write-ahead logging)
PBXT white paper

MySQL AB provides three transactional storage engines: InnoDB, NDB Cluster, and Falcon
Several thrid-party engines are also available; the best-know engines right now are solidDB and PBXT

MySQL operates in AUTOCOMMIT mode by default
This means that unless you've explicitly begun a transaction, it automatically executes each query in a separate transaction
Changing the value of AUTOCOMMIT has no effect on nontransactional tables, such as MyISAM or Memory tables, which essentially always operate in AUTOCOMMIT mode
Certain commands cause MySQL to commit the transaction before they execute
These are typically Data Definition Language commands that make significant changes, such as ALTER TABLE, LOCK TABLES
MySQL lets you set the isolation level using the SET TRANSACTION ISOLAITON LEVEL command, which takes effect when the next transaction starts

MySQL doesn't manage transactions at the server level, instead the underlying storage engines implement transactions themselves
This means you can't reliably mix different engines in a single transaction
MySQL also supports the LOCK TABLES and UNLOCK TABLES commands, which are implemented in the server, not in the storage engines
The interaction between LOCK TABLES and transactions is complex, and there are unexpected behaviors in some server versions
Therefore, we recommend that you never use LOCK TABLES unless you are in a transaction and AUTOCOMMIT is disables, no matter what storage engine you are using

Multiversion Concurrency Control
Most of MySQL's transactional storage engines, such as InnoDB, Falcon, and PBXT, don't use a simple row-locking mechanism
Instead, they use row-level locking in conjunction with a technique for increasing concurrency known as multiversion concurrency control(MVCC)
MVCC is not unique to MySQL: Oracle, PostgreSQL, and some other database systems use it too

InnoDB implements MVCC by storing with each row two additional, hidden values that record when the row was created and when it was expired(or deleted)
Rather than storing the actual times at which these events occurred, the row stores the system version number at the time each event occurred
This is a number that increments each time a transaction begins
Each transaction keeps its own record of the current system version, as of the time it began

MVCC works only with the READ COMMITTED and REPEATABLE READ isolation levels
READ UNCOMMITTED isn't MVCC-compatible because queries don't read the row version that's appropriate for their transaction version, they read the newest version, no matter what
SERIALIZABLE isn't MVCC-compatible because reads lock every row they return

Locking models and concurrency in MySQL using the defualt isolation level
Locking strategy        Concurrency        Overhead        Engines
Table level                Lowest           Lowest        MyISAM, Merge, Memory
Row level                  High             High          NDB Cluster
Row level with MVCC        Highest          Highest       InnoDB, Falcon, PBXT, solidDB

MySQL's Storage Engines
MySQL stores each database(also called a schema) as a subdirectory of its data directory in the underlying filesystem
When you create a table, MySQL stores the table definition in a .frm file with the same name as the table
Each storage engine stores the table's data and indexes differently, but the server itself handles the table definition

MyISAM
MySQL's default storage engine, provides good compromise between performance and useful features, such as full-text indexing, compression, and spatial(GIS) functions
MyISAM doesn't support transactions or row-level locks
MyISAM typically stores each table in two files: a data file and an index file
The two files bear .MYD and .MYI extensions respectively

MyISAM features
MyISAM locks entire tables, not rows
MySQL supports automatic checking and repairing of MyISAM tables
You can use the CHECK TABLE mytable and REPAIR TABLE mytable commands to check a table for errors and repair them
You can also use the myisamchk command-line tool to check and repair tables when the server is offline
You can create indexes on the first 500 character of BLOB and TEXT columns in MyISAM tables
MyISAM supports full-text indexes, which index individual words for complex search operations
MyISAM tables marked with the DELAY_KEY_WRITE create option don't write changed index data to disk at the end of a query
Instead MyISAM buffers the changes in the in-memory key buffer, it flushes index blocks to disk when it prunes the buffer or closes the table

InnoDB
InnoDB was designed for transaction processing-specifically, processing of many short-lived transactions that usually complete rather than being rolled back
InnoDB stores its data in a series of one or more data files that are collectively known as a tablespace
A tablespace is essentially a black box that InnoDB manages all by itself
In MySQL 4.1 and newer versions InnoDB can store each table's data and indexes in separate files
InnoDB uses MVCC to achieve high concurrency, and it implements all four SQL standard isolation levels
It defaults to the REPEATABLE READ isolation level, and it has a next-key locking strategy that prevents phantom reads in this isolation level
InnoDB tables are built on a clustered index, it provides very fast primary key lookups
Sencondary indexes  contain the primary key columns, you should strive for a small primary key if you have many indexes on a table
InnoDB doesn't compress its indexes
InnoDB loads data and creates indexes more slowly than MyISAM, any operation that changes an InnoDB table's structure will rebuild the entire table, including all the indexes
Besides its high-concurrency capabilities, InnoDB's next most popular feature is foreign key constraints
InnoDB also provides extremely fast lookups for queries that use a primary key
InnoDB has a variety of internal optimizations, include predictive read-ahead for prefetching data from disk, an adaptive hash index that automatically builds hash indexes in memory for very fast lookups, and an insert buffer to speed inserts

Main elements you should take into account to choose storage engines:
Transactions, Concurrency, Backups, Crash recovery, Special features
分享到:
评论
1 楼 hideto 2009-06-30  
看来对实时性要求比较高的应用都是采用logging这种方案啊
FW的AdServer是logging,之前师兄设计的一个网页游戏存储方案也是logging
因为logging是sequential的,没有random I/O的overhead,所以很快
结合memory cache就可以达到很好的实时性效果

可以参考下 protocolbuffers

相关推荐

Global site tag (gtag.js) - Google Analytics