`
foreversunyao
  • 浏览: 204010 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

Paxos algorithm

阅读更多

Paxos is a family of protocols for solving consensus in a network of unreliable processors. Consensus is the process of agreeing on one result among a group of participants. This problem becomes difficult when the participants or their communication medium may experience failures.[ 1]

Consensus protocols are the basis for the state machine approach to distributed computing, as suggested by Leslie Lamport [ 2] and surveyed by Fred Schneider.[ 1]

The state machine approach is a technique for converting an algorithm into a fault-tolerant, distributed implementation. Ad-hoc techniques may leave important cases of failures unresolved. The principled approach proposed by Lamport et al. ensures all cases are handled safely.

 

Roles

Paxos describes the actions of the processes by their roles in the protocol: client, acceptor, proposer, learner, and leader. In typical implementations, a single processor may play one or more roles at the same time. This does not affect the correctness of the protocol—it is usual to coalesce roles to improve the latency and/or number of messages in the protocol.

Client
The Client issues a request to the distributed system, and waits for a response . For instance, a write request on a file in a distributed file server.
Acceptor
The Acceptors act as the fault-tolerant "memory" of the protocol. Acceptors are collected into groups called Quorums. Any message sent to an Acceptor must be sent to a Quorum of Acceptors, any message received from an Acceptor is ignored unless a copy is received from each Acceptor in a Quorum.
Proposer
A Proposer advocates a client request, attempting to convince the Acceptors to agree on it, and acting as a coordinator to move the protocol forward when conflicts occur.
Learner
Learners act as the replication factor for the protocol. Once a Client request has been agreed on by the Acceptors, the Learner may take action (ie: execute the request and send a response to the client). To improve availability of processing, additional Learners can be added.
Leader
Paxos requires a distinguished Proposer (called the leader) to make progress. Many processes may believe they are leaders, but the protocol only guarantees progress if one of them is eventually chosen. If two processes believe they are leaders, it is possible to stall the protocol by continuously proposing conflicting updates. The safety properties are preserved regardless.



Message flow: Basic Paxos

(one instance, one successful round)

 Client   Proposer      Acceptor     Learner
   |         |          |  |  |       |  |
   X-------->|          |  |  |       |  |  Request
   |         X--------->|->|->|       |  |  Prepare(N)
   |         |<---------X--X--X       |  |  Promise(N,{Va,Vb,Vc})
   |         X--------->|->|->|       |  |  Accept!(N,Vn)
   |         |<---------X--X--X------>|->|  Accepted(N,Vn)
   |<---------------------------------X--X  Response
   |         |          |  |  |       |  | 


实际上对于一般的开发人员,我们并不需要了解Paxos所有细节及如何实现,只需要知道Paxos是一个分布式选举算法就够了。本文主要介绍一下 Paxos常用的应用场合,或许有一天当你的系统增大到一定规模,你知道有这样一个技术,可以帮助你正确及优雅的解决技术架构上一些难题。

1. database replication, log replication等, 如bdb的数据复制就是使用paxos兼容的算法。Paxos最大的用途就是保持多个节点数据的一致性。

2. naming service, 如大型系统内部通常存在多个接口服务相互调用。
1) 通常的实现是将服务的ip/hostname写死在配置中,当service发生故障时候,通过手工更改配置文件或者修改DNS指向的方法来解决。缺点是可维护性差,内部的单元越多,故障率越大。
2) LVS双机冗余的方式,缺点是所有单元需要双倍的资源投入。
通过Paxos算法来管理所有的naming服务,则可保证high available分配可用的service给client。象ZooKeeper还提供watch功能,即watch的对象发生了改变会自动发 notification, 这样所有的client就可以使用一致的,高可用的接口。

3.config配置管理
1) 通常手工修改配置文件的方法,这样容易出错,也需要人工干预才能生效,所以节点的状态无法同时达到一致。
2) 大规模的应用都会实现自己的配置服务,比如用http web服务来实现配置中心化。它的缺点是更新后所有client无法立即得知,各节点加载的顺序无法保证,造成系统中的配置不是同一状态。

4.membership用户角色/access control list, 比如在权限设置中,用户一旦设置某项权限比如由管理员变成普通身份,这时应在所有的服务器上所有远程CDN立即生效,否则就会导致不能接受的后果。

5. 号码分配。通常简单的解决方法是用数据库自增ID, 这导致数据库切分困难,或程序生成GUID, 这通常导致ID过长。更优雅的做法是利用paxos算法在多台replicas之间选择一个作为master, 通过master来分配号码。当master发生故障时,再用paxos选择另外一个master。

这里列举了一些常见的Paxos应用场合,对于类似上述的场合,如果用其它解决方案,一方面不能提供自动的高可用性方案,同时也远远没有Paxos实现简单及优雅。

Yahoo!开源的ZooKeeper [5]是一个开源的类Paxos实现。它的编程接口看起来很像一个可提供强一致性保证的分布式小文件系统。对上面所有的场合都可以适用。但可惜的 是,ZooKeeper并不是遵循Paxos协议,而是基于自身设计并优化的一个2 phase commit的协议,因此它的理论[6]并未经过完全证明。但由于ZooKeeper在Yahoo!内部已经成功应用在HBase, Yahoo! Message Broker, Fetch Service of Yahoo! crawler等系统上,因此完全可以放心采用。

分享到:
评论

相关推荐

    Revisiting the Paxos algorithm

    Revisiting the Paxos algorithm

    Paxos implementation

    raft simplify Paxos algorithm you deserved it since it so powful

    Fast Paxos(pdf)

    Fast Paxos is an extension of the classic Paxos algorithm that allows the value to be learned in two message delays. How and why the algo-rithm works are explained informally, and a TLA+ speci¯...

    分布式一致性一般解决方法.pdf

    The Paxos algorithm is synonymous with distributed consensus, yet it performs poorly in practice and is famously difficult to understand. In this paper, we re-examine the foundations of distributed ...

    开源项目-ailidani-paxi.zip

    开源项目-ailidani-paxi.zip,WPaxos: multiple leaders paxos algorithm with deal with WAN latency barriers written in Go

    Paxos-Algorithm-Game:基本Paxos算法的应用

    Paxos算法游戏 该项目的目的是使用Baisc Paxos Algoritm设计一个简单的分布式系统。 这里的项目是一个猜数字游戏,可以让三个用户一起玩。 文件夹Dueling_Paxos中的代码显示了基本paxos中的决斗问题。 先决条件 Java...

    Consensus on Transaction Commit.pdf

    The Paxos Commit algorithm runs a Paxos consensus algorithm on the commit/abort decision of each participant to obtain a transaction commit protocol that uses 2F + 1 coordinators and makes progress ...

    DB - Unbounded Pipelining in Dynamically Paxos Clusters.pdf

    When equipped with a consensus algorithm a distributed system can act as a replicated state machine (RSM), duplicating its state across a cluster of redundant components to avoid the failure of any ...

    awesome-consensus:Paxos和朋友的很棒列表

    awesome-consensus:Paxos和朋友的很棒列表

    TRON.NETWORK 白皮书

    (2) Consensus layer: a Fast Paxos-based PoS consensus algorithm. 3. P2P-based distributed storage system: a support located at the bottom: (1) Network layer: customized content-addressable P2P ...

    JGroups的Raft实现jgroups-raft.zip

    Raft is a consensus algorithm that is designed to be easy to understand. It's equivalent to Paxos in fault-tolerance and performance. The difference is that it's decomposed into relatively independent...

    raft_raft_AllTheDifference_源码

    Raft is a consensus algorithm that is designed to be easy to understand. It's equivalent to Paxos in fault-tolerance and performance. The difference is that it's decomposed into relatively independent...

    一文搞懂Raft算法

    raft是工程上使用较为广泛的强一致...raft是一个共识算法(consensusalgorithm),所谓共识,就是多个节点对某个事情达成一致的看法,即使是在部分节点故障、网络延时、网络分割的情况下。这些年最为火热的加密货币(比

    Weave:基于Java的可靠且容错的基于RAFT的分布式密钥库

    编织 ...Diego Ongaro和John Osterhout在论文“寻找可理解的共识算法”中引入了RAFT,它是1990年代Leslie Lamport提出的Paxos共识算法的替代方案。 众所周知,原始Paxos论文难以实施,导致各种行业

    braft:基于brpc的RAFT共识算法的工业级C ++实现,在百度内部广泛用于构建高可用性的分布式系统

    和基于的工业级C ++实现。 braft是为要求高工作量和低延迟开销的方案设计和实现的,并考虑了易于理解的概念,以便百度内部的工程师可以正确,正确地构建自己的分布式系统。 它在百度内部广泛用于...Paxos 扎比 QJM

    elle:基于Elle协程的异步C ++开发框架

    Elle,基于协程的异步C ++开发框架 Elle是用现代C ++(C ++ 14)编写的库的集合。 它包含一组高度可重用的概念,算法,API包装器,... Elle分为不同的专业子库,以提供优雅的方式来处理异步(使用),联网,格式化...

    distributed-consensus-reading-list:关于分布式共识的一长串学术论文

    分布式共识阅读列表自1980年代成立以来,分布式共识和原子广播,状态机复制和拜占庭容错的相关领域一直是广泛的学术研究主题。 该文件包含与分布式共识相关的论文(和其他著作)列表。 下面列出的许多论文都包含多个...

    龙舟:Go中功能齐全的高性能多组筏库

    龙舟-Go中的多组筏库/新闻2021-01-20 Dragonboat v3.3已发布,请检查进行所有更改。 2020-03-05 Dragonboat v3.2已发布,有关详细信息,请检查 。关于Dragonboat是纯高性能多组库。 只要大多数成员服务器可用,诸如...

Global site tag (gtag.js) - Google Analytics