Giraph源码分析(二)—启动Master/Worker服务
作者 | 白松
注:本文为原创,引用转载需与数澜联系。
1、org.apache.giraph.bsp.CentralizedService 接口
功能:Basic service interface shared by both CentralizedServiceMaster and CentralizedServiceWorker.
2、org.apache.giraph.bsp.CentralizedServiceMaster接口
功能:At most, there will be one active master at a time, but many threads can be trying to be the active master.
3、org.apache.giraph.bsp.CentralizedServiceWorker接口
功能:All workers should have access to this centralized service to execute the following methods.
4、org.apache.giraph.bsp.BspService抽象类
功能:Zookeeper-based implementation of CentralizedService.
5、org.apache.giraph.master.BspServiceMaster类
功能:ZooKeeper-based implementation of CentralizedServiceMaster.
6、org.apache.giraph.worker.BspServiceWorker类
功能:ZooKeeper-based implementation of CentralizedServiceWorker.
BspServiceWorker类有WorkerClient和WorkerServer实例,分别作为IPC通信的客户端和服务器端,通过Netty来发送数据。WorkerClient实例实际为NettyWorkerClient对象,WorkerServert实例实际为NettyWorkerServer对象。
NettyWorkerClient implements WorkerClient接口,NettyWorkerServer implements WorkerServer接口。
NettyWorkerServer类的构造方法中创建一个NettyServer对象,用于底层的IPC的通信,还有一个ServerData对象,作为数据实体。ServerData中包含该Worker的partitionStore、edgeStore、incomingMessageStore、currentMessageStore、聚集值等。
NettyWorkerClient类的构造方法中创建一个NettyClient对象,用于底层的IPC的通信,作为客户端。
7、org.apache.giraph.worker.InputSplitsCallable 抽象类,继承 Callable接口。
功能:用于加载顶点或边 输入splits,每个线程都有一个WorkerClientRequestProcessor实例(实为 NettyWorkerClientRequestProcessor对象),负责向远端的worker发送数据。
NettyWorkerClientRequestProcessor对象用于发送的WorkerClient对象就是BspServiceWorker里面的WorkerClient对象。
VertexInputSplitsCallable类中的readInputSplit()方法用来从split中读取顶点的信息,然后调用NettyWorkerClientRequestProcessor对象的sendVertexRequest()方法把顶点发送到它所属的Partition上。
8、org.apache.giraph.graph.ComputeCallable 类,继承Callable接口。
在该对象中完成“计算-通信-同步”的过程。每个线程都有一个WorkerClientRequestProcessor实例(实为 NettyWorkerClientRequestProcessor对象),负责向远端的worker发送数据。