Glines.org
/software/net::cluster design

NetCluster is very much a work in progress. Here's the current design notes (more or less a text version of what's on my whiteboard).

Design goals

  • Application layer mesh networking library.
  • Support unicast, multicast, broadcast targets.
  • Support SOCK_STREAM, SOCK_DGRAM, SOCK_RDM and SOCK_SEQPACKET socket semantics.
  • Reliable delivery of all frames, excluding total network failure. Will retry as necessary. All messages use source-routing, sequence numbers and acknowledgements.
  • Opaque to underlying network topology
  • Easy to use
  • Easy to query attributes of the network, other peers in existence, multicast groups, etc
  • Each application has its own network mesh. Nodes may belong to more than one application-network. Applications are referred to by name (ascii string).
  • Multicast group subscription. Multicast groups are referred to by name (ascii string).
  • Broadcasts are just multicasts with an empty multicast-group string. Everyone is a member of the “” multicast group.

Overview

The NetCluster implementation will consist of three toplevel components. First, an application library. Second, a daemon process. Third, a discovery server (think “bittorrent tracker”), to allow nodes to find eachother.

The library will communicate with the daemon via UNIX domain sockets, spawning the daemon if necessary. It will allow the application to create sockets (really UNIX domain sockets, with some extra header stuff sent to the daemon before returning it), and read/write/select on them normally.

The daemon will connect to the discovery server to find other peers, as well as broadcast on the local ethernet (if possible). It will maintain a list of TCP links to other peers, and calculate multicast distribution trees for locally originating broadcast/multicast traffic.

User library class heirarchy

Net::Cluster

->new returns NetCluster object

  • cluster name (required)
  • broadcast flag (boolean, defaults to 1)
  • central server (string, defaults to squawk.glines.org)

->list_peers returns array of NetClusterNode objects

  • multicast group (string, optional)

->socket returns NetClusterSocket object

  • delivery options (who to connect to or what to listen for)
  • semantics (SOCK_STREAM, SOCK_RDM, etc)
  • override default timeout/retry options if needed

Net::Cluster::Node returned by list_peers

->get_uuid returns DataUUID

->get_groups returns a list of multicast group names (strings)

->get_attributes (proposed) returns some extra info about the node

->get_bandwidth returns a pair of integers, indicating the rx and tx speeds of this peer.

NetClusterSocket isa IOSocket

->new

->send

->recv

etc, etc.

Daemon class heirarchy

Net::Cluster::Daemon

Select tosser.

Net::Cluster::UNIX

Adapter to communicate with applications.

Net::Cluster::TargetResolution

For new requests, figures out the target(s).

Net::Cluster::RouteStack

Implements source-routing, to get a packet to the specified target.

Net::Cluster::Messaging

Core message-passing module.

Net::Cluster::Messaging::Broadcast

Communicate via ethernet packets with any other local hosts. Subclass of IO::Socket.

Net::Cluster::Messaging::Packet

Handle the packet format. Prints and parses packet structures, allows getting and setting of fields, etc.

Net::Cluster::Messaging::Peer

Communicate via TCP with another host. Subclass of IO::Socket::INET.

Net::Cluster::Discovery

Maintains a list of online nodes. Finds new nodes, expires dead ones. Sends and receives link-state updates to peers.

Net::Cluster::SPF

Populates the route-stack table, based on the discovery database. This process runs asynchronously, a couple seconds after a link-state update was received.

Example packet flow

  • Multicast packet received from application, destined for multicast group “test”.
  • Target resolution returns 3 destinations.
  • Look up source-routing chains for each in route table.
  • Pass each message down to messaging layer.
  • Message layer takes first peer-address from each packet's source-routing chain, and sends the message to that peer (minus the first peer-address).
  • The message is received by the next peer, which again takes the first peer-address from the list, and sends the message on to the peer (minus the address of that peer).
  • The message keeps traversing the network in this fashion, until finally the peer-address list is empty. This indicates the message has gotten to its destination. The node which receives this packet sends an ACK back to the originating node, and passes the message up to the UNIX adapter. The UNIX adapter sends the received payload data to the application socket.
  • The original node receives the ACK packets and removes the messages from its queue.

Open design issues

  • Resource limits (bandwidth, peers, buffer memory, threads)
  • config file(s), formats and locations
  • Congestion notification, QoS