Pages

Monday, June 25, 2012

Towards a Next Generation Data Center Architecture: Scalability and Commoditization

This paper is the precursor to the VL2 paper. Lots of interesting ideas. My random thoughts:

Preliminaries:
Redundancy models
N+1
N+1 means we have 1 backup device per group, so we can tolerate one failure. Any component can replace
any of the components, but only once. Minimum number of components is N.

1+1
1+1 means that every component has one dedicated backup. Each component can be replaced by the one
and only other backup device. Minimum number of components is 1.
In a 1+1 system the standby server has the ability to maintain the state, e.g. in a phone network the standby server would be able to keep ongoing calls as it constantly monitors the state of all activities in the active server. In an N+1 system there's only one standby for N active servers and the call state is lost in case of a switchover.

Destination NAT/Half-NAT + Direct Server Return + Source NAT/Full NAT
Read http://lbwiki.com/index.php/DSR for understanding DSR. DSR requires the load balancer and server to be in the same Layer 2 domain. Source/Destination NAT require that all packets pass through them in reverse direction as well. Therefore, the servers should set up the load balancer as their default route. This severely limits the number of servers which a load balancer can support. 


This figure is taken from the excellent book Load Balancing Servers, Firewalls, and Caches


- Workload inside a data center changes frequently which leads to sporadic congestion. VLB helps accommodate arbitrary traffic patterns.
- No link oversubscription: Any server can communicate with any other server at full NIC speed (minimum of sender and receiver's NIC, usually 1 Gbps)
- This is probably the earliest paper proposing the idea of software load balancers on commodity servers in data centers. Advantages: Scale out. Flexible load balancing algorithms. Cheap Disadvantages: Can't forward at line rate. 
- Packets coming from external sources are sent to an Ingress Server which performs packet encapsulation after contacting the directory service. This is a hidden hardware cost we must incur. Interestingly, VL2 does not even mention anything about Load Balancers or communication with external hosts.
- Used MAC-in-MAC tunneling instead of IP-in-IP tunneling used in VL2. The switches know about other switches (not hosts) and directly connected servers only. 
- OS is changed but not the applications. The network stack performs encapsulation after examining each packet. Would be slower than usual.
- Hose traffic model: No server is asked to send or receive more than what its network interface allows.
- Monsoon proposes that we run IS-IS with the controller, so that it knows where every host is located in order to run the directory service.

Random Facts
- Networking gear consumes 10-20% of the data center's equipment budget
- An Ethernet port is 10% to 50% the cost of an equivalent Layer 3 port
- 16K MAC entries can stored in commodity switches.

No comments:

Post a Comment