Pages

Sunday, June 17, 2012

Applying NOX to the Datacenter

Get the paper here

The paper demonstrates that NOX can be used to implement various data center architectures (Portland and VL2) and addressing schemes; along with the fact that it scales well, can provide fine-grained QoS, access control and middlebox traversal. The paper has many good insights.

1) Data center networks are a special case networking paradigm which allow new architectures, specialized addressing schemes and easy modification of the hypervisor and network stack. This paves way for easier innovation as shown by recent research work some of which I have described in this blog. Also, since the data center is a single administrative domain using a centralized controller is better than using distributed dynamic protocols.

2) The paper describes three types of data centers: Bare-metal (w/o any hypervisor), virtualized (~20VMs per physical server) and virtualized-multitenant (like Amazon EC2)

3) Networking requirements of a data center:
i) Scaling: allocate more servers/VMs effortlessly as the demand increases. The data centers architecture and addressing schemes should be able to scale for at least a million VMs or hundreds of thousands of servers. With bigger scale the forwarding table entries grow, the switches/routers require more ports and hence become costlier and broadcasts become expensive.
ii) VMs should be able to migrate without any service interruption
iii) Fault tolerance to avoid service interruption

4) Next the paper describes VL2 and Portland. Here are the cons:
In Portland every core switch is connected to every pod with a single link: this will not scale well as it requires too many ports on core routers. VL2 runs OSPF over all its switches to build forwarding tables. This does not scale well as LSP which are broadcasted are expensive. VM migration in VL2 is not easy and requires the controller to push the new state to all the switches which are sending traffic to the VM. In both schemes ARP and new flows prompt sending packet to the controller which is not scalable in my opinion. The controller should be used only for bootstrapping and on link failures only. The middleboxes will take care of QoS and load-balancing. Doing fine grain control is not scalable

5) NOX can operate in both proactive and reactive mode. In proactive mode the controller computes all the forwarding data beforehand and pushes it onto the switches. This incurs no latency unlike the reactive approach where the first packet of each new flow is sent to the controller for processing.

6) An interesting point in this section was that for larger networks multiple NOX controllers can act in parallel where each new flow packet is sent to the designated controller in reactive mode. The controller work independently but must maintain global consistency about the network topology and host address mapping. Achieving consistency will incur more latency and complication, although it depends on the controller application under analysis.
The paper says an individual controller can handle 30K new flows per second with ~10ms latency.
A low cost 1Gbps switch can support 250K source/destination pair flow entries.

How to calculate over-subscription ratio: http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/DC_Infra2_5/DCInfra_3.html#wp1088848

7) In case of link/switch failure the controller has to install new flow entries at all the switches. The complexity and response time will increase with multiple failures. Also, installing low priority flow entries before hand will not work in case of multiple failures.





No comments:

Post a Comment