Wednesday 31 July 2013

The Open vSwitch plugin with VLANs

This post looks at the Grizzly Open vSwitch plugin when it is configured to use VLANs to isolate the VM traffic of different Quantum networks on the physical network.

For this example the OVS section of /etc/quantum/plugins/openvswitch/ovs_quantum_plugin.ini on the controller, network and compute nodes is:


[OVS]
integration_bridge = br-int
tenant_network_type = vlan
enable_tunneling = False
network_vlan_ranges = physnet1:1000:1999
bridge_mappings = physnet1:br-eth2


These settings are described here. It is necessary to create the bridge br-eth2 on each data forwarding node (compute and network nodes) and add the NIC interface that connects the physical network to it - in this case that was eth2 on each node. Note, the network startup scripts need to put this interface up with no IP address at boot time.

The OVS configuration on the network-node looks like this after the OVS agent starts:


root@netnode:/etc/quantum# ovs-vsctl show
6875a9c2-a5d0-4017-95c7-466a27b6362a
    Bridge "br-eth2"
        Port "br-eth2"
            Interface "br-eth2"
                type: internal
        Port "phy-br-eth2"
            Interface "phy-br-eth2"
        Port "eth2"
            Interface "eth2"
    Bridge br-ex
        Port "eth3"
            Interface "eth3"
        Port br-ex
            Interface br-ex
                type: internal
    Bridge br-int
        Port "int-br-eth2"
            Interface "int-br-eth2"
        Port br-int
            Interface br-int
                type: internal
    ovs_version: "1.9.0"


The OVS agent connected bridges br-int and br-eth2 by adding ports int-br-eth2 and phy-br-eth2 and connecting them with a veth pair - see here about viewing veth peers. Note these ports do not have tags, so they are trunk ports.

This example requires a couple of networks to show how isolation between them is done locally and globally. I'm using the quantum and nova CLI for this, and they can read the credentials from the environment:


controller:~$ set | grep ^OS_
OS_AUTH_URL=http://10.0.10.10:5000/v2.0/
OS_PASSWORD=secrete
OS_REGION_NAME=RegionOne
OS_TENANT_NAME=demo
OS_USERNAME=admin


The user admin has been given the admin role in Keystone, and the default Quantum policy allows the admin role to view and set Quantum network provider attributes. The Quantum service picks a free VLAN from the range if they are not specified:


controller:~$ quantum net-create net1
Created a new network:
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | True                                 |
| id                        | 0e5954a1-3064-4e38-ac9d-aa74b35ef517 |
| name                      | net1                                 |
| provider:network_type     | vlan                                 |
| provider:physical_network | physnet1                             |
| provider:segmentation_id  | 1000                                 |
| router:external           | False                                |
| shared                    | False                                |
| status                    | ACTIVE                               |
| subnets                   |                                      |
| tenant_id                 | fb19b1595ae74adf942d8d527a41edda     |
+---------------------------+--------------------------------------+


Or the admin user can specify the attributes:


controller:~$ quantum net-create net2 --provider:network_type vlan \
                      --provider:physical_network physnet1 \
                      --provider:segmentation_id 1234
Created a new network:
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | True                                 |
| id                        | 97ca9768-cd8a-43c6-a979-6489fdaf7f29 |
| name                      | net2                                 |
| provider:network_type     | vlan                                 |
| provider:physical_network | physnet1                             |
| provider:segmentation_id  | 1234                                 |
| router:external           | False                                |
| shared                    | False                                |
| status                    | ACTIVE                               |
| subnets                   |                                      |
| tenant_id                 | fb19b1595ae74adf942d8d527a41edda     |
+---------------------------+--------------------------------------+


Now create subnets and boot an instance on each network:


controller:~$ quantum subnet-create net1 172.17.17.0/24 --name sub1
...
controller:~$ quantum subnet-create net2 172.17.18.0/24 --name sub2
...

vagrant@controller:~$ quantum subnet-list
+--------------------------------------+------+----------------+--------------------------------------------------+
| id                                   | name | cidr           | allocation_pools                                 |
+--------------------------------------+------+----------------+--------------------------------------------------+
| 988a6fe9-ef1b-43f5-b31b-cc9850b753bf | sub1 | 172.17.17.0/24 | {"start": "172.17.17.2", "end": "172.17.17.254"} |
| 9d530c34-5b04-4ab4-9590-db1509d808bb | sub2 | 172.17.18.0/24 | {"start": "172.17.18.2", "end": "172.17.18.254"} |
+--------------------------------------+------+----------------+--------------------------------------------------+


controller:~$ nova boot --flavor micro --image cirros-0.3.1-x86_64 \
              --nic net-id=0e5954a1-3064-4e38-ac9d-aa74b35ef517 \
              --security-groups test-vms vm1
...

controller:~$ nova boot --flavor micro --image cirros-0.3.1-x86_64 \
              --nic net-id=97ca9768-cd8a-43c6-a979-6489fdaf7f29 \
              --security-groups test-vms vm2
...

controller:~$ nova list
+--------------------------------------+------+--------+------------------+
| ID                                   | Name | Status | Networks         |
+--------------------------------------+------+--------+------------------+
| e4ec3dc1-98a0-4eeb-8330-454084ac5305 | vm1  | ACTIVE | net1=172.17.17.2 |
| d1d399b6-1659-4b56-bb92-e703c53902a9 | vm2  | ACTIVE | net2=172.17.18.2 |
+--------------------------------------+------+--------+------------------+


Shortly after the first instances on the networks were booted, the dhcp-agent created a tap device for each network (for dnsmasq) and then the openvswitch-agent created local vlans for them:


root@netnode:/etc/quantum# ovs-vsctl show
...
    Bridge br-int
        Port "tap8fda0b55-e9"
            tag: 1
            Interface "tap8fda0b55-e9"
                type: internal
        Port "tap6744fc6d-91"
            tag: 2
            Interface "tap6744fc6d-91"
                type: internal
        Port "int-br-eth2"
            Interface "int-br-eth2"
        Port br-int
            Interface br-int
                type: internal
...

root@netnode:/# cat /var/log/quantum/openvswitch-agent.log
...
INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Assigning 1 as local vlan for net-id=0e5954a1-3064-4e38-ac9d-aa74b35ef517
...
INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Assigning 2 as local vlan for net-id=97ca9768-cd8a-43c6-a979-6489fdaf7f29
...


These VLANs (1 and 2) are local to br-int and isolate net1 and net2 on br-int. But if a packet needs to go to an instance running on another node, then it will have to go via br-eth2, and its vlan id will be translated to the provider:segmentation_id of its quantum network by these flows:


root@netnode:/# ovs-ofctl dump-flows br-eth2 | cut -d, -f7-
NXST_FLOW reply (xid=0x4):
 priority=1 actions=NORMAL
 priority=2,in_port=4 actions=drop
 priority=4,in_port=4,dl_vlan=1 actions=mod_vlan_vid:1000,NORMAL
 priority=4,in_port=4,dl_vlan=2 actions=mod_vlan_vid:1234,NORMAL

root@netnode:/# ovs-vsctl get Interface phy-br-eth2 ofport
4


The fourth flow matches packets that ingress port 4 (which connects br-int with the veth pair) having VLAN id 2, and changes the VLAN id to 1234 and gives it the NORMAL action (MAC learning). These packets will then be forwarded to port eth2 which connects the physical network.

The third flow is the same except it translates VLAN id 1 to 1000 for packets on net1.

The second flow is lower priority and drops any packets that come from br-int but don't have a VLAN id of 1 or 2.

The first flow has the lowest priority and gives the NORMAL action to any remaining packets that were not matched - ie packets from the external network that ingress eth2.


Br-int has the following flows:


root@netnode:/# ovs-ofctl dump-flows br-int | cut -d, -f7-
NXST_FLOW reply (xid=0x4):
 priority=1 actions=NORMAL
 priority=2,in_port=2 actions=drop
 priority=3,in_port=2,dl_vlan=1234 actions=mod_vlan_vid:2,NORMAL
 priority=3,in_port=2,dl_vlan=1000 actions=mod_vlan_vid:1,NORMAL

root@netnode:/# ovs-vsctl get Interface int-br-eth2 ofport
2


The forth flow matches packets that ingress port 2 (which connects br-eth2 with the veth pair) with VLAN id 1000 and changes it to 1 and gives them the NORMAL action.

The third flow matches packets that ingress port 2 with VLAN id 1234 and changes it to 2 and gives them the NORMAL action.

The second flow matches packets that ingress port 2 without a VLAN header of 1000 or 1234 and drops them.

The first flow is the default (lowest priority) and applies the NORMAL action (MAC learning) to packets that don't ingress port 2. These are the packets from the attached tap devices.




This example only shows the network node, but the compute node(s) running vm1 and vm2 have a similar setup with flows on br-int and br-eth2 that translate the VLAN ids in in-bound and out-bound packets. Now the packets on eth2 look like this:


root@netnode:/# tcpdump -tnei eth2
tcpdump: WARNING: eth2: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
...
fa:16:3e:c1:63:79 > fa:16:3e:e8:b4:51, ethertype 802.1Q (0x8100), length 46: vlan 1234, p 0, ethertype ARP, Request who-has 172.17.18.2 tell 172.17.18.3, length 28
fa:16:3e:e8:b4:51 > fa:16:3e:c1:63:79, ethertype 802.1Q (0x8100), length 46: vlan 1234, p 0, ethertype ARP, Reply 172.17.18.2 is-at fa:16:3e:e8:b4:51, length 28
fa:16:3e:b3:7b:05 > fa:16:3e:7a:ad:7c, ethertype 802.1Q (0x8100), length 46: vlan 1000, p 0, ethertype ARP, Request who-has 172.17.17.2 tell 172.17.17.3, length 28
fa:16:3e:7a:ad:7c > fa:16:3e:b3:7b:05, ethertype 802.1Q (0x8100), length 46: vlan 1000, p 0, ethertype ARP, Reply 172.17.17.2 is-at fa:16:3e:7a:ad:7c, length 28
...


And the mac addresses correspond to the Quantum ports:


vagrant@controller:~$ quantum port-list -c mac_address -c fixed_ips -c id
+-------------------+------------------------------------------------------------------------------------+--------------------------------------+
| mac_address       | fixed_ips                                                                          | id                                   |
+-------------------+------------------------------------------------------------------------------------+--------------------------------------+
| fa:16:3e:7a:ad:7c | {"subnet_id": "988a6fe9-ef1b-43f5-b31b-cc9850b753bf", "ip_address": "172.17.17.2"} | 330d01ff-04c5-495f-be84-b3cbd890ee14 |
| fa:16:3e:e8:b4:51 | {"subnet_id": "9d530c34-5b04-4ab4-9590-db1509d808bb", "ip_address": "172.17.18.2"} | 5db30b6c-217a-471f-b279-215977e1f1c9 |
| fa:16:3e:c1:63:79 | {"subnet_id": "9d530c34-5b04-4ab4-9590-db1509d808bb", "ip_address": "172.17.18.3"} | 6744fc6d-917a-43dc-8255-8e5d5bc22824 |
| fa:16:3e:b3:7b:05 | {"subnet_id": "988a6fe9-ef1b-43f5-b31b-cc9850b753bf", "ip_address": "172.17.17.3"} | 8fda0b55-e9c0-48bc-b2fa-074217f8c478 |
+-------------------+------------------------------------------------------------------------------------+--------------------------------------+


In this case the "physical" network is actually a VirtualBox host-only network, and it does not seem to have a problem with tagged traffic. However real switches will probably need to have their ports configured to accept 802.1Q host tagged packets in the VLAN range.