Monthly Archives: August 2015

NSX on AWS (Or Google Cloud) – Part 3

Finally, let’s stretch our vSphere clusters from AWS to Google using the L2VPN feature of NSX and Ravello! This method also applies between a physical site (a branch office, your laptop…) and a cloud infrastructure.

What we will achieve:

img_54e26bb887aaf

Site A is going to be Google Cloud, side B AWS. The VLANs we chose to use are a bit different from the diagram above, but what we are trying to achieve is pretty much the same.

Side A does not use network virtualisation, site B has NSX 6.1.2 and VXLANs configured. I have tried 6.1.4 unsuccessfully because of a bug with self-signed certificates, no doubt 6.2 will fix that.

We will show that VM1 on this diagram can ping VM7, following this data path:

VM1 (Google) > VLAN > L2VPN Client > L2VPN Server > Logical switch (VXLAN) > VM7 (AWS)

In our setup, the “real” VLAN is VLAN501 and is tagged with dot1q 501 and  the VXLAN is VXLAN5000.

From parts 1 and 2 your site B should be up and running.

Our 2 cloud applications:

Screen Shot 2015-08-10 at 1.07.45 AM

Site B’s topology is:

Screen Shot 2015-08-10 at 1.03.33 AM

 

Screen Shot 2015-08-10 at 1.39.46 AM

At the top infrastructure level, we have the following

NS1 (CentOS) is running named and NTPd (192.168.1.101/24)

Vcenter 5.5 U2 appliance (192.168.1.100/24)

Jumphost is a Windows 7 VM running RDP on an elastic IP (192.168.1.5/24)

ESX MGMT is our management cluster for NSX manager and the controllers (192.168.1.11/24)

ESX PROD simulates a compute cluster (192.168.1.12/24)

The following nested/non-nested objects are managed by our VCenter server:

DC (DataCenter)

> MGMT (Cluster)

>>esx1.tomlab.com (Host)

>>>NSX Manager (VM) (192.168.1.200/24)

>>>NSX Controller1 (VM) (192.168.1.160/24)

>PRD (Cluster)

>>esx2.tomlab.com (Host)

>>>Server1 (VM) (192.168.4.2/24) on VXLAN5000 (logical switch)

And finally, network-wise:

– 1 Standard switch (1 uplink, esx1 and 2 vnic1) no VLAN configured (192.168.1.X)

– 1 VDS with 1 trunk port group (1 uplink, esx1 and esx2 vnic2) VLANs 500-505 configured

 

Site A is much simpler and simulates a small branch infrastructure:

Screen Shot 2015-08-10 at 1.03.58 AM

 

Screen Shot 2015-08-10 at 1.43.31 AM

 

Jumphost is a Windows 7 VM running RDP on an elastic IP (192.168.2.5/24)

Remote ESX is an ESXi server (192.168.2.11/24)

Vcenter 5.5 U2 appliance (192.168.2.100/24)

The following nested/non-nested objects are managed by our VCenter server:

DC2 (Datacentre)

>Cluster1 (Cluster)

>>esx1.tomlab2.com (Host)

>>>server1 (VM) (192.168.4.1/24)

Finally, network-wise:

esx1’s first nic is on a Standard switch, no VLAN configured, access to 192.168.2.0/24

esx1’s second nic is on a Distributed switch, 2 port groups configured, 1 access port group to VLAN 501, 1 trunk port group (500-505).

server1 is connected to the access port group on VLAN501.

L2 VPNs connect trunk ports together, hence the presence of trunk port groups at both sites. The creation of these ports require special parameters, there is some great documentation specifically about that here.

Let’s get started:

For SiteB (L2VPN server):

1) In the 2d part of this doc, we have deployed a logical switch. We will now connect a LIF of an edge gateway to this logical switch, and a second LIF of the same EG on our uplink network (192.168.1.X/24).

Follow the steps to create and configure your edge gateway (here are screenshots of the less standard parameters). It will be our L2VPN server and reside on our production cluster in the AWS cloud:

Screen Shot 2015-08-09 at 8.17.22 PM

Screen Shot 2015-08-09 at 8.17.46 PM

Screen Shot 2015-08-09 at 8.18.22 PM

Let’s just configure the uplink for now:

Screen Shot 2015-08-09 at 8.18.44 PM

Screen Shot 2015-08-09 at 8.19.14 PM

Once the VM is deployed, power it on and edit its second interface. The interface type must be set to trunk and connected to your trunk port on the VDS. Then you need to add a sub interface on the trunk with a TunnelID. This will make your VXLAN (Logical Switch) a part of the trunk:

Screen Shot 2015-08-09 at 9.01.16 PM

Screen Shot 2015-08-09 at 9.00.09 PM

Next step is to generate a CSR and self-sign our certificate (head to the certificates section below your edge interfaces configuration), click on Actions > “Generate CSR”, fill in basic information and once your CSR is generated, click on Actions > “self-sign certificate” and enter a number of days.

Screen Shot 2015-08-09 at 8.52.27 PM

Screen Shot 2015-08-09 at 8.53.03 PM

 

Then head to manage > VPN and choose L2VPN, and enter your server settings:

Screen Shot 2015-08-09 at 9.01.51 PM

Once done, we need to enable the peer site (SiteA, the client), with the following configuration:

Choose a password there that you will use during SiteA configuration later and the interfaces that you want to stretch to the remote site:

Screen Shot 2015-08-09 at 9.03.14 PM

 

Then just click on the “enable” button and publish your changes.

We have chosen 192.168.1.250 as the listener IP so we will need to configure DNAT on the Ravello web UI, ElasticIP > 192.168.1.250:443. The way to do this is to add a secondary IP on your ESX PROD host (192.168.1.250) and add a service on 443, bound to an elastic IP:

Screen Shot 2015-08-10 at 11.03.32 am

 

Screen Shot 2015-08-10 at 11.03.43 am

 

This will make the L2VPN service on the edge gateway listen to incoming connections on port 443. Test if everything works well by issuing

telnet <elastic IP> 443

If you get a reply, proceed to the following step, if not review your settings.

Refer to this site if you need more help.

Now for SiteA:

We first need to deploy the NSX client OVF available with every NSX release on vmware’s download website. Just do that  from your jump host using either the vsphere client or the web UI.

You will have to fill out  a few configuration settings:

The setup network screen sets the uplink (192.168.2.0/24) and the trunk port that we are going to use. The trunk port MUST be different from the access port your VM is connected to, but MUST have it in its trunked VLANs list.

In other words, we will use 3 port groups for this setup. 1 “VM network” (192.168.2.0/24), 1 “VLAN501″ configured as an access port to VLAN 501 (and our linux server is connected to that one”) and 1 TRUNK port group configured as a trunk for VLANs 500-505. On the “setup network” screen, use 192.168.2.0/24 as your public source and TRUNK as your trunk source.

This shows the trunk port configuration on site :
Screen Shot 2015-08-09 at 10.55.05 PM

Screen Shot 2015-08-09 at 10.30.42 PM

 

For the server address, enter your elastic IP, and for the rest, just match your server settings.

Once the appliance runs, the connection should be automatically established between your 2 sites:

Screen Shot 2015-08-10 at 12.49.44 AM

If your tunnel is not established, try the edge l2vpn troubleshooting commands from this blog.

If not, you are pretty much done, and should have connectivity between your linux machines (if issues still exist it might be because of the configuration of the security of your port groups, check out the blog aforementioned):

Screen Shot 2015-08-10 at 1.01.55 AM

Screen Shot 2015-08-10 at 1.00.57 AM

That’s it… your L2 network is now stretched between Google Cloud and AWS. This opens a quite a lot of possibilities, think disaster recovery, vmotion… NSX can be used as-a-service over any IaaS cloud (provided the extra layer Ravello provides for now!) We will explore them in future posts.

Thanks for Ravello’s support and let me know if you have questions!

 

 

NSX on AWS (Or Google Cloud) – Part 2

Now that we have ESX and VCenter installed on AWS and/or Google Cloud, let’s focus on NSX. This time we will deploy our management and “prod” VMs in a nested environment. The NSX installation/configuration is quite standard there are only a few differences that we will outline in this post, hence the lack of details. If you need more information on how to install and configure NSX, this page for example, is great!

First of all, let’s add our ESX machines to our VCenter server, making 1 DC, and 2 clusters,

– 1 MGMT (Management) Cluster for NSX manager and controllers

– 1 PRD (Production) Cluster for compute resources and VXLANs

Your setup should look like this:

Screen Shot 2015-08-09 at 9.53.44 PM

(ignore the VMs for now, we will get to them soon)

Now let’s deploy NSX manager (right click on your MGMT cluster and choose deploy OVF), the rest is pretty obvious, choose an IP in your 192.168.1.X/24 range (I picked 1.200) and follow the prompts. Just don’t tick “Power on VM” on the last screen as we will lower the VM resource consumption.

Once the manager is deployed, right click and edit its settings and change the specs to 2VCPUs (down from 4) and 8192MB of RAM (down from 12GB), and power on the VM.

After a few minutes, you should be able to access the interface, username “admin” password is what you chose during the deployment.

Make sure that all the NSX services are running:

 

Screen Shot 2015-08-09 at 10.01.00 PM

Then, go to “Manage” and NSX management service and configure your lookup service and your vcenter server association.

Screen Shot 2015-08-09 at 10.02.39 PM

Configure NTP as well (it’s important), and other parameters relevant to your configuration.

Once done, the Networking & Security section should appear in the VCenter Web UI, click on it and select “installation”, then go to Management, and add a controller node.

Screen Shot 2015-08-09 at 10.06.46 PM

Next step is the host preparation, click on the Host Preparation tab and prepare your PRD Cluster.

Then go to Logical Network Preparation, create a unicast zone, configure your segment ID, and finally your VXLAN transport.

Screen Shot 2015-08-09 at 10.10.40 PM

It’s time to create our first VXLAN (aka logical switch):

Screen Shot 2015-08-09 at 10.12.01 PM

You should now be able to deploy a few VMs and connect them to your logical switch, and test the connectivity. We will re-use them for the 3d and final part of this document.

 

 

NSX on AWS (Or Google Cloud) – Part 1

Trying to run the mighty NSX in a home lab can be challenging… Forget about a single host, forget about 16GB per host, and if you want to have a bit of fun, you will need at least a couple of VLANs. Enters Ravello (www.ravellosystems.com).

Thanks to this incredible product, your expensive and noisy home lab days are over.

Spin up 2 ESXi hosts and a vCenter on Google Cloud or AWS in a minute from this extremely well crafted web interface! There you go, you have your management cluster. In reality things are a bit more challenging, but workable. We will see how in a minute. Ravello is free for 2 weeks for you to try and perfom your initial installations, on the cloud platform of your choice. After the trial expires, and depending on how demanding your setup is, you will have to pay US$1-3/h. Reasonable if you consider that a similar setup would cost at least a couple of grands.

The plan:

Step 1 (this post): Run an ESXi cluster, a VCenter server and basic required services in the cloud

Step 2: Install NSX in a nested environment and configure controllers

Step 3: Connect a remote environment to the Ravello NSX setup using a Layer 2 VPN!

This will demonstrate that connecting a company network to a cloud instance, securely, is possible using NSX, whether (and it is much more simple) you are going to use vCloud Air, AWS or Google Cloud. Ravello advertise themselves as a lab/testing platform, but I don’t see how such services will not be available for production in the future.

A couple of gotchas:

– I have tried to complete this setup using vSphere 6, unsuccessfully. While ESXi 6 will install without a problem on Ravello’s ESX instances, the new version of vcenter appliance asks for an ESXi host as a target and therefore needs to run in a nested environment (deployed on a virtual ESXi host). The workload created by such an installation (4vcpus and 8GB of RAM) is not -yet- supported by Ravello and WILL crash. Stick to VC5 or install VC6 on a Windows server (no nested deployment). One thing to remember, if you install ESXi6, VC5 will not work.

– I have experienced a few “Cloud Errors” which will make your setup unavailable. They are not lasting long usually, but can happen at random times. (Update 11/08/15: 2 causes, ESXi Nested workload, and at the time I tried this, Google experienced a storage outage).

– This setup assumes that you have the right (and the licenses) to download VC5 and NSX 6 virtual appliances. Licensing is not covered in this document.

– These instructions also assume that you know your way around ESX, NSX, Ravello and Linux.
– Opt for the performance tier of Ravello, NSX manager and 1 controller will crash your nested environment. You have been warned.

Some steps of my setup are similar to these instructions. They are great, have a read, but we will go a little further and will try to simplify the setup.

Now, for our basic setup we will need:

– A Windows jumphost to perform vcenter deployments, access the ESX hosts, etc.

– An authoritative DNS and NTP server for the zone tomlab.com (centOS 6/bind/NTPd) that we can use to ssh into the ESXi hosts as well.

– A software iSCSI SAN (presenting a 100GB target)

– 1 ESXi host to emulate NSX management cluster.

– 1 VCenter appliance version 5.5

– We will use just one network for our setup to keep things simple (192.168.1.0/24) but R

  • Static IPs only
  • 192.168.1.1 is the default gateway (a very elegant feature of Ravello is automatic creation of default GW, DNS servers, etc when you configure a VM)
  • 192.168.1.101 will be our DNS server running bind
  • 192.168.1.11: our virtual ESXi host (MGMT)
  • 192.168.1.150: our SAN (linux machine running open filer)
  • 192.168.1.100: our nested VCenter server (running on the MGMT cluster, we will set it up last)
  • 192.168.1.200: our NSX manager appliance (also set up later)

Our setup should look like this once everything is up and running, from the Ravello side:

Screen Shot 2015-08-06 at 11.32.39 PM

Detailed network view:

Screen Shot 2015-08-06 at 11.44.27 PM

Let’s get started.

Just follow the steps:

1) Open an account on Ravello’s website

2) Once logged in, add a new application. Just click on “Applications” and “Create Application”.

Give it a name, don’t use a blueprint for now

3) Next let’s upload:

– The ESX 5 iso image

– CentOS 6.6 iso image

– OpenFiler iso image (or any software SAN of your preference, I have tested iSCSI but any NFS server should work, I just wanted to avoid the NFS/Firewall setup).

– A Windows client OS image (I used windows 7, tried the supplied XUbuntu but had issues with X11’s performance)

– The VCenter 5 Appliance ovf/ova image.

While iso images will just need to be attached when deploying a new VM, you will have to configure some settings for the OVA/OVF before you can use it. Stick to LSI logic (Parallel) for your virtual disks and the rest should be fine.

Click on Library > Disk Images, and install the windows or mac uploader on your machine, then fire it up and upload the 2 images, after logging in

If you already have a vsphere environment running, you can even export

Once you are done with your 5 images, proceed with the next steps (when you add VMs to your canvas, you will have to “Update” your configuration. This publishes them to the cloud platform on your choice, pay attention to your target cloud on the following screen).

4) First install your jump host by clicking on the + sign and create an empty image. Connect the windows iso to it and install Windows. Give it an IP address. Once the installation is completed, go to your Ravello network preferences and make the IP you configured post installation in Windows matching your Ravello one. Then enable RDP on your windows machine, and in Ravello’s config, go to services and add a service labeled RDP / TCP / 3389.

Using a public IP or an elastic one if you want it to stick, test your RDP connection.

5) Install your DNS server by clicking on the + sign and create an empty image. Do not use the Ravello CentOS image if you plan on not using keys for authentication. Install CentOS 6.6 minimal from scratch, configure its IP address, DNS client settings, hostname etc. 1 CPU, 2GB of RAM and 20GB of HDD will be sufficient.

The 2 services we will need are bind (named) and NTPd.

Install bind using:

yum install  bind
yum install bind-utils

Configure your bind server and a zone file similar to this one:

[root@ns1 ~]# cat /var/named/tomlab.zone 
$TTL 86400 
@	IN	SOA	ns1.tomlab.com.	hostmaster.example.com. (
			2015070701 ; serial                     
			21600      ; refresh after 6 hours                     
			3600       ; retry after 1 hour                     
			604800     ; expire after 1 week                     
			86400 )    ; minimum TTL of 1 day  
		     
		           
	IN	NS	ns1.tomlab.com.       
	
ns1	IN	A	192.168.1.101
esx1	IN	A	192.168.1.11
san	IN	A	192.168.1.150
nsx	IN	A	192.168.1.200

vcenter	IN	A	192.168.1.100

Configure your ntpd service like this:

[root@ns1 ~]# cat /etc/ntp.conf 
# Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server au.pool.ntp.org iburst

# Leave the rest as is

Start your services, make sure that they are enabled on boot (with chkconfig) and that your firewall allows requests on ports 53 (UDP) for DNS and 123 (UDP) for NTP

6) Create another empty host using the + sign and name it san.whatever.com.

Give it a 100GB HDD, (LSI logic Generic) and attach your open filer ISO to it.

Give it an IP address and configure the network settings, make them match on the Ravello interface static settings, and create a service to access OpenFiler’s web interface. For guidance:

Screen Shot 2015-08-05 at 11.31.40 AMScreen Shot 2015-08-05 at 11.32.22 AM

If you need help getting Openfiler installed you will find a lot of tutorials around, this one details the steps very well.

7) Now let’s create our nested ESXi servers VMs

On your canvas, just click on the “+” sign and add 3 empty ESX hosts. Connect your ESXi image to the 3 hosts and complete the install using the console. Tip: use the visual keyboard to press F11.

Once installed, configure your networking and enable SSH from the troubleshooting options.

Important steps, to enable nested virtualisation on your ESXi hosts and avoid trouble. From Ravello’s excellent blog entry, and using your freshly installed name server as a jumphost (or windows with putty, whatever you are comfortable with) SSH in your ESX hosts and perform the following steps:

DELETE ESX UUID

  1. run “vi /etc/vmware/esx.conf”
  2. go to last line in the file in which “/system/uuid” is defined. Delete this line and save the file.

SET UNIQUE MAC ADDRESSES

  1. In Ravello GUI, in the Network tab of the VM, make sure “Auto MAC” is checked for both interfaces.
  2. run ‘esxcli system settings advanced set -o /Net/FollowHardwareMac -i 1′

ENABLE NESTING ON ALL ESX GUESTS

  1. This step is important in order to be able to power on VMs running on ESXi. It replace the need in configuring each guests with the ‘vmx.allowNested’ flag.
  2. run ‘vi /etc/vmware/config’.
  3. add the following to the file ‘vmx.allowNested = “TRUE”‘ and save.

ENSURE CHANGES ARE SAVED

  1. run ‘/sbin/auto-backup.sh’ (ignore warnings in its output if exist)

Once you are done with this on your host, you should be able to access it from a traditional vsphere client on your Windows jump host, but we won’t need to do this for now.

8) Let’s install vcenter appliance, simply click on “+” again on your canvas and drop your VCA image on the canvas. Give it an IP address, configure your settings as per the beginning of this document. (IP/GW/DNS). Don’t use Ravello’s DNS as some of our infrastructure will be running on the nested environment, use your centOS named instance instead. Once deployed, let the virtual machine boot and using Ravello’s console, log into it (root/vmware). Then, as instructed, execute:

Screen Shot 2015-08-06 at 11.39.09 PM

And follow the menus to configure your VCenter networking, default GW, DNS…

Once done, you will be able to access https://192.168.1.100:5480 and finish your VC configuration, and when it is complete, you finally can access VCenter web client at https://192.168.1.100:9443

It’s now time to create a DC, a cluster (call it MGMT) and add your virtual ESXi to it.

Once done, add a software iscsi initiator, and add your SAN IP (192.168.1.150) as a dynamic target.

The result of all this work should give you something like this (without the error):

Screen Shot 2015-08-06 at 11.53.31 PM

That’s it for now. In our next post, we will deploy our NSX manager, a controller, and start deploying virtual switches and edge gateways, and later VPN in to this setup!

Stay tuned!