Book Review: Life in Code

Life in Code is a collection of essays by Ellen Ullman, a writer with a background in programming/software engineering. In this collection, she muses on the life of an engineer, the role of privacy in the modern-day internet age, the rise and fall of tech economies over the last 35 years, and the sociopolitical dynamics of tech culture of Silicon Valley and San Francisco. (Which couldn’t be anymore timely with the recent news of privacy concerns from our internet big brothers.)

I enjoyed the stories when she talked about coding back in the 80s. I honestly can’t imagine trying to solve complex problems without online, searchable documentation, Stackoverflow, or even google searches . I respect the willingness to dive through pages of manuals to find one nugget of inspiration for trying something a different way. And not to give too much away, but the solutions she describes while bug hunting aren’t any different from the culprits we find today when troubleshooting code.

I have never worked full-time as a developer/software engineer/programmer, but I definitely found parallels to my technology career.

In one section she describes the life of software in the context of the life of a programmer, “If you are a programmer, it is guaranteed that your work has errors. These errors will be discovered over time, most coming to light after you’ve moved on to a new job… At the old job, they will say terrible things about you after you’ve gone. This is normal life for a programmer.”

My current role is a sales engineer, she bites hard at this when talking about a fellow developer that became a sales engineer and it hits a little too close to home. “When asked we said, ‘Frank is now in sales.’ This was equivalent to saying he was dead.”

She talks about the drive of an engineer, “I’m an engineer for the same reason anyone is an engineer: a certain love for the intricate lives of things, a belief in a functional definition of reality. I do believe that the operational definition of a thing — how it works — is its most eloquent self-expression.”

And again another analogy I found relevant to the infrastructure that I have spent my career defining, building, and refining. “And down under all those piles of stuff, the secret was written: we build our computers the way we build our cities — over time, without a plan, on top of ruins.”

She also tackles the subject of being a female in a male-dominated industry. She makes reference a few times to the internet rises and falls in her home of San Francisco and how during those peaks, the young white and asian men are the ones making the fortunes.

Overall, it was an enjoyable read for anyone who has lived in the technology world for any amount of time.

AWS Cloudformation

I wanted to create a very basic, very beginner guide to cloudformation. The information I found on it went 0 to 60 pretty quickly, and I just found myself barely hanging on to understand.

Cloudformation is a way to model an AWS infrastructure in code. This allows for a reasonable way to quickly stand up new environments, ensure state of current environments, or change the state of environments.

I have written a tutorial and provided example templates over at my Github Repo. (Because of the way wordpress formats code snippets, I left all of it over there.)

 

https://github.com/boblongmore/aws-cloudformation-examples

 

 

Ansible Basics

I have worked with Ansible enough to figure out how to get some basic configuration done. I wanted to document some of these basic approaches to set a baseline.

Using our vagrant machine and a CSR router we will go through different methods to configure a router.

  • copying from cfg file to deploy to the router
  • tasks defined in the main playbook
  • tasks using roles in three different ways
    • tasks defined in the role
    • tasks that import tasks
    • tasks that use a jinja2 template to populate a config using role-specific vars

I tried to take an example of using ansible with the tools that we use today to configure a new site–a spreadsheet.

I have a spreadsheet that collects the information you would most likely use to configure a router (SNMP, NTP, routing process, interface configuration).

I use a Python script to populate a yml file that then, via jinja2 template, populates a router config. The end result is a standard router config.

hostname CSR01

ip domain-name example.com
crypto key generate rsa modulus 2048

snmp-server community publicRO

router ospf 100
router-id 1.1.1.1

interface g1

  description to_Core
  ip address 192.168.1.1 255.255.255.252
  ip ospf 100 area 0
  no shutdown

interface g2

  description to_Core
  ip address 192.168.1.4 255.255.255.252
  ip ospf 100 area 1
  no shutdown

interface lo1

  description EIGRP RID
  ip address 1.1.1.1 255.255.255.255
  ip ospf 100 area 0
  no shutdown

We can use an ansible playbook to run that python script. We can then use an ansible playbook to apply that config to a router.

There is a global hosts file that we can use to define hosts or groups of hosts. In this case I am using one local to this playbook directory so I can be more flexible in testing different scenarios. Here is what it looks like.

localhost ansible_connection=local

[CSR]
172.16.9.155
[CSR:vars]
ansible_connection=local

I run the playbook and include the hosts file:

ansible-playbook csr_config.yml -i ./hosts

Here is what the playbook looks like.

---

- name: Configure a CSR Router
hosts: CSR
vars:
creds:
username: admin
password: cisco
authorize: yes

tasks:
- name: apply config file to the router
ios_config:
provider: "{{ creds }}"
authorize: yes
src: "/Github/AnsibleNetExamples/CSR-Builder/cfg_files/CSR01.cfg"

NOTE: Notice we define our credentials right in the playbook. We then reference a provider in our tasks that says to use the credentials. Since Ansible 2.3 we can pass authentication via command line so we don’t have to store credentials in our playbooks. You could use vault to store encrypted credentials (that is for another time)

We will use the method of passing credentials via command line when we start using roles.

Next we’ll define the router configuration within a single playbook. This playbook simply set an NTP server and sets an IP address and description on an interface. This takes the syntax of the router and defines those items. Notice the parents line. This allows us to put configuration in interface configuration mode.

---

- name: Apply Configuration to CSR
hosts: CSR
vars:
creds:
username: admin
password: cisco

tasks:
- name: Configure Interface G2
ios_config:
provider: "{{ creds }}"
lines:
- ip address 10.1.1.1 255.255.255.252
- description Configured by Ansible Playbook CSR-Basic
parents: interface g1
- name: Configure NTP
ios_config:
provider: "{{ creds }}"
lines: ntp server 8.8.8.8

We could keep adding tasks to this configuration to make a complete router configuration, but instead we will start to abstract the different areas of configuration to different roles. This comes in handy for static configuration you may have. For example things like NTP servers, or SNMP servers probably don’t change as often as switchport configurations or even routing configurations. We can define an NTP role that we can call from another variable. That way we never have to worry about accidentally changing that playbook while we are working on something else.

Let’s look at the directory structure we will need within our ansible directories to take advantage of these abstractions.

├── hosts
├── roles
│   ├── interfaces
│   │   ├── Tasks
│   │   │   └── main.yml
│   │   ├── templates
│   │   │   └── int.j2
│   │   └── vars
│   │   └── main.yml
│   ├── ntp
│   │   ├── tasks
│   │   │   └── main.yml
│   │   └── templates
│   └── ospf
│   ├── tasks
│   │   ├── main.yml
│   │   ├── ospf-int.yml
│   │   └── ospf-proc.yml
│   └── templates
└── site.yml

Our site.yml file does nothing more than say what hosts we want to run the playbook on and what roles we want to run. Here is the top-level site.yml.

---

- name: provide creds
hosts: CSR

roles:
- ntp
- ospf
- interfaces

We have roles for ntp, ospf, and interfaces defined. Notice that each role has its own directory structure. There are more directories we could create, but we are keeping it simple with tasks, templates, and vars.

Here is the main.yml file for the ntp role.

---

- name: set ntp server
ios_config:
lines:
- ntp server 10.1.1.1

It just very simply sets the ntp server.

Next we’ll abstract those roles a bit more with the include-tasks functionality. For OSPF we want to set the OSPF process id and router id, we also want to set the interface level command to make it an OSPF interface.

our main.yml for the OSPF role is as follows.

---

- import_tasks: ospf-proc.yml
- import_tasks: ospf-int.yml
-

This role imports its task from two other files in the same directory.
The ospf-proc.yml sets the process-id and RID.

---

- name: set ospf process
ios_config:
lines:
- router ospf 100

- name: set router id
ios_config:
lines:
- router-id 1.1.1.1
parents: router ospf 100

The ospf-int.yml configures the interface.

---

- name: set OSPF Interface
ios_config:
lines:
- ip ospf 100 area 0
parents:
- interface g1

Lastly, we have the interfaces role. We are going to apply the config from the jinja2 file included in the templates directory.

---

- name: configure interface settings
ios_config:
src: int.j2

Let’s look at the jinja2 template. Here we are referencing variables, these variables are in the main.yml file under the vars directory for this role.

We are creating a loop for each interface defined in the vars file.

{% for interface in interface.name %}
interface {{ interface['int'] }}
description {{ interface['description'] }}
ip address {{ interface['ip'] }} {{ interface['mask']}}
no shutdown
{% endfor %}

In this loop we are creating configuration lines for which interface, interface description and an IP address. In our vars, we create dictionary for each interface and reference the dict items within our for loop in the jinja2 template.

---

interface:
name:
- { int: g2, description: created by ansible, ip: 10.1.1.2, mask: 255.255.255.0}
- { int: g1, description: created by ansible, ip: 10.2.1.2, mask: 255.255.255.0}

You’ll notice we never defined our provider in these examples. We will run the playbook and pass in our username and password on the command line.

ubuntu@ubuntu-xenial:/Github/AnsibleNetExamples/CSR-Roles$ ansible-playbook -i ./hosts site.yml -u admin -k
SSH password:

PLAY [CSR Roles Playbook] ********************************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************************************
ok: [172.16.9.155]

TASK [ntp : set ntp server] ******************************************************************************************************************
changed: [172.16.9.155]

TASK [ospf : set ospf process] ***************************************************************************************************************
changed: [172.16.9.155]

TASK [ospf : set router id] ******************************************************************************************************************
changed: [172.16.9.155]

TASK [ospf : set OSPF Interface] *************************************************************************************************************
changed: [172.16.9.155]

TASK [interfaces : configure interface settings] *********************************************************************************************
changed: [172.16.9.155]

PLAY RECAP ***********************************************************************************************************************************
172.16.9.155 : ok=6 changed=5 unreachable=0 failed=0

NOTE: I did have some issues with SSH and network devices. I’ve found that if you edit the /etc/ansible/ansible.cfg file to disable host key checking under defaults and record host keys under the paramiko settings, it works a bit better.

# uncomment this to disable SSH key host checking
host_key_checking = False

...

# uncomment this line to cause the paramiko connection plugin to not record new host
# keys encountered. Increases performance on new host additions. Setting works independently of the
# host key checking setting above.
record_host_keys=False

These files are available on my github repository: https://github.com/boblongmore/AnsibleNetExamples

AWS re:Invent 2017

I am not exactly Mr. Relevant on this topic–AWS re:Invent happened over a month ago in Las Vegas. I took a whole bunch of notes and spewed them into a summary as soon as I got back. I delivered a presentation to my coworkers and moved on. But as we roll into 2018, I am still thinking about what AWS is and what they have to offer businesses small and large. When I left the conference I was definitely high on what AWS had to offer. And a month later I still am. There are business problems to solve. There are conversations about what cloud native means. There are discussions about migrations and/or building greenfield. There are strategies that businesses are figuring out.

As a grizzled old network engineer, I find this world fascinating. I find these potential business transformation topics fascinating.

I’ve done a decent job of making myself uncomfortable in 2017. I changed jobs. I went to a devops conference. I focused on cloud and automation more. Now I went to a cloud conference. This was a different venue for me. There was a tangible difference in the feel of the conference. Much more talk aimed at developers and business outcomes rather than just focusing on widgets on a new platform.

It could be that because cloud technologies are newer to me, everything felt fresher and more forward-thinking than the other vendor conferences I have been to. Maybe I don’t have the battle scars, which enable some healthy skepticism yet, but I feel like diving into this universe of cloud technologies is reinvigorating my passion for the IT industry.

Another difference that struck me was the partner ecosystem. Walking around the Expo floor, I was plenty of the old guard companies–Cisco, NetApp, Palo Alto, etc. But I saw way more companies that I have never heard of before. These companies are cloud-focused, cloud-born and are solving cloud-problems and filling cloud-gaps for customers. I think this speaks to the different customers of cloud and the different pain points of cloud as compared to traditional data center operations.

I heard the phrase “undifferentiated heavy lifting” several times over the week-long conference. The first couple times it didn’t register, but I came to get it. What I came to realize is that AWS is just different than the traditional IT industry I have been a part of for the last 20 or so years. The effort is in a different place. For example, we may have a cool idea to implement some hot new technology in the data center. We spend months researching testing and then during some middle of the night maintenance window–we hope and pray that our solution is going to work. But usually, that infrastructure change is not noticable to anyone except us in IT operations. AWS does that month-long planning and deployment of your infrastructure for you. The part that you spent a lot of time and effort on that was several layers below adding value to the customer is the starting point for AWS. You build on top of the heavy lifting AWS has done for you.

Andy Jassey, CEO of AWS, talked about the culture of builders. The people that use AWS are building cool technology on top of AWS. The goal of AWS is to eliminate that “undifferentiated heavy lifting” from the workforce and enable to builders to focus on building.

During the keynote, Mark Okerstrom, the CEO of Expedia came on stage to talk about how they are utilizing AWS. He said, “AWS is not just a data center replacement, AWS has services that make companies better.” That quote resonated greatly with me. We are not talking about simply moving existing workloads to AWS. To take advantage of everyting a cloud solution offers, we need to understand those advantages. We need to not be afraid of scalability, of automating cloud infrastructure stacks, and of putting effort into monitoring security and performance. We need to think cloud native.

After a week in the desert, immersed in AWS’s version of the cloud, I really have come to believe AWS has differentiated itself. I am sure I will discover some cynicism about some of their services eventually, but right now I am buzzing from the possibilities.

They announced quite a few new and/or improved services over the course of the conference. Here are a few that really interested me:

  • DynamoDB Global Tables – This allows this AWS non-relational database to be deployed in a multi-master configuration across several regions. This results in synchronous replication between multiple read-write nodes across a geographically disparate area. This brings a great deal of application redundancy and performance capabilities in my opinion.

  • Sagemaker – I’ll quickly admit that I don’t have a deep knowlege of the machine learning space. The way this was explained to me though, is that the Sagemaker service enables quicker training and deployment of data models associated with machine learning. The possibility exists that an organization where machine learning was out of reach because of a manpower or knowledge deficiency now has an opportunity to accelerate a machine learning project.

  • Fargate – managing and delpoying containers without managing the underlying infrastructure. You don’t need to size the resources to allow your application to scale, Fargate manages that for you. (ECS is supported now, Kubernetes in the form of EKS coming later this year.)

  • Guard Duty – Security analysis by looking at events across your account(s) and looking for compromises or potential comporomises.

  • Privatelink – Allows customers to privately access SaaS services on Amazon’s backbone instead of over the internet. Some of the current partners include Cisco Stealthwatch, CA technologies – App experience, and Dynatrace.

My Vagrant Setup

I’ve seen and heard how Vagrant is powerful tool for development environments. Vagrant is a tool from Hashicorp for managing virtual environments. I’ve had it installed for a while, but never really dug in far enough to understand the value. Well, like most things, I didn’t understand the value because I didn’t have a particular use case. Well now I do…

I was having issues running multiple VMs on my MacBook Pro via Fusion. Namely, that if I want to run a CSR router, I don’t have the horsepower to also run other VMs at the same time. A coworker of mine had mentioned on an internal forum that he recommended using Vagrant for testing Ansible. I am just scratching the surface I believe, but this did solve my particular problem. I can spin up a VM using Vagrant to test my Ansible playbooks against the aforementioned CSR–all on my laptop.

I won’t cover the basic installation of Vagrant itself, but that information is available at the Vagrant website, as well as much more documentation: https://www.vagrantup.com

OK. So here is where I start customizing. When you install Vagrant, a file named vagrantfile gets installed and contains all the information about your vagrant environment.

If you want to customize the different ansible machines you can spin up, copy the vagrant file into different directory per project.

For mine, I copied the vagrantfile into /vagrant-ansible-example/.

I can now edit the vagrantfile to customize this instance.

I can set which OS to spin up.

config.vm.box = "ubuntu/trusty64"

I can create a synced folder. I can have my Ansible files, or python files on my desktop and mount that directory from within the vagrant machine. This is useful to have access to playbooks or scripts that I am editing that I want to run from within a VM.

config.vm.synced_folder "/Users/bob/Documents/Github", "/Github"

I can then customize my vagrant machine by calling a bash startup script to install ansible, pip, and xlrd. (These are just examples, could be any packages you need to install.)

  config.vm.provision "shell" do |s|
    s.path = "./setup.sh"
  end

Here is a snippet of what my setup script looks like.

<br />#!/bin/bash

echo "setting up environment"
echo "installing ansible"
sudo apt-get install ansible -y
echo "installing pip"
sudo apt-get install python-pip -y
echo "installing xlrd"
sudo pip install xlrd

We spin up or machine from within the Vagrant-ansible-example directory.

vagrant up

It boots up our OS and runs our startup script, then we can ssh to our machine and start testing.

The nice thing about Vagrant, I have found, is I can easily reset my machine back to this first known-good state, or change my startup parameters and spin up a new VM anytime.

I can do this by issuing a vagrant destroy command, which will wipe out any state configuration on my VM. Then I just issue vagrant up again and it spins up my VM like it was a brand new install.

Networking in AWS – VPC Edition

I’ve been spending some time attempting get a more in-depth knowledge around Amazon Web Services. It is dead simple to start spinning up compute instances and S3 buckets. As I dive deeper in, however, I have started to uncover some of the more complicated topics that a person or an organization would run into while beginning a “journey to the cloud.”

One of those areas is networking. Maybe I gravitate towards that area first since my background involves a heavy dose of traditional on-premise networking. I would imagine anyone in my shoes would have no problem grasping the concepts of VPCs, NAT gateways, and Internet Gateways. Still, there are a certain amount of steps involved in customizing a VPC.

A Virtual Private Cloud (VPC) is a networking space in an AWS footprint. They are unique to regions, meaning you can’t span VPCs between Virginia and Ireland for example. The use cases for building your own customized VPCs include some or all of the following:

  • The ability to pick your IP addressing scheme per data center (could be important in building VPNs from a local data center to a VPC, or VPC-to-VPC peering–larger topics for another time)
  • The ability to separate networking space for different business units such as HR, Finance, IT (And the ability to apply separate, granular firewall controls through security groups and ACLs)
  • The ability to separate dev/test/prod environments from one another

There is a default VPC, which contains default subnets, you can start deploying instances into this networking space right away. But you may want to take a more prescriptive approach to networking in AWS and build your own custom VPCs.

To create a new VPC

  • Go to AWS Services, under “Networking and Content Delivery,” choose VPC
  • Go to “Your VPCs,” click the big blue button that says “Create VPC”
    • Give the VPC a name (I called mine Bob VPC) and a CIDR block (A /16 is the largest block you can create, you will create subnets from this block)

CreateVPC1

Next, we need to create subnets within our VPC

  • Go to “Subnets,” click “Create Subnet”
  • Give the subnet a friendly name (I chose the format of my VPC name and the network address)
  • choose which VPC it goes in (my BobVPC)
  • you can specify the availability zone in which you want it to reside
  • specify the CIDR block for the subnet (I chose /24s for my example)

CreateSubnet

In the details pane, you can see how many IP addresses are left, VPC membership, network, and subnet-ID. AWS reserves the first three addresses in any network. I have created two new subnets, one for public instances and one for private instances.

SubnetPane_availableIP

By default, any subnet created in a custom VPC doesn’t assign a public IP to instances created within it. If you want instances in the subnet to automatically get a public IP, you can enable auto-assignment of those public IPs.

  • choose which subnet ( I chose my Bob 10.10.1.0/24 subnet to be my public-facing subnet)
  • from the dropdown on “Subnet Actions,” choose modify auto-assign IP settings
  • Check the box for auto-assign

Create_auto_assign_public

For the subnets with public IPs, we need to create a way for those instances to get to the internet. This is an Internet Gateway.

  • Go to Internet Gateway
  • click “Create Internet Gateway”
  • Give the IGW a friendly name ( I chose BobVPC IGW in this case)

Create_IGW

  • We then need to associate the IGW with the VPC
  • choose your new IGW and click “Attach to VPC,” select your VPC (Bob VPC)

AttachIGW_to_VPC

OK, we have a VPC created, we have a subnet that auto-assigns public IPs to instances within it, we have an internet gateway. We now need to create a route for that subnet to get to the Internet Gateway.

  • Go to “Route Tables”
  • Click “Create Route Table,” lets name it “BobVPC Public Route,” make sure it is associated with the correct VPC

Create_PublicRouteTable

  • Our route table is created, let’s create a route, choose our route table
  • Click on the routes tab and add a new route, by clicking edit
  • Click add another route, since this is default, we are going to do an all 0s destination (0.0.0.0/0)
  • Click in the target box, and it should show you the options, choose the IGW you created, Click Save

Add_IGW_Route

  • Lastly, we need to associate the subnets with this route table, click on the Subnet tab
  • Click Edit, then choose the subnets you wish to route to the internet, Click Save

RouteTables_AssociateSubnets

We can now create an EC2 instance. During the launch process, choose our new VPC and the subnet you want the instance to live in. Once that instance is launched into my “Bob VPC 10.10.1.0” subnet, it will get a public IP address and will be able to access the internet via our Internet Gateway.

EC2_Instance_VPCChoice

What if we don’t want our EC2 instance to get a public IP? We have our old friend NAT.

Much like we created an Internet Gateway for our public instance, we create a NAT Gateway for our private instance. The key here is that when creating this NAT Gateway, it needs to be associated with a public subnet.

  • Go to NAT Gateways, click “Create NAT Gateway”
  • Choose a subnet in which to place the NAT GW (This is where we need to choose our public subnet, “Bob VPC 10.10.1.0”

NatGW_Subnet

  • We can have AWS automatically assign our external IP, which is called an Elastic IP

NATGW_Assign_EIP

Next we’ll need to add a route to the NAT GW for your private subnets. You can either edit the default route table within your VPC, or create a new route table. I am going to create a new route table.

  • Go to “Route Tables”
  • Click “Create Route Table,” lets name it “Bob VPC NAT,” make sure it is associated with the correct VPC
  • Let’s add a default route, click on the Routes tab, choose Edit
  • Add another route, and choose all 0s again (0.0.0.0/0)
  • In the target, you should see the NAT gateway you created, choose that and click save

RouteTable_NAT_defaultRoute

  • Then we need to associate this route table with our private subnet, click on the Subnet tab
  • Click Edit, then choose your private subnet (in my example it is “Bob VPC NAT”)

NAT_GW_RouteTable_SubnetAssociation

Any new EC2 instances created with this subnet will have NAT access to the internet, but will not have a public IP.

Like many of the services within AWS, there is a low barrier to entry for getting started, but once you get past the surface, there is a world of dragons. Beware.

Here is the link to the VPC user guide:

http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Introduction.html

 

DevOps is People (Or it’s not)

Google what is devops and you’ll be deluged by a lot of definitions that loosely land on some sort of collaboration between development groups and IT operations groups. These definitions rarely delve into what exactly those groups definitions are. Because it is all nebulous. I have taken an interest in this little slice of the IT industry and I had the chance to attend the recent devopsdays MSP here in Minneapolis.

As an old infrastructure guy, I definitely felt out of my element around this crew. I don’t have a dev background, I have done sysadmin work (ops), and only recently have I gotten back into some “dev” work around infrastructure automation. I put that dev in quotes because what I am doing is nothing like real application development, but I am familiarizing myself with the languages and tools used in their world, I just don’t live in that world every day. My main focus over the past 15 years has been around network engineering.

It really clicked for me at this conference that this devops scene is not really about tools, but processes. It’s about people in the sense that people make up a community and that community defines a culture. This community is in IT shops within organizations. A broader community of support and education is evolving around this culture. This conference was part of it.

There were many excellent talks, and as I said, not necessarily about tools, but about the culture of devops. The two that stood out to me were from Brian Liles and Pete Cheslock.

Brian gave a good overview of the state of “devops” for whatever that term means to anybody. One of the themes that he touched on was human element of working in IT. The concept of empathy as a tool is a realization that the longer you work in IT, the more closely you associate with success. Especially as a consultant like me, you need to attempt to understand the circumstances that the person sitting across the desk from you is coming from. Many of the speakers and conversations in the open group sessions kept coming back to empathy. I think this underlines what was most impressed upon me by this conference–people. The devops movement is about people. The community. Teams. The tools are almost secondary, it really is about culture.

Brian stressed “diversity of thought” in the enterprise as a valuable tool. He did give a list of high-level areas that any one working anywhere near the devops space should be familiar with.

  1. Linux – “scripting isn’t necessarily automation.” Automation is “throw this thing over there and it just works, it complains when it doesn’t”
  2. Know networking
  3. Know how sys services works – Systemd, docker
  4. Know tools of space
  5. Most important piece is empathy. Live through how your customer is seeing it…
  6. Continuous integration
  7. Monitoring (point in time and long-term trends) – alerts only go out when they are actionable (reduce noise)
  8. Logging

The second speaker that stood out to me was Pete Cheslock. Mostly because it was a very compelling story that he told. Even as a non-dev, I could relate with projects gone sideways due to changing requirements and unclear direction. Definitely worth your time.

 

I look forward to my continued immersion into this world. This event was a good way to jumpstart that journey.

ACI Config Backup via Python

The configuration backup/rollback tool in ACI is a great feature that has saved my ass many times.

Any time I start making changes, I go do a one-time backup. If I get too far down a wrong path, I can easily rollback to that snapshot to set the fabric right again.

One thing that always bugged me though, was the inability to give that backup a customizable, user-friendly name.

I discovered this week that if we trigger that backup via the cobra SDK, we can assign any name we want. Here is the code, which is also available on my Github page.

#!/usr/bin/env python

# list of packages that should be imported for this code to work
import cobra.mit.access
import cobra.mit.request
import cobra.mit.session
import cobra.model.fabric
import cobra.model.pol
import cobra.model.config
import credentials
from cobra.internal.codec.xmlcodec import toXMLStr
import requests.packages.urllib3
requests.packages.urllib3.disable_warnings()

def take_backup ():
#Login to APIC
ls = cobra.mit.session.LoginSession('https://'+credentials.ACI_login["ipaddr"], credentials.ACI_login["username"], credentials.ACI_login["password"])
md = cobra.mit.access.MoDirectory(ls)
md.login()

polUni = cobra.model.pol.Uni('')
fabricInst = cobra.model.fabric.Inst(polUni)

backup = cobra.model.config.ExportP(fabricInst, name="backup_created_with_python", snapshot="true", adminSt="triggered")

c = cobra.mit.request.ConfigRequest()
c.addMo(fabricInst)
md.commit(c)

take_backup()

Palo Alto and NSX Configuration

VMware NSX offers a distributed firewall that applies security policy across your virtual environment and allows for what VMware has coined “micro-segmentation.” Additionally, you can add Palo Alto virtual firewalls to gain further visibility and Next-Gen Firewall capabilities. I recently deployed this in a lab environment and at a customer site. Here is an overview of my experience and understanding.

The Palo Alto documentation for inserting the PAN virtual appliance into the NSX data plane is pretty good. It gives you the step-by-step on “how” to configure it. The thing I find lacking in Palo Alto documentation is the “why.”

For all the qualms I have with Cisco, much of their documentation does a good job explaining the theory behind a certain feature or technology. I find that understanding the why, makes the actual configuration step-by-step instructions more sensible.

That being said, I can expound on a few things in configuring PAN and NSX that is not explicitly mentioned, but came in handy for me understanding what we are doing.

First, the basic overview of service insertion for this scenario.

We are going to establish a connection between Panorama and NSX manager. We are going to tell NSX what traffic to send to the PAN device for inspection. We are going to set up policies in PAN to do that inspection.

To deploy PAN with NSX, you’ll need a license for every ESXi host that is going to have a virtual firewall. You will also need the Palo Alto centralized management console, which is called Panorama. Lastly, you will need a web server that can host the .ovf and .vmdk files that make up the PAN virtual appliance.

An overview of the setup:

We are going to tell NSX manager where it can download the .OVF file for the virtual appliance and where to pull the license file. We are also going to define an IP pool for the management IP of the virtual appliances.

Once all that information is populated, in vCenter networking and security we pick on which hosts we want the virtual appliance installed. Once those are installed correctly, those will show up in Panorama.

We then create security groups in NSX using either dynamic or static membership. We create a distributed firewall rule in NSX to define which traffic is redirected to the Palo Alto appliances.

In Panorama, we define security policies based on the security groups we created in NSX.

This is an important point, the Palo Alto only inspects the traffic you redirect. You can also create exceptions in NSX to specifically define traffic you don’t want redirected.  So you could have an any-any redirect rule, but then define a smaller subset of traffic to not redirect.

My thinking is that if there is traffic you know you would block anyway, it doesn’t make sense to redirect that to the PAN to block, just block it in the distributed firewall in NSX.

One other thing that I ran into during installation is the pre- and post-rules in Panorama. When creating your security policies you have the option for pre- and post-rules. What does that mean?

Panorama only manages the policies that it creates. When the PAN NSX virtual appliance is installed, there is a local policy installed that is a deny all. If you create a post-rule policy in Panorama, that policy will be placed after the local deny all rule, so all traffic will be denied. When creating policies for the NSX virtual PAN, you must create pre-rule policies that will be placed above the local rules.

prerules

It might be hard to see in the image, but there is a yellow line between the PAN-NSX policy and the default-deny policy. The PAN-NSX policy is the one created in Panorama as a pre-rule. The default-deny is the local policy.

default-deny

 

APIC Failure Testing

I had an interesting conversation the other day about failure of the APICs within an ACI fabric. This particular customer had been burned by another vendor’s fabric solution with regards to failure or upgrade scenarios. I mentioned that the APICs could all blow up and the data plane of the fabric would keep chugging along. This customer said, “I want to see it. I want to pull the power to the controllers and see it.”

I realized that I had been saying this scenario was possible for a while now, but I had never explicitly tested it. I figured I had better ensure this behavior before I have my customer come out and pull the plug on my APICs.

Here is the setup. I have two EPGs on either side of an F5 LTM. This LTM is doing a simple round robin load balancing of some apache web servers. In my test, I am pinging the virtual server on the LTM. I also tested that I could get to the web pages served by the pool defined in the LTM.

 

slide1

I am advertising the bridge domain subnet (10.207.141.0/24) associated with F5 Outside EPG via an L3 Out.

Starting a ping to all three controllers and the VS on the LTM appliance (10.207.141.100). I can ping all four addresses. I can also access web pages from servers in the Web EPG that are load balanced by the LTM.

I logged into the CIMC of all three controllers and did a reboot. At this point I lost pings to all three controllers as expected. I did, however, keep my pings to the VS and I was still able to access web pages.

pingtest_edit

Now I have proof that the APIC failures do not affect data plane traffic. I can rest easy the next time a customer questions that statement.