Last week, I sat for the AWS DevOps Professional certification. I took the 2018 version of the test, because it’s still in rotation until sometime in February. It was an interesting, grueling slog of a test… as AWS tests usually are. It wasn’t as difficult as the SA Pro, though.
You may have figured out that the site has changed once again.
I did this once in 2016. I tried to move the site to Hugo. But I was frustrated with the need to run an EC2 instance.
No longer.
I crafted a Terraform module that creates a full AWS publishing CodePipeline for a Hugo site to an S3 bucket.
Clever laziness.
Bye bye, Wordpress.
I just completed some work on a little project with some unique requirements. It’s a project that uses Terraform to provision infrastructure within AWS. That’s not too terribly hard. We’re trying to make the platform, infrastructure and code as reusable as possible while maintaining customer-specific privacy and security requirements.
Did you create a multi-tier Elastic Beanstalk deployment? Did you tie it to CodePipeline to deploy out of Github? Has it been working well until just recently?
…did you accidentally leave RDS attached to your worker tier?
This post is for you.
I built an Elastic Beanstalk for a customer with those characteristics. It’s been working great for about a year, until suddenly… the developer of the application reports that he’s no longer able to deploy his code changes. It keeps failing and rolling back all of the changes to the last known good state, which includes older versions of his code. This was bad news for everyone because we had a Monday-morning deadline to demo code changes to a new customer.
Sunday morning offered me a chance to sit and focus on this. I’ve been trying to understand this problem for a few days. It looks like I was finally able to understand the issue after some focus and coffee.
First, let’s cover what was actually happening. When the developer pushed his code updates through CodePipeline, Elastic Beanstalk was working through its “magic” (cough) to update the config to its “known good state” (which was wrong) and failed to apply the changes because of CloudFormation problems. This triggered a rollback on CloudFormation, CodePipeline, and Elastic Beanstalk config changes. Hence the failure.
How did it all get out of whack?
There were several mistakes committed, most of them on my part. Some of them are just problems with Elastic Beanstalk itself. But I’ll make the no-no list:
Don’t let Elastic Beanstalk manage your RDS instance. Remove all references to RDS in all tiers before you build your RDS instance. Even AWS tells you to not to do this. I missed the one in the worker tier.
If you proceed forward with RDS tied to your EB, do NOT use the RDS console to make any changes to the RDS instance. EB won’t know about the changes and will get really angry when they don’t match. In our case, we did some performance testing and modified the RDS instance size from db.t2.micro to db.m4.large. We also changed the storage setting from 20gb to 100gb. We made those changes in the RDS console and not the EB console. Don’t do that.
You should change one setting in the RDS console. Turn off automatic version upgrade. In our case, RDS was upgrading the minor version of the database and once again, EB got angry. Worse yet, you can’t change the minor version in EB’s console. It’s locked. That’s EB’s fault. But whatever.
Those three items led to a huge bag of fail whenever our developer pushed changes. Elastic Beanstalk would initiate changes, but see that RDS’ configuration was out of whack from its understanding. It would fail and roll everything back.
But wait - there’s more!
Elastic Beanstalk was also using some very old CloudFormation to make changes to the RDS instance. It was still using DBSecurityGroups, which apparently is illegal to use now… at least for our case. We were using postgres and minor version 9.6.6. It looks like the RDS team has moved on from DBSecurityGroups and now enforces the use of VPC Security Groups. Therefore, any changes to RDS would completely fail with the error:
Updating RDS database named: <instance> failed Reason: DB Security Groups can no longer be associated with this DB Instance. Use VPC Security Groups instead.
Ouch.
How do you fix all of this mess?
Let’s go over how Elastic Beanstalk actually works. I’ll be describing some of the simple concepts that are covered in documentation on the AWS site. Bookmark it and keep it handy.
First thing’s first. You need to understand that Elastic Beanstalk is really driven by a simple YAML file. This YAML file is specific to the “environment”, which is a child of the “Application” in Elastic Beanstalk. This always confuses me because I think of an “Environment” as being a place to put an “Application,” but in Elastic Beanstalk it’s backwards of how I think. AWS has a pretty good document on how you can look at this YAML file and see what’s going on.
In this case, I was able to save the configuration as described in the AWS document. I then visited the S3 bucket and was able to see a few things that was making my life difficult. There was also a clue left in this document about how EB was driving changes to the RDS instance via CloudFormation. I knew this was happening. If you’re using Elastic Beanstalk, take a few minutes to go look at your CloudFormation console. You’ll see a template in there - one for each EB “environment” you have deployed. The top of your EB environment dashboard has an “environment ID” displayed in a very small font. This environment ID corresponds to the CloudFormation template ID in the CloudFormation console. You can see the nitty-gritty of what it’s trying to do in there.
But Elastic Beanstalk is coughing up some invalid CloudFormation. How do I know? That security group error that was coming up is actually coming out of CloudFormation. I can see the error event in there. CloudFormation is the service that actually triggers the rollback. CloudFormation and RDS is enforcing the change away from DBSecurityGroups to VPCSecurityGroups. But when Elastic Beanstalk creates the CloudFormation template to initiate the change, it uses DBSecurityGroups.
I used one troubleshooting session to manually fix the CloudFormation JSON that Elastic Beanstalk is spitting out. I pushed it through by hand and it worked. I made the changes to the security groups in the way that CloudFormation and RDS expect - however, if I initiated a change through Elastic Beanstalk or the developer pushed a code update, it would fail with invalid CloudFormation once again.
I’ll take a quick break to break down what’s happening here. When you make a change in Elastic Beanstalk, my new understanding is that this happens:
Elastic Beanstalk console writes new YAML config file to S3 –> Elastic Beanstalk parses the config file and decides what changes should be made –> Elastic Beanstalk generates a CloudFormation JSON template –> Elastic Beanstalk saves the CloudFormation JSON to S3 –> Elastic Beanstalk pokes CloudFormation and asks it to update –> CloudFormation updates… if a failure is encountered, it rolls back and tells Elastic Beanstalk that everything is hosed –> Elastic Beanstalk rolls back the version of code that was deployed to a known good state.
Now I understand the root cause here. RDS made a change to enforce the security group update. Elastic Beanstalk can’t seem to figure that out.
Here’s how to resolve this.
Look at the AWS documentation on Elastic Beanstalk’s config above. Follow their steps to save the configuration file from the console. Then, get your favorite code application out. Download the file and manipulate it by hand.
I changed the RDS properties to reflect reality. EB still thought it was postgres, version 9.6.2, on a db.t2.micro with 20gb of storage. I updated these properties to reality.
Then, I saw it. At the bottom of the file, there is a block of YAML that tells Elastic Beanstalk where to pick up the CloudFormation JSON and feed parameters. The default value was:
Extensions:
RDS.EBConsoleSnippet:
Order: null
SourceLocation: https://s3.amazonaws.com/elasticbeanstalk-env-resources-us-east-1/eb_snippets/rds/rds.json
Take a look at that URL. Go ahead. I’ll wait.
See it?
It’s the bad CloudFormation template.
How did I resolve this? Well, I took that template and downloaded it. I modified it in my code editor to change the DBSecurityGroup resources into VPC Security Group resources. I had to manually add the SecurityGroupIngress information too, but because I speak CloudFormation this wasn’t too hard. It’s cheating a little bit, but not a big deal.
I created a new S3 bucket and uploaded my new CloudFormation JSON template into that bucket. Then, I revisited this YAML config and changed the URL to point to my new private copy of the CloudFormation template.
Go back to the Elastic Beanstalk console and “load” the configuration template and wham, it worked. Everything was fine.
Now I know how Elastic Beanstalk really works, and I figured out some super advanced ways to manipulate it to my bidding.
I hope this helps you understand Elastic Beanstalk a little more - it certainly helped me. Now I know how to trick Elastic Beanstalk into working if it hoses up again.
Since it’s working, turn off minor version upgrade in RDS to prevent this from happening, then use your AWS support plan to tell them that Elastic Beanstalk has a bug with CloudFormation and RDS security groups :)
Happy cloud days.
DFARS.
If you don’t know what that is, look it up. I’m not going to go into it in this article. I only want to discuss the ramifications of DFARS and how it’s being interpreted/implemented.
Every federal contractor company I’ve worked for has a “matrixed” business model. This means in order to save money, they will employ you on a single federal contract - but “leverage your expertise” on other federal contracts. The end result of this is that you’ll end up working on multiple projects across multiple agencies. Because federal agencies refuse to get along and agree on standards, this means you get to go through multiple clearances and obtain multiple credentials (i.e. CAC or PIV cards and usernames/passwords).
This is a little disingenius on the part of the contractor company. It’s been my experience that they will tie you to a single contract and then matrix you to others. But if the funding lapses on the primary contract, they’ll show you the door. Valuable employees are kept but others that are lower level (but still matrixed!) will be laid off.
That’s another issue that is between you and your company.
Anyway… DFARS. The way companies and agencies are interpreting DFARS is the subject of this article. Basically, if you’re a matrixed employee, the end result is that you will end up with one laptop and one mobile device per project.
That’s right.
If you’re matrixed across three different projects, you will end up with three laptops and three different mobile devices. None of these devices will be allowed to communicate with the other agency. Your company will likely issue a company-specific laptop and mobile device as well. In my case, this could result in four separate devices to do your work.
That sounds reasonable, but it’s woefully ignorant of how a matrixed employee does business. Every agency expects the employee to be devoted to their contract, even if they are on record as having only a slice of time. The agency/customer expects that employee to be available at any time… not just during certain hours of the day.
The end result is that the matrixed employee is expected to manage multiple meeting requests across multiple devices without a single integrated view of meeting and work conflicts. This means the employee will miss meetings, emails and lort knows what else.
I predict this will be rolled back within a few years.
It’s untenable.
Me? I’m going to set “out of office” replies that notify senders that I only check my email and calendar during certain parts of the day. They’ll receive that autoreply every time they email me. Sure, I can set it to reply once a day.
I wouldn’t want to like… be annoying, or something.
Love someone in the way they need, not in the way you think they need.
Love is not selfish.
If your love needs match theirs, that’s when magic happens.
If you believe your mate’s needs are stupid, unreasonable, or anything other than something you’re willing to do…
…then you don’t deserve that person.
I just got a note that Amazon API Gateway is now available in AWS GovCloud. This makes things more interesting for GovCloud for sure, but it’s just a minor stepping stone. Remember, just because it’s in GovCloud doesn’t mean it’s FedRAMP’d (even though it probably is).
Stephen King… I used to love you. Your legends themselves were the stuff of legend. Not a single week went by without someone bringing up Danny boy, the little Gage that got hit by a truck and dragged down the street… the murderous supernatural car that ran people down just because… the girl wearing pig’s blood at the prom… and, one of my beloved favorites, the nurse that likes to crack ankles with sledgehammers… and we all learned what rabies REALLY does to a massive domesticated pet. Your source material is pretty much all the same, but with just enough schtick to keep us coming back. It was much of the same characters, much of the same town, much of the same dialog full of dialects…. but we still loved it.
Producers loved you because they mined your source material for decades. Sometimes they twisted it to fit their own agenda or make it “audience-friendly” (cough cough… who in their right mind thinks horror is audience-friendly? That’s the whole fucking point… it’s NOT).
You even tried your own hand at directing. I remember being aghast at your choice to include AC/DC as the soundtrack to Maximum Overdrive. I think you enjoyed it, but frankly, AC/DC doesn’t scare me. I marveled at your strange choices to augment your horror with unintentional slapstick. You knew you weren’t that awesome at directing and you pulled out. That’s fine. Let others collaborate with you and make it better.
My point is… you have had decades upon decades to build a massive fan base and production credibility. You could do almost anything you wanted and the sheep will follow. There isn’t a bookshelf in the country that doesn’t bear your name, and yes I’m talking about every individual household that owns a book.
You spent the time to create and write a phenomenal series of books that I have yet to finish… and may never finish (but that’s ok), and successfully blend science fiction, fantasy and horror into one single twisted modern masterpiece that people place in such high regard as Tolkien.
The time came for you to put together The Dark Tower. Now was the perfect time for you to cash in your accrued credibility and satisfy your fan base. Now was the time to seal your fate in the annals of pop culture. We heard the movie was coming out. We heard Idris Elba and Matthew McCoughnaheygirlwhatchauptoo were cast. There were debates. There were rages. There were wadded panties. But we all held our breath.
The trailer came out. We exhaled slightly. The trailer did wonders for your anticipation, just as a trailer should do.
We all opened our mouths and waited.
You walked up, unzipped your pants, and pissed in them.
95 minutes. Out of a series of x books and two gifted actors with fantastic star power, you gave us 95 fucking minutes.
Somehow, you let the director… the producers… the studio… someone… decide… that this sprawling horror fantasy with blood, sex and gore… should be distilled to PG-13.
In the age of Game of Thrones… Westworld… Lost… and countless other serial dramas that have overtaken our lives (THANK GOD GOODBYE REALITY SHOWS)… you… the creator of this massive, proud work… YOU… let them do this to us.
YOU, kind sir, are solely responsible for this reprehensible decision.
I don’t give a fuck if you think the studios did it. I don’t fucking care if you think the director was going to make the right choices. You did this. You should have stayed involved with your work closely enough to make sure the RIGHT DECISIONS WERE MADE.
THEY WEREN’T MADE, STEPHEN. THE RIGHT DECISIONS WERE NOT MADE.
You cashed out decades of good will and fan base on 95 minutes and a PG-13 rating that was created so a horned helmet wearing priest could pull the heart out of a sacrificial victim. You let them pick a rating that explicitly allows the use of one occurrence of the f-word.
FOR THE DARK TOWER.
I will never be able to register my malcontent over this. The least I can do is avoid giving you my ticket money. I’ll do my best to wait and see if it comes up on HBO or something that I can use my existing subscription to see. Or, maybe I’ll pirate it. But I’m sure I’ll even waste the my valuable time to download it.
The real horror story here is how a single man, full of arrogance and pride, singlehandedly murdered an entire fan base in the span of 95 minutes.
I’m so disappointed.
Idiot.
If you hired a cloud consultant that heartily recommends a “lift and shift” migration and they assure you that everything will be fine…
Fire them.
It won’t be fine.
I must really be out of the loop. I had no idea Microsoft bought SwiftKey. Anyway, they are killing the Windows Phone keyboard for IOS and focusing exclusively on SwiftKey.
When Microsoft does things that makes sense, I’m always surprised. When they do things that do not make sense (like beefing Skype for the iPhone) I am rarely surprised.
Microsoft’s Windows Phone keyboard for the iPhone is dead - The Verge