Directoryless User Administration in AWS/IAM, Terraform and CI/CD

I just completed some work on a little project with some unique requirements. It’s a project that uses Terraform to provision infrastructure within AWS. That’s not too terribly hard. We’re trying to make the platform, infrastructure and code as reusable as possible while maintaining customer-specific privacy and security requirements.

The requirements and curve balls were unique enough to make this project a little challenging:

  • Create and manage IAM users inside an AWS account.
  • Provision IAM roles inside subaccounts within the organization (or inside the main account if your use case is not as complex as this).
  • Provision sts-assume-role permissions on those roles based on group membership from an identity provider.

Sounds simple, right? Well, let’s add in the curve balls:

  • You cannot set sts-assume-role policies based on IAM group membership (this is an AWS limitation). You can do this with SAML and/or some kind of federated access, but in this case that was not available to us. We had to provide some way to do this without an IdP and only manage users inside IAM. If you’re using pure IAM, you can only provision users to assume roles on a user-by-user basis. Ick.
  • Do not hard-code the usernames or group membership inside the Terraform.
  • Make it work with a CI/CD deployment – this means you can’t use a local workstation tfvars file to define the users.
  • Treat the usernames and group memberships as sensitive information – which means they must be encrypted.

Setting up CI/CD to work with your Terraform deployment is outside the scope of this article. I’m only focusing on the little bits of code that I used to make this work. Let’s just assume that code that is pushed to your master branch is deployed to production within AWS.

How did we pull this off? AWS’ EC2 Parameter Store to the rescue. Parameter Store allows you to just store key/value pairs. You can store a string, StringList, or SecureString. A SecureString requires the use of a KMS key. So you’ll have to create a KMS key manually or through your Terraform code.

After the KMS key is created, set up your parameters. In my use case, I set up four parameters. The first parameter is a SecureString. It’s just a comma-delimited list of usernames you wish to create. Terraform can automagically decrypt the parameter store object through code, provided the user executing the code has access to use the key to decrypt the parameter store object. You can use the web console or AWS CLI to create this parameter store object and its value. You don’t want to create the parameter store object in your Terraform code, since one of the requirements was to NOT hard-code the user names or use tfvars.

The Terraform code to read the value looks like this:

data "aws_ssm_parameter" "iam_user_list" {
  name = "iam-user-list"
}

All this does is set up a method inside your Terraform to look at the parameter store object and read the value by calling: ${data.aws_ssm_parameter.iam_user_list.value} elsewhere in your code. Terraform will go out to AWS, find the parameter you supplied in the “name” property and read it into memory. Now it’s available for use in other places.

Remember though, we supplied the values as a comma-delimited list. This is important because that’s where things get tricky.

First we have to create the users identified in that parameter store object. The best way to accomplish this is to user a local to split the comma-delimited value into a usable list, then loop through that list and create the users.

locals {
  user_list = ["${split(",", data.aws_ssm_parameter.iam_user_list.value)}"]
}

resource "aws_iam_user" "iam_users" {
  count = "${length(local.user_list)}"
  name = "${local.user_list[count.index]}"
}

Now if you run your Terraform code, you’ll end up with all new IAM users created by the usernames from the list you provided in the Terraform code. Better yet, if you add/remove information from that string, Terraform will automatically adjust the next time you manually run the code or CI/CD executes it. Congratulations, you have basic user management! It may be even more useful to write a Lambda script that runs this routine every so often, but we didn’t do it that way for this particular use case.

This doesn’t set up the users with access keys, passwords or MFA’s. Sorry, that’s harder to do. For now I just handle that in the web console or CLI.

Next, let’s handle the really tricky part. Again, there’s no way to set up a group and use group membership to decide who should get an assume role permission. But that’s ok. We can handle this in a similar fashion. Build parameters that are similar to the iam_user_list parameter. Put a comma-delimited list of the users that should belong to the “group” in this parameter. Make sure the IAM users actually exist before you go further, because Terraform will get mad at you if you try to set up sts-assume-role policies for users that do not exist.

Just like before, set up a data object that reads your new parameter.

data "aws_ssm_parameter" "admin_iam_role_list" {
  name = "admin-iam-role-list"
}

This will expose the contents of that parameter to your Terraform template as: ${data.aws_ssm_parameter.admin_iam_role_list}. Apply the same locals trick as above and iterate through your list to build out a list of users and the ARNs that should be set in the assume-role permissions.

locals {
  admin_iam_role_list = ["${split(",", data.aws_ssm_parameter.admin_iam_role_list.value)}"]
}

resource "aws_iam_role" "admin_role" {
  name = "${var.admin_role_name}"

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": ${jsonencode(concat(formatlist("arn:aws:iam::%s:user/%s", var.aws_account_id, local.tenant_viewonly_iam_role_list)))}
        },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}

The next time your Terraform template runs, it will iterate through the comma-delimited list of users in your parameter store and add them to the sts-assume-role policy in your role. We’re actually using this in AWS subaccounts (using provider aliases) so that we can centrally-manage IAM users in one AWS account while provisioning roles in other AWS accounts and managing the use of those roles like group membership.

There you have it. Directoryless, basic IAM user and role management in Terraform with no additional infrastructure and a slightly more secure way of handling it… and best yet, your CI/CD will provision the same aspects of information as your developers that deploy the infrastructure.

How to fix an Elastic Beanstalk/RDS breakup

Did you create a multi-tier Elastic Beanstalk deployment? Did you tie it to CodePipeline to deploy out of Github? Has it been working well until just recently?

…did you accidentally leave RDS attached to your worker tier?

This post is for you.

I built an Elastic Beanstalk for a customer with those characteristics. It’s been working great for about a year, until suddenly… the developer of the application reports that he’s no longer able to deploy his code changes. It keeps failing and rolling back all of the changes to the last known good state, which includes older versions of his code. This was bad news for everyone because we had a Monday-morning deadline to demo code changes to a new customer.

Sunday morning offered me a chance to sit and focus on this. I’ve been trying to understand this problem for a few days. It looks like I was finally able to understand the issue after some focus and coffee.

First, let’s cover what was actually happening. When the developer pushed his code updates through CodePipeline, Elastic Beanstalk was working through its “magic” (cough) to update the config to its “known good state” (which was wrong) and failed to apply the changes because of CloudFormation problems. This triggered a rollback on CloudFormation, CodePipeline, and Elastic Beanstalk config changes. Hence the failure.

How did it all get out of whack?

There were several mistakes committed, most of them on my part. Some of them are just problems with Elastic Beanstalk itself. But I’ll make the no-no list:

  1. Don’t let Elastic Beanstalk manage your RDS instance. Remove all references to RDS in all tiers before you build your RDS instance. Even AWS tells you to not to do this. I missed the one in the worker tier.
  2. If you proceed forward with RDS tied to your EB, do NOT use the RDS console to make any changes to the RDS instance. EB won’t know about the changes and will get really angry when they don’t match. In our case, we did some performance testing and modified the RDS instance size from db.t2.micro to db.m4.large. We also changed the storage setting from 20gb to 100gb. We made those changes in the RDS console and not the EB console. Don’t do that.
  3. You should change one setting in the RDS console. Turn off automatic version upgrade. In our case, RDS was upgrading the minor version of the database and once again, EB got angry. Worse yet, you can’t change the minor version in EB’s console. It’s locked. That’s EB’s fault. But whatever.

Those three items led to a huge bag of fail whenever our developer pushed changes. Elastic Beanstalk would initiate changes, but see that RDS’ configuration was out of whack from its understanding. It would fail and roll everything back.

But wait – there’s more!

Elastic Beanstalk was also using some very old CloudFormation to make changes to the RDS instance. It was still using DBSecurityGroups, which apparently is illegal to use now… at least for our case. We were using postgres and minor version 9.6.6. It looks like the RDS team has moved on from DBSecurityGroups and now enforces the use of VPC Security Groups. Therefore, any changes to RDS would completely fail with the error:

Updating RDS database named: <instance> failed Reason: DB Security Groups can no longer be associated with this DB Instance. Use VPC Security Groups instead.

Ouch.

How do you fix all of this mess?

Let’s go over how Elastic Beanstalk actually works. I’ll be describing some of the simple concepts that are covered in documentation on the AWS site. Bookmark it and keep it handy.

First thing’s first. You need to understand that Elastic Beanstalk is really driven by a simple YAML file. This YAML file is specific to the “environment”, which is a child of the “Application” in Elastic Beanstalk. This always confuses me because I think of an “Environment” as being a place to put an “Application,” but in Elastic Beanstalk it’s backwards of how I think. AWS has a pretty good document on how you can look at this YAML file and see what’s going on.

In this case, I was able to save the configuration as described in the AWS document. I then visited the S3 bucket and was able to see a few things that was making my life difficult. There was also a clue left in this document about how EB was driving changes to the RDS instance via CloudFormation. I knew this was happening. If you’re using Elastic Beanstalk, take a few minutes to go look at your CloudFormation console. You’ll see a template in there – one for each EB “environment” you have deployed. The top of your EB environment dashboard has an “environment ID” displayed in a very small font. This environment ID corresponds to the CloudFormation template ID in the CloudFormation console. You can see the nitty-gritty of what it’s trying to do in there.

But Elastic Beanstalk is coughing up some invalid CloudFormation. How do I know? That security group error that was coming up is actually coming out of CloudFormation. I can see the error event in there. CloudFormation is the service that actually triggers the rollback. CloudFormation and RDS is enforcing the change away from DBSecurityGroups to VPCSecurityGroups. But when Elastic Beanstalk creates the CloudFormation template to initiate the change, it uses DBSecurityGroups.

I used one troubleshooting session to manually fix the CloudFormation JSON that Elastic Beanstalk is spitting out. I pushed it through by hand and it worked. I made the changes to the security groups in the way that CloudFormation and RDS expect – however, if I initiated a change through Elastic Beanstalk or the developer pushed a code update, it would fail with invalid CloudFormation once again.

I’ll take a quick break to break down what’s happening here. When you make a change in Elastic Beanstalk, my new understanding is that this happens:

Elastic Beanstalk console writes new YAML config file to S3 –> Elastic Beanstalk parses the config file and decides what changes should be made –> Elastic Beanstalk generates a CloudFormation JSON template –> Elastic Beanstalk saves the CloudFormation JSON to S3 –> Elastic Beanstalk pokes CloudFormation and asks it to update –> CloudFormation updates… if a failure is encountered, it rolls back and tells Elastic Beanstalk that everything is hosed –> Elastic Beanstalk rolls back the version of code that was deployed to a known good state.

Now I understand the root cause here. RDS made a change to enforce the security group update. Elastic Beanstalk can’t seem to figure that out.

Here’s how to resolve this.

Look at the AWS documentation on Elastic Beanstalk’s config above. Follow their steps to save the configuration file from the console. Then, get your favorite code application out. Download the file and manipulate it by hand.

I changed the RDS properties to reflect reality. EB still thought it was postgres, version 9.6.2, on a db.t2.micro with 20gb of storage. I updated these properties to reality.

Then, I saw it. At the bottom of the file, there is a block of YAML that tells Elastic Beanstalk where to pick up the CloudFormation JSON and feed parameters. The default value was:

Extensions:
 RDS.EBConsoleSnippet:
 Order: null
 SourceLocation: https://s3.amazonaws.com/elasticbeanstalk-env-resources-us-east-1/eb_snippets/rds/rds.json

Take a look at that URL. Go ahead. I’ll wait.

See it?

It’s the bad CloudFormation template.

How did I resolve this? Well, I took that template and downloaded it. I modified it in my code editor to change the DBSecurityGroup resources into VPC Security Group resources. I had to manually add the SecurityGroupIngress information too, but because I speak CloudFormation this wasn’t too hard. It’s cheating a little bit, but not a big deal.

I created a new S3 bucket and uploaded my new CloudFormation JSON template into that bucket. Then, I revisited this YAML config and changed the URL to point to my new private copy of the CloudFormation template.

Go back to the Elastic Beanstalk console and “load” the configuration template and wham, it worked. Everything was fine.

Now I know how Elastic Beanstalk really works, and I figured out some super advanced ways to manipulate it to my bidding.

I hope this helps you understand Elastic Beanstalk a little more – it certainly helped me. Now I know how to trick Elastic Beanstalk into working if it hoses up again.

Since it’s working, turn off minor version upgrade in RDS to prevent this from happening, then use your AWS support plan to tell them that Elastic Beanstalk has a bug with CloudFormation and RDS security groups 🙂

Happy cloud days.

 

AWS GovCloud and CloudFormation

Be careful when you’re working with CloudFormation in the AWS GovCloud region. Almost every code snippet available on the Internet refers to the public regions of AWS. If you’re making resources in GovCloud with a Cloudformation templates, there are subtle differences.

For instance, referring to an S3 bucket in a code snippet is:

“Resource”: { “Fn::Join” : [“”, [“arn:aws:s3:::”, { “Ref” : “myExampleBucket” } , “/*” ]]},

But if your bucket is in GovCloud, your arn is different:

“Resource”: { “Fn::Join” : [“”, [“arn:aws-us-gov:s3:::”, { “Ref” : “myExampleBucket” } , “/*” ]]},

Subtle things like that can make CloudFormation development a real hoot. Be careful.

Unpopular Opinion Post: Microsoft Azure is toast (as a public service)

I really think Microsoft Azure is screwed.

It’ll still be around to power Microsoft’s backend services, but as a public offering to compete against AWS… it’s toast.

Also… OneDrive… seriously, wtf?

The Tone of the WWDC Keynote

One thing I wanted to mention in my post about WWDC last night… did anyone feel that the tone of the overall keynote was different? It felt a little more relaxed and fun. It seems like Tim Cook has encouraged his staff to be more relaxed and at ease with what they are doing. There was more humor and more open honesty.

I think Tim is trying to strike a keen balance between old school Apple secrecy and a new humane approach to the work they are doing. I think he’s listening to the consumers about how things should be (iCloud Drive is a likely example of that).

I like Tim Cook. I like where he’s taking the company. All of you who keep crying about Apple’s lack of innovation need to look back at Microsoft’s record the past 20 years. Give Apple some time and they will surprise you. They like to lay a lot of foundation work before they spring a surprise on anyone. This WWDC was foundational. I expect a lot of interesting things this fall.

OS X 10.10 and iOS 8 Thoughts

I should be at WWDC 2014 this year, but I’m not. I work for a Microsoft-centric shoppe right now and they just don’t see the value in it. Nevertheless, I put my name in for the lottery and I didn’t win anyway.

I watched most of the keynote from afar and parts of the State of the Union address. All of it is ultra exciting. If they get Continuity, iCloud Drive (FINALLY OMG) and Messages right, this will be a killer OS combo with iPhones, iPads and Macs.

There’s a plethora of articles out there explaining what’s up. I highly recommend Anandtech write-ups in almost every scenario.

Also, I’m really interested in Swift – this new programming language. It seems quite daft. Too bad I’m not as proficient with Obj-C as I wanna be yet. When they announced the new language, my first thought was all the people who are groaning about the iOS and Mac programming courses that they have to remake… or, more likely, how excited they are that they can make another round of these things and a windfall of cash.

There’s great things to come in the world of Apple. I’m looking forward to seeing where the home automation stuff goes too.

Who cares about an iWatch and TV? Whatever.

Microsoft (and Paul Thurrott) says Windows 8 sucks

Last week, Hell froze over in one of the deepest freezes in the history of the United States.

This week, Paul Thurrott finally speaks the truth about Windows 8/8.1. It’s not pretty.

“Threshold” to be Called Windows 9, Ship in April 2015 | Windows 8 content from Paul Thurrott’s SuperSite for Windows

It’s going to be a very interesting 2014 in the tech world.

Speaking of which, I hope you’re having a fantastic start to this new year. I need to get back to blogging and updating my websites.

Could a Bug be Deliberately Coded into an Open Source Project for Financial Gain?

For some bizarre reason, the thought at the top of my head last night at bedtime was… “I wonder if sometimes… open source developers deliberately code bugs or withhold fixes for financial gain?”

If you don’t follow what I mean, here’s where I was: often times, large corporations or benefactors will offer a code fix bounty or developmental funding for an open source project they have come to rely upon.  What if an open source developer were to deliberately code a bug into an open source project or withhold a fix so they might extract some financial support with this method?

I brought it up in #morphix to Gandalfar, one of my trusted open source advisors.  We debated it shortly and he brought up several good points.  While this may happen, the scheme is likely to fall apart quickly.  The community is the resolver of situations like this.  If the community finds a bug and offers a fix for the problem, then the developer will find themselves in a political combat situation.  They would likely try to stifle the fix with some ridiculous excuses and/or start to censor discussion of the subject over mailing lists or on forums.  Speculation could be raised about the issue and ultimately, people could start to fork the project elsewhere, unless the license of the project disallows that.  In the long run, the community would resolve the situation by simply offering a new solution.

So while it could theoretically be achieved for short-term gain, in the long run the community makes the approach unsustainable.

Why do I bring this up?  Well, I think we all know that closed source entities often engage in this practice.  I could point out several examples that I have absolute knowledge of this happening, but I don’t think I have to.  I’m not completely absolving open source from this either – look at what “official distributions” do in some situations… Red Hat Enterprise Linux or Novell (SUSE) for example.  But in those situations, if you didn’t want to pay to upgrade the operating system and still resolve your situation, we all know that with the right application of effort and skill you could overcome it.

All in all, this whole thought process ends up with a positive note about open source.  If it’s broken, you can fix it yourself or work with others to make it happen.  The community – that incredibly large, global groupthink – keeps it all honest.

Or, you can put all your money and eggs into a closed source basket and find out you’re getting screwed when it’s too late.

It’s all about choice, right?

Reblog this post [with Zemanta]

iPhone to be allowed on other carriers?

I really can’t believe this hasn’t been pointed out before… so I guess I’ll do the dirty work and try to fan the flames of rumor.

This started when I read Paul Thurrott’s latest blog post, with which I could not agree more.

Then, I decided it’s time to blog about this and see if anyone had noticed:

iphone_tech_specs.jpg

(Click image for larger view)

Why would you have a SIM ejector tool in the 3G box if they didn’t intend you to use it? The EDGE model has a SIM ejector hole, but there’s no included tool that I can recall…

Can I start a rumor? We’ll see.

The iPhone Earthquake

Once again, the iPhone rules the press with a heavy dollop of enticing news.

There’s a lot here on the surface and a lot below the surface. Let’s scratch the surface first.

The announcements about Apple licensing ActiveSync are interesting. There was lots of speculation in this regard and greetz to those who called it. I myself lost a bet. I was thinking that Apple might actually thumb their nose at ActiveSync and employ webdav for Exchange 2003 (much like Entourage) or web services for Exchange 2007. Of course, that would not be a quick route to policy controls on the device itself (i.e. remote kill), so ActiveSync makes the most business sense both in time and money. It’s a good investment. I was just hoping they wouldn’t just… well, because.

But they did. Let’s analyze what this brings:

– Sync with email (effectively push email, but it’s not TRULY push email… ActiveSync, even on Windows Mobile, IS NOT PUSH EMAIL. It just appears that way).

– Sync with contacts

– Sync with calendars

– NOTICEABLY ABSENT: sync with tasks

– Policy control over device. The You Had Me At EHLO blog states that this is about at the Exchange 2003 SP2 level of device control, which means it’s not as feature rich as the Blackberry, but a good starting point.

Other items of note for enterprises:

– Cisco IPSEC and VPN clients

– Two-factor authentication

What’s missing? Well, you saw me point out that task syncing is missing… Merlin Mann is likely pissing himself right about now over that. But I noticed today that there were no federal government folks present and… here’s the bad news for those federal workers… Jobs never mentioned encryption of data at rest. Thanks to an OMB directive, encryption of data at rest is a requirement for a mobile device on a federal government network. Guess what device is the only one to meet that requirement?

If you’re thinking of a berry in the color of night, you’d be right.

You’d also be right if you’re thinking of the next version of Windows Mobile… 6.1, I believe they call it. Last I remember, that also had encryption of data at rest.

So unfortunately, this may leave the iPhone out of the federal government networks for a little while longer. Perhaps it’s an oversight that it wasn’t mentioned – but I’m betting that it was left out deliberately.

All in all, I wasn’t crazy about the iPhone before but I certainly am now. The fact that they’ve really turned it into a platform with an ecosystem makes this very, very exciting. One of the challenges of the OS X platform was the lack of an ecosystem. Now with OS X advances, the freely-available Xcode and now the freely-available iPhone SDK, Apple stands to really rock the world with an ecosystem that could quickly rival Microsoft.

To make sure they’re shaking things up, there’s that iFund thing. What a fantastic idea. Folks, when was the last time Microsoft paid you to develop applications for their platform? If you want to get into the Microsoft developmental mafia, you’re likely looking at an MSDN subscription ($2500 or so the first year, $1500 each year afterwards… PER SEAT!)… you’re looking at heavy software licensing costs and hell, they don’t even distribute the application or updates for you.

Apple is not only making the price of entry into their ecosystem dirt cheap ($99), the development software is free and they will distribute your applications/updates. Folks, this is a hell of a deal and I’m betting there are small businesses and garage developers everywhere getting excited about this.

I really, really think Microsoft is in trouble on many fronts. It’s going to be hard to stop this kind of excitement. I don’t even intend to develop apps for the iPhone or the Mac and I’m excited.

Truly, there was an earthquake today in California. It may have been a subtle earthquake for some, but I felt it quite strong here on the other side of the states. I’m excited about computing again – and that’s something to cheer about.

Technorati Tags:
, , ,