Reinvent 2019 Day 4

Page content

By day 4, my attention span and body were starting to give out. I really needed to hang in there, though, because this was the night of re:Play, the big party. I always enjoy going to this party to see Jen Lasher play. She is super energetic, talented with mixing beats, and really fun to watch. Her enthusiasm is infectious.

She did not disappoint at all.

I knocked out one session of notes - the other sessions I attended did not garner anything interesting to write down. Read on for the notes on architecting serverless apps at scale.

SVS407: Architecting and operating resilient serverless apps at scale

Lessons learned, under the hood, techniques for serverless architecting

Overload

Bank teller analogy used throughout presentation.

  • You can parallelism a system to the point where contention becomes the bottleneck

  • Servers are too optimistic

** Clients timeout & retry -> creates a cascading failure (brownout)

** Try to never get into this scenario

** Cheaply reject excess work. Prefer to reject requests.

** Servers do not know when clients have timed out, causing wasted work

** Server-side timeouts in Lambda - use them to minimize server waste

*** What is client sends an expensive request?

*** Bounded work - input size validation & checkpointing

  • Lambda execution environment - each request gets its own execution environment

  • More bank analogies: database vs. caching

  • Dependency isolation in Lambda

** Cold invoke vs. warm invoke. A cold invoke goes all the way to placement manager for placement to spin up execution, therefore slightly slower. Warm invoke does not go the extra step and uses existing resources in the execution environment.

** Warm invoke > cold invoke

  • Queue backlogs are bad

** FIFO vs LIFO. Make FIFO behave like LIFO. Use 2 SQS queues: high priority vs low priority. In a chat app, all requests are high priority. Put some messages into low priority.

** Use TTL and drop expired messages

** Use dead-letter SQS as a lambda timeout (new)

  • Use polling

** But polling is not free

  • Use shuffle-sharing (look up this documentation)

** Fixed number of N queues