Implementation Lessons

I had implemented this in a different code base with a different language and falsely assumed that translating it would be a simple task. The architectural changes proved to be fairly challenging to overcome.

POST Multipart

Parsing multipart form data is more complex than it needs to be. Parsing the form data in python required multiple packages that must be added as layers to the lambda function.

Cant use API Gateway with long processing times

Once moving past a test file with 5 addresses there was an issue with service timeout. It turns out there is a 30 second time limit for responses when using the API gateway. Need to use function URLs for longer running processes.

Lambda timeout

There is a lambda function timer that needs to be extended for processes that will run longer than the default 3 seconds. Because of the multiple external API calls I had to extend this multiple times to up over 5 minutes.

Lambda Memory

The default is to give lambda 128mb of ram. When performing merging of two data frames the function was failing. This was because it needed more than the 128mb of ram. using CloudWatch you can see how much memory was used so after bumping to 256mb I could see that with a large file the function used 180mb. I decided to keep it at 256 just to have a small buffer.

Boto3 DynamoDB batch_get_item

One headache I hadnt expected and spent more time than I care to admit working out was deduplicating a batch lookup of dynamo items. I had not considered that I would have duplicates but this causes the lookup to fail and it does so silently. So initial tests worked as there was no location overlaps, but in QA testing the entire function would fail with no feedback.

Numpy int64 and truncation

one of the steps in the process needs to take some numbers and convert them to strings then concatenate them. This became awkward when it would create concatenated strings of NaNNaNNaN or NoneNaNNaN etc. So each row needed to be processes. but that added a new problem of numbers being given decimals of ".0" which would then be added to strings. To solve this the Math.Trunc() function was used. This appeared fine at first but later would fail silently. Leading to a considerable amount of time being spent debugging. In the end the Numpy.Array() module was used and then cast to the int type with .astype(int)


Retrospective

I spent multiple weeks on what I had initially assumed would be a simple conversion. I had created API calls in lambda before. I had attached functions to API gateways before. I had all the component pieces I needed and just had to put them together. Or so I thought.

Hours were spent in VS code and postman submitting POST requests to the endpoints and reviewing CloudWatch logs to understand what was happening. There was a lot of time spent switching between local development IDEs and deploying new versions of the code in lambda. This was a reminder that things may not always go as planned, but with time and focus you can find your way though.

In the future I might try the local development tools for lambda to spend less time in the AWS console.