Project Reasoning

The reasoning behind this project was twofold, with the primary focus being to build on my existing cloud knowledge as well as the theoretical principals from the AWS Cloud Practitioner Certification. What I found particularly interesting about this project, was its real world use case and how someone like myself who does freelance photography could utilise this. There are of course many existing image compression tools of various complexities, but this project revealed the programmatic approach to problem I come across quite often and that is storage capacity limitations. However I also use image compression as a way to protect my intellectual property, so I can send watermarked proofed low resolution images so they get an idea of what the photo will be like, but cannot claim it prior to payment.

Project Aim

The aim of the project is to create a lambda function that is triggered when a user uploads an image to a selected S3 bucket. This function will compress the image to the size of a thumbnail and save it back into the same bucket under a new name. Data about the new image will be saved to a Dynamo DB table. An API will also be set up so users can see the images as well as the metadata pertaining to them.

Creating the Environment

At the time of creating this project, I had a high level understanding of Python, however I was fairly new to using it for development in the cloud. Knowing the importance of infrastructure as code, and how AWS favours this method of working as part of their operational excellence I thought it would be best to work though as much of this project using the CLI. My IDE of choice was VS Code as I have plenty of experience using it for work on locally hosted projects. VS Code also allows for plugins and extensions to be stalled with ease.  At this stage, I had already installed the serverless framework whilst working through the AWS Practitioner, and this was already configured to one of my IAM user accounts. So I began by creating a new project using the “Python Starter” template.

Setting Up the Infrastructure

Now the environment was set up, I began by building out my YAML file to set and grant permission to up the necessary infrastructure.  This included setting up the following:

  • Region: us-east – Although I am based in the UK this region has the greatest supply of resources.
  •  
  • Profile: I created a specific profile which I use for serverless deployments with the appropriate permissions.
  •  
  • Roles: I set up an “Allow” role to S3, which was need to read and write to the bucket.
  •  
  • Environment variables: I created some environment variables such as the region and thumbnail size so they could be invoked in the python (handler.py) file.

Developing the Lambda Function

Once the main infrastructure had been defined, it was time to work on the Python code. I began by importing all the necessary dependencies that I knew I many need for the project including json, boto3, and date time. The next step was to define any global variables that I planned to use throughout such as references to the database, s3 and the image size. From there I was the able to start defining the functions necessary to carry out the image conversion. The first of which was the “s3_thumbnail_generator “, which parses through an event in this case uploading and image to the nominated S3 bucket. The function checks to make sure the new file name does not already exist in the bucket. If the name is unique and is of file type “png”, the following chain of functions will run. These functions include obtaining the image, carrying out the conversion and renaming the file using string concatenation.

Developing the Lambda Function

The final and most challenging step of the process was creating API requests. I began by updating the YAML file with the 4 API requests that I thought would be most appropriate for this project.

List: To list all the images stored within the S3 bucket and their attributes.

Get: To get the meta-data of an image by its ID.

Delete: To delete an image from the S3 bucket.

It was then time to create the functions in the Python (handler.py) file.

A get item function was created to retrieve an item (image) by it’s ID. The functions makes a call to the dynamodb table and returns the image as well as the status code of 200 for confirmation and debugging.

The delete function was written in a similar way, however it takes in the item ID as an event to trigger the deletion of the image. It too has a status confirmation to confirm the deletion.

A List item function was created which calls the dynamodb table and scans through using a while loop and returns each image and it’s meta-data in json format.

Deployment & Setbacks

Now that most of the local development was complete, I wanted to deploy what had been created to determine the functionality. The first issue I faced was with the PIL (or Pillow) Python module used for image processing. Though I had installed Pillow locally in the root and imported it into the project, AWS doesn’t natively support all modules defined within deployed code. It was at this point I was introduced to Lambda Layers, and how they can either be referenced directly in the code or attached to the function as an arn. As AWS did not pick up the import natively, it was also not triggering the Lambda function with S3 so I had to test the code with the Lamda console. I also had a few smaller issues regarding Python version issues as well as more trivial or logical problems such as the casing on the “PNG” extension.

Database

The next stage of this project involved configuring the database to store the metadata surrounding the image. A the database had already been created a put request, followed by an array was used to create key value pairs for the following:

  • -ID
  •  
  • -URL (Link to the new image)
  •  
  • -Approximate new size
  •  
  • -Date created
  •  
  • -Date Updated

natively support all modules defined within deployed code. It was at this point I was introduced to Lambda Layers, and how they can either be referenced directly in the code or attached to the function as an arn. As AWS did not pick up the import natively, it was also not triggering the Lambda function with S3 so I had to test the code with the Lamda console. I also had a few smaller issues regarding Python version issues as well as more trivial or logical problems such as the casing on the “PNG” extension.

REST API

The final and most challenging step of the process was creating API requests. I began by updating the YAML file with the 4 API requests that I thought would be most appropriate for this project.

List: To list all the images stored within the S3 bucket and their attributes.

Get: To get the meta-data of an image by its ID.

Delete: To delete an image from the S3 bucket.

It was then time to create the functions in the Python (handler.py) file.

A get item function was created to retrieve an item (image) by it’s ID. The functions makes a call to the dynamodb table and returns the image as well as the status code of 200 for confirmation and debugging.

The delete function was written in a similar way, however it takes in the item ID as an event to trigger the deletion of the image. It too has a status confirmation to confirm the deletion.

A List item function was created which calls the dynamodb table and scans through using a while loop and returns each image and it’s meta-data in json format.