Trying Out New Relic APM for Lambda With an Image Recognition Prototype App

sorangutan

Lately, I’ve been playing around with Amazon Rekognition, which you can use to analyze images and video to do real-time analysis to recognize things like faces, objects, and text. The video below is an example of how I used Rekognition to pick up the New Relic logo on a coffee mug:

But that wasn’t enough. I wanted to see what else I could do with it … and I decided to see if I could use Rekognition—and New Relic—to track the exact number of seconds that the New Relic logo appeared on screen during the telecast of the 2018 Masters Tournament. You see, we sponsor golfer Mark Leishman, who was paired with Tiger Woods in the tournament’s opening round and was just two strokes off the lead heading into Saturday. We estimate that 10 million people could have seen the New Relic logo on Mark’s shirt, and in fact, this visibility correlated with an approximate 34% traffic spike in our web properties over that weekend.

To calculate the number of seconds that the New Relic logo appeared on screen, I architected a prototype app using footage of the Masters and several AWS services—Rekognition, AWS Lambda, Amazon API Gateway, Amazon Kinesis, Amazon CloudWatch, and Amazon S3. And, of course, I included some instrumentation to gain visibility and send custom data to New Relic. Hopefully, my project will spark ideas on other ways to use New Relic to monitor images and video.

Inside my Amazon Rekognition prototype app

Here’s how it works: An Android mobile app streams the relevant video from my phone to a Kinesis video stream. When I’m ready to process part of the stream, I manually invoke the API Gateway to kick off my Lambda function (in a real-world, practical application, I’d script this). The Lambda function grabs the most recent footage from the video stream, uses the Rekognition service to recognize any “New” and “Relic” text from the footage, writes the resultant image to an S3 bucket, and returns the image in base64 encoding as the API response. It’s all pretty slick.

Of course, no project would be complete without some New Relic instrumentation. This time, I had the pleasure of using an alpha release of New Relic’s APM for Lambda, now available in preview. Using APM for Lambda, I was able to gather and visualize some custom data (in this case, how many seconds the New Relic logo appeared in the video stream) and have enhanced monitoring of my main Lambda function processing the stream. To gather this data, I used the New Relic log ingestion function (described below) to pipe a stream of CloudWatch logs generated by my main Lambda function into New Relic.

A closer look at each part of the pipeline

The Kinesis app

Luckily, I didn’t have to develop my own application to stream video from my Android phone to the Kinesis video stream. Instead, I used the Android Producer Library, which provides a sample app with which you can set up a Kinesis Video Stream producer client in an Android application to send data to a Kinesis Video Stream. In other words, the client ran on my phone and sent the relevant footage to the Kinesis service.

I had to set up a Cognito user pool to be able to set up the sample app, but once I did that things worked “right out of the box.”

The Kinesis Video Stream

I created the Kinesis Video Streams in the AWS Management Console. The console provides a dashboard that shows what’s currently flowing into my Kinesis stream. I was able to scroll backwards and forwards through any part of the stream until it was 24 hours old, at which point the system purges it.

The AWS Lambda processing function

I consider this function, written in Python, to be the “brain” of the pipeline, as it processes the video stream and moves it through the rest of the application—a lot of heavy lifting. Specifically, the function:

Gathers about one second of video from the Kinesis video stream
Writes that video to a temporary file
Grabs the first frame from the temporary video file, creates a grayscale frame, and encodes the grayscale frame to PNG format
Sends the grayscale frame to Rekognition for text recognition
Uses text recognition responses to draw a bounding box on the frame and encodes the resulting frame
Writes the new frame to a temporary file and saves it in an S3 bucket
Encodes the saved file to base64 for transmission and returns the results through the API Gateway in a JSON-formatted response

The function’s code could use some refactoring and optimization, but it worked for my prototype.

import cv2
import boto3
import tempfile
import os
import json
from base64 import b64encode
import re
import time
import newrelic

print('Loading function')

@newrelic.monitor_lambda
def lambda_handler(event, context):
	# Get about a second of video from the Kinesis video stream
	video_client = boto3.client('kinesis-video-media',endpoint_url='https://s-5xxxxxx4.kinesisvideo.eu-west-1.amazonaws.com',region_name='eu-west-1')
	stream = video_client.get_media(StreamARN='arn:aws:kinesisvideo:eu-west-1:36xxxxxxx4:stream/my-stream/55555555555',StartSelector={'StartSelectorType': 'NOW'})
	streamingBody = stream["Payload"]
	datafeed = streamingBody.read(40000)
	
	# Write video to a temporary file
	new_file, filename = tempfile.mkstemp()
	os.write(new_file, datafeed)
	os.close(new_file)
	
	# Grab the first frame for the temporary video file, create grayscale frame and encode to png
	cap = cv2.VideoCapture(filename)
	ret, frame = cap.read()
	gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	rekog_frame = cv2.imencode('.png',gray)[1]
	
	# Send grayscale frame to Rekognition for text recognition
	client=boto3.client('rekognition','eu-west-1')
	response = client.detect_text(Image={'Bytes': rekog_frame.tobytes()})
	
	# Use text recognition response to draw bounding box on frame and encode resulting frame
	mod_frame = bounding_box(response,'new relic', frame)
	return_frame = cv2.imencode('.png',mod_frame)[1]
	
	# Write resulting frame to temporary file and save the file to s3
	cv2.imwrite('/tmp/image1.png',mod_frame)
	s3_file_path = save_to_s3('/tmp/image1.png')
	
	# Encode resulting image to base64 for transmission
	base64_bytes = b64encode(return_frame.tobytes())
	base64_string = base64_bytes.decode()
	
	# Return results in API Gateway accepted format
	return({
    	"isBase64Encoded": True,
    	"statusCode": 200,
    	"headers": { "s3_file_path": s3_file_path },
    	"body": base64_string
	})

def bounding_box(response, word_array, frame):
	width, height = frame.shape[:2]
	x1_coord=y1_coord=x2_coord=y2_coord=None
	for entry in response['TextDetections']:
    	if entry['Type'] == "LINE" and re.search('new relic',entry['DetectedText'], flags=re.IGNORECASE):
        	x1_coord = entry['Geometry']['Polygon'][3]['X']
        	y1_coord = entry['Geometry']['Polygon'][3]['Y']
        	x2_coord = entry['Geometry']['Polygon'][1]['X']
        	y2_coord = entry['Geometry']['Polygon'][1]['Y']
        	break
    	if entry['Type'] == "WORD":
        	if re.search('New',entry['DetectedText'], flags=re.IGNORECASE):
            	print('Detected "new"')
            	x1_coord = entry['Geometry']['Polygon'][1]['X']
            	y1_coord = entry['Geometry']['Polygon'][1]['Y']
            	if x2_coord == None and y2_coord == None:
                	x2_coord = entry['Geometry']['Polygon'][3]['X']
                	y2_coord = entry['Geometry']['Polygon'][3]['Y']
        	if re.search('Relic',entry['DetectedText'], flags=re.IGNORECASE):
            	print('Detected "relic"')
            	if x1_coord == None and y1_coord == None:
                	x1_coord = entry['Geometry']['Polygon'][1]['X']
                	y1_coord = entry['Geometry']['Polygon'][1]['Y']
            	x2_coord = entry['Geometry']['Polygon'][3]['X']
            	y2_coord = entry['Geometry']['Polygon'][3]['Y']
	try:
    	cv2.rectangle(frame, (round(x1_coord*height), round(y1_coord*width)),(round(x2_coord*height), round(y2_coord*width)), (255,0,0),5)
    	newrelic.record_custom_event('rekogEvent', {'logo_detection': 1})
	except Exception as e:
    	print(e)
	return frame
	
def save_to_s3(filename):
	bucket = 'rekognitionseantest'
	timestr = time.strftime("%Y%m%d-%H%M%S")
	key_name = timestr + '-rekog.png'
	directory = "Test_images/"
	s3 = boto3.client('s3')
	s3.upload_file(filename, bucket, directory + key_name, ExtraArgs={'ContentType': "image/png", 'ACL': "public-read"} )
	return(directory + key_name)

The Amazon Rekognition service

This part of the pipeline is essentially an API to which I send an image and receive a response on whether or not it recognized the specific text for which I asked it to search—in this case, the words “New” and “Relic.” Unfortunately, Rekognition can’t recognize text in video, but I got around that by extracting frames from the Kinesis Video Stream (asI mentioned above when I described the Lambda function).

New Relic log ingestion (from CloudWatch)

The New Relic log ingestion function—the second function in my app—is an AWS Lambda function that consumes the CloudWatch logs generated when my main Lambda function executes. Specifically, it watches for CloudWatch log entries that contain the NR_LAMBDA_MONITORING marker. This marker is produced by a wrapper (@newrelic.monitor_lambda) that I added to my main Lambda function. The custom data I collect with my function (newrelic.record_custom_event('rekogEvent', {'logo_detection': 1}) is also tracked by the CloudWatch logs.

Here’s an example log that returns details about my function and the custom event:

{
	"marker": "NR_LAMBDA_MONITORING",
	"version": "Alpha-4",
	"context": {
    	    "functionName": "cloud9-python3videostream-python3videostreamtodata-7xxxxxx5",
    	    "functionArn": "arn:aws:lambda:eu-west-1:36xxxxxxx4:function:cloud9-python3videostream-python3videostreamtodata-6VL57MEK2E26",
    	    "functionVersion": "$LATEST",
    	    "functionRuntime": "python3.6",
    	    "functionRegion": "eu-west-1",
    	    "awsRequestId": "f3xxxxxxxxxxxxxxxxxxf",
    	    "sourceInfo": null,
    	    "containerInfo": {
        	"isColdStart": true,
        	"startTime": 1528975664117,
        	"executions": 1,
        	"guid": "30xxxxxxxxxxxx8"
    	    }
	},
	"traceEvent": {
    	    "startTime": 1528975664118,
    	    "endTime": 1528975666044,
    	    "failed": false,
    	    "memoryMB": 63.15625
	},
	"operationEvents": [
    	    {
        	"service": "kinesis-video-media",
        	"operation": "GetMedia",
        	"startTime": 1528975664206,
        	"endTime": 1528975664291,
        	"failed": false
    	},
    	{
        	"service": "rekognition",
        	"operation": "DetectText",
        	"startTime": 1528975664650,
        	"endTime": 1528975665621,
        	"failed": false
    	},
    	{
        	"service": "s3",
        	"operation": "PutObject",
        	"startTime": 1528975665925,
        	"endTime": 1528975666040,
        	"failed": false,
        	"target": "rekognitionseantest"
    	}
	],
	"customEvents": [
    	    {
        	"eventType": "rekogEvent",
        	"logo_detection": 1,
        	"timestamp": 1528975665632
    	    }
	]
}

In New Relic, I set up a New Relic Insights dashboard that captures my custom events (length of logo appearance, in seconds) and metrics about my Lambda function, such as memory usage, function execution counts, and the duration of each function execution.

In some cases, the New Relic logo was on screen for up to 90 seconds at a time. Apparently, that was enough to make plenty of people curious enough to visit our web properties.

AWS and APM—always raising the bar

For every image Rekognition analyzes, AWS charges about $0.01 . That may not look expensive at first glance, but depending on the video format your stream uses, you could have 25 – 50 frames per second; multiply that by minutes or hours, and a project like this could really add up. (The pricing for video analysis is more reasonable—$0.10 per minute of footage).

AWS is setting a high bar as it uses Rekognition and related services to bring AI and related technology “to the masses.” And the company is making it easier than ever to add various types of automation capabilities to homegrown applications.

AWS Lambda makes it easier than ever to write and quickly execute your code, since you don’t have to worry about infrastructure management. However, removing a layer of abstraction (or, one could argue, adding another one!) doesn’t make it easier to monitor what’s going on in the code while it’s running. I found that APM for Lambda opened the “black box” surrounding my Lambda function and gave me real-time, code-level APM insights—and even allowed me to track custom metrics.

If you’re interested in getting early access to New Relic APM for Lambda and want to contribute feedback, sign up here. Unfortunately, not everyone who signs up will be invited to the private beta, but everyone will receive updates as we hit critical milestones on the way to general availability.

Now I wonder: What should I use this app to track next?