|
Measures reported by AWSAmazonLambdaTest
AWS Lambda is a compute service that lets you run code without provisioning or managing servers. In other words, AWS Lambda runs your code on a high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, code monitoring and logging. All you need to do is author your code in a language that AWS Lambda supports (currently Node.js, Java, C#, Go and Python), and upload your application code to AWS Lambda in the form of one or more Lambda functions. Using AWS Lambda, you can even maintain multiple versions of your in-production function code, and can also create aliases for each of your function versions for easy reference.
Typically, AWS Lambda is used to run code in response to events, such as changes to data in an Amazon S3 bucket or an Amazon DynamoDB table; to run your code in response to HTTP requests using Amazon API Gateway; or invoke your code using API calls made using AWS SDKs.
In such scenarios, if the Lambda function code fails or takes too long to execute, it can stall or even completely stop data/request processing by critical AWS services (eg., Amazon S3, Amazon DynamoDB, Amazon API Gateway, etc.) that rely on that code for their operations. Likewise, if any Lambda function utilizes CPU/memory excessively, it can degrade the performance of other functions that share the resources. Furthermore, if the concurrency capacity of a function is not correctly set, latencies in function execution will become inevitable. To pre-empt such anomalies, administrators need to monitor each Lambda function that these services use and promptly capture problematic functions. This is exactly what the AWS Lambda test does!
This test automatically discovers the Lambda functions, monitors the invocations of each function, and in the process, reports latencies and errors/failures in function execution. This enables administrators to quickly and accurately identify slow and/or buggy functions, so that they take those functions and their codes up for closer review and fine-tuning.
Optionally, you can configure this test to report metrics for each version of a function or for every alias of a function version. This enables administrators to quickly compare the performance of different versions or aliases of a function, and then decide which version/alias to use in the production environment.
Additionally, the test reports metrics for a Summary descriptor, using which administrators can audit/track changes to the AWS Lamba service for every region - i.e., track the addition/deletion of Lambda functions per region.
Also, the test measures the resource usage of individual functions, thus rapidly turning administrator attention to resource-hungry functions. The adequacy of the concurrency configuration of every function is also checked periodically, so that administrators can be promptly alerted to inadequacies.
Outputs of the test : One set of results for each Lambda function / version / alias in every region.
First-level descriptor: AWS Region
Second-level descriptor: Function / Version / Alias, depending upon the option chosen from the LAMBDA FILTER NAME parameter of this test
A few metrics are also reported for a Summary descriptor per region
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| Invocations |
By default, this measure represents the number of times this function was invoked in response to an event or invocation API call.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the number of times this version of a function was invoked in response to an event or invocation API call.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the number of times this alias was invoked in response to an event or invocation API call. |
Number |
Compare the value of this measure across functions to know which function is used the maximum. The failure of such a function will naturally have a more adverse impact on performance and productivity than other functions. |
| Invoctn_Error |
By default, this measure represents the number of invocations of this function failed due to errors (response code 4xx)
If the LAMBDA FILTER NAME is set to Version, then this measure represents the number of invocations of this version of a function that failed due to errors.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the number of invocations of this alias that failed due to errors. |
Number |
Ideally, the value of this measure should be 0. A non-zero value is indicative of one/more errors in the function code.
To know what errors occurred during invocation, check the logs. Typically, each time the code is executed in response to an event, it writes a log entry into the log group associated with a Lambda function, which is /aws/lambda/<function name>.
Following are some examples of errors that might show up in the logs:
If you see a stack trace in your log, there is probably an error in your code. Review your code and debug the error that the stack trace refers to.
If you see a permissions denied error in the log, the IAM role you have provided as an execution role may not have the necessary permissions. Check the IAM role and verify that it has all of the necessary permissions to access any AWS resources that your code references.
If you see a timeout exceeded error in the log, your timeout setting exceeds the run time of your function code. This may be because the timeout is too low, or the code is taking too long to execute.
If you see a memory exceeded error in the log, your memory setting is too low. Set it to a higher value. Typically, when creating a function, you need to mention the amount of memory that should be given to that function. Lambda uses this memory size to infer the amount of CPU and memory allocated to your function. Your function use-case determines your CPU and memory requirements. For example, a database operation might need less memory compared to an image processing function. The default value is 128 MB. The value must be a multiple of 64 MB.
|
| Letter_Error |
By default, this measure represents the number of times Lambda could not write the failure of this function to the configured dead letter queues.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the number of times Lambda could not write the failure of this version of a function to the configured dead letter queues.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the number of times Lambda could not write the failure of this alias to the configured dead letter queues. |
Number |
By default, a failed Lambda function invoked asynchronously is retried twice, and then the event is discarded. Using Dead Letter Queues (DLQ), you can indicate to Lambda that unprocessed events should be sent to an Amazon SQS queue or Amazon SNS topic instead, where you can take further action.
If the value of this measure keeps increasing, it implies that the event payload is consistently failing to reach the dead letter queue. Probable cause for this are as follows:
|
| Invoctn_Duraton |
By default, this measure indicates the average elapsed wall clock time from when this function's code starts executing because of an invocation to when it stops executing.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the average elapsed wall clock time from when this version of a function's code starts executing because of an invocation to when it stops executing.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the average elapsed wall clock time from when this alias starts executing because of an invocation to when it stops executing. |
Secs |
Ideally, the value of this measure should be low. A high value indicates that a function/version/alias is taking too long to execute.
To determine why there is increased latency in the execution of a Lambda function, do the following:
Test your code with different memory settings: If your code is taking too long to execute, it could be that it does not have enough compute resources to execute its logic. Try increasing the memory allocated to your function and testing the code again, using the Lambda console's test invoke functionality. You can see the memory used, code execution time, and memory allocated in the function log entries. Changing the memory setting can change how you are charged for duration.
Investigate the source of the execution bottleneck using logs: You can test your code locally, as you would with any other Node.js function, or you can test it within Lambda using the test invoke capability on the Lambda console, or using the asyncInvoke command by using AWS CLI. Each time the code is executed in response to an event, it writes a log entry into the log group associated with a Lambda function, which is named aws/lambda/<function name>. Add logging statements around various parts of your code, such as callouts to other services, to see how much time it takes to execute different parts of your code.
|
| Invoctn_Throtle |
By default, this measure indicates the number of invocation attempts for this Lambda function that were throttled due to invocation rates exceeding the customer's concurrent limits (error code 429).
If the LAMBDA FILTER NAME is set to Version, then this measure represents the number of invocation attempts that were throttled for this version of the Lambda function due to invocation rates exceeding the customer’s concurrent limits (error code 429).
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the number of invocation attempts that were throttled for the version of the function that maps to this alias, due to invocation rates exceeding the customer’s concurrent limits (error code 429). |
Number |
The unit of scale for AWS Lambda is a concurrent execution (see Understanding Scaling Behavior for more details). However, scaling indefinitely is not desirable in all scenarios. For example, you may want to control your concurrency for cost reasons, or to regulate how long it takes you to process a batch of events, or to simply match it with a downstream resource. To assist with this, Lambda provides a concurrent execution limit control at both the account level and the function level.
On reaching the concurrency limit associated with a function, any further invocation requests to that function are throttled, i.e. the invocation doesn't execute your function. Each throttled invocation increases the value of this measure for the corresponding function.
AWS Lambda handles throttled invocation requests differently, depending on their source:
Event sources that aren't stream-based: Some of these event sources invoke a Lambda function synchronously, and others invoke it asynchronously. Handling is different for each:
Synchronous invocation: If the function is invoked synchronously and is throttled, Lambda returns a 429 error and the invoking service is responsible for retries. The ThrottledReason error code explains whether you ran into a function level throttle (if specified) or an account level throttle. Each service may have its own retry policy. For example, CloudWatch Logs retries the failed batch up to five times with delays between retries. For a list of event sources and their invocation type, see Supported Event Sources.
Asynchronous invocation: If your Lambda function is invoked asynchronously and is throttled, AWS Lambda automatically retries the throttled event for up to six hours, with delays between retries. Remember, asynchronous events are queued before they are used to invoke the Lambda function.
Stream-based event sources: For stream-based event sources (Kinesis and DynamoDB streams), AWS Lambda polls your stream and invokes your Lambda function. When your Lambda function is throttled, Lambda attempts to process the throttled batch of records until the time the data expires. This time period can be up to seven days for Kinesis. The throttled request is treated as blocking per shard, and Lambda doesn't read any new records from the shard until the throttled batch of records either expires or succeeds. If there is more than one shard in the stream, Lambda continues invoking on the non-throttled shards until one gets through.
|
| Total_functions |
Indicates the total number of Lambda functions available in this region. |
Number |
This measure is available only for the Summary descriptor.
Use the detailed diagnosis of this measure, if enabled, to know which functions are available for each region, the runtime environment of each function, and when the function was last modified.
Note that detailed diagnostics will be reported for this measure only if the DD FOR TOTAL flag is set to Yes. |
| New_functions |
Indicates the number of Lambda functions that were created in this region during the last measurement period. |
Number |
This measure is available only for the Summary descriptor.
Use the detailed diagnosis of this measure to know which Lambda functions were recently created in this region. |
| Deleted_functions |
Indicates the number of Lambda functions that were deleted in this region during the last measurement period. |
Number |
This measure is available only for the Summary descriptor.
Use the detailed diagnosis of this measure to know which Lambda functions were recently deleted in this region. |
| Cpu_total_time |
By default, this measure indicates the total time for which this Lambda function used CPU for user and system-related processing.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the total duration for which Lambda functions of this version used the CPU for user and system-related processing.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the total duration for which Lambda functions that are of a version that maps to this alias, used the CPU for user and system-related processing. |
Seconds |
Compare the value of this measure across functions to know which function was hogging the CPU. Likewise, compare the value of this measure across aliases / versions to know the functions mapped to which alias / version is hogging the CPU. |
| Init_duration |
By default, this measure indicates the time spent by this Lambda function in the init phase of the Lambda execution environment lifecycle.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the time that Lambda functions of this version spent in the init phase of the Lambda execution environment lifecycle.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the time that Lambda functions of a version mapped to this alias spent in the init phase of the Lambda execution environment lifecycle. |
Seconds |
In the Init phase, Lambda creates or unfreezes an execution environment with the configured resources, downloads the code for the function and all layers, initializes any extensions, initializes the runtime, and then runs the function's initialization code (the code outside the main handler). The Init phase happens either during the first invocation, or in advance of function invocations if you have enabled provisioned concurrency.
Compare the value of this measure across functions to know which function spent the maximum time in the init phase. Likewise, compare the value of this measure across aliases / versions to know the functions mapped to which alias / version were delayed in the init phase. |
| Total_memory |
Indicates the amount of memory allocated to this Lambda function.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the total memory allocated to Lambda functions of this version.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the total memory allocated to Lambda functions of versions mapped to this alias. |
MB |
|
| Used_memory |
By default, this measure reports the maximum amount of memory used by this Lambda function.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the maximum memory used by Lambda functions of this version.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the maximum memory used by Lambda functions of a version mapped to this alias. |
MB |
|
| Memory_util |
By default, this measure reports the maximum memory measured as a percentage of total memory.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the percentage of maximum memory used by Lambda functions of this version.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the percentage of maximum memory used by Lambda functions of a version mapped to this alias. |
Percent |
Compare the value of this measure across functions to know which function used the maximum amount of memory.
Likewise, compare the value of this measure across aliases/versions to understand if functions of a specific version are memory-hungry. |
| Read_data |
By default, this measure reports the amount of data read by this Lambda function.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the amount of data read by Lambda functions of this version.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the amount of data read by Lambda functions of a version mapped to this alias. |
KB |
|
| Write_data |
By default, this measure reports the amount of data written by this Lambda function.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the amount of data written by Lambda functions of this version.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the amount of data written by Lambda functions of a version mapped to this alias. |
KB |
|
| Total_network |
By default, this measure reports the sum of data reads and writes for this Lambda function.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the total amount of data read and written by all Lambda functions of this version.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the total amount of data read and written by all Lambda functions of a version mapped to this alias. |
KB |
Compare the value of this measure across Lambda functions to know which function is I/O-intensive.
You can also compare the value of this measure across aliases/versions to determine if Lambda functions of any specific alias/version are consistently I/O-intensive. |
| Post_extenton |
By default, this measure reports the cumulative amount of time that the runtime of this Lambda function spends running code for extensions after the function code has completed.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the cumulative amount of time that all Lambda functions of this version spend running code for extensions after the function code has completed.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the cumulative amount of time that all Lambda functions mapped to this alias spend running code for extensions after the function code has completed. |
Seconds |
Compare the value of this measure across Lambda functions to know which function's runtime is spending more time running code for extensions.
You can also compare the value of this measure across aliases/versions to determine if Lambda functions of any specific alias/version are spending more time running code for extensions. |
| Conc_executon |
By default, this measure reports the number of instances of this function that are concurrently processing events.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the number of instances of Lambda functions of this version that are concurrently processing events.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the number of instances of Lambda functions mapped to this alias that are concurrently processing events. |
Number |
Concurrency is the number of requests that your function is serving at any given time. When your function is invoked, Lambda allocates an instance of it to process the event. When the function code finishes running, it can handle another request. If the function is invoked again while a request is still being processed, another instance is allocated, which increases the function's concurrency. The total concurrency for all of the functions in your account is subject to a per-region quota.
For an initial burst of traffic, your functions' cumulative concurrency in a region can reach an initial level of between 500 and 3000, which varies per region. You can also allocate capacity on a per-function basis with Reserved concurrency. This guarantees the maximum number of concurrent instances for the function. When a function has reserved concurrency, no other function can use that concurrency. Alternatively, you can configure Provisioned concurrency for a function. This initializes a requested number of execution environments so that they are prepared to respond immediately to your function's invocations.
By observing the variations to this measure for each function over time, you can quickly isolate functions requiring more concurrency capacity. You can then allocate the additional capacity to that function by setting Reserved concurrency or Provisioned concurrency. |
| Delivery_fail |
By default, this measure reports the number of times this Lambda function attempted to send an event to a destination but failed.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the number of times functions of this version attempted to send an event to a destination but failed.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the number of times functions mapped to this alias attempted to send an event to a destination but failed. |
Number |
Ideally, the value of this measure should be 0 for any function/alias/version. A non-zero value is a cause for concern, as it implies that a Lambda function or functions mapped to an alias/version (as the case may be) failed to send an event to a destination. Delivery errors can occur due to permissions errors, misconfigured resources, or size limits. |
| Prov_conc_invocation |
By default, this measure reports the number of times this function's code is executed on provisioned concurrency.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the number of times functions of this version were executed on provisioned concurrency.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the number of times functions mapped to this alias were executed on provisioned concurrency. |
Number |
Provisioned Concurrency is a feature that keeps functions initialized and hyper-ready to respond in double-digit milliseconds. This is ideal for implementing interactive services, such as web and mobile backends, latency-sensitive microservices, or synchronous APIs.
When you invoke a Lambda function, the invocation is routed to an execution environment to process the request. When a function has not been used for some time, when you need to process more concurrent invocations, or when you update a function, new execution environments are created. The creation of an execution environment takes care of installing the function code and starting the runtime. Depending on the size of your deployment package, and the initialization time of the runtime and of your code, this can introduce latency for the invocations that are routed to a new execution environment. This latency is usually referred to as a ‘cold start’. For most applications this additional latency is not a problem. For some applications, however, this latency may not be acceptable.
When you enable Provisioned Concurrency for a function, the Lambda service will initialize the requested number of execution environments so they can be ready to respond to invocations. |
| Spill_invocation |
By default, this measure reports the number of times this function's code was executed on standard concurrency when all provisioned concurrency is in use.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the number of times functions of this version were executed on standard concurrency when all provisioned concurrency is in use.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the number of times functions mapped to this alias were executed on standard concurrency when all provisioned concurrency is in use. |
Number |
If the value of this measure is consistently high for any function, it could indicate that the function has outgrown its provisioned concurrency capacity. You may then want to allocate more concurrency capacity to that function by fine-tuning its Provisioned concurrency setting. |
| Prov_conc_executon |
By default, this measure reports the number of instances of this function that are processing events on provisioned concurrency.
If the LAMBDA FILTER NAME is set to Version, then this measure represents the number of instances of functions of this version that are processing events on provisioned concurrency.
If the LAMBDA FILTER NAME is set to Alias, then this measure represents the number of instances of functions mapped to this alias that are processing events on provisioned concurrency. |
Number |
|
| Conc_utilization |
By default, this measure reports the percentage of allocated Provisioned concurrency capacity that is currently in use.
| Percent |
A value close to 100% for any function indicates that the concurrency capacity configured for that function is inadequate. In such cases, a spill over becomes inevitable. To avoid this, you may want to consider increasing the Provisioned Concurrency capacity for that function. Optionally, you may want to configure Reserved Concurrency for that function, as it guarantees the maximum number of concurrent instances. |
| Unreserv_conc |
By default, this measure reports the number of events that are being processed by functions that do not have reserved concurrency.
| Number |
This measure is available only for the Summary descriptor. |
|