Measures reported by AWSKinesiStremsTest
Amazon Kinesis Data Streams enables you to build custom applications that process or analyze streaming data for specialized needs. Kinesis Data Streams can continuously capture and store terabytes of data per hour from hundreds of thousands of sources such as website clickstreams, financial transactions, social media feeds, IT logs, and location-tracking events. With the Kinesis Client Library (KCL), you can build Kinesis Applications and use streaming data to power real-time dashboards, generate alerts, implement dynamic pricing and advertising, and more. You can also emit data from Kinesis Data Streams to other AWS services such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Amazon EMR, and AWS Lambda.
The following diagram illustrates the high-level architecture of Kinesis Data Streams. A Kinesis data stream is nothing but an ordered sequence of data records. A data record is the unit of data stored in a Kinesis data stream. A shard is a uniquely identified group of data records in a stream. A stream is composed of one or more shards, each of which provides a fixed unit of capacity. The producers continually push data records or shards of data to Kinesis Data Streams and the consumers process the data in real time. Consumers (such as a custom application running on Amazon EC2, or an Amazon Kinesis Data Firehose delivery stream) can store their results using an AWS service such as Amazon DynamoDB, Amazon Redshift, or Amazon S3.
Typically, you can work with data streams - i.e., put records into a stream, read records from it , etc. - using the Amazon Kinesis Data Streams API. At run time, any delay that a custom application experiences when streaming / analyzing data can be attributed to the delay in execution of these API calls. To capture the Kinesis Data Stream that is experiencing such a slowness, and to pinpoint the source of the slowness, use the AWSKinesiStremsTest test.
This test automatically discovers the Kinesis Data Streams, and for each stream reports the time taken to put records in and get records from that stream. The test also promptly notifies administrators if any API operation fails, thus enabling administrators to troubleshoot and fix the failure before it causes any serious damage to application performance. The actual and provisioned throughput for each stream is tracked, and any throttling that occurs due to a throughput threshold breach is brought to the attention of administrators, so that the stream capacity/configuration can be changed according to the load.
Outputs of the test : One set of results for each Kinesis data stream / shard.
First-level descriptor: AWS Region
Second-level descriptor: Kinesis data stream / shard, depending upon the option chosen against the KINESIS FILTER NAME parameter.
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| Getrecrd_byts |
Indicates the amount of data retrieved by the GetRecords API operation from this data stream. |
KB |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported. |
| Getrecrd_itrage |
By default, this measure represents the age of the last record in all GetRecords calls made to all shards in this data stream.
If the KINESIS FILTER NAME is set to ShardId, then this measure represents the age of the last record in all GetRecords calls made to this shard. |
Number |
Age is the difference between the current time and when the last record of the GetRecords call was written to the stream. A value of zero indicates that the records being read are completely caught up with the stream. |
| Getrecrd_latncy |
Indicates the time taken per GetRecords operation performed on this data stream. |
Secs |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported.
Ideally, the value of this measure should be low. A high value indicates that the GetRecords API operation is very slow. Compare the value of this measure across data streams to know for which data stream the GetRecords operation is slowest. |
| Getrecrd_recrds |
Indicates the number of records retrieved from this data stream by the GetRecords API operation. |
Number |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported. |
| Getrecrd_success |
Indicates the number of successful Getrecords API operations for this data stream. |
Number |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported.
Ideally, the value of this measure should be high. |
| Incum_byts |
By default, this measure represents the total amount of data put into all shards in this data stream, by PutRecord and PutRecords operations.
If the KINESIS FILTER NAME is set to ShardId, then this measure represents the total amount of data put into this shard by PutRecord and PutRecords operations. |
KB |
|
| Incum_recrds |
By default, this measure represents the total number of records put into all shards in this data stream, by PutRecord and PutRecords operations.
If the KINESIS FILTER NAME is set to ShardId, then this measure represents the total number of records put into this shard by PutRecord and PutRecords operations. |
Number |
|
| Outgo_byts |
By default, this measure represents the total amount of data retrieved from all shards in this data stream.
If the KINESIS FILTER NAME is set to ShardId, then this measure represents the total amount of data retrieved from this shard. |
KB |
|
| Outgo_recrds |
By default, this measure represents the total number of records retrieved from all shards in this data stream.
If the KINESIS FILTER NAME is set to ShardId, then this measure represents the total number of records retrieved from this shard. |
Number |
|
| Putrecrd_byts |
Indicates the amount of data put into this data stream. by the PutRecord API operation. |
Number |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported.
Each call to PutRecord operates on a single record. Prefer the PutRecords operation unless your application specifically needs to always send single records per request, or some other reason PutRecords can't be used. |
| Putrecrd_latncy |
Indicates the time taken per PutRecord API operation performed on this data stream. |
Number |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported.
Ideally, the value of this measure should be low. A high value indicates that the PutRecord API operation is very slow. Compare the value of this measure across data streams to know for which data stream the PutRecord operation is slowest. |
| Putrecrd_suces |
Indicates the number of successful PutRecord API operations for this data stream. |
Number |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported.
Ideally, the value of this measure should be high. |
| Putrecrds_byts |
Indicates the amount of data put into this data stream. by the PutRecords API operation. |
KB |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported.
The PutRecords operation sends multiple records to Kinesis Data Streams in a single request. By using PutRecords, producers can achieve higher throughput when sending data to their Kinesis data stream. Each record in the request can be as large as 1 MB, up to a limit of 5 MB for the entire request |
| Putrecrds_latncy |
Indicates the time taken per PutRecords API operation performed on this data stream. |
Secs |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported.
Ideally, the value of this measure should be low. A high value indicates that the PutRecords API operation is very slow. Compare the value of this measure across data streams to know for which data stream the PutRecords operation is slowest. |
| Putrecrds_recrds |
Indicates the number of records retrieved from this data stream by the PutRecords API operation. |
Number |
This measure is not reported for a shard - i.e., if the KINESIS FILTER NAME is set to ShardId, then this measure will not be reported. |
| Putrecrds_sucess |
Indicates the number of PutRecords API operations, where at least one record succeeded for this data stream. |
Number |
Ideally, the value of this measure should be high. A low value is indicative of a high failure rate of PutRecords operations.
By default, failure of individual records within a request does not stop the processing of subsequent records in a PutRecords request. This means that a response Records array includes both successfully and unsuccessfully processed records. You must detect unsuccessfully processed records and include them in a subsequent call. |
| Red_prov_throughput |
By default, this measure represents the number of GetRecords calls throttled for this data stream.
If the KINESIS FILTER NAME is set to ShardId, then this measure represents the number of GetRecords calls throttled for this shard. |
Number |
The maximum size of data that GetRecords can return is 10 MB. If a call returns this amount of data, subsequent calls made within the next five seconds throw ProvisionedThroughputExceededException. If there is insufficient provisioned throughput on the stream, subsequent calls made within the next one second throw ProvisionedThroughputExceededException.
This exception implies that the request rate for the stream is too high, or the requested data is too large for the available throughput. The recommended solution for this problem is to reduce the frequency or size of your requests. |
| Wrt_prov_throughput |
By default, this measure represents the number of PutRecord and PutRecords calls throttled for this data stream.
If the KINESIS FILTER NAME is set to ShardId, then this measure represents the number of PutRecord and PutRecords calls throttled for this shard. |
Number |
If a PutRecord request cannot be processed because of insufficient provisioned throughput on the shard involved in the request, PutRecord throws ProvisionedThroughputExceededException.
This exception implies that the request rate for the stream is too high, or the requested data is too large for the available throughput. The recommended solution for this problem is to reduce the frequency or size of your requests. |
|