Overview
Apache Kafka is a cornerstone of modern real-time data pipelines, facilitating high-throughput, low-latency event streaming.
Managing Kafka infrastructure, particularly at scale, presents significant operational challenges. To address this, Amazon Managed Streaming for Apache Kafka (MSK) provides a fully managed service - simplifying the provisioning, configuration, patching, and scaling of Kafka clusters. While MSK handles the infrastructure heavy lifting, effective management and control are still crucial for maintaining cluster health, performance, and reliability.
This article provides a comprehensive walkthrough to setting up an Amazon MSK cluster and integrating it with Kpow for Apache Kafka, a powerful tool for managing and monitoring Kafka environments. It walks through provisioning AWS infrastructure, configuring authentication with the OAUTHBEARER
mechanism using AWS IAM, setting up a client EC2 instance within the same VPC, deploying Kpow via Docker, and using Kpow's UI to monitor/manage brokers, topics, and messages.
Whether you manage production Kafka workloads or are evaluating management solutions, this guide provides practical steps for effectively managing and monitoring Kafka clusters on AWS.
About Factor House
Factor House is a leader in real-time data tooling, empowering engineers with innovative solutions for Apache Kafka® and Apache Flink®.
Our flagship product, Kpow for Apache Kafka, is the market-leading enterprise solution for Kafka management and monitoring.
Explore our live multi-cluster demo environment or grab a free Community license and dive into streaming tech on your laptop with Factor House Local.
Set up an EC2 instance
For this post, we're utilizing an Ubuntu-based EC2 instance. Since the MSK cluster will be configured to accept traffic only from within the same VPC, this instance will serve as our primary access point for interacting with the cluster. To ensure connectivity and control, the instance must:
- Be launched in the same VPC as the MSK cluster
- Allow inbound HTTP (port 80) and SSH (port 22) traffic via its security group
We use the AWS Command Line Interface (CLI) to provision and manage AWS resources throughout the demo. If the CLI is not already installed, follow the official AWS CLI user guide for setup and configuration instructions.
As Kpow is designed to manage/monitor Kafka clusters and associated resources, we can give administrative privileges to it. For more fine-grained access control within a Kafka cluster, we can rely on Apache Kafka ACLs, and the Enterprise Edition of Kpow provides robust support for it - see Kpow's ACL management documentation for more details.
Below shows example policies that can be attached.
Option 1: Admin Access to ALL MSK Clusters in the Region/Account
This policy allows listing/describing all clusters and performing any data-plane action (kafka-cluster:*
) on any cluster within the specified region and account.
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "kafka-cluster:*", "Resource": "arn:aws:kafka:<REGION>:<ACCOUNT-ID>:cluster/*" }, { "Effect": "Allow", "Action": [ "kafka:ListClusters", "kafka:DescribeCluster", "kafka:GetBootstrapBrokers" ], "Resource": "*" } ] }
Option 2: Admin Access to a Specific LIST of MSK Clusters
This policy allows listing/describing all clusters but restricts the powerful kafka-cluster:*
data-plane actions to only the specific clusters listed in the Resource
array.
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "kafka-cluster:*", "Resource": [ "arn:aws:kafka:<REGION>:<ACCOUNT-ID>:cluster/<CLUSTER-NAME-1>/<GUID-1>", "arn:aws:kafka:<REGION>:<ACCOUNT-ID>:cluster/<CLUSTER-NAME-2>/<GUID-2>" // Add more cluster ARNs here as needed following the same pattern // "arn:aws:kafka:<REGION>:<ACCOUNT-ID>:cluster/<CLUSTER-NAME-3>/<GUID-3>" ] }, { "Effect": "Allow", "Action": [ "kafka:ListClusters", "kafka:DescribeCluster", "kafka:GetBootstrapBrokers" ], "Resource": "*" } ] }
Create a MSK Cluster
While Kpow supports both provisioned and serverless MSK clusters, we'll use an MSK Serverless cluster in this post.
First, create a security group for the Kafka cluster. This security group allows traffic on Kafka port 9098 from:
- Itself (for intra-cluster communication), and
- The EC2 instance's security group (for Kpow access within the same VPC).
VPC_ID=<vpc-ic> SUBNET_ID1=<subnet-id-1> SUBNET_ID2=<subnet-id-2> SUBNET_ID3=<subnet-id-3> EC2_SG_ID=<ec2-security-group-id> CLUSTER_NAME=<cluster-name> REGION=<aws-region> SG_ID=$(aws ec2 create-security-group \ --group-name ${CLUSTER_NAME}-sg \ --description "Security group for $CLUSTER_NAME" \ --vpc-id "$VPC_ID" \ --region "$REGION" \ --query 'GroupId' --output text) ## Allow traffic from itself aws ec2 authorize-security-group-ingress \ --group-id "$SG_ID" \ --protocol tcp \ --port 9098 \ --source-group $SG_ID \ --region "$REGION" ## Allow traffic from EC2 instance aws ec2 authorize-security-group-ingress \ --group-id "$SG_ID" \ --protocol tcp \ --port 9098 \ --source-group $EC2_SG_ID \ --region "$REGION"
Next, create an MSK serverless cluster. We use the aws kafka create-cluster-v2
command with a JSON configuration that specifies:
- VPC subnet and security group,
- SASL/IAM-based client authentication.
read -r -d '' SERVERLESS_JSON <<EOF { "VpcConfigs": [ { "SubnetIds": ["$SUBNET_ID1", "$SUBNET_ID2", "$SUBNET_ID3"], "SecurityGroupIds": ["$SG_ID"] } ], "ClientAuthentication": { "Sasl": { "Iam": { "Enabled": true } } } } EOF aws kafka create-cluster-v2 \ --cluster-name "$CLUSTER_NAME" \ --serverless "$SERVERLESS_JSON" \ --region "$REGION"
Launch a Kpow Instance
We'll connect to the EC2 instance via SSH and install Docker Engine, as Kpow relies on Docker for its launch. For detailed instructions, please refer to the official installation and post-installation guides.
With Docker ready, we'll create Kpow's configuration file (e.g., aws-trial.env
). This file defines Kpow's core settings for connecting to the MSK cluster and includes Kafka connection details, licensing information, and AWS credentials.
The main section defines how Kpow connects to the MSK cluster:
ENVIRONMENT_NAME
: A human-readable name for the Kafka environment shown in the Kpow UI.BOOTSTRAP
: The Kafka bootstrap server URL for the MSK Serverless cluster (e.g.,boot-xxxxxxxx.c2.kafka-serverless.<region>.amazonaws.com:9098
).KAFKA_VARIANT
: Set this toMSK_SERVERLESS
to ensure Kpow creates its internal topics with the constrained topic configuration properties and service limitations specific to MSK Serverless.
Secure communication with the cluster is established using SASL over SSL:
SECURITY_PROTOCOL
: Set toSASL_SSL
to enable encrypted client-server communication.SASL_MECHANISM
: Set toAWS_MSK_IAM
to use AWS IAM for Kafka client authentication.SASL_JAAS_CONFIG
: Specifies the use of theIAMLoginModule
provided by Amazon for secure authentication.SASL_CLIENT_CALLBACK_HANDLER_CLASS
: Points toIAMClientCallbackHandler
, which automates the process of retrieving and refreshing temporary credentials via IAM.
Finally, the configuration file includes Kpow license details and AWS credentials. These are essential not only to activate and run Kpow but also for it to access the Kafka cluster.
## Managed Service for Apache Kafka Cluster Configuration ENVIRONMENT_NAME=MSK Serverless BOOTSTRAP=boot-<cluster-identifier>.c2.kafka-serverless.<aws-region>.amazonaws.com:9098 KAFKA_VARIANT=MSK_SERVERLESS SECURITY_PROTOCOL=SASL_SSL SASL_MECHANISM=AWS_MSK_IAM SASL_JAAS_CONFIG=software.amazon.msk.auth.iam.IAMLoginModule required; SASL_CLIENT_CALLBACK_HANDLER_CLASS=software.amazon.msk.auth.iam.IAMClientCallbackHandler ## Your License Details LICENSE_ID=<license-id> LICENSE_CODE=<license-code> LICENSEE=<licensee> LICENSE_EXPIRY=<license-expiry> LICENSE_SIGNATURE=<license-signature> ## AWS Credentials AWS_ACCESS_KEY_ID=<aws-access-key> AWS_SECRET_ACCESS_KEY=<aws-secret-access-key> AWS_SESSION_TOKEN=<aws-session-token> # Optional AWS_REGION=<aws-region>
With the aws-trial.env
file created, we'll use the following docker run
command to launch Kpow. This command forwards Kpow's internal UI port (3000) to port 80 on the host EC2 instance, enabling us to access the Kpow UI in a browser at http://<ec2-public-ip>
without specifying a port.
docker run --pull=always -p 80:3000 --name kpow \ --env-file aws-trial.env -d factorhouse/kpow-ce:latest
Monitor and Manage Resources
With Kpow launched, we can now step through a typical workflow using its user-friendly UI: from monitoring brokers and creating a topic, to sending a message and observing its journey to consumption.
Conclusion
In summary, this guide walked through setting up a fully managed Kafka environment on AWS using Amazon MSK Serverless and Kpow. By leveraging MSK Serverless for Kafka infrastructure and Kpow for observability and control, we can streamline operations while gaining deep insight into our data pipelines. The process included provisioning AWS resources, configuring a secure cluster with IAM-based authentication, and deploying Kpow via Docker Compose with environment-specific and security-conscious settings.
Once connected, Kpow provides an intuitive interface to monitor brokers, manage topics, produce and consume messages, and track consumer lag in real time. Beyond the basics, it offers advanced features like schema inspection, Kafka Connect monitoring, RBAC enforcement, and audit visibility - helping teams shift from reactive troubleshooting to proactive, insight-driven operations. Together, Amazon Managed Streaming for Apache Kafka (MSK) and Kpow form a robust foundation for building and managing high-performance, secure, real-time streaming applications on AWS.