GenAI Powered Security Self-Service Chatbot


A PoC AWS Bedrock GenAI-powered security self-service chatbot

Published on June 10, 2024 by Dai Tran

project selfservice-security ai-powered-application blog

5 min READ

Introduction

This proof-of-concept (PoC) investigates the application of generative AI within the realm of security enablement and self-service. The primary objective is to harness the capabilities of generative AI alongside front-end chatbots to enhance the user experience for security consumers, particularly in extensive enterprise settings. By equipping these chatbots with specialized security knowledge, they can educate and mentor users on security principles, protocols, and procedures, ultimately facilitating their ability to independently address their security needs through GitOps workflows, such as managing git pull requests. While Cloudflare serves as an example of security technology utilized in this PoC, this methodology is broadly applicable across various self-service workflows that involve coding security configurations, compliance, and governance processes onto a version control system.

High Level Architecture and Design

GenAI Powered Security Self-service Chatbot HLSA

The PoC generative AI powered security self-service chatbot is built upon the following building blocks:

Infrastructure Setup

AWS IAM OIDC Provider for GitHub

AWS IAM OIDC provider for GitHub

This OIDC provider authenticates the GitHub Actions Knowledge Base update workflow and authorize it to assume the IAM role GitHubActionsAssumeRoleForSelfServiceChatBot that is assigned to the OIDC provider.

GitHub Actions Role

AWS Knowledge Base

AWS Bedrock Knowledge Base is the AWS Bedrock orchestration service that co-ordinates the following AWS services to convert data from data source to indexed vectors in the vector database:

  • AWS S3
  • AWS Lambda function
  • AWS Knowledge Base
  • AWS Text Embeddings model
  • AWS OpenSearch Serverless vector database
  • AWS Anthropic Claude 3 foundation model

AWS Bedrock Model Access

AWS Bedrock Model Access

AWS Bedrock Knowledge Base Configurations

AWS Bedrock Knowledge Base Configurations

AWS IAM Service Role for Bedrock Knowledge Base

AWS IAM Service Role for Bedrock Knowledge Base

AWS IAM Service Role for Knowledge Base Autosync Lambda

Knowledge Base Autosync Lambda Permission Configurations AWS IAM Service Role for Knowledge Base Autosync Lambda

Application Workflows

Knowledge Base Update Workflow

  1. Process owners (security engineers) create a documentation branch from the main branch of the code repository
  2. Process owners (security engineers) document and/or update self-service workflows and security knowledge in Markdown files, push the changes to the documentation branch, and create a PR to the main branch
  3. The GitHub Actions workflow triggered by the PR merge uploads the new and updated documents to an AWS S3 bucket
  4. The S3 bucket document upload creates an event that triggers the Lambda function that performs the Knowledge Base Sync:
    1. Knowledge Base service retrieves updated source data from the S3 bucket
    2. It splits data into chunks (aka performs data chunkcing)
    3. It sends data chunks to the text embeddings model to create vectors
    4. It indexes and saves the created vectors in AWS OpenSearch Serverless vector database
  5. Test the updated knowledge base by running a query to generate responses. If the reponses are not accurate, repeate the whole workflow to fine tune the answers.

Security Self-Service Chatbot Workflow

  1. Security consumers ask the chatbot security domain questions like What does Cloudflare web application firewall do? and What does Cloudflare rate limit do? to gain understanding of the security technlogies in question
    WAF Question Answer
    Rate Limit Question Answer
  2. Security consumers ask the chatbot about how to create and manage the security configuration using an IaC tool like Terraform to get the suggested sample code. Some examples of the questions are Tell me how to create Cloudflare WAF managed ruleset in Terraform and Show me Terraform code to create a Cloudflare HTTP rate limit resource
    WAF Ruleset Prompt Answer
    HTTP Rate Limit Prompt Answer
  3. Security consumers compose their security self-service requests based on the suggested sample code and ask the chatbot to create a GitHub pull request on their behalf like this prompt createpr->resource "cloudflare_rate_limit" "example" { zone_id = "your_zone_id" name = "example-rate-limit" description = "Example rate limit" disabled = false match { request { methods = ["GET", "POST"] schemes = ["HTTP", "HTTPS"] path { values = ["/example/*"] } } } threshold = 10 period = 1 action { mode = "simulate" } }
  4. The chatbot creates a PR
    Rate Limit PR Creation
    Rate Limit PR Creation on GitHub Repo
  5. Process owners (security engineers) review and merge the PR that triggers the configuration management pipeline

Chatbot Front-End Hosting

Local Hosting/Testing

Set up the following environment variables:

SET GITHUB_PAT=ghp_*******************

SET AWS_ACCESS_KEY_ID=*******************
SET AWS_SECRET_ACCESS_KEY=*******************
SET AWS_SESSION_TOKEN=*******************
export GITHUB_PAT=ghp_*******************

export AWS_ACCESS_KEY_ID=*******************
export AWS_SECRET_ACCESS_KEY=*******************
export AWS_SESSION_TOKEN=*******************

Install Python packages:

pip install -r requirements.txt

Run the chatbot:

streamlit run selfservicebot.py

Hosting via AWS

TBD

References

  1. Security self service GitHub code repository
  2. Use IAM roles to connect GitHub Actions to actions in AWS
  3. Implementing RAG App Using Knowledge Base from Amazon Bedrock and Streamlit
  4. Amazon Bedrock & AWS Generative AI - Beginner to Advanced
  5. Build a basic LLM chat app
  6. AWS: Integrate Bedrock Knowledge Base (AWS SDK & LangChain)