4 July 2023
Securing a cloud infrastructure involves a number of systems, such as respecting the principle of least privilege through IAM and preventing attacks on our IS. The first thing to do is always to control the entry points to our system and its access path, which passes through a WAF.
Architecture
The target architecture of this tutorial looks like this. Its aim is to have a secure and trusted IP list. In other words, whitelist only the IPs we know, for example from a VPN used by our users, or from a specific wifi. Let's go into more detail about our WAF.
WAF (AWS Web Application Firewall) is a security system that controls incoming traffic to an infrastructure. Generally speaking, the WAF is placed upstream of services that are in direct contact with the end-user, in order to control their inputs. A CloudFront, for example.
WAF can incorporate rules that can, for example, impose restrictions on specific IP addresses, HTTP headers, and URI strings. AWS WAF rules help prevent common Web attacks, such as SQL injection and Cross-Site Scripting, which exploit application vulnerabilities.
- DNS (CloudFare, amazon route 53) is used to transfer name requests into IP addresses. In our infrastructure, DNS also adds headers to detect bot or suspicious behavior... This is what we call a DNS filter.
- WAF applies the filtering rules. For example, if DNS sends a request containing the header "requestanomaly", the response must be a 403.
- The CloudFront is the front-end interface at AWS, which displays our web page and transfers requests to our business logic.
- S3 and lambda make up our serverless back
In this article, we'll explain how to easily connect a whitelist system to a WAF. As a bonus, we'll show how easy it is to add rules to increase security.
All this will be done in terraform, a language that allows us to describe what our infrastructure looks like. For this example, the cloud provider used is AWS, but the structure with equivalent components would be similar to other cloud providers.
Step-by-step summary
TL:DR if you're a smart aleck who doesn't need a detailed description of the steps, you can check directly on this github.
- Creation of a totally blocking WAF
- Add a rule to allow certain IPs
- (Optional) add a rule to block bad requests
- Connect the WAF to a cloudFront
Prerequisites
- An AWS account
- terraform
Step 1: Blocking WAF
The first step is to create a totally blocking system. This means that all requests, regardless of their origin, are canceled and won't reach our CloudFront.
Let's take a moment to look at the parameters:
resource "aws_wafv2_web_acl" "production_waf_acl" {
provider = aws.us-east-1
name = "production-waf-acl"
description = "production-waf-acl"
scope = "CLOUDFRONT"
#scope: the WAF scope is CLOUDFRONT because it's outside the region system #provider: setup aws-us-east-1 is required. This is because cloudFront is not regional, but is located as close as possible to the user (like CDNs) and must be connected to cloudFront.
# By default block actions
default_action {
block {}
}
} #end of block aws_wafv2_web_acl
We've therefore created a terraform WAF which, regardless of the request it receives as input, refuses the request and returns a 403 error to the user requesting access to the service behind the WAF.
Step 2: add a rule to authorize certain IPs
rule {
name = "ip-whitelisting"
priority = 1
action {
allow {}
}
statement {
ip_set_reference_statement {
arn = aws_wafv2_ip_set.production.arn
}
}
visibility_config {
cloudwatch_metrics_enabled = false
metric_name = "false"
sampled_requests_enabled = true
}
}
resource "aws_wafv2_ip_set" "production" {
provider = aws.us-east-1
name = "production-ip-whitelisting-rules"
scope = "CLOUDFRONT"
ip_address_version = "IPV4"
addresses =
"127.0. 0.1/32", #your ip adress
])
}
The block rule is added to the WAF and allows rules to be added. Requests are approved or denied sequentially using the "priority" parameter. As soon as a request is concerned by a rule, it is validated or refused, depending on whether the rule contains an allow or a block. Rules start with priority 1 and continue in ascending order.
If no rule concerns the request, then the default is taken into account. In our case, the rule is blocked.
Addresses not present in our IP-set will automatically be refused.
Step 3: Add a rule to block bad requests
rule {
name = "bot-detection"
priority = 2
action {
block {}
}
statement {
regex_pattern_set_reference_statement {
arn = aws_wafv2_regex_pattern_set.bot_regex_pattern.arn
field_to_match {
single_header {
name = "akamai-bot"
}
}
text_transformation {
priority = 2
type = "LOWERCASE"
}
}
}
visibility_config {
cloudwatch_metrics_enabled = false
metric_name = "false"
sampled_requests_enabled = true
}
}
We've also added a pattern set a regex that acts as a condition for our rule: if a string is contained in the pattern, then the request will be rejected, even if our IP is whitelisted.
resource "aws_wafv2_regex_pattern_set" "bot_regex_pattern" {
provider = aws.us-east-1
name = "pattern"
scope = "CLOUDFRONT"
regular_expression {
regex_string = "requestanomaly|scraperreputation|wrong-format-bmp-endpoint|100-continue|wrong-capitalization|backslash|urlencoded|suspicious-hash|evasion-backslash|evasion-urlencoded"
}
}
The CDN takes care of adding these headers. This means that users cannot modify the headers of their own requests on the fly, in order to reach our cloudfront.
It is also possible to add the rules created by AWS to add an extra layer of security. This part needs to be adapted to your own business logic and what's actually going on behind it (ec2 with linux, Mongo databases, serverless).
rule {
name = "AWS-AWSManagedRulesKnownBadInputsRuleSet"
priority = 1
override_action {
none {}
}
statement {
managed_rule_group_statement {
name = "AWSManagedRulesKnownBadInputsRuleSet"
vendor_name = "AWS"
}
}
visibility_config {
cloudwatch_metrics_enabled = false
metric_name = "false"
sampled_requests_enabled = true
}
}
Step 4: Connect WAF to the front end
resource "aws_cloudfront_distribution" "tf" {
origin {
domain_name = aws_s3_bucket.bucketExample.bucket_regional_domain_name
origin_id = "myS3Origin"
s3_origin_config {
origin_access_identity = aws_cloudfront_origin_access_identity.origin_access_identity.cloudfront_access_identity_path
}
}
web_acl_id="${aws_wafv2_web_acl.production_waf_acl.arn}"
enabled = true
default_root_object = "index.html"
default_cache_behavior {
viewer_protocol_policy = "redirect-to-https"
compress = true
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "myS3Origin"
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
}
logging_config {
include_cookies = false
bucket = "mylogs.s3.amazonaws.com"
prefix = "logFolder"
}
restrictions {
geo_restriction {
restriction_type = "none"
}
}
viewer_certificate {
cloudfront_default_certificate = true
}
}
It's a basic cloudFront containing just the web_acl id and an s3. The default cache behavior transfers http requests to https.
The other parameters are mandatory and defined in the cloudfront starter terraform.
And a basic s3 for example:
resource "aws_s3_bucket" "bucketExample" {
bucket = "my-tf-bucket-122113222"
force_destroy = true
tags = {
Name = "My bucket"
Environment = "Dev"
}
}
Conclusion
So we've just deployed a CloudFront and its WAF, which monitors incoming requests. With this architecture, your WAF protects against most of the known security problems listed in the top 10. What's more, your CloudFront can now only be accessed by people with a modicum of trust.
Your CloudFront is ready to use, so go and test it!