VPC Flow Logs Deep Dive - How They Work and Why You Need Them?

Updated Feb 10, 202615 min read

VPC Flow Logs is a powerful, built-in network monitoring feature provided by cloud platforms (AWS, GCP, Azure), designed to capture detailed metadata about IP traffic flowing through your network interfaces. By enabling seamless logging of accepted and rejected traffic flows, it empowers administrators to troubleshoot connectivity issues, detect anomalies, and comply with regulatory requirements, all without the need for additional agents or hardware.

Each flow log entry (a flow log record) documents a single network flow, characterized by source IP, destination IP, source port, destination port, and protocol. These records are aggregated over a configurable time window (called the aggregation interval) and then delivered to a destination of your choice for storage and analysis. Flow log data is collected outside the path of your network traffic, enabling them has zero impact on your network throughput or latency. VPC Flow Logs don't slow down your instances or degrade application performance.

Why Do You Need VPC Flow Logs?

Before VPC flow logging, teams often relied on host-based tooling or third-party network sensors, which used to increase operational overhead and usually lacks VPC-wide coverage. This approach was complex, added overhead to each instance, and provided a limited view restricted to traffic visible at the instance level. VPC Flow Logs eliminate all of that by providing centralized, agent-free visibility at three levels: the entire VPC, a specific subnet, or an individual network interface (ENI).

Here are the practical reasons to use VPC Flow Logs:

Security Monitoring and Threat Detection

VPC Flow Logs help you detect port scanning, network enumeration attempts, lateral movement after a compromised host, and data exfiltration. If an attacker communicates with a compromised system on a non-standard port (say, using SMTPS for an HTTPS request), flow logs catch it.

Troubleshooting Connectivity Issues

When an application can't reach a database or a service times out, flow logs help you pinpoint whether the traffic is being rejected by a security group, blocked by a network ACL, or it is not reaching the intended destination at all.

Compliance Monitoring

Many compliance standards, like PCI DSS and HIPAA, require organizations to monitor and log network traffic. VPC Flow Logs provide the comprehensive record needed for auditing and compliance reporting. AWS itself recommends enabling VPC Flow Logs for every VPC as a security best practice.

Cost Optimization

By analyzing traffic patterns, you can identify top talkers in your network, discover unnecessary cross-AZ traffic that's inflating your bill, and understand how your network resources are actually being utilized.

How do VPC Flow Logs Work?

The architecture behind VPC Flow Logs involves three stages: Capture, Aggregate, and Deliver.

A high-level overview of the VPC Flow Logs architecture
A high-level overview of the VPC Flow Logs architecture

1. Capture

When you enable VPC Flow Logs, the cloud provider begins capturing metadata about IP traffic to/from the network interface level. On AWS, the flow log service captures connection-level metadata for every IP traffic flow passing through the ENIs (Elastic Network Interface) within your configured scope (VPC, subnet, or individual ENI).In practice, not all traffic types are logged, and in rare cases records can be skipped (log-status = SKIPDATA). The capture happens outside the data path, so there is no performance penalty.

2. Aggregate

Captured flow metadata is aggregated into flow log records (source IP, destination IP, source port, destination port, protocol) and aggregated within a time window called the aggregation interval. By default, this interval is 10 minutes on AWS. You can reduce it to 1 minute for more granular data, but this significantly increases the volume (and cost) of log records.

On Nitro-based instances (most modern EC2 instance types), the aggregation interval is always 1 minute or less, regardless of the configured setting.

3. Deliver

After aggregation, the flow log records are delivered to your configured destination. On AWS, you have three options:

DestinationBest For
Amazon CloudWatch LogsReal-time analysis, quick queries using CloudWatch Insights, metric filters and alerts
Amazon S3Long-term archival, cost-effective storage, big data analysis with Athena
Amazon Data FirehoseStreaming to third-party tools (SigNoz, Datadog, Splunk, etc.), real-time processing pipelines

VPC Flow Logs Across Different Cloud Providers

While this guide primarily focuses on AWS, VPC Flow Logs are available across all major cloud providers. Here's a quick comparison:

FeatureAWS VPC Flow LogsGCP VPC Flow LogsAzure NSG Flow Logs
ScopeVPC, Subnet, ENIVPC Network, Subnet, VLAN Attachment, VPN TunnelNetwork Security Group
Aggregation Interval1 min or 10 min5 sec (default), 30 sec, 1 min, 5 min, 10 min, 15 min1 min
Log DestinationsCloudWatch, S3, FirehoseCloud LoggingAzure Storage (Blob) (primary). Optional downstream analytics/export (e.g., Traffic Analytics / SIEM tooling).
SamplingAll flows capturedConfigurable (50% default, adjustable)All flows captured
FormatSpace-delimited or ParquetJSONJSON
Container-level visibilityYes (ECS, Version 7)Yes (GKE nodes)Yes (AKS Container Network Logs / Advanced Container Networking Services)

GCP's VPC Flow Logs work differently in a few important ways. They use a dynamic primary sampling rate that varies with host load. You also get a configurable secondary sampling rate (default 50%), meaning not all flows are guaranteed to be logged. The tradeoff is lower cost for high-traffic environments.

Azure has evolved from NSG Flow Logs (which log at the security group level) to the newer VNet Flow Logs, which provide virtual network-level logging, closer to what AWS and GCP offer at the VPC level.

VPC Flow Logs Format: Understanding the Record Structure

A flow log record is a space-delimited string, with each field representing a component of the network flow. Understanding the format is critical for writing queries, building dashboards, and debugging network issues.

Default Format (Version 2)

The default format includes 14 fields:

<version> <account-id> <interface-id> <srcaddr> <dstaddr> <srcport> <dstport> <protocol> <packets> <bytes> <start> <end> <action> <log-status>

Here's what a real flow log record looks like:

2 123456789012 eni-abcdef01 10.0.1.1 203.0.113.5 54321 80 6 10 3456 1510868790 1510868850 ACCEPT OK

Let's break down each field:

FieldValue in ExampleWhat it Means
version2Flow log format version
account-id123456789012Your AWS account ID
interface-ideni-abcdef01The ENI where traffic was captured
srcaddr10.0.1.1Source IP address
dstaddr203.0.113.5Destination IP address
srcport54321Source port
dstport80Destination port (HTTP)
protocol6IANA protocol number (6 = TCP, 17 = UDP, 1 = ICMP)
packets10Number of packets in the aggregation interval
bytes3456Number of bytes in the aggregation interval
start1510868790Start time of the capture window (Unix epoch)
end1510868850End time of the capture window (Unix epoch)
actionACCEPTWhether traffic was accepted or rejected
log-statusOKLogging status (OK, NODATA, or SKIPDATA)

Custom Format Fields (Version 3+)

The default format gives you the basics, but AWS has introduced additional fields across multiple versions that provide significantly deeper visibility. You access these by choosing a custom format when creating the flow log.

When using custom format, the ${version} field is automatically set to the highest version number among all the fields you include, you don't manually specify the version. For example, if you include reject-reason (v8), your log records will show version = 8.

Version 3 fields (VPC/Subnet/Instance context):

FieldPurpose
vpc-idIdentifies which VPC the traffic belongs to
subnet-idIdentifies the subnet
instance-idMaps traffic to a specific EC2 instance
tcp-flagsBitmask of TCP flags (SYN, ACK, FIN, RST), it is critical for understanding connection state
typeIPv4 or IPv6
pkt-srcaddrOriginal packet source IP (important when traffic passes through NAT or load balancers)
pkt-dstaddrOriginal packet destination IP

Version 4 fields (Location awareness):

FieldPurpose
regionAWS region of the flow
az-idAvailability Zone ID
sublocation-typeType of sublocation (e.g., wavelength)
sublocation-idID of the sublocation

Version 5 fields (Service and direction awareness):

FieldPurpose
pkt-src-aws-serviceAWS service name for the source (e.g.,S3, DYNAMODB)
pkt-dst-aws-serviceAWS service name for the destination
flow-directionWhether traffic is ingress or egress
traffic-pathThe path traffic takes to exit (e.g., through internet gateway, NAT gateway, VPC peering)

Version 7 fields (ECS container visibility):

Version 7, introduced in 2024, adds 10 new fields specific to Amazon ECS workloads, giving you visibility into container-level traffic for the first time, including task IDs, cluster ARNs, container instance IDs, and service names.

Version 8+ fields (Enhanced troubleshooting)

FieldVersionPurpose
reject-reason8Reason traffic was rejected (currently BPA or EC; ‘-’ otherwise).
resource-id9Resource ID associated with the flow (e.g., NAT gateway ID, transit gateway ID)
encryption-status10Encryption status for the flow

If you're setting this up for the first time, here is a recommended custom format that balances depth and cost. It covers security, troubleshooting, and cost optimization use cases:

${version} ${account-id} ${interface-id} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${protocol} ${packets} ${bytes} ${start} ${end} ${action} ${log-status} ${vpc-id} ${subnet-id} ${instance-id} ${tcp-flags} ${type} ${pkt-srcaddr} ${pkt-dstaddr} ${flow-direction} ${traffic-path}

Add pkt-src-aws-service and pkt-dst-aws-service if you need to trace traffic to specific AWS services.

Log File Format: Plain Text vs Parquet

When publishing to S3, you can choose between two file formats:

  • Plain Text: The default. Human-readable, space-delimited records. Simple but slower to query at scale.
  • Apache Parquet: A columnar data format. AWS documentation states that queries on Parquet data are 10 to 100 times faster compared to plain text, and the format compresses better, reducing storage costs significantly.

For any production deployment publishing to S3, use Parquet.

VPC Flow Logs Examples: Reading and Interpreting Records

Understanding how to read flow log records is essential for effective troubleshooting. Here are the most common patterns and what they mean:

Example 1: Security Group Rejecting Inbound Traffic

2 123456789012 eni-abcdef01 203.0.113.50 10.0.1.15 49152 3389 6 20 4000 1510868790 1510868850 REJECT OK

This record shows someone from an external IP 203.0.113.50 trying to connect to your instance on port 3389 (RDP). The traffic was rejected, which means your security group or network ACL is working correctly. However, if you see hundreds of these from different source IPs, it likely indicates a brute-force or port scanning attempt.

Example 2: Accepted Traffic to an Unexpected Port

2 123456789012 eni-abcdef01 10.0.1.15 198.51.100.25 43567 8443 6 15 5200 1510868790 1510868850 ACCEPT OK

Your instance at 10.0.1.15 is sending traffic to an external IP on port 8443. If this isn't expected (for example, if your application should only communicate with known API endpoints on port 443), it could indicate a compromised instance or a misconfigured application.

Example 3: NODATA Records

2 123456789012 eni-abcdef01 - - - - - - - 1510868790 1510868850 - NODATA

No traffic was recorded for this ENI (Network Interface) during the aggregation window. This is normal for idle interfaces.

Example 4: Custom Format with Flow Direction and AWS Services

Using version 5 custom format, here's a record showing an EC2 instance calling Amazon S3:

5 123456789012 eni-abcdef01 10.0.1.15 52.217.33.14 43567 443 6 15 5200 1510868790 1510868850 ACCEPT OK vpc-abc123 subnet-def456 i-0123456789 2 IPv4 10.0.1.15 52.217.33.14 AMAZON S3 egress 8

The traffic-path value of 8 indicates the traffic went through an internet gateway. The pkt-dst-aws-service shows it's going to S3. If you're expecting this instance to access S3 via a VPC endpoint (private path), this record reveals it's actually going over the public internet, a common misconfiguration that's both slower and more expensive.

Example 5: TCP Flags for Connection Analysis

When using a custom format with tcp-flags, you can track the full lifecycle of TCP connections:

  • SYN (2): Connection initiation, a client is trying to connect.
  • SYN-ACK (18): Server acknowledging the connection.
  • FIN (1): Graceful connection close.
  • RST (4): Abrupt connection termination, it often indicates a problem.

If you see many SYN packets from a single source, but no SYN-ACK responses, it's likely a SYN flood or port scan.

Next Step

VPC Flow Logs provide comprehensive, agent-free visibility into your network traffic at the VPC, subnet, or ENI level. Understanding the architecture, how logs are captured, aggregated, and delivered, and the record format, including its various versions, helps you make informed decisions about which fields to include and how to structure your logging strategy.

The next part, How to enable VPC Flow Logs in AWS? walks you through the practical steps for: enabling flow logs via Console and CLI, querying them with CloudWatch Insights and Athena, and setting up advanced monitoring for unified observability across your entire stack.

FAQs

Are VPC Flow Logs free?

No. VPC Flow Logs are billed as CloudWatch Vended Logs (data ingestion + archival) regardless of destination, plus destination-specific charges (e.g., S3 storage/requests, CloudWatch Logs storage/queries, Firehose delivery stream costs). Pricing is region and tier dependent.

Do VPC Flow Logs affect network performance?

No. Flow log data is collected outside the path of your network traffic. Enabling them has no impact on network throughput or latency.

Can I enable VPC Flow Logs on a peered VPC?

Yes, but you can only enable flow logs for a peered VPC if the peer VPC is in your own AWS account. You cannot enable flow logs for VPCs peered with VPCs in other accounts.

How long does it take for flow logs to appear?

Typical delivery: ~5 min to CloudWatch Logs, ~10 min to S3, best-effort and may be delayed.

Can I capture only rejected traffic?

Yes. When creating a flow log, set the filter to Reject. This significantly reduces data volume and cost while still providing security-relevant information.

What is the difference between srcaddr and pkt-srcaddr?

srcaddr shows the IP address of the network interface from which traffic was captured. pkt-srcaddr (available in version 3+) shows the original source IP from the packet header. They differ when traffic passes through intermediate services like NAT gateways or load balancers.

How do VPC Flow Logs differ from VPC Traffic Mirroring?

VPC Flow Logs capture connection-level metadata (IP addresses, ports, protocols, and byte counts). VPC Traffic Mirroring captures actual packet contents for deep inspection. Flow logs are for monitoring and analysis; traffic mirroring is for forensic investigation and protocol analysis.

Can I modify an existing VPC Flow Log?

No. Once created, you cannot change a flow log's configuration. You need to delete it and create a new one with the desired settings.

What traffic is not captured by VPC Flow Logs?

DNS queries to Amazon DNS servers, DHCP traffic, instance metadata requests (169.254.169.254), Windows license activation traffic, Amazon Time Sync Service traffic, and traffic to the VPC router reserved IP address. Queries to your own DNS servers are captured.

How do I search VPC Flow Logs?

If your logs are in CloudWatch, use CloudWatch Logs Insights, select the log group containing your flow logs, and use the built-in query language to filter by source IP, port, action, etc. If your logs are in S3, use Amazon Athena to run SQL queries directly against the stored files. See the action guide for example queries.

Was this page helpful?

Your response helps us improve this page.

Tags
loggingvpc