API Investigation

Detective Sarah Kim got the call at 3:47 AM. Quantum Financial Services had lost $15 million overnight, but nothing looked wrong. Fifty thousand transactions, all under $500, all from authenticated users, all following proper API protocols.

"Walk me through what controls should have stopped this," Sarah said.

That question changed everything. Attackers had obtained credentials for 10,000 accounts through a breach at an unrelated company where users had reused passwords. Using these stolen credentials, they logged into Quantum's mobile API and initiated small overdrafts simultaneously from each account. Each overdraft was within policy limits. But the overdraft protection logic had a flaw: it checked account balance before processing, not during. By sending requests faster than the system could update balances, attackers withdrew more than each account held.

Sarah's job was to reconstruct what happened, when it started, and what evidence could identify the attackers. The challenge: every individual transaction looked legitimate. The crime only became visible in patterns across thousands of accounts.

This story is fictional, but the patterns are real.

Why This Matters

Business Logic Attacks explained how these exploits work. This article covers what happens after: how investigators piece together what occurred, what evidence exists, and why API-based attacks are harder to investigate than traditional breaches.

Understanding investigation concepts won't make you a forensic analyst. But it will help you recognize what evidence matters when you spot suspicious patterns, and why some attacks leave clear trails while others don't.

Why API Investigations Are Different

Traditional digital forensics looks for artifacts: files copied, databases accessed, malware installed. API forensics is different.

The evidence isn't in individual requests. A single API call that transfers $500 looks identical whether it's legitimate or fraudulent. The evidence lives in relationships between calls: timing patterns, sequences, volumes, and correlations across accounts.

There's no single crime scene. An API attack might touch authentication servers, business logic services, databases, payment processors, and logging systems. Each component has partial evidence. Reconstructing the full picture requires correlating across all of them.

The attacker used the front door. Traditional intrusions leave traces of breaking in: exploited vulnerabilities, unauthorized access, privilege escalation. Business logic attacks use valid credentials and authorized actions. The system worked correctly; the rules were wrong.

Speed matters more. File-based evidence persists. API logs often rotate quickly, and the attack might still be running. Evidence preservation in the first hour can determine whether investigation succeeds or fails.

What Evidence Exists

API attacks leave traces at multiple layers. Understanding what exists helps you know what to look for.

Request Evidence

Every API call generates data:

Evidence Type	What It Shows
HTTP method and endpoint	What action was attempted
Request parameters	What values were submitted
Response codes	Whether the request succeeded
Response times	How long processing took
Timestamps	When the request occurred
Session and authentication tokens	Who made the request
IP addresses and user agents	Where and how the request originated

This data appears in API gateway logs, application logs, and web server access logs. The challenge isn't finding it; it's correlating it across systems and time.

Business Logic Evidence

Beyond raw requests, systems track business state:

Workflow progression: Did the user complete checkout steps in order, or skip directly to confirmation?
State transitions: Did a claim move from "pending" to "approved" without passing through "reviewed"?
Calculated values: Do the numbers in the transaction match what the business rules should produce?
Timing relationships: Did balance checks happen before or after transfers executed?

This evidence often lives in application databases and audit logs rather than API logs.

Behavioral Evidence

Patterns across requests can suggest automation, but sophisticated attackers know this and adapt:

Timing regularity: Unsophisticated bots show mechanical precision. But attackers who know they're being watched add random delays. Perfectly random timing can itself be a signal, since real humans cluster around certain intervals.
Impossible geography: A request from London followed by one from Tokyo three minutes later. No human traveled that fast. But residential proxy networks make this less reliable than it used to be.
Scale and duration: A human might make 50 requests in a session. An attack might make 50,000. But distributed attacks across many accounts can hide this volume.
Error response patterns: Real users read error messages and change behavior. Automated attacks often retry immediately, or cycle through a list regardless of responses.

What's Often Missing

Some evidence you might expect doesn't exist:

Intent. Logs show what happened, not why. A negative transfer amount could be a bug, a test, or an attack.
Attribution. IP addresses identify machines, not people. Session tokens identify accounts, not who controls them.
Complete timelines. Log retention varies. Some systems keep days of history; others keep hours.
Unlogged actions. If the system didn't log a particular field or action, that evidence is gone.

How Timelines Get Reconstructed

The core of API investigation is building a chronological picture of what happened. This sounds simple but requires correlating data from multiple sources.

Finding the Starting Point

Investigations often begin at discovery, not at the attack's start. Working backward requires:

Identify affected accounts or transactions from the known damage
Pull all activity for those accounts in a reasonable time window
Look for the earliest anomaly that fits the attack pattern
Expand the search to related accounts, IPs, or sessions

The attack often started days or weeks before the damage became visible. Reconnaissance and testing phases leave traces if you know to look.

Correlating Across Systems

A single transaction might appear in:

API gateway logs (request received)
Authentication service logs (token validated)
Business logic service logs (rules applied)
Database audit logs (data changed)
Payment processor logs (funds moved)

Each system has its own timestamp format, its own log structure, and its own retention policy. Matching events across systems requires common identifiers: transaction IDs, session IDs, or correlation IDs that flow through the entire chain.

When those identifiers don't exist, investigators fall back on timing and sequence: what happened within milliseconds of what else?

Recognizing Coordinated Attacks

Individual account analysis (covered above) reveals whether one account behaves like a bot. Cross-account analysis reveals whether multiple accounts are working together:

Shared infrastructure: Different accounts using the same IP addresses, device fingerprints, or session patterns suggest common control.
Synchronized timing: When 500 accounts all perform the same action within a 30-second window, that's not coincidence. Sarah's case showed 10,000 overdrafts clustered in a two-hour window.
Identical behavior sequences: Different accounts following the exact same click paths, in the same order, with similar timing, indicates automation from a single script.
Complementary actions: One set of accounts creates value (generating refunds, earning rewards); another set extracts it (cashing out, transferring funds). The accounts look independent until you map the money flow.

Evidence Preservation

API evidence is volatile. Logs rotate, systems restart, and attackers may still be active. What happens in the first hours matters.

The Golden Hour

When an API attack is discovered, the priority is capturing evidence before it disappears:

Freeze log rotation so historical data isn't overwritten
Snapshot current system state including running processes, active sessions, and configuration
Capture network traffic if the attack is ongoing
Document the discovery including what was noticed, when, and by whom

This isn't about analysis yet. It's about ensuring evidence exists to analyze later.

Chain of Custody

For evidence to be useful in legal proceedings or formal investigations, it needs documentation:

What was collected: File names, sizes, sources
When it was collected: Timestamps for each acquisition
Who collected it: Names and roles
How it was preserved: Hashes proving files weren't modified, secure storage locations
Who accessed it: Log of everyone who touched the evidence

Without this documentation, evidence may be challenged or excluded.

What Gets Lost

Some evidence is gone before anyone knows to look:

Ephemeral logs that rotate hourly or daily
Session state that exists only in memory
Intermediate calculations that aren't logged
Network traffic that wasn't captured
Attacker cleanup if they had time to delete traces

Investigations often work with incomplete pictures. Knowing what's missing is as important as knowing what's present.

The Attribution Challenge

Identifying what happened is hard. Identifying who did it is harder.

What Evidence Shows

API evidence typically reveals:

Which accounts were involved
Which IP addresses made requests
Which devices (via fingerprinting) were used
What timing patterns characterized the activity

What Evidence Doesn't Show

Who controlled the accounts. Stolen credentials mean the account owner isn't the attacker.
Who controlled the IPs. VPNs, proxies, and botnets obscure origin.
Who controlled the devices. Malware can automate attacks from compromised machines.
Organizational structure. Whether attackers are individuals, groups, or state-sponsored.

Attribution in API attacks often stops at "we know what accounts and infrastructure were used." Going further requires evidence beyond API logs: intelligence sources, law enforcement resources, or operational security mistakes by attackers.

Key Takeaways

API evidence lives in relationships. Individual requests look normal. Patterns across requests, accounts, and time reveal attacks.
Speed matters. Log rotation and ongoing attacks mean evidence preservation in the first hours is critical. What's not captured then may be gone forever.
Business logic knowledge is essential. Investigators need to understand what the system should do to recognize what it shouldn't have done.
Attribution is limited. API logs show accounts and infrastructure, not people. Proving who controlled them requires evidence from beyond the logs.
Incomplete pictures are normal. Investigations work with what's available. Knowing what's missing helps interpret what's present.

What's next: This completes the API Abuse module. For related concepts, see Following the Money for financial investigation techniques, or return to Fraud Basics to continue the learning path.

Key Terms

API forensics: The discipline of investigating crimes committed through API abuse, focusing on request patterns and business logic exploitation rather than traditional file-based artifacts.
Correlation ID: A unique identifier that flows through all systems involved in a transaction, enabling investigators to match related events across logs.
Log rotation: Automatic deletion or archiving of old log files, which can destroy evidence if not paused during investigation.
Chain of custody: Documentation proving evidence hasn't been tampered with, required for legal proceedings.
Behavioral fingerprint: Patterns in request timing, sequences, and errors that distinguish automated attacks from human activity.
Golden hour: The critical first hour after attack discovery when evidence preservation determines investigation success.
Attribution: Determining who is responsible for an attack, which is harder than determining what happened.

Generated with AI assistance. Reviewed by humans for accuracy.

All Categories