Thursday, March 31, 2005

remote hardware fingerprinting - how it works

Tadayohi Kohno, Andre Broido, and kc claffy at the San Diego Supercompter Center have published a paper called "Remote physical device fingerprinting. The paper discusses a technique for remotely identifying physical computer hardware. This article goes into how it works. I'll follow up with another that talks more about what it means.

First, here's the explanation for non-techies.

Their technique relies on a few things. (1) Every clock runs a little fast or a little slow. (2) Computers have clocks. (3) If the clock on the computer tends to gain 15 seconds per day, that gain per day is pretty much constant, and the same with how much time it loses per day. It doesn't gain 15 seconds one day and lose 15 seconds the next. (4) You can remotely read the computer's clock.

The practical upshot is that, if you're clever, you can find out that computer X has a clock that tends to gain, say, 15 seconds per day, while computer Y has a clock that tends to lose, say, 7 seconds per day. You can use that information to tell whether you're talking to computer X or computer Y regardless of where it connects to the Internet, or if it hides behind a firewall, or even if it's halfway around the world. This technique works even if the user is trying to stay anonymous.

Now, the turbo-geek version.

Kohno et. al. measure clock skew. They mostly focus on the TCP Timestamp option, which is built into most TCP protocol stacks. However, their technique also works with ICMP time requests. They've even presented a rather clever way to generalize the attack: most operating systems batch their packet transmissions at 10ms or 100ms intervals, but those intervals are also subject to clock skew. That means if you can accurately measure packet interarrival times and perform a Fourier transform on the difference between the batching frequency and the interarrival times, the peak frequency will show you the clock skew. (It's not clear to me whether transmission line characteristics or intermediate routers would obscure this last approach. Also, they don't present a way to automate it.)

Getting back to TCP Timestamp, most TCP stacks offer the timestamp in the SYN packet. Windows NT and XP are notable exceptions: they don't offer it. However, if the responding system breaks the protocol and requests TCP timestamps in the SYN-ACK, XP and NT will still turn on timestamp service. I don't know if Linux or BSD will do the same.

Kohno et. al. also look at the effect of NTP synchronization. On most operating systems, NTP adjusts the system clock but not the clock that runs the TCP timestamp (which presumably derives directly from system up-time.) So, if you turn on frequent NTP synchronization, the system clock's skew drops to near zero, but the TCP timestamp clock's skew remains largely unaffected.

To measure skew, they require a number of samples, though I'm not sure at this point how many they need. Then they use a linear programming approach to combine the samples to get skew.

It looks very much like you could implement the approach through a web browser using the high-resolution clock available on most Pentium systems.

No comments: