IPs as evidence

I am reading a lot of things to do with file sharing and flawed evidence lately. One of the common misconceptions is that IPs can be arbitrarily spoofed. This isn’t really true. You can spoof your IP but you won’t receive any packets, so it’s kind of a pointless exercise unless you’re doing something cunning. It’s not practical for web usage. It’s very hard to do it most of the time because the data transfer protocol you use will most likely be built on TCP which requires a forwards-backwards handshaking procedure to establish a connection first, and it’s hard to bluff your way through the handshake if you don’t know what the other person’s sending back. Or as nmap says: TCP Sequence Prediction: Difficulty=204 (Good luck!).

But I think this is all missing the point a bit. The IP in the first place is such flimsy and meaningless evidence. There are too many humans in the chain who stand to gain from just making them up and it’s trivial to do so. Between a law firm and a ‘monitoring’ firm, you’ve got two separate entities whose entire business depends on being able to come up with a set number of IP addresses per month. It isn’t a difficult task to add on an extra 10% to ensure you get a nice bonus. It isn’t even a difficult task to fictitiously generate 95% of your data.

If I was doing this, my first thoughts would be like so: I’d probably proceed by recompiling a torrent client to write a list of IPs for each torrent to a text file (filtering by UK IPs belonging to ISPs who don’t protect their customers), then if I was a greedy cretin I’d most likely bump up the number like so:

#!/usr/bin/env python
import commands
import random
import re

ip32 = lambda n: sum([int(x)<<8*(3-i) for i, x in enumerate( (n.split(".")) ) ])

def random_ip_from_subnet(network, subnet):
  ip = 0  
  for i in range(32)[::-1]:
    snbit = (subnet >> i ) & 1
    ip = ip << 1
    ip |= (network >> i) & 1 if snbit else random.randint(0, 1)

  return ip

users = [] # read user list in here as list of (ip, date) tuples
newusers = []

for ip, date in users:
  newusers += [(ip, date)]
  subnet = commands.getoutput("whois {0} | grep route | awk {{'print $2 '}}".format(ip)).trim()
  if re.match("^(\d{,3}.){3}\d{,3}/\d+$", subnet):
    [network, mask] =  subnet.split("/")
    networkip = ip32(network)
    maskip = 2**(32-int(mask))-1
    for i in xrange(3):
      fakeip = random_ip_from_subnet(networkip, maskip)      
      fakedate = date + (random.random() - 0.5) * (60*60*24*28)      
      newusers += [(fakeip, fakedate)]

# write newusers to file here.

(note: untested code)

And there we go, I’ve just increased my income by 300% in about 30 lines of code. Each IP I generate is on the same ISP as one of the ones I’ve really captured, so I know that I’m not going to have to submit multiple court orders. Some of these will likely not be allocated at the time of the alleged infringement, so the ISP won’t be able to identify them to a user, but IPv4 is oversubscribed so we’re not just taking wild stabs in the dark. A lot of these will be allocated. I’ve looked through ACS:Law’s spreadsheets and on one of the BSkyB ones, 1387 out of 5699 records are ‘unknown’. On another 1325 out of 5358 are unknown. What’s going on here? It doesn’t say why they’re unknown, but it’s a massive failure rate.

The basic idea of this code is that it takes a known IP, it then looks up that IP’s WHOIS record which tells us what range (subnet) it’s on. Then we start generating IPs at random from that range. We also randomise the date within a 14 day range (although if I was making a serious effort, I’d then truncate the hours/minutes/seconds and instead choose time of day randomly from a non-linear distribution to favour evenings during the week (negatively skewed normal distribution, mean 8pm) and daytime at weekends — this would take an extra 10 lines or so but I’d have to read up on both the random and date modules so I didn’t bother). Also if I was making a serious effort I’d drop or regenerate IPs whose last byte is 0 or 0xFF.

When someone says “I’ve got an IP address” the correct response is not “alright here’s a court order to get the ISP to tell you who it belongs to so you can send them a letter demanding £500”; the correct response is “so?”.


I like blogging

Tagged with: , ,
Posted in Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: