In the first article on IP to location I described the overall concept. In this edition we'll look at the implementation and how to scale the tracing to a reasonable speed.
With the examples from the last article you can also implement everything to build your own database. Before you jump into it take a minute to look at the implications.
Taking the example from the previous article
184.108.40.206/12 for the German Telekom we have to trace 1mio addresses and scanning them all will take a while:
# time traceroute -w 1 -m 8 -q 1 220.127.116.11 traceroute to 18.104.22.168 (22.214.171.124), 8 hops max, 40 byte packets 2 126.96.36.199 36.974 ms 3 ke-eb1.ke.de.net.dtag.de (188.8.131.52) 54.454 ms 4 184.108.40.206 54.610 ms 5 * * * 6 * * * 7 * * * 8 * * * real 0m3.244s
If we can make the make the host lookup optional we can reduce it to:
# time traceroute -n -w 1 -m 8 -q 1 220.127.116.11 traceroute to 18.104.22.168 (22.214.171.124), 8 hops max, 40 byte packets 2 126.96.36.199 37.636 ms 3 188.8.131.52 55.631 ms 4 184.108.40.206 55.614 ms 5 * * * 6 * * * 7 * * * 8 * * * real 0m1.053s
Scanning everything with one
ips * time_per_scan / num_of_scanners = scan_time 1 048 576 IPs * 1s / 1 scanner = 12 days
traceroute you can use something like
Net::Traceroute with Perl. It will start parallel
traceroute instances for you and you should be able to scan 32 IPs in parallel. It would reduce the over all time down to 9hrs, for scanning not even a 1/1000 of the internet.
time-per-scan is mostly controlled by the network-latency:
# traceroute -w 1 -m 6 -q 1 cnn.com 3 220.127.116.11 <b>38.105 ms *DSL-line*</b> 4 f-ec1.f.net.dtag.de (18.104.22.168) <b>52.038 ms *across-Germany*</b> 5 pop1-frr-s2-2-0.atdn.net (22.214.171.124) <b>136.859 ms *across-the-ocean*</b>
To scan worldwide you should have scanners in each continent. The sea-cable or satellite connections would add too much latency otherwise.
As there will always be slow IPs to connect, you have to send more packets in parallel before you get back a response.
Build your own traceroute
Quite some time is spent in starting and stopping the programs as
traceroute itself is not really spending CPU cycles for its work. A traceroute-daemon would be a first step to scan more in parallel.
Checking the net you will find several traceroute implementations for UDP, TCP and other protocols. You can either take their source and use
libnet to build your own.
libnet is for building the UDP packets with a increasing TTL and
libpcap is to receive the ICMP messages for
As we want to scale nicely up to 1000 parallel scans we should also take a look at event-driven, non-blocking IO to handle all the scans in a single binary.
In a example implementation it can look like:
$ ./loc-scanner usage: ./loc-scanner <hostmask> <steps> <regex> <type> <children> hostmask: 192.168.0.0/16 steps : 64 regex : '[a-z]' type : (dns|traceroute|whois) children : number of parallel children $ ./loc-scanner 126.96.36.199/16 32 'dtag.de' traceroute 16 IPs: 65536, childs: 16, jobs/childs: 16 childs started: 16 ................................................................ 4.92 63488 ke-eb1.KE.DE.net.DTAG.DE,188.8.131.52 ................................................................ 6.74 61440 ke-eb1.KE.DE.net.DTAG.DE,184.108.40.206 ................................................................ 7.68 59392 ke-eb1.KE.DE.net.DTAG.DE,220.127.116.11
After some warm up we go up to 12 scans/s with 16 childs, with larger networks we can use more scanners and increase the scan-rate to 40 scan/s. This is currently limited by the network latency on my network.
If we can't add more scanners and can't reduce the latency even more, we have to reduce the number of IPs that we scan. This is a balance between accuracy and speed.
IP-allocations are controlled in blocks. To reduce management overhead and keep the routing-tables small. Depending on the provider this might be done in blocks of 32 addresses or even in blocks of 256k. Assuming that the smallest size is 16 IPs in a block we can pretty only scan every 16th address and reduce the scan-time by 16.
We can also be a bit more predictive and start with larger blocks, like 128 IPs and probe a few address in the block if they are handled by the same router. Instead of random probes we can use the assumption of block-allocations again and test the IPs at the border of the next possible smaller block:
Address block (/24) ------------------- Block: .0 .255 Probes next to the border of /25 (.126 and .129) .1 .126 .129 .254
If all 4 are handled by the same router, we can assume that the whole block is handled the same block. If not, you divide the block and repeat until the condition is met.
Scanning 4 IPs to predict 256 IPs. That's a reduction of factor 64. A lot better than just assuming that the smallest block is 16 IPs.
... and next week ?
In the next article we will look into mapping the collected data to a useful data-structure and compress it nicely.