Ok. First, I'll just adjust what I wrote earlier to include another column in IP_Subnets. iplast will hold the last IP address in the subnet range (which means that the column I previously called ipstart should be called ipfirst, but never mind). This isn't vital, but it stops the query getting messy.
DROP TABLE IP_Subnets
CREATE TABLE IP_Subnets (
ipstart binary(4) PRIMARY KEY,
iplast binary(4) NOT NULL,
netmask binary(4) NOT NULL,
country char(2) NOT NULL
)
INSERT INTO IP_Subnets (ipstart, iplast, netmask, country)
SELECT ipstart,
CAST(CAST(ipstart AS int) | (netsize - 1) AS binary(4)) AS iplast,
netmask, country
FROM (
SELECT
CAST(CAST(PARSENAME(ipstart, 4) AS tinyint) AS binary(1)) +
CAST(CAST(PARSENAME(ipstart, 3) AS tinyint) AS binary(1)) +
CAST(CAST(PARSENAME(ipstart, 2) AS tinyint) AS binary(1)) +
CAST(CAST(PARSENAME(ipstart, 1) AS tinyint) AS binary(1)) AS ipstart,
netsize,
CAST(netsize - 1 AS binary(4)) AS netmask,
country
FROM IP_Staging
WHERE type = 'ipv4' AND ipstart <> '*'
) A
Next, we'll need some test addresses. Since arin.20021201.txt covers about half the 32 bit space, choosing random 32 bit values seems fairly reasonable.
CREATE TABLE IP_TestAddress (ip binary(4))
INSERT INTO IP_TestAddress
SELECT CAST(NEWID() AS binary(4))
FROM Numbers
WHERE n < 10000
Numbers is just a tally table being used to generate 10000 rows (mine happens to be 0-based and have 100000, hence the WHERE). It's relying on the 32 bits from NEWID() being "random enough". If you're on NT4, this won't be the case.
Given all that, you might think that it's just a question of doing a simple join on the two tables:
SELECT ip, ipstart, netmask, country
FROM IP_TestAddress
LEFT JOIN IP_Subnets ON ip BETWEEN ipstart AND iplast
This works, but it's really slow because SQL Server doesn't understand the correlation between ipstart and iplast -- in this case, that they don't overlap at all.
A faster way is to adapt what I wrote earlier:
SELECT ip, ipstart, netmask, country
FROM IP_TestAddress
LEFT JOIN IP_Subnets
ON ipstart = (
SELECT MAX(ipstart)
FROM IP_Subnets
WHERE ip >= ipstart
)
AND ip <= iplast
Again, I've used a LEFT join so you can see the ones that fail to match a subnet.
I'm sure there are faster ways than that, but hopefully it'll do for now.