Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Query for ip and associated country

Author  Topic 

dimoss
Yak Posting Veteran

52 Posts

Posted - 2007-12-28 : 12:16:10
Hi,

I have created some statistics for my site based on the ip of the visitors. For that reason I have created two tables
1. One table with a column which hold the ip number of the visitor
Table: AccessLog
Column: IP

2. One table with the ip range and the countries
Table: Countries
Columns: ip_to, ip_from, country_name

I would like to create a query which will bring the country name of the visitor IP which IP will be like "ip_to <= ip <= ip_from"

Thanks

Denis

www.tabletennis.gr

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2007-12-28 : 13:22:53
Denis,

Depending on your performace needs you might want to consider breaking the IP addresses into Octets. Here is an example of what I mean:
DECLARE @AccessLog TABLE
(
IPAddress varchar(15) NOT NULL,
Octet1 tinyint NOT NULL,
Octet2 tinyint NOT NULL,
Octet3 tinyint NOT NULL,
Octet4 tinyint NOT NULL
)

INSERT @AccessLog
(
IPAddress,
Octet1,
Octet2,
Octet3,
Octet4
)
SELECT
IPAddress,
PARSENAME(IPAddress, 4) AS Octet1,
PARSENAME(IPAddress, 3) AS Octet2,
PARSENAME(IPAddress, 2) AS Octet3,
PARSENAME(IPAddress, 1) AS Octet4
FROM
-- Derived table of IPs
(
SELECT '172.1.1.1' AS IPAddress
UNION ALL SELECT '198.123.56.230'
UNION ALL SELECT '10.124.1.189'
) AS Temp
That way you cana index the octets and get some pretty screaming performance. Otherwise I think you'll have to apply functions to the IPs to break them into octets at query time.
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2007-12-28 : 14:10:13
Hi Lamprey,

I have already convert the ip's into numbers using the following function using .asp
<%
Private Function ipAd2ipNum(ipA)
strO = ipA
pos1 = InStr(strO, ".")
intA = CInt(Left(strO, (pos1-1)))
strO2 = Mid(strO, pos1+1, len(strO))
pos2 = InStr(strO2, ".")
intB = CInt(Left(strO2, (pos2-1)))
strO3 = Mid(strO2, pos2+1, len(strO2))
pos3 = InStr(strO3, ".")
intC = CInt(Left(strO3, (pos3-1)))
intD = CInt(Mid(strO3, pos3+1, len(strO3)))

intConvert = (intA*(256*256*256)) + (intB*(256*256)) + (intC*256) + intD
ipAd2ipNum = Trim(intConvert)
End Function
%>

So, in the AccessLog table I have one column with the IP (like 192.168.1.1) and the converted one in octet style. This is created on the fly using the function above..:-)
Also the columns "ip_to" and "ip_from" in the countries table are octets.

I just need to match the octet ip of the visitor between the static octet range (ip_to, ip_from) of the countries table in order to take the country name.

Thanks.

Denis

www.tabletennis.gr
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2007-12-28 : 16:13:16
I'm not sure if I understand the issue then. If you have a number and you want to see if that number is between two other numbers you can use BETWEEN or just normal comparisons (>= and <=). If they are already in numeric form can you do something like:
SELECT
AccessLog.IP,
Countries.Country_name
FROM
AccessLog
INNER JOIN
Countries
ON AccessLog.IPNumber BETWEEN Countries.IP_From_Number AND Countries.IP_To_NumberSELECT
AccessLog.IP,
Countries.Country_name
FROM
AccessLog
INNER JOIN
Countries
ON AccessLog.IPNumber BETWEEN Countries.IP_From_Number AND Countries.IP_To_Number
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2007-12-29 : 05:21:47
Thanks Lamprey..Now you understood what I am trying to do.

I changed a little bit your query

SELECT countries.country
FROM AccessLog INNER JOIN
countries ON AccessLog.LogIP_NO BETWEEN countries.ip_from AND countries.ip_to
GROUP BY AccessLog.LogIP_NO, countries.country

Now I want to count the records from each country_name and take a table with something like that

eg.
USA 14
UK 35

Is it possible?

Thanks,

Denis

www.tabletennis.gr
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2007-12-30 : 14:06:01
Any help?

www.tabletennis.gr
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2007-12-30 : 14:31:00
assuming your query (above) works and you simply want to group by only by country:

SELECT countries.country, count(*)
FROM AccessLog INNER JOIN
countries ON AccessLog.LogIP_NO BETWEEN countries.ip_from AND countries.ip_to
GROUP BY countries.country



Be One with the Optimizer
TG
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2007-12-31 : 03:28:51
quote:
Originally posted by TG

assuming your query (above) works and you simply want to group by only by country:

SELECT countries.country
FROM AccessLog INNER JOIN
countries ON AccessLog.LogIP_NO BETWEEN countries.ip_from AND countries.ip_to
GROUP BY countries.country, AccessLog.IP_NO


Be One with the Optimizer
TG


I have already tried that.
It gives Timeout expired error as it takes too long to execute.

Maybe I could use the query to write the results on a temporary table and then use Count for the country but i don't know how to do that. Something like..

SELECT countries.country
INTO #ccountry_name
FROM AccessLog INNER JOIN
countries ON AccessLog.LogIP_NO BETWEEN countries.ip_from AND countries.ip_to
GROUP BY countries.country, AccessLog.IP_NO

However I tried it also without success. Any ideas?
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2007-12-31 : 09:15:35
please post the DDL of these two tables (including primary key and index definitions):

AccessLog
Countries

also, how many rows are in each table.


Be One with the Optimizer
TG
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2007-12-31 : 12:32:33
quote:
Originally posted by TG

please post the DDL of these two tables (including primary key and index definitions):

AccessLog
Countries

also, how many rows are in each table.


Be One with the Optimizer
TG



What you mean DDL?

AccessLog:
Lid float(8)
LogDate smalldatetime (4)
LogMonth smallint (2)
LogYear int (4)
LogRemote_Addr nvarchar (15)
LogPath_Info nvarchar (50)
LogQuery_String nvarchar (255)
LogIP_NO float (8)

Current total rows: 40,000

countries
ip_from float (8)
ip_to float (8)
country nvarchar (50)

Total rows: 63726

www.tabletennis.gr
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2007-12-31 : 12:49:47
>>What you mean DDL?
I just meant the results of if you used enterprise manager to generate a sql script to create the tables and indexes.
ie: CREATE TABLE [AccessLog] (...

So does this mean your tables have no primary key and no indexes at all?

What do the values of ip_from and ip_to look like. I suspect float may not be a good candidate for these columns. float as a non-precise datatype is designed to allow for a very large range of values very small numbers to very large numbers). Same for the [Lid] column Is that just an ID column?

Is there any overlap in your (ip_from, ip_to) columns from row to row? in other words is there any logID_No that would join to more that one countries row?

Be One with the Optimizer
TG
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2007-12-31 : 13:48:10
I created in my Enterprise Manager locally. I didn't use SQL systax to create the tables.
There is no primary key.

Lid is an id column.

ip_to and ip_from are numeric values which show the range of the converted ip like:
33996344 33996351 UNITED KINGDOM
50331648 69956103 UNITED STATES
69956104 69956111 BERMUDA
69956112 83886079 UNITED STATES
94585424 94585439 SWEDEN
and so on....till
3740270592 3740925951 CHINA
3740925952 3741024255 TAIWAN
3741024256 3741057023 REPUBLIC OF KOREA
3741057024 3741319167 VIET NAM
3758096384 4294967295 UNITED STATES

As you can see there is no overlap from row to row so the LogIP_NO can only between one row range.

(In the AccessLog table the IP_NO can be repeated a lot of times as the visitor visits more than one page. But this has already solved using the GROUP By clause.)

Thanks for you help..I appreciate it

www.tabletennis.gr
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2007-12-31 : 14:49:51
ok, I'm not sure what kind of flexiblity you have in structure changes and of course any changes should be done in a test/dev environment but I would consider the following changes:

I don't think float is a good datatype for primary keys so...
consider changing datatype of ip_from and ip_to to bigint
consider changing datatype lid to int

add Primary Key to AccessLog on (Lid)
add index to accessLog on (LogIP_NO)

add primary key clustered to countries on (ip_from, ip_to)
add index to countries on (country)

EDIT:
on second thought maybe add country as the 3rd key to the index and omit the secondary index. This table is pretty static right, not many changes? Just 3 columns - May as well make the entire table the convered index.

Even without any changes (above) this change should help your query:

select c.country
,count(*) [distinct IP Count by Country]
from
(--narrow down to distinct IP first
select LogIP_NO
from AccessLog
group by LogIP_NO
) IP
join countries c
on ip.LogIP_NO >= countries.ip_from
and ip.LogIP_NO <= c.ip_to
group by c.country


>>Thanks for you help..I appreciate it
No problem - I hope it helps

Be One with the Optimizer
TG
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2007-12-31 : 15:04:21
quote:
It gives Timeout expired error as it takes too long to execute.

I would also suggest that you get this worked out in query analyzer before trying to implement it through your application. At least that way you'll know not to even try until we get the execution time well below your timeout limit.

Be One with the Optimizer
TG
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2007-12-31 : 15:24:03
quote:
Originally posted by TG

quote:
It gives Timeout expired error as it takes too long to execute.

I would also suggest that you get this worked out in query analyzer before trying to implement it through your application. At least that way you'll know not to even try until we get the execution time well below your timeout limit.

Be One with the Optimizer
TG



Thank you...I will try it and i will let you know..
Since then...I wish the best for the NEW YEAR!

www.tabletennis.gr
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2007-12-31 : 20:03:18
quote:
Originally posted by TG

select c.country
,count(*) [distinct IP Count by Country]
from
(--narrow down to distinct IP first
select LogIP_NO
from AccessLog
group by LogIP_NO
) IP
join countries c
on ip.LogIP_NO >= countries.ip_from
and ip.LogIP_NO <= c.ip_to
group by c.country
[/code]



Unfortunately it didn't work.
I run this:
select countries.country, count(*)
from
(select AccessLog.LogIP_NO
from AccessLog
group by AccessLog.LogIP_NO)
join countries.country
on AccessLog.LogIP_NO >= countries.ip_from
and AccessLog.LogIP_NO <= c.ip_to
group by countries.country

AND then..this:

select countries.country, count(*)
from
(select AccessLog.LogIP_NO
from AccessLog
group by AccessLog.LogIP_NO)
AccessLog
join countries.country
on AccessLog.LogIP_NO >= countries.ip_from
and AccessLog.LogIP_NO <= c.ip_to
group by countries.country

I didn't do any change.
Would u mind to give you an example .bak file to check it by yourself?

www.tabletennis.gr
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2007-12-31 : 21:30:26
what error did you get?

oh, I think I forgot to use a table alias. Try this:

select c.country
,count(*) [distinct IP Count by Country]
from
(--narrow down to distinct IP first
select LogIP_NO
from AccessLog
group by LogIP_NO
) IP
join countries c
on ip.LogIP_NO >= c.ip_from
and ip.LogIP_NO <= c.ip_to
group by c.country


Be One with the Optimizer
TG
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2008-01-01 : 04:55:29
Used the following:
select countries.country, count(*)
from
(select AccessLog.LogIP_NO
from AccessLog
group by AccessLog.LogIP_NO)
AccessLog
join countries.country
on AccessLog.LogIP_NO >= countries.ip_from
and AccessLog.LogIP_NO <= country.ip_to
group by countries.country

I took the following in Query Analyzer:
Server: Msg 208, Level 16, State 1, Line 1
Invalid object name 'countries.country'

however the object name is ok.

www.tabletennis.gr
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2008-01-01 : 04:56:43
Then I used the previous query in query analyzer:
SELECT countries.country, count(*)
FROM AccessLog INNER JOIN
countries ON AccessLog.LogIP_NO BETWEEN countries.ip_from AND countries.ip_to
GROUP BY countries.country
..and it worked locally although it took 0.30 min to complete.

When I tried to run it online I took error:
Microsoft OLE DB Provider for SQL Server error '80040e14'

The query has been canceled because the estimated cost of this query (901) exceeds the configured threshold of 60. Contact the system administrator.

www.tabletennis.gr
Go to Top of Page

dimoss
Yak Posting Veteran

52 Posts

Posted - 2008-01-01 : 05:24:10
Then I used the following:

select country
,count(*) [distinct IP Count by Country]
from
(--narrow down to distinct IP first
select LogIP_NO
from AccessLog
group by LogIP_NO
) AccessLog
join countries
on AccessLog.LogIP_NO >= countries.ip_from
and AccessLog.LogIP_NO <= countries.ip_to
group by country

..and it worked locally but it took 0.25 min to complete.

When I tried to run it online I took error:
Microsoft OLE DB Provider for SQL Server error '80040e14'

The query has been canceled because the estimated cost of this query (635) exceeds the configured threshold of 60. Contact the system administrator.

www.tabletennis.gr
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2008-01-01 : 08:36:35
couple things:

- As I said earlier, don't even try to run it with your application until you get it working quickly from query analyzer.

- It is unreasonable to expect fast results from large tables that have no primary and no indexes. Are you working on this project solo? Is there is no database person that can help with the DB side? Databases that support applications MUST be properly desinged and maintained otherwise they are useless.

- Even with indexes, as the log table grows, realtime querying from it may not be realistic. You may need to have an offline process regularly summarize this data. Then query from the summary table from the application.

Be One with the Optimizer
TG
Go to Top of Page
    Next Page

- Advertisement -