[MySQL] Parse Top Level Domain from URL, 리퍼러 > db

본문 바로가기

db

[MySQL] Parse Top Level Domain from URL, 리퍼러

페이지 정보

작성자 서방님 댓글 0건 조회 3,006회 작성일 12-09-26 16:06

본문

출처 : http://blog.netnerds.net/2007/01/t-sql-parse-top-level-domain-from-url/

 

I keep track of all hits to my website RealCajunRecipes.com in a SQL table called hitcounter which has columns for the user's IP, browser, referring URL and the date. Recently, I saw a surge in traffic and wanted to know which domains were sending the traffic our way. After getting tired of issuing ad-hoc queries that included WHERE clauses like "where referer like '%google%'", I created a SQL Server user-defined function (UDF) to extract the domain from the referring URL.

parseurl.sql

CREATE      FUNCTION [dbo].[parseURL]  (@strURL varchar(1000))
RETURNS varchar(1000)
AS
BEGIN
IF CHARINDEX('http://',@strURL) > 0 OR CHARINDEX('https://',@strURL) > 0
-- Ghetto-tastic
SELECT @strURL = REPLACE(@strURL,'https://','')
SELECT @strURL = REPLACE(@strURL,'http://','')
SELECT @strURL = REPLACE(@strURL,'www','')
-- Remove everything after "/" if one exists
IF CHARINDEX('/',@strURL) > 0 (SELECT @strURL = LEFT(@strURL,CHARINDEX('/',@strURL)-1))

-- Optional: Remove subdomains but differentiate between www.google.com and www.google.com.au
IF (LEN(@strURL)-LEN(REPLACE(@strURL,'.','')))/LEN('.') < 3 -- if there are less than 3 periods
SELECT @strURL = PARSENAME(@strURL,2) + '.' + PARSENAME(@strURL,1)
ELSE -- It's likely a google.co.uk, or google.com.au
SELECT @strURL = PARSENAME(@strURL,3) + '.' + PARSENAME(@strURL,2) + '.' + PARSENAME(@strURL,1)
RETURN @strURL
END

This script does the following:
1. Checks to see if the string is an URL
    (example: str = http://www.search.google.com.au/?q=netnerds)
2. Removes http, https and www (str = search.google.com.au/?q=netnerds)
3. Removes everything after the slash (str = search.google.com.au)
4. Removes excessive subdomains (str = google.com.au)

The script isn't perfect; I saw things like mysearch.myway.com get by but it's good enough for general use. If you'd like to see the entire domain, just remove the 4 line chunk marked "Optional."

To call this using SQL, modify this sample script to suite your environment:

SELECT COUNT(*) as theCount, dbo.parseURL("referer) as referer FROM hitcounter
WHERE referer IS NOT NULL
GROUP BY dbo.parsedomain(referer)
ORDER BY thecount DESC

Your results should look something like this

11831 google.com
10542 yahoo.com
9101 msn.com
746 google.ca
624 google.co.uk

Note: NULLs aren't parsed and thsu won't kill this function..they'll just show up as NULL.

댓글목록

등록된 댓글이 없습니다.

Total 447건 1 페이지
게시물 검색

회원로그인

접속자집계

오늘
146
어제
365
최대
592
전체
40,120

그누보드5
Copyright © seobangnim.com All rights reserved.
자바스크립트를 활성화 하세요![ 브라우저에서 자바스크립트를 활성화하는 방법 ]