Google Analytics 原理与实例分析[zt]

最近在做Web统计相关的工作,在查资料的同时,顺手找到了GA的原理分析:)

发统计包

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
GET http://www.google-analytics.com/__utm.gif?utmwv=1&utmn=1261523910&utmcs=gb2312&utmsr=1400×1050&utmsc=32-bit&utmul=en-us&utmje=1&utmfl=-&utmhn=www.mydll.com&utmr=-&utmp=/gg.htm&utmac=UA-2789145-1&utmcc=__utma%3D251296922.1430927915.1192194210.1192194210.1192194210.1%3B%2B__utmb%3D251296922%3B%2B__utmc%3D251296922%3B%2B__utmz%3D251296922.1192194210.1.1.utmccn%3D(direct)%7Cutmcsr%3D(direct)%7Cutmcmd%3D(none)%3B%2B HTTP/1.1
Accept: */*
Referer: http://www.mydll.com/gg.htm
Accept-Language: zh-cn
UA-CPU: x86
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Connection: Keep-Alive
Host: www.google-analytics.com
Pragma: no-cache

HTTP/1.1 200 OK
Pragma: no-cache
Cache-Control: private, no-cache, no-cache=”Set-Cookie”, proxy-revalidate
Expires: Fri, 04 Aug 1978 12:00:00 GMT
Content-Type: image/gif
Server: ucfe
Content-Length: 35
Date: Fri, 12 Oct 2007 13:04:04 GMT
Via: 1.1 HttpSpy

发出统计的请求详细分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
http://www.google-analytics.com/__utm.gif?

utmwv=1& # 常量 1
utmn=1261523910& # Math.round(Math.random()*2147483647);
utmcs=gb2312& # charset
utmsr=1400×1050& # screen
utmsc=32-bit& # screen.colorDepth
utmul=en-us& # navigator.language.toLowerCase();
utmje=1& # navigator.javaEnabled() ? 1 : 0;
utmfl=-& # _uFlash
utmhn=www.mydll.com& # JsUrlEncode(location.hostname)
utmr=-& # document.referrer
utmp=/gg.htm& # location.pathname+location.search; 或者优先是 用户输入的 page 页面
utmac=UA-2789145-1& # 站点ID,用户输入的 _uacct = “UA-2789145-1″;
utmcc=__utma%3D251296922.1430927915.1192194210.1192194210.1192194210.1%3B%2B__utmb%3D251296922%3B%2B__utmc%3D251296922%3B%2B__utmz%3D251296922.1192194210.1.1.utmccn%3D(direct)%7Cutmcsr%3D(direct)%7Cutmcmd%3D(none)%3B%2B
utmcc=__utma=251296922.1430927915.1192194210.1192194210.1192194210.1;+__utmb=251296922;+__utmc=251296922;+__utmz=251296922.1192194210.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none);+
utmcc=
__utma=251296922.1430927915.1192194210.1192194210.1192194210.1;+
__utmb=251296922;+
__utmc=251296922;+
__utmz=251296922.1192194210.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none);+

其中的 251296922 是根据当前页面的根域名计算出来的 HASH 值,例如本次请求是用 mydll.com 计算出来的值 251296922
__utma 的第二个数值 1430927915 是随机数字,用函数 Math.round(Math.random()*2147483647) 生成出来的
__utma 的第三、四、五个数值 1192194210 是随机数字,是用函数_ust=Math.round((new Date()).getTime()/1000); 计算出来的
__utma 的第六个数值 1 是固定的常量
__utmz 的第一个数是域名的 HASH 值
__utmz 的第二个数值 1192194210 是随机数字,是用函数 _ust=Math.round((new Date()).getTime()/1000); 计算出来的
__utmz 的第三个数值 1 其实就是 __utma 的第六个固定常量
__utmz 的第四个数值 1 在第一次请求的时候是 1,我们就暂时不考虑第二次的请求
__utmz 的最后一个参数 utmccn=(direct)|utmcsr=(direct)|utmcmd=(none) 是表示来源等信息,默认就是这个

当链接是从其他网站点入时,其 __utmz 参数如下:

__utmz=251296922.1192220231.1.1.utmccn=(referral)|utmcsr=yx8.com|utmcct=/temp/togg.html|utmcmd=referral;+
utmccn – 的值表示有来源
utmcsr – 表示来源的跟域名
utmcct – 表示来源页面的 PATH
utmcmd – 表示一个命令好像,这里设置了 referral,可能是为了配合签名的

外部链接导入包

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
GET http://www.google-analytics.com/__utm.gif?utmwv=1&utmn=1389663121&utmcs=gb2312&utmsr=1400×1050&utmsc=32-bit&utmul=en-us&utmje=1&utmfl=-&utmcn=1&utmhn=www.mydll.com&utmr=http://www.yx8.com/temp/togg.html&utmp=/gg.htm&utmac=UA-2789145-1&utmcc=__utma%3D251296922.1389663121.1192220231.1192220231.1192220231.1%3B%2B__utmb%3D251296922%3B%2B__utmc%3D251296922%3B%2B__utmz%3D251296922.1192220231.1.1.utmccn%3D(referral)%7Cutmcsr%3Dyx8.com%7Cutmcct%3D%2Ftemp%2Ftogg.html%7Cutmcmd%3Dreferral%3B%2B HTTP/1.1
Accept: */*
Referer: http://www.mydll.com/gg.htm
Accept-Language: zh-cn
UA-CPU: x86
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Connection: Keep-Alive
Host: www.google-analytics.com

HTTP/1.1 200 OK
Pragma: no-cache
Cache-Control: private, no-cache, no-cache=”Set-Cookie”, proxy-revalidate
Expires: Fri, 04 Aug 1978 12:00:00 GMT
Content-Type: image/gif
Server: ucfe
Content-Length: 35
Date: Fri, 12 Oct 2007 20:17:14 GMT
Via: 1.1 HttpSpy

参数分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
http://www.google-analytics.com/__utm.gif?

utmwv=1&
utmn=1389663121&
utmcs=gb2312&
utmsr=1400×1050&
utmsc=32-bit&
utmul=en-us&
utmje=1&
utmfl=-&
utmcn=1&
utmhn=www.mydll.com&
utmr=http://www.yx8.com/temp/togg.html&
utmp=/gg.htm&
utmac=UA-2789145-1&
utmcc=__utma%3D251296922.1389663121.1192220231.1192220231.1192220231.1%3B%2B__utmb%3D251296922%3B%2B__utmc%3D251296922%3B%2B__utmz%3D251296922.1192220231.1.1.utmccn%3D(referral)%7Cutmcsr%3Dyx8.com%7Cutmcct%3D%2Ftemp%2Ftogg.html%7Cutmcmd%3Dreferral%3B%2B
utmcc=__utma=251296922.1389663121.1192220231.1192220231.1192220231.1;+__utmb=251296922;+__utmc=251296922;+__utmz=251296922.1192220231.1.1.utmccn=(referral)|utmcsr=yx8.com|utmcct=/temp/togg.html|utmcmd=referral;+
utmcc=
__utma=251296922.1389663121.1192220231.1192220231.1192220231.1;+
__utmb=251296922;+
__utmc=251296922;+
__utmz=251296922.1192220231.1.1.utmccn=(referral)|utmcsr=yx8.com|utmcct=/temp/togg.html|utmcmd=referral;+
1
2
3
4
5
6
7
8
9
GET http://www.mydll.com/51la.htm HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, application/x-shockwave-flash, */*
Accept-Language: zh-cn
UA-CPU: x86
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Host: www.mydll.com
Connection: Keep-Alive
Cookie: __utma=251296922.1389663121.1192220231.1192220231.1192220231.1; __utmb=251296922; __utmc=251296922; __utmz=251296922.1192220231.1.1.utmccn=(referral)|utmcsr=yx8.com|utmcct=/temp/togg.html|utmcmd=referral