HTTP access log recording and analysis

2010-04-09  来源:本站原创  分类:Tech  人气:189 

Since its release study of the embedded browser to run for some time, and we have not done detailed statistics, the proposed requirements, require more detailed statistics.

Browser through a proxy server to access Web content, the proxy server to do a conversion, to convert html page to return proprietary binary protocol browser in order to save traffic and speed up the browsing speed. We need more statistics need only the proxy server for processing.

Need to consider is how to record user access to data, the company now has a data warehouse and data analysis systems, Ye have a special of staff to provide data analysis. So the first consideration of the program are applied directly to the user's HTTP access to records written to the database, and then analyzed by the data warehouse into the data warehouse staff time for data analysis. But given the current data warehouse processing and analysis of data requires a longer time, statistical data, there is a certain lack of immediacy.

Ultimately decided the way to the log file records using apache's http log format:
"% H% l% u% t \"% r \ "%> s% b \"% (Referer) i \ "\"% (User-Agent) i \ ""
Because it will visit a number of different sites, so an increase in the top Host entry, as follows:
"% Host% h% l% u% t \"% r \ "%> s% b \"% (Referer) i \ "\"% (User-Agent) i \ ""

Then carried out directly by Awstats log analysis.

Because the proxy service to deploy a number of nodes, and distributed in different IDC, it also faces the problem dealt with how to merge the log, now only for a single server logs to do the merge, the data on different servers can not merge.
But the http logs can be easily imported into the data warehouse, so the overall statistical analysis of the log can consider the late stage of the data warehouse.

相关文章
  • HTTP access log recording and analysis 2010-04-09

    Since its release study of the embedded browser to run for some time, and we have not done detailed statistics, the proposed requirements, require more detailed statistics. Browser through a proxy server to access Web content, the proxy server to do

  • Tomcat access log analysis tool 2010-03-29

    http://www.blogjava.net/xmatthew/archive/2008/04/14/192450.html Normal web server logs of two parts: 1: is a running log, which runs some of the major record information, especially some abnormal error log information 2: is the access log information

  • tomcat access log configuration 2010-05-13

    To monitor who is using your server, set up HTTP access logging in Tomcat. Every request that comes to Tomcat gets a Setting up Logging To setup access logging, edit the Tomcat server configuration file, $ (tomcat_home) / conf / server.xml and uncomm

  • tomcat access log format configuration 2010-09-19

    <Host name="localhost" appBase="webapps" unpackWARs="true" autoDeploy="true" xmlValidation="false" xmlNamespaceAware="false"> <Context path="/xxx" docBase="/data/www/xxx&q

  • Tomcat 7 access log configuration 2011-09-09

    tomcat access log configuration, the following tag in the server.xml inside the editor, (Close access to the contents of the log notes the following paragraph) <Valve className = "org.apache.catalina.valves.AccessLogValve" directory = "l

  • 定时备份mysql, 定时切割nginx access log的方法 2013-11-26

    定时备份mysql, 定时切割nginx access log的方法,需要的朋友可以参考下. 定时备份mysql 放入 /etc/cron.hourly/ #!/bin/bash DUMP=/usr/local/webserver/mysql/bin/mysqldump OUT_DIR=/data1/backup/ DB_NAME=数据库名 DB_USER=数据库用户 DB_PASS=数据库密码 #How much days backup most DAYS=3 #12 hours ago MI

  • apache的access.log和error.log减肥 2014-03-19

    我的服务器是用apache搭建的,里面的access.log和error.log这两个文件要经常上去看,和清理,如果时间忙,忘记看和清理了,过不了多久,这两个文件就膨胀的非常的大,打都打不开了. 怀疑是有其他的爬虫,明天都在爬我的几个网站. 在网上找找了给access.log和error.log减肥的方法 如下 CustomLog "|D:/thridparty-system/java/apache2/bin/rotatelogs.exe D:/thridparty-system/java/ap

  • Tomcat access log记录到mongo的插件 Tomcat Mongo Access Log Valve 2014-07-18

    Tomcat Mongo Access Log Valve 网站 : https://github.com/chanjarster/tomcat-mongo-access-log Tomcat Mongo Access Log 是一个将 Tomcat access log 记录到 MongoDB 的插件. 特性: 不影响业务代码 使用简单 使用方式和Tomcat自带的AccessLogValve基本一致,只需要简单配置就能够实现将日志存入到mongodb. 使用前先mvn clean packa

  • 使用 Map-Reduce 统计Web 服务器 access.log 日志文件 2013-07-15

    1.6. Map-Reduce 1.6.1. 使用 Map-Reduce 统计Web 服务器 access.log 日志文件 首先将web服务器access.log倒入到mongodb,参考 http://netkiller.github.io/article/log.html. 格式如下: { "_id" : ObjectId("51553efcd8616be7e5395c0d"), "remote_addr" : "192.168.

  • web applications, site-wide and efficient access log records the amount of source code 2008-12-19

    With regard to the amount of site-wide access logs containing session information, user requests issued by the url information recording function, web handling of asynchronous requests. To minimize space consumption and time consuming. Configuration:

  • The use of the log and filters generate user access log (change) 2010-03-29

    To use log4j to generate log. Then the log class added to the filter. When the user makes a visit jsp or servlet can generate a log. Easy to debug. Here is the log4j to get users to access the address and wrote the document. package xzt.rs.tools; imp

  • Tomcat access log pattern 2010-05-10

    You can use apache log format: common: '% h% l% u% t "% r"% s% b' combined: '% h% l% u% t "% r"% s% b "% (Referer) i" "% (User-agent) i"' Can also customize their own: See http://tomcat.apache.org/tomcat-5.5-doc/con

  • Nothing to see Oracle's AlertSID.log do some analysis 2011-08-30

    Background process trace file: alert log File : View questions 1. Describe memory errors and block corruption errors 2. Monitoring data manipulation 3. View non-default initialization parameters View dump storage show parameter dump; backgroup_dump_d

  • PHP Apache Access Log 分析工具 拆分字段成CSV文件并插入Mysql数据库分析 2014-05-22

    网站被黑了 挂马了 服务器中毒了 防火墙没有开 双机热备失效了...种种奇葩.. 现在需要分析访问日志,怎么办? 比如分析D:\Servers\Apache2.2\logs\access2014-05-22.log http://my.codeweblog.com/cart/针对这个问题特意开发了一款小工具分析Apache 日志,拆分字段成CSV文件并插入Mysql数据库分析 $date = '2014-05-23'; preg_match_all('/(.*?) .*? .*? \[(.*?)

  • j2ee middleware under linux installation configuration 2011-04-27

    nginx awstats jdk tomcat detailed steps to install and configure Tomcat and a JDK installed 1, upload apache-tomcat-6.0.18.tar.gz and jdk-6u12-linux-i586.bin to / usr / local 2, execute the following command to install tomcat: # Cd / usr / local # Ta

  • Analysis of access logs awstats Nginx 2010-05-31

    Analysis of access logs awstats Nginx Study: Fan Cheuk Yun [email protected] This article describes how to analyze by awstats nginx access logs, and output through Nginx results to the browser. Preface In my previous article "Using Nginx improve Web

  • Log analysis | Problem Analysis 2010-08-02

    Wrote, SEO Optimization There are many ways to run-off if a certain site features a combination of its own, unique way out. How long is the life of optimal results, you can completely rely on the conservative strategy of how long a secret. Said today

  • Oracle archive log analysis - LogMiner (rpm) 2010-12-03

    Log Analysis Technical Overview: As the Oracle DBA, we sometimes need to track the malicious user data accidentally deleted or operating conditions, then we need not only perform these operations identify the database account, you also need to know w

  • [Switch] with awstats log analysis of some records Nginx 2011-05-11

    Original Address: http://www.linuxbyte.org/yong-awstats-fen-xi-nginx-ri-zhi-de-yi-xie-ji-lu.html System environment for the Cenots + Nginx details, read the log before the " yum install nginx + Centos FPM + eAccelerator + PHP-MySQL . " Awstats i

  • Nginx using analysis of access logs awstats 2011-03-16

    In my last article, " to use Nginx improve site access speed , "described in this Nginx HTTP server and how to use it to accelerate the site's speed. In the actual Web site operators, we often need to understand that the site visits, for example