Nginx是一個高性能的HTTP和反向代理服務器。Nginx access日誌記錄了web應用的訪問記錄。大致記錄了訪問方式(POST/GET)、客戶端IP、遠程用戶、請求時間、請求狀態碼、訪問host地址、請求頁面大小、reffer信息、x_forwarded_for地址等等。nginx access日誌的格式不是一成不變的,是可以自定義的。Nginx access具體日誌格式與在服務器的存儲位置可以查看nginx.conf配置文件。Nginx詳細記錄了每一次web請求。
設置nginx日誌格式
默認變量格式:log_format combined '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"';
$remote_addr變量:記錄了客戶端的IP地址(普通情況下)。
$remote_user變量:當nginx開啓了用戶認證功能後,此變量記錄了客戶端使用了哪個用戶進行了認證。
$time_local變量:記錄了當前日誌條目的時間。
$request變量:記錄了當前http請求的方法、url和http協議版本。
$status變量:記錄了當前http請求的響應狀態,即響應的狀態碼,比如200、404等響應碼,都記錄在此變量中。
$body_bytes_sent變量:記錄了nginx響應客戶端請求時,發送到客戶端的字節數,不包含響應頭的大小。
$http_referer變量:記錄了當前請求是從哪個頁面過來的,比如你點了A頁面中的超鏈接才產生了這個請求,那麼此變量中就記錄了A頁面的url。
$http_user_agent變量:記錄了客戶端的軟件信息,比如,瀏覽器的名稱和版本號。
增加變量:
'"$http_host" "$request_time" "$upstream_response_time" "$upstream_connect_time" "$upstream_header_time"';
$http_host 請求地址,即瀏覽器中你輸入的地址(IP或域名)
$request_time:處理請求的總時間,包含了用戶數據接收時間
$upstream_response_time:建立連接和從上游服務器接收響應主體的最後一個字節之間的時間
$upstream_connect_time:花費在與上游服務器建立連接上的時間
$upstream_header_time:建立連接和從上游服務器接收響應頭的第一個字節之間的時間
修改後的自定義格式:
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'"$http_host" "$request_time" "$upstream_response_time" "$upstream_connect_time" "$upstream_header_time"';
Nginx的詳細配置 參考:http://www.zsythink.net/archives/3213
解析nginx日誌
日誌內容:
192.168.100.175 - - [14/Jun/2020:20:18:59 +0800] "POST /mall_open_api.php?token=9f1063d6bf1ec1 HTTP/1.0" 300 1829 "-" "Dart/2.8 (dart:io)" "-" "mall-api.carisok.com" "0.556" "0.556" "0.000" "0.556"
192.168.100.173 - - [14/Jun/2020:20:18:59 +0800] "POST /mall_open_api.php?token=a3a9c69 HTTP/1.0" 200 1829 "-" "Dart/2.8 (dart:io)" "-" "mall-api.carisok.com" "0.446" "0.445" "0.000" "0.445"
192.168.100.175 - - [14/Jun/2020:20:19:03 +0800] "POST /mall_open_api.php?token=b9b0eadf7 HTTP/1.0" 200 731 "-" "Dart/2.8 (dart:io)" "-" "mall-api.carisok.com" "0.153" "0.154" "0.000" "0.154"
192.168.100.176 - - [14/Jun/2020:20:19:14 +0800] "POST /mall_open_api.php?token=242b947c HTTP/1.0" 200 97 "-" "Dart/2.8 (dart:io)" "-" "mall-api.carisok.com" "0.425" "0.425" "0.000" "0.425"
日誌解析:正則表達式,通過java的pattern和matter類捕獲變量
String pattern = "(?<ip>\\d+\\.\\d+\\.\\d+\\.\\d+)(?<datetime> - - \\[(.*?)])(?<t1>\\s[\\\\\"]+)(?<requestMethod>[A-Z[/url]]+)(?<t2> )(?<requestUrl>\\S+\\s+)(?<protocol>\\S+\")(?<status> \\d+)(?<bytes> \\d+)" +
"(?<referer> \"(.*?)\")(?<agent> \"(.*?)\")(?<forwarded> \"(.*?)\")(?<host> \"(.*?)\")" +
"(?<requestTime> \"(.*?)\")(?<responseTime> \"(.*?)\")(?<connectTime> \"(.*?)\")(?<headerTime> \"(.*?)\")"
實現代碼: 先逐行讀取文件中的日誌,存儲到list中,然後通過並行計算框架 ForkJoin 解析list中每一個日誌信息併合並相同URI的統計信息。最終得出每條URI當天的訪問統計數據(總的訪問次數、成功次數、平均訪問響應時間等)
public class forkJoin_log {
public static void main(String[] args) {
Long start_time = System.currentTimeMillis();
//逐行讀取日誌文件
String fileName = "C:\\Users\\Administrator\\Desktop\\log\\access_20200615.log";
//String fileName = "C:\\Users\\Lin\\Desktop\\log.txt";
ArrayList<String> logList = readFileByLines(fileName);
System.out.println("log size: " + logList.size());
Long load_file_time = System.currentTimeMillis();
System.out.println("load file time: "+ (load_file_time - start_time));
//創建分治任務線程池
ForkJoinPool fjp = new ForkJoinPool(5);
//創建分治任務
Log fib = new Log(logList,0,logList.size());
//啓動分治任務
Map<String, RequestMsg> result = fjp.invoke(fib);
//輸出結果
Long end_time = System.currentTimeMillis();
System.out.println("result size: "+result.size());
System.out.println("forkjoin compute_time: "+ (end_time - load_file_time));
System.out.println("total time: "+ (end_time-start_time));
}
//分治任務
static class Log extends RecursiveTask<Map<String,RequestMsg>> {
private int start;
private int end;
private ArrayList<String> logList;
Log(ArrayList<String> logList,int start,int end){
this.logList = logList;
this.start = start;
this.end = end;
}
@Override
protected Map<String, RequestMsg> compute(){
//終止條件(不可再細分,就執行計算任務)
if(end - start == 1)
return calc(logList.get(start));
int mid = (start+end)/2;
Log f1 = new Log(logList,start,mid);
Log f2 = new Log(logList,mid,end);
//f1創建⼦任務,f2執行計算任務,避免出現只分配任務不執行任務的情況
f1.fork();
//等待⼦任務結果,併合並結果
return merge(f2.compute(), f1.join());
}
}
public static Map<String, RequestMsg> merge(Map<String,RequestMsg> requestMap1, Map<String,RequestMsg> requestMap2){
Map<String, RequestMsg> result = new HashMap<>();
result.putAll(requestMap1);
//合併結果
requestMap2.forEach((k,v) -> {
RequestMsg rm = result.get(k);
if (rm!=null){
//result.put(k,c+v);
Integer succeed_visit_times = rm.getSucceed_visit_times() + v.getSucceed_visit_times();
if(succeed_visit_times != 0){
if(rm.getSucceed_visit_times()==0){
rm.setMax_request_time(v.getMax_request_time());
rm.setMin_request_time(v.getMin_request_time());
rm.setMax_response_time(v.getMax_response_time());
rm.setMin_response_time(v.getMin_response_time());
rm.setAverage_request_time(v.getAverage_request_time());
rm.setAverage_response_time(v.getAverage_response_time());
rm.setSucceed_visit_times(v.getSucceed_visit_times());
}
else if (v.getSucceed_visit_times()==0){
//
}
else {
Double all_request_time = rm.getAverage_request_time() * rm.getSucceed_visit_times() + v.getAverage_request_time() * v.getSucceed_visit_times();
Double all_response_time = rm.getAverage_response_time() * rm.getSucceed_visit_times() + v.getAverage_response_time() * v.getSucceed_visit_times();
BigDecimal b1 = new BigDecimal(all_request_time);
BigDecimal b2 = new BigDecimal(all_response_time);
BigDecimal b3 = new BigDecimal(succeed_visit_times);
Double average_request_time = b1.divide(b3, 3, BigDecimal.ROUND_HALF_UP).doubleValue();
Double average_response_time = b2.divide(b3, 3, BigDecimal.ROUND_HALF_UP).doubleValue();
rm.setAverage_request_time(average_request_time);
rm.setAverage_response_time(average_response_time);
rm.setSucceed_visit_times(succeed_visit_times);
}
}
rm.setVisit_times(rm.getVisit_times() + v.getVisit_times());
}
else
result.put(k,v);
});
return result;
}
public static Map<String,RequestMsg> calc(String log){
Map<String, RequestMsg> result = new HashMap<>();
parseLine(log,result);
return result;
}
public static ArrayList<String> readFileByLines(String fileName) {
File file = new File(fileName);
BufferedReader reader = null;
ArrayList<String> logLIST = new ArrayList<String>();
try {
System.out.println("以行爲單位讀取文件內容,一次讀一整行:");
reader = new BufferedReader(new FileReader(file));
String tempString = null;
int line = 1;
//一次讀入一行,直到讀入null爲文件結束
while ((tempString = reader.readLine()) != null) {
//顯示行號
logLIST.add(tempString);
}
reader.close();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (reader != null) {
try {
reader.close();
} catch (IOException e1) {
}
}
}
return logLIST;
}
public static RequestMsg parseLine(String str,Map<String, RequestMsg> url){
String requestStatus="";
RequestMsg requestMsg = new RequestMsg();
//部分變量之間沒有空格符,需注意
String pattern = "(?<ip>\\d+\\.\\d+\\.\\d+\\.\\d+)(?<datetime> - - \\[(.*?)])(?<t1>\\s[\\\\\"]+)(?<requestMethod>[A-Z[/url]]+)(?<t2> )(?<requestUrl>\\S+\\s)(?<protocol>\\S+\")(?<status> \\d+)(?<bytes> \\d+)" +
"(?<referer> \"(.*?)\")(?<agent> \"(.*?)\")(?<forwarded> \"(.*?)\")(?<host> \"(.*?)\")" +
"(?<requestTime> \"(.*?)\")(?<responseTime> \"(.*?)\")(?<connectTime> \"(.*?)\")(?<headerTime> \"(.*?)\")";
//String pattern = "(?<ip>\\d+\\.\\d+\\.\\d+\\.\\d+)(?<datetime> - - \\[(.*?)])(?<t1>\\s[\\\\\"]+)(?<requestMethod>[A-Z[/url]]+)(?<t2> )(?<requestUrl>\\S+\\s)(?<protocol>\\S+\")(?<status>\\s+\\d+)(?<bytes>\\s+\\d+)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(str);
if(m.find()){
String gap = " + ";
String ip = m.group("ip");
String datetime = m.group("datetime").replaceAll(" - - |\\[|\\]","");
String requestMethod = m.group("requestMethod");
//獲取 requestUrl 並去掉參數
String requestUrl = m.group("requestUrl");
Pattern r1 = Pattern.compile("\\/(.*?)\\?");
Matcher m1 = r1.matcher(requestUrl);
while(m1.find()) requestUrl=m1.group().replaceAll(" |\\?","");
requestUrl = requestUrl.replaceAll(" ","");
requestUrl = requestUrl.length()>255?requestUrl.substring(0,250):requestUrl;
requestUrl = requestUrl.indexOf(",")!=-1?requestUrl.substring(requestUrl.indexOf(",")+1):requestUrl;
String protocol = m.group("protocol").replaceAll("\"","");
String status = m.group("status").substring(1);
String bytes = m.group("bytes").substring(1);
String referer = m.group("referer").substring(1).replaceAll("\"","");
String requestTime = m.group("requestTime").substring(1).replaceAll("\"","");
requestTime = requestTime.equals("-")?"0.000":requestTime ;
String responseTime = m.group("responseTime").substring(1).replaceAll("\"","");
responseTime = responseTime.equals("-")?"0.000":responseTime ;
responseTime = responseTime.indexOf(",")!=-1?responseTime.substring(responseTime.indexOf(",")+1):responseTime;
//記錄當次的接口信息
requestMsg.setVisit_times(1);
requestMsg.setRemote_addr(ip);
requestMsg.setTime_local(datetime);
requestMsg.setRequestMethod(requestMethod);
requestMsg.setRequestUrl(requestUrl);
requestMsg.setProtocol(protocol);
requestMsg.setStatus(Integer.parseInt(status));
requestMsg.setBytes(Integer.parseInt(bytes));
requestMsg.setHttp_referer(referer);
requestMsg.setRequest_time(new Double(requestTime));
requestMsg.setResponse_time(new Double(responseTime));
//記錄訪問成功的相關變量
if(status.equals("200")) {
requestMsg.setMax_request_time(new Double(requestTime));
requestMsg.setMin_request_time(new Double(requestTime));
requestMsg.setMax_response_time(new Double(responseTime));
requestMsg.setMin_response_time(new Double(responseTime));
requestMsg.setAverage_request_time(new Double(requestTime));
requestMsg.setAverage_response_time(new Double(responseTime));
requestMsg.setSucceed_visit_times(1);
}else{
requestMsg.setAverage_request_time(new Double(0));
requestMsg.setAverage_response_time(new Double(0));
requestMsg.setSucceed_visit_times(0);
}
url.put(requestUrl,requestMsg);
}
return requestMsg;
}
}
參考鏈接:
https://www.aboutyun.com//forum.php/?mod=viewthread&tid=20709&extra=page%3D1&page=1&