Python脚本实现下载合并SAE日志
由于一些原因,需要SAE上站点的日志文件,从SAE上只能按天下载,下载下来手动处理比较蛋疼,尤其是数量很大的时候。还好SAE提供了API可以批量获得日志文件下载地址,刚刚写了python脚本自动下载和合并这些文件
调用API获得下载地址
文档位置在这里
设置自己的应用和下载参数
请求中需要设置的变量如下
代码如下:
api_url = 'http://dloadcenter.sae.sina.com.cn/interapi.php?' appname = 'xxxxx' from_date = '20140101' to_date = '20140116' url_type = 'http' # http|taskqueue|cron|mail|rdc url_type2 = 'access' # only when type=http access|debug|error|warning|notice|resources secret_key = 'xxxxx'
生成请求地址
请求地址生成方式可以看一下官网的要求:
1.将参数排序
2.生成请求字符串,去掉&
3.附加access_key
4.请求字符串求md5,形成sign
5.把sign增加到请求字符串中
具体实现代码如下
代码如下:
params = dict() params['act'] = 'log' params['appname'] = appname params['from'] = from_date params['to'] = to_date params['type'] = url_type if url_type == 'http': params['type2'] = url_type2 params = collections.OrderedDict(sorted(params.items())) request = '' for k,v in params.iteritems(): request += k+'='+v+'&' sign = request.replace('&','') sign += secret_key md5 = hashlib.md5() md5.update(sign) sign = md5.hexdigest() request = api_url + request + 'sign=' + sign if response['errno'] != 0: print '[!] '+response['errmsg'] exit() print '[#] request success'
下载日志文件
SAE将每天的日志文件都打包成tar.gz的格式,下载保存下来即可,文件名以日期.tar.gz命名
代码如下:
log_files = list() for down_url in response['data']: file_name = re.compile(r'\d{4}-\d{2}-\d{2}').findall(down_url)[0] + '.tar.gz' log_files.append(file_name) data = urllib2.urlopen(down_url).read() with open(file_name, "wb") as file: file.write(data) print '[#] you got %d log files' % len(log_files)
合并文件
合并文件方式用trafile库解压缩每个文件,然后把文件内容附加到access_log下就可以了
代码如下:
# compress these files to access_log access_log = open('access_log','w'); for log_file in log_files: tar = tarfile.open(log_file) log_name = tar.getnames()[0] tar.extract(log_name) # save to access_log data = open(log_name).read() access_log.write(data) os.remove(log_name) print '[#] all file has writen to access_log'
完整代码
代码如下:
#!/usr/bin/env python # -*- coding: utf-8 -*- # @Author: Su Yan <http://yansu.org> # @Date: 2014-01-17 12:05:19 # @Last Modified by: Su Yan # @Last Modified time: 2014-01-17 14:15:41 import os import collections import hashlib import urllib2 import json import re import tarfile # settings # documents http://sae.sina.com.cn/?m=devcenter&catId=281 api_url = 'http://dloadcenter.sae.sina.com.cn/interapi.php?' appname = 'yansublog' from_date = '20140101' to_date = '20140116' url_type = 'http' # http|taskqueue|cron|mail|rdc url_type2 = 'access' # only when type=http access|debug|error|warning|notice|resources secret_key = 'zwzim4zhk35i50003kz2lh3hyilz01m03515j0i5' # encode request params = dict() params['act'] = 'log' params['appname'] = appname params['from'] = from_date params['to'] = to_date params['type'] = url_type if url_type == 'http': params['type2'] = url_type2 params = collections.OrderedDict(sorted(params.items())) request = '' for k,v in params.iteritems(): request += k+'='+v+'&' sign = request.replace('&','') sign += secret_key md5 = hashlib.md5() md5.update(sign) sign = md5.hexdigest() request = api_url + request + 'sign=' + sign # request api response = urllib2.urlopen(request).read() response = json.loads(response) if response['errno'] != 0: print '[!] '+response['errmsg'] exit() print '[#] request success' # download and save files log_files = list() for down_url in response['data']: file_name = re.compile(r'\d{4}-\d{2}-\d{2}').findall(down_url)[0] + '.tar.gz' log_files.append(file_name) data = urllib2.urlopen(down_url).read() with open(file_name, "wb") as file: file.write(data) print '[#] you got %d log files' % len(log_files) # compress these files to access_log access_log = open('access_log','w'); for log_file in log_files: tar = tarfile.open(log_file) log_name = tar.getnames()[0] tar.extract(log_name) # save to access_log data = open(log_name).read() access_log.write(data) os.remove(log_name) print '[#] all file has writen to access_log'
相关推荐
waitui00 2019-12-31
shangc 2016-11-17
zhichizhixiao0 2015-09-01
guowang0 2013-06-18
evilice 2011-11-29
cawebs 2016-11-17
heavstar 2019-06-21
tuxlcsdn 2019-06-21
haixianTV 2015-09-01
字节流动 2014-10-23
kongxingxing 2013-05-19
liutian0 2012-06-15
xlklmy 2016-02-15
好脑筋不如烂笔头 2016-02-14
微加幸福 2016-02-13
关山岂止万千重 2016-01-18
86407413 2016-01-05
anxiaomai0 2016-02-15
mantgh 2014-07-28