Heritrix 学习笔记1.Heritrix defined codes
本文为博主翻译,转载请注明出处。如有翻译不妥,请指出以便改正,谢谢。
1SuccessfulDNSlookup
DNS查找成功
0Fetchnevertried(perhapsprotocolunsupportedorillegalURI)
从未获取(可能协议未授权或者不合法URI)
-1DNSlookupfailed
DNS查找失败
-2HTTPconnectfailed
HTTP连接失败
-3HTTPconnectbroken
HTTP连接中断
-4HTTPtimeout(beforeanymeaningfulresponsereceived)
HTTP协议超时(在接收到响应之前)
-5Unexpectedruntimeexception;seeruntime-errors.log
未处理的运行时异常会记录在runtime-errors.log
-6Prerequisitedomain-lookupfailed,precludingfetchattempt
运行先决条件,也就是没有得到域名的DNS
-7URIrecognizedasunsupportedorillegal
无支持或者非法的URI
-8Multipleretriesallfailed,retrylimitreached
多次尝试全部失败,重试次数(可以自己设置)达到限制
-50TemporarystatusassignedURIsawaitingpreconditions;appearanceinlogsmaybeabug
临时的状态已分配的URIs等待先决条件(DNS),出现在log可能是一个bug
-60FailurestatusassignedURIswhichcouldnotbequeuedbytheFrontier(andmayinfactbeunfetchable)
失败的状态已分配的URIs不能被Frontier(调度器)加入队列
-61Prerequisiterobots.txt-fetchfailed,precludingafetchattempt
运行先决条件(DNS)被robots.txt(爬虫协议)拒绝
-62Someotherprerequisitefailed,precludingafetchattempt
其他的一些获取先决条件(DNS)失败
-63Aprerequisite(ofanytype)couldnotbescheduled,precludingafetchattempt
DNS在所有的类型中不能被加入列表
-3000SevereJava'Error'conditions(OutOfMemoryError,StackOverflowError,etc.)duringURIprocessing.
-4000'chaff'detectionoftraps/contentofnegligiblevalueapplied
-4001Toomanylinkhopsawayfromseed
-4002Toomanyembed/transitivehopsawayfromlastURIinscope
-5000Outofscopeuponreexamination(onlyhappensifscopechangesduringcrawl)
-5001Blockedfromfetchbyusersetting
-5002Blockedbyacustomprocessor
-5003Blockedduetoexceedinganestablishedquota
-5004Blockedduetoexceedinganestablishedruntime
-6000DeletedfromFrontierbyuser
-7000Processingthreadwaskilledbytheoperator(perhapsbecauseofahungcondition)
-9998Robots.txtrulesprecludedfetch
HTTPcodes
1xxInformational
100Continue
101SwitchingProtocols
2xxSuccessful
200OK
201Created
202Accepted
203Non-AuthoritativeInformation
204NoContent
205ResetContent
206PartialContent
3xxRedirection
300MultipleChoices
301MovedPermanently
302Found
303SeeOther
304NotModified
305UseProxy
307TemporaryRedirect
4xxClientError
400BadRequest
401Unauthorized
402PaymentRequired
403Forbidden
404NotFound
405MethodNotAllowed
406NotAcceptable
407ProxyAuthenticationRequired
408RequestTimeout
409Conflict
410Gone
411LengthRequired
412PreconditionFailed
413RequestEntityTooLarge
414Request-URITooLong
415UnsupportedMediaType
416RequestedRangeNotSatisfiable
417ExpectationFailed
5xxServerError
500InternalServerError
501NotImplemented
502BadGateway
503ServiceUnavailable
504GatewayTimeout
505HTTPVersionNotSupported