Experience job configuration

We often encounter a problem that clearly links to other sites so much, why do we have so little to crawl to the?
Or that such a snail crawl rate? Download links are not what we want?
Here we have a little bit to solve!
Download link is too little too narrow domain restrictions, such as restrictions in DecidingScope case, if the hash entry in the other two domain names, we can not extract to this link, resulting in little things we downloaded to personal recommendations with BroadScope
However, if used broadscope then downloaded are too, because he did not make any restrictions! A lot of things are not what we want, such as js, css, jpg, etc. We need to expand its interface Extractor or Scheduler
But to expand this interface is a very troublesome problem, heritrix principles we all know, judging by the link scheduler to download, let go after the resolution inside the URL, so we all eventually find the whole page to download all the URL, to customize a post- regular, must be progressive layers, and can not fault. This can be quickly downloaded to the page we need it! I suggest using Scheduler, because Extractor Extract url themselves often have to write because of the positive result is not satisfactory to extract a small URL!

