tag:blogger.com,1999:blog-6690994337395244641.post2410021749362175200..comments2022-03-30T14:14:56.448-05:00Comments on Paul Melson's Blog: Monkey-SpiderPaulMhttp://www.blogger.com/profile/02530533566781746778noreply@blogger.comBlogger1125tag:blogger.com,1999:blog-6690994337395244641.post-45064842447567684572010-02-09T10:25:15.933-05:002010-02-09T10:25:15.933-05:00Likely you already read this, but in http://monkey...Likely you already read this, but in http://monkeyspider.sourceforge.net/documentation.html there is this blurb:<br /><br />Step 1 Seeding:<br /><br />The Heritrix crawler starts crawling with a plain text file called seeds.txt inside of the standard crawl profile. There are four different methods to generate starting seeds for the crawler:<br />Manual URL addition: URL entries can be added manually during the crawl configuration or directly to the seeds.txt file if we want to analyze a known predefined set of Web sites.<br /><br />So, modifying seeds.txt of the crawler component is the first place to try. Alternatively, you could just use Malware Domain List, http://www.malwaredomainlist.com/ , and Wepawet, http://wepawet.iseclab.org/ , to correlate and analyze your web traffic. Submittal to Virustotal and CWSandbox, http://www.sunbeltsoftware.com/Malware-Research-Analysis-Tools/Sunbelt-CWSandbox/ , wouldn't hurt either.jbmoorehttps://www.blogger.com/profile/09751110750712243573noreply@blogger.com