Does GoogleBot use inference when spidering – having crawled site.com/article/page1.htm and /page2.htm, can it guess at the existence of a /page3.htm and crawl it? Or does it stick entirely to what it finds via the link graph and/or Sitemaps/feeds?
Matt Cutts explains: