Advanced

The Advanced tab in the Download Settings dialog provides the following settings:

Check All Links

Check this box to have SiteSucker check all links in all downloaded HTML files — including links to files that you are not downloading — and log any errors that occur. With this option turned on, SiteSucker will report many errors that you normally wouldn't see. This setting is intended as a debugging tool for Web designers who want to see if their own sites have any bad links.

Delete Small Images

Check this box to have SiteSucker delete small images after they are downloaded. SiteSucker will delete any image if its width is less than half the width of the main screen and its height is less than half the height of the main screen. You can use this feature to delete thumbnails and other small images.

Suppress Login Dialog

Whenever SiteSucker encounters a page that requires authentication, it first looks for the appropriate credentials in the Keychain. If nothing is found in the Keychain, it displays the Login Dialog.

Check this box to suppress display of the Login Dialog and skip the download of any pages that require authentication. For more information on authentication, see Password-protected Sites.

Ignore Robot Exclusions

Check this box to have SiteSucker ignore robots.txt exclusions and the Robots META tag.

Warning: Ignoring robot exclusions is not recommended. Robot exclusions are usually put in place for a good reason and should be obeyed.

By default, SiteSucker honors robots.txt exclusions and the Robots META tag. The robots.txt file allows the Web site administrator to define what parts of a site are off-limits to specific robots, like SiteSucker. Web administrators can disallow access to cgi and private and temporary directories, for example, because they do not want pages in those areas downloaded. In addition to server-wide robot control using robots.txt, Web page creators can also use the Robots META tag to specify that the links on a page should not be followed by robots.

Assume Ambiguous URLs Are Files

Check this box to have SiteSucker treat ambiguous URLs as files. If a URL does not end with a '/' and the last path component does not have a file extension, SiteSucker considers it to be ambiguous. When this option is off, SiteSucker adds a '/' to the end of ambiguous URLs.

Save Web URL as Spotlight Comment

Check this box to have SiteSucker store the Web URL of each downloaded file in the file's Spotlight Comments field.

Download Attempts

Use this control to specify the number of times SiteSucker should attempt to download a file. SiteSucker will only retry downloading a file if a timeout error occurs.

Download Timeout

Use this control to select the length of time that SiteSucker should wait for a response from the server.

Download Delay

Use this control to specify the length of time that SiteSucker should delay before it downloads a file. This feature can allow you to download sites while using very little bandwidth and can help avoid anti-mining safeguards employed by some sites.

The delay can be set to None or to a fixed range of values (e.g., 20 - 40 seconds). If you select None, SiteSucker downloads the site as quickly as possible. If you select a delay range, SiteSucker will add a random delay (within the selected range) before it downloads a file. Furthermore, if a delay is specified, SiteSucker will only use a single active connection to download files since the whole purpose of using multiple connections is to reduce delays.

Identity

Use this control to customize the way SiteSucker identifies itself when making a request. Some sites are very particular about which browsers they will allow. You can you use this feature to "fool" the site into thinking that you are using an approved browser.

To change SiteSucker's identity, simply click on this control and select one of the Web browsers listed. (If you choose "None", SiteSucker will not include any identifying information when making requests.)

You can customize the list of available Web browsers by editing the user agent property list in your home folder at ~/Library/Application Support/SiteSucker/UserAgent.plist.