ATOMSEO

Broken Link Checker — Advanced Settings

If you need more, than just a checking a single URL or whole web site, you can use advanced settings for configuring Atomseo crawler. You can include and exclude urls, specific folders, files etc. You can include and exclude urls, specific folders, files etc. As well, you can switch between user agents and set up the speed of crawling. Also, it’s possible to exclude external links from crawling.

Note: All mentioned options are provided under Professional, Enterprise or Premium Plans.
Input is not an url!
Input is not a number!
Input is not a http header!
 

How to Use It

Option ‘Include’

Allows to specify URLs to check. For example, if you need to check only specified folder, or subfolder, or page types, or images.
If you need to check all pages under the folder, for example
http://www.example.com/info/*

It will check only pages starting with http://www.example.com/info/, e.g. http://www.example.com/info/page1, http://www.example.com/info/page2

If you need to check every page under subfolder, you can use
http://www.example.com/*/subfolder/*

In this case, it provides the result for every page under /subfolder/, disregarding the mother folder, it can be any.

If you need to check every page under subfolder, you can use
http://www.example.com/folder1/subfolder/page1
http://www.example.com/folder1/subfolder/page2

and
http://www.example.com/folder2/subfolder/page1
http://www.example.com/folder2/subfolder/page2

If you want to include for search only specific type of files, you can use

http://www.example.com/*jpg — searches only for .jpg files
http://www.example.com/*pfd — searches only for .pdf files

If you want to crawl pages with specific parameters, e.g. ?utm, ?price, use

http://www.example.com/?utm=* or http://www.example.com/?price=*

Option ‘Exclude’

If you need to check the whole web site, but to need to exclude certain pages and/or save your crawling budget, you can configure your search.
For example, if you do not need to check only one page, please add the full address

http://www.example.com/url-not-for-scanning

If you need to avoid folders or subfolders, just use
http://www.example.com/login/*

So, it will pass all the pages under folder /login/

You can also use any of examples stated above, at the Include Section.

Important! You can use any number of excluding and including rules — just add them line by line at the proper section. And don’t forget to do this correctly, otherwise you will get the incorrect result. If you have any doubts, write us at info@atomseo.com

Option ‘User-Agent’

By default, ATOMSEO use this User-Agent:
Mozilla/5.0 (compatible; Atomseobot/2.0; +http://https://error404.atomseo.com/)

However, it has inbuilt preset user agents for various browsers. This allows you to switch between them quickly when required.

Custom HTTP Headers

You can use any of custom headers [HttpHeaderName]: [HttpHeaderValue]:
This may be useful for passing anti-spam or access hidden pages.
You can put any number of custom HTTP, for example:

Cache-Control: no-cache
Pragma: no-cache
Last-Modified:

And others.

Share it share

Please support us by sharing! Most park of our product is for free. Please support us by sharing the page with your friends.
Facebook Twitter LinkedIn

More about broken links

What is 404 error, step by step

The definition of "404 error not found"

Why to check broken links

How broken links can damage your business

Causes of broken links

How to find and fix your dead links

Check up to 1M links for only 5.95$

Find and fix 404 errors on a website - a new option