Create or modify a web robot access policy record for your web site

$13
1
Delivery in
1 day

  • Views 12
   
  • 0

What you get with this Hourlie

I will deliver a new or modified record in a robots.txt file.

The record will specify a policy on any access that uses the following:
- a particular scheme
- a particular authority

The record will specify the policy for your choice of one of the following:
- a particular web robot
- a particular group of web robots
- general web robots

The record will specify the policy to address one or more of the following:
- where to avoid accessing
- where may be accessed
- the maximum rate of access
- which mirror to access

When creating or modifying a record for a particular web robot or a particular group of web robots, I will provide to you further information about what particular specification a relevant web robot is known to understand.


Below is information relating to this hourlie which may be helpful.

URLs:
- a URL identifies a resource
- 'http://name.example/webpage.html' is a URL
- 'http://name.example:8080/webpage.html' is a URL
- 'https://www.name.example/directory/' is a URL
- 'ftp://ftp.name.example/directory/leaflet.pdf' is a URL
- 'http', 'https' and 'ftp' are schemes
- 'name.example', 'name.example:8080', 'www.name.example' and 'ftp.name.example' are authorities
- '/webpage.html', '/directory/' and '/directory/leaflet.pdf' are paths

Web robots:
- a web robot is a computer program that automatically accesses the internet
- a web robot may ignore or follow part or all of a policy while accessing resources

Robots.txt files:
- a robots.txt file is a computer file that may contain a single record or multiple records
- a record specifies a policy on access to a single resource or multiple resources
- a policy is specified for a single web robot or multiple web robots
- a robots.txt file may contain multiple records to specify multiple policies
- a web site may contain multiple robots.txt files to cover access that uses multiple schemes and/or multiple authorities
- 'http://name.example/robots.txt' is a URL identifying an example robots.txt file
- the example robots.txt file covers any access that uses the 'http' scheme and the 'name.example' authority, for example access using 'http://name.example/webpage.html'

Multiple instances of this hourlie can be combined to create or modify multiple records in one go.

To work on this hourlie, I need confirmation that all aspects of the relevant web site are permitted by all applicable laws.

Please contact me in case you have a question or questions.

What the Seller needs to start the work

All occurrences below of the phrase 'the buyer' refer to the person purchasing this hourlie.
All occurrences below of the phrase 'the web site' refer to the web site for which I will deliver the record.

To work on this hourlie, I will need the buyer to deliver all of the following:
- confirmation by the buyer that all aspects of the web site are permitted by all applicable laws
- the scheme or schemes used to access the web site
- the authority or authorities used to access the web site
- confirmation by the buyer that they have permission to add a robots.txt file to the web site
- a specification that the record is for a particular web robot, a particular group of web robots or general web robots
- if required, a description of the particular web robot or group of web robots
- a specification of where a web robot is to avoid accessing and/or may access
- if relevant, any URL identifying a sitemap for the web site
- a number representing how many accesses a web robot may do per minute
- if relevant, a URL identifying a mirror for the web site

To specify a location on the web site, please use any of the following:
- a URL, with scheme and authority, to specify a resource, for example 'http://name.example/webpage.html'
- a URL, with scheme and authority, to specify a resource and any resource within it, for example 'https://www.name.example/directory/'

Please contact me for information about describing a particular web robot or group of web robots.