Here are some suggestions on creating the file especially for Google user agents;
1) The file should follow the Robots Exclusion Standard.
2) It can include one or more rules for allowing or blocking the access to the specified crawler to a particular path of a site.
3) A webmaster should be familiar with almost all the syntax of robots.txt file to understand the subtle behaviour of each syntax.
4) The site cannot be having more than one robots.txt file.
5) The file supports both subdomains (like http://website.demo.com/robots.txt or any non-standard port like (http://demo:8181/robots.txt).
6) If you do not know or having the access to the root folder of your website then it is best to reach the web hosting service provider to keep the robots.txt file inside the same. In case you can’t access to the website root then use meta tags as alternative blocking method.
7) More than one group directives or rules (mentioned one per line) can be included in the robots.txt file.
8) It supports only ASCII characters.
9) A group provides information about to whom it is applied for (user agent) and what all files or directories that an agent cannot/can access. The directives are processed from top to bottom. A web bot associated itself to only one rule set that can be specified separately or comes first.
10) As per the default assumption a bot can crawl any directory or page by a “Disallow:” syntax.
11) The directives used in the file are case-sensitive, like Disallow: /one.xml doesn’t apply to ONE.xml.
12) It applies to the full domain of a website consisting of either https or http protocol.
Usually, the user agents of Bing and Google go with a specific group of directives but by default, first, matching rules are preferable since different search engine web bots interpret the directives in a different manner.
It is also suggested for webmasters to avoid using the crawl-delay syntax as much as possible in their robots.txt file so to reduce the total crawl time of the search engine bots.