libwget-robots − Robots Exclusion file parser
struct wget_robots_st
#define parse_record_field(d, f)
int
wget_robots_parse (wget_robots **_robots,
const char *data, const char *client)
void wget_robots_free (wget_robots **robots)
int wget_robots_get_path_count (wget_robots
*robots)
wget_string * wget_robots_get_path
(wget_robots *robots, int index)
int wget_robots_get_sitemap_count (wget_robots
*robots)
const char * wget_robots_get_sitemap
(wget_robots *robots, int index)
The purpose of this set of functions is to parse a Robots Exclusion Standard file into a data structure for easy access.
Value:
parse_record_field(d, f, sizeof(f) − 1)
Parameters
data Memory with
robots.txt content (with trailing 0-byte)
client Name of the client / user-agent
Returns
Return an allocated wget_robots structure or NULL on error
The function parses the robots.txt data in accordance to https://www.robotstxt.org/orig.html#format and returns a ROBOTS structure including a list of the disallowed paths and including a list of the sitemap files.
The ROBOTS structure has to be freed by calling wget_robots_free().
Parameters
robots Pointer to Pointer to wget_robots structure
wget_robots_free() free’s the formerly allocated wget_robots structure.
Parameters
robots Pointer to instance of wget_robots
Returns
Returns the number of paths listed in robots
Parameters
robots Pointer to
instance of wget_robots
index Index of the wanted path
Returns
Returns the path at index or NULL
Parameters
robots Pointer to instance of wget_robots
Returns
Returns the number of sitemaps listed in robots
Parameters
robots Pointer to
instance of wget_robots
index Index of the wanted sitemap URL
Returns
Returns the sitemap URL at index or NULL
Generated automatically by Doxygen for wget2 from the source code.