Manpage logo

libwget-robots - Robots Exclusion file parser

NAME  SYNOPSIS  Data Structures  Macros  Functions  Detailed Description  Macro Definition Documentation  #define parse_record_field( d, f)  Function Documentation  int wget_robots_parse (wget_robots ** _robots, const char * data, constchar * client)  void wget_robots_free (wget_robots ** robots)  int wget_robots_get_path_count (wget_robots * robots)  wget_string * wget_robots_get_path (wget_robots * robots, int index)  int wget_robots_get_sitemap_count (wget_robots * robots)  const char * wget_robots_get_sitemap (wget_robots * robots, int index)  Author 

NAME

libwget-robots − Robots Exclusion file parser

SYNOPSIS

Data Structures

struct wget_robots_st

Macros

#define parse_record_field(d, f)

Functions

int wget_robots_parse (wget_robots **_robots, const char *data, const char *client)
void wget_robots_free (wget_robots **robots)
int wget_robots_get_path_count (wget_robots *robots)
wget_string
* wget_robots_get_path (wget_robots *robots, int index)
int wget_robots_get_sitemap_count (wget_robots *robots)
const char * wget_robots_get_sitemap (wget_robots *robots, int index)

Detailed Description

The purpose of this set of functions is to parse a Robots Exclusion Standard file into a data structure for easy access.

Macro Definition Documentation

#define parse_record_field( d, f)

Value:
parse_record_field(d, f, sizeof(f) − 1)

Function Documentation

int wget_robots_parse (wget_robots ** _robots, const char * data, constchar * client)

Parameters

data Memory with robots.txt content (with trailing 0-byte)
client
Name of the client / user-agent

Returns

Return an allocated wget_robots structure or NULL on error

The function parses the robots.txt data in accordance to https://www.robotstxt.org/orig.html#format and returns a ROBOTS structure including a list of the disallowed paths and including a list of the sitemap files.

The ROBOTS structure has to be freed by calling wget_robots_free().

void wget_robots_free (wget_robots ** robots)

Parameters

robots Pointer to Pointer to wget_robots structure

wget_robots_free() free’s the formerly allocated wget_robots structure.

int wget_robots_get_path_count (wget_robots * robots)

Parameters

robots Pointer to instance of wget_robots

Returns

Returns the number of paths listed in robots

wget_string * wget_robots_get_path (wget_robots * robots, int index)

Parameters

robots Pointer to instance of wget_robots
index
Index of the wanted path

Returns

Returns the path at index or NULL

int wget_robots_get_sitemap_count (wget_robots * robots)

Parameters

robots Pointer to instance of wget_robots

Returns

Returns the number of sitemaps listed in robots

const char * wget_robots_get_sitemap (wget_robots * robots, int index)

Parameters

robots Pointer to instance of wget_robots
index
Index of the wanted sitemap URL

Returns

Returns the sitemap URL at index or NULL

Author

Generated automatically by Doxygen for wget2 from the source code.


Updated 2026-06-01 - jenkler.se | uex.se