LuaSocket
Network support for the Lua language

home · introduction · reference


URL

The url namespace provides functions to parse, protect, and build URLs, as well as functions to compose absolute URLs from base and relative URLs, according to RFC 2396.

To obtain the url namespace, run:

-- loads the URL module 
local url = require("socket.url")

An URL is defined by the following grammar:

<url> ::= [<scheme>:][//<authority>][/<path>][;<params>][?<query>][#<fragment>]
<authority> ::= [<userinfo>@]<host>[:<port>]
<userinfo> ::= <user>[:<password>]
<path> ::= {<segment>/}<segment>

url.absolute(base, relative)

Builds an absolute URL from a base URL and a relative URL.

Base is a string with the base URL or a parsed URL table. Relative is a string with the relative URL.

The function returns a string with the absolute URL.

Note: The rules that govern the composition are fairly complex, and are described in detail in RFC 2396. The example bellow should give an idea of what the rules are.

http://a/b/c/d;p?q

+

g:h      =  g:h
g        =  http://a/b/c/g
./g      =  http://a/b/c/g
g/       =  http://a/b/c/g/
/g       =  http://a/g
//g      =  http://g
?y       =  http://a/b/c/?y
g?y      =  http://a/b/c/g?y
#s       =  http://a/b/c/d;p?q#s
g#s      =  http://a/b/c/g#s
g?y#s    =  http://a/b/c/g?y#s
;x       =  http://a/b/c/;x
g;x      =  http://a/b/c/g;x
g;x?y#s  =  http://a/b/c/g;x?y#s
.        =  http://a/b/c/
./       =  http://a/b/c/
..       =  http://a/b/
../      =  http://a/b/
../g     =  http://a/b/g
../..    =  http://a/
../../   =  http://a/
../../g  =  http://a/g

url.build(parsed_url)

Rebuilds an URL from its parts.

Parsed_url is a table with same components returned by parse. Lower level components, if specified, take precedence over high level components of the URL grammar.

The function returns a string with the built URL.

url.build_path(segments, unsafe)

Builds a <path> component from a list of <segment> parts. Before composition, any reserved characters found in a segment are escaped into their protected form, so that the resulting path is a valid URL path component.

Segments is a list of strings with the <segment> parts. If unsafe is anything but nil, reserved characters are left untouched.

The function returns a string with the built <path> component.

url.escape(content)

Applies the URL escaping content coding to a string Each byte is encoded as a percent character followed by the two byte hexadecimal representation of its integer value.

Content is the string to be encoded.

The function returns the encoded string.

-- load url module
url = require("socket.url")

code = url.escape("/#?;")
-- code = "%2f%23%3f%3b"

url.parse(url, default)

Parses an URL given as a string into a Lua table with its components.

Url is the URL to be parsed. If the default table is present, it is used to store the parsed fields. Only fields present in the URL are overwritten. Therefore, this table can be used to pass default values for each field.

The function returns a table with all the URL components:

parsed_url = {
  url = string,
  scheme = string,
  authority = string,
  path = string,
  params = string,
  query = string,
  fragment = string,
  userinfo = string,
  host = string,
  port = string,
  user = string,
  password = string
}
-- load url module
url = require("socket.url")

parsed_url = url.parse("http://www.example.com/cgilua/index.lua?a=2#there")
-- parsed_url = {
--   scheme = "http",
--   authority = "www.example.com",
--   path = "/cgilua/index.lua"
--   query = "a=2",
--   fragment = "there",
--   host = "www.puc-rio.br",
-- }

parsed_url = url.parse("ftp://root:passwd@unsafe.org/pub/virus.exe;type=i")
-- parsed_url = {
--   scheme = "ftp",
--   authority = "root:passwd@unsafe.org",
--   path = "/pub/virus.exe",
--   params = "type=i",
--   userinfo = "root:passwd",
--   host = "unsafe.org",
--   user = "root",
--   password = "passwd",
-- }

url.parse_path(path)

Breaks a <path> URL component into all its <segment> parts.

Path is a string with the path to be parsed.

Since some characters are reserved in URLs, they must be escaped whenever present in a <path> component. Therefore, before returning a list with all the parsed segments, the function removes escaping from all of them.

url.unescape(content)

Removes the URL escaping content coding from a string.

Content is the string to be decoded.

The function returns the decoded string.