Tutorial: A Simple HTTP Server
The very first use-case for NMFU was to simplify the protocol layer of an embedded HTTP server, and so that's what the first tutorial will cover.
The goal here is to have an API that we can feed received HTTP data into as it arrives, and get out an object containing information about the request to be serviced.
Defining the parser
We start by defining what output we want our parser to generate -- for now, let's just get the requested path.
Observe the explicit size here: NMFU doesn't do dynamic allocation, and so all fields must have a defined length -- the size here is the length of the underlying buffer, which by default is null-terminated.
Next, we tell NMFU what we want our parser to parse inside the parser
block:
So what is this doing? An NMFU parser at its core is a sequence of matches -- here we have three different types: the direct-match which matches an exact sequence of characters, the regex-match which matches a regular expression, and the wait-match which discards all non-matching input until its argument matches.
The first line of the parser, "GET /"
matches those 5 characters in order. The next line is an append-statement, which takes whatever its argument matches
and appends it to a given string output. The regex shown here matches a typical URL. Notice how we've placed the first "/" in the preceding match: all URLs
should start with one, and so we can save a byte of RAM by not putting it into the string.
Then, we use a combination of direct and regex matches to math the last part of the first line of the HTTP request. We could have combined the two into a single regex, but doing this avoids having to escape the slash and dot.
Finally, we wait for the end of the request, signified by two empty lines (or, equivalently, two newlines with nothing between them).
So what can we do with this? Well, let's assume this full parser is in a file called http_server.nmfu
, then we can compile it into C with
which will generate two files, http_server.c
and http_server.h
in the same directory as http_server.nmfu
containing our parser.
Using the parser
NMFU tries to keep its generated API as simple as possible. The entire parser state is contained within the http_server_state
struct, which has a helper typedef defined
as http_server_state_t
.
Note
All definitions generated by NMFU are based on the output filename without extension (which can be customized with the -o
command line option), defaulting to the input filename.
We initialize this state object with the http_server_start
function, e.g.
Then, we can provide input to the parser via the http_server_feed
function, which takes two pointers, the start and (exclusive) end of the data to read. For example, if we were reading
from stdin, it would look something like:
int count = 0;
char buf[32];
while ((count = read(STDIN_FILENO, buf, 32)) > 0) {
http_server_feed(&parser, buf, buf + count);
}
This, however, is not complete. We need to deal with the return from _feed
, which is an enum with three possible values. Either the parser encountered an error (such as
out of space in a string or no match), the parser reached the end of its program, or the parser is waiting for more input. These correspond to the results HTTP_SERVER_FAIL
, HTTP_SERVER_DONE
or HTTP_SERVER_OK
respectively.
Updating our example, we might have something like
while ((count = read(STDIN_FILENO, buf, 32)) > 0) {
switch (http_server_feed(&parser, buf, buf + count)) {
case HTTP_SERVER_OK:
continue;
case HTTP_SERVER_FAIL:
fprintf(stderr, "invalid input");
return;
case HTTP_SERVER_DONE:
goto finished;
}
}
if (count < 0) {
perror("read error");
return;
}
finished:
// do something with the output
Warning
Note that if you want to try this in a terminal you'll probably want to change the \r\n
(which is what the HTTP RFC specifies) into just an \n
to test.
Finally, we just need to extract the data from the parser. All the output variables are placed inside the c
subobject in the parser state, so we can just use
(using the /
since we omitted it from the string)
Putting this all together, we might have something along the lines of
#include <stdio.h>
#include <unistd.h>
#include <http_server.h>
int main() {
http_server_state_t parser;
int count = 0; char buf[32];
http_server_start(&parser);
while ((count = read(STDIN_FILENO, buf, 32)) > 0) {
switch (http_server_feed(&parser, buf, buf + count)) {
case HTTP_SERVER_OK:
continue;
case HTTP_SERVER_FAIL:
fprintf(stderr, "invalid input");
return 2;
case HTTP_SERVER_DONE:
goto finished;
}
}
if (count < 0) {
perror("read error");
return 1;
}
finished:
printf("got request for /%s\n", parser.c.request_path);
return 0;
}
which should read an HTTP 1.x request off of stdin and print out the path being requested.
Simple conditionals: Handling request methods
Now, this server is only barely functional. Let's provide it with slightly more functionality by getting it to recognize different request methods.
First, we'll declare another output variable, this time using the enumeration syntax.
Then, we'll replace the first part of our parser with
parser {
case {
"GET " -> {method = GET;}
"POST " -> {method = POST;}
else -> {method = UNSUPPORTED; wait " ";}
}
"/";
We've introduced a new statement, the case-statement. This statement basically tries to match all of the expressions given to it simultaneously, and whichever one successfully terminates first determines the next set of statements to execute. Multiple branches can be matching at the same time, however ambiguity as to which branch to execute is not allowed. For example,
will work fine, despite both matches starting with the same letter, but
would not, since both conditions would match "PUT".
Note that the case statement also introduces us to the first way NMFU can deal with errors, with the else
condition. If all of the conditions fail to match after a certain input character, control is immediately
transferred to the body of the else
condition. Specifically, if we gave GER
to our parser, the first two letters GE
would be consumed by the GET
option, however the R
would not match anything.
Therefore, it gets "sent" to the body of the else
condition, which in this case winds up being the wait match, which will then discard it as expected.
Regardless, our parser should now be capable of differentiating between different request methods, and even give useful error information if it gets one that it doesn't recognize (perhaps
to generate a 405 Method Not Allowed
response).
The enumeration we defined will be exposed to C as the enum http_server_method
, with another helper typedef http_server_method_t
, with values HTTP_SERVER_METHOD_GET
, HTTP_SERVER_METHOD_POST
, etc.
We can access it from the state object with
printf("got request for /%s\n", parser.c.request_path);
switch (parser.c.method) {
case HTTP_SERVER_METHOD_GET:
puts("with a get request");
break;
case HTTP_SERVER_METHOD_POST:
puts("with a post request");
break;
default:
puts("with an unknown method");
break;
}
Error handling: the try
statement
Let's go back to how we read the request path. What if we wanted our server to offer up a useful error message if the request path was too long? (since there is a defined status code for this, 414 URI Too Long
)
We can use the try-catch functionality of NMFU to accomplish this. Let's add an output flag to indicate if this happens with
Note
In a full server, this would probably be a "status" enumeration as opposed to individual flags, but for now this will work.
Then, we can replace the line reading request_path
with
try {
request_path += /[\/a-zA-Z0-9.\-_?=&+]+/;
}
catch (outofspace) {
uri_too_long = true;
wait "\r\n\r\n";
finish;
}
The try
block here adds an error handler in much the same way else
does for the case statement. While there is a nomatch
error which functions identically to the case statement, here we're
using the outofspace
error, which will fire when trying to append a character to a full output. Here, we just set the uri_too_long
output to true, wait for the end of the request, and then terminate
the parser early with the finish;
statement.
We could read this from C with
This concludes the first part of the HTTP tutorial, the second part covers handling headers and numbers with loops.