diff --git a/app-guide.gmi b/app-guide.gmi index 8571643..8251923 100644 --- a/app-guide.gmi +++ b/app-guide.gmi @@ -26,7 +26,7 @@ To get a sense of what Gemini apps can be like in practice, here is a sampling o  ## Overview of the guide  -* Section 1 "Getting started with CGI": what is CGI and how you use it. +* Section 1 "Getting started with CGI": what is CGI and how to use it.  * Section 2 "User interface": the look and feel of a Gemini application; how it appears and behaves from the user's point of view.  @@ -56,11 +56,11 @@ TODO: (minimal and naive example of a Gemini CGI app using Python)  # 2. User interface  -A key part of application design, and also programming in general, is to separate the public interface from the internal implementation. The needs of the human user and the internal technical implementation are very different and often at odds with each other. Nevertheless, both facets of an application are crucial and neither is compromised by the other. +A key part of application design, and also programming in general, is to separate the public interface from the internal implementation. The needs of the human user and the internal technical implementation are very different and often at odds with each other. Nevertheless, both facets of an application are crucial and should be considered on their own terms.  In practice, when it comes to Gemini applications, the user interface is built out of one or more dynamically generated Gemtext ("text/gemini") pages.  -Gemtext is quite a limited format for presenting a UI, which makes for an interesting design challenge. One issue is that the UI of your application should work — or at least strive to work — equally well with every Gemini client out there. Especially you should be wary of testing your application only on high-end graphical clients, where you have sophisticated page layout, multiple fonts, and color schemes that clarify the structure of the page. When viewing such pages in a terminal-based client, things may look different in unexpected ways. For example, clients may display links in different ways, and since links are used for most user actions and menus, it is important for them to remain legible and accessible. Still, Gemtext is simple enough that by following a few basic rules, you can achieve good results everywhere. One rule of thumb is that your UI should be comprehensible even when viewed as a plain-text Gemtext source file without any visual formatting. +Gemtext is quite a limited format for user interface presentation, which makes for an interesting design challenge. One issue is that the UI of your application should work — or at least strive to work — equally well with every Gemini client out there. Especially you should be wary of testing your application only on high-end graphical clients, where you have sophisticated page layout, multiple fonts, and color schemes that clarify the structure of the page. When viewing such pages in a terminal-based client, things may look different in unexpected ways. For example, clients may display links in different ways, and since links are used for most user actions and menus, it is important for them to remain legible and accessible. Still, Gemtext is simple enough that by following a few basic rules, you can achieve good results everywhere. One rule of thumb is that your UI should be comprehensible even when viewed as a plain-text Gemtext source file without any visual formatting.  In this section, we will take a closer look at the options and tools at your disposal when it comes to the UI.  @@ -146,16 +146,17 @@ The header and footer actions can be chosen as appropriate for the application,  ## 2.5 Navigation links  -* help the user get around and understand where they are -* breadcrumbs? placement? -* navigation structures: query strings vs. directory structure ("Go Up" / "Go to Parent" navigation!), "Back to X" pages vs. knowing where the user came from -* client-side navigation: small pages can be fully cached, presented as-is from history; no cache control; user may see obsolete/old content; also benefits, like ability to see old versions of edited pages without the server having to save any history +If your application is complex enough to have multiple pages or a directory hierarchy, you should carefully consider how the user will be navigating inside it. In practice, this happens via navigation links. It is important to present these links consistently, as it will help the user understand the application's structure and make using the application more fluent.  -* remember client may have go up/root actions +The primary navigation actions are typically found at the top menu of the page where they are instantly visible to the reader. Secondary navigation actions could be placed at the bottom of the page, for reaching locations that are less frequently needed or only indirectly related to the current page. + +If you provide "Back to X" actions for returning to a previous location, not that you may not actually know what "X" is supposed to be. Your application would need to record the previous request(s) performed by the user to know if returning to a particular page is appropriate. Even so, the user could manually type in a URL or access a specific page via a bookmark, without visiting any previous page beforehand. Generally speaking, relying on the client's built-in backwards navigation can be more reliable, although if the application generates its pages dynamically, obsolete data may then be seen by the user. The Gemini protocol has no way to control clientside caching, so clients can decide to cache visited pages as they see fit. + +If the application is more of a single-page state machine, like a game where you perform actions but always stay on a status page, query strings could be used as the primary navigation method. You should not mix query strings and directory hierarchies, though. Query strings are generally handled as input from the user, while the directory hierarchy is a structure can be navigated with Go Up/Parent clientside actions. (TODO: Why)  ## 2.6 Tips  -* To get started, open up a blank .gmi file in your favorite text editor and write, by hand, a prototype version of your UI. You can preview the results in a client and fine-tune the prototype until you are happy with how it looks. Don't forget to try it in both graphical and text-based clients. +* Open up a blank .gmi file in your favorite text editor and write, by hand, a prototype version of your UI. You can preview the results in a client and fine-tune the prototype until you are happy with how it looks. Don't forget to try it in both graphical and text-based clients. When you are happy with the prototype, you can use it as a concrete starting point for building the application.  * `PATH_INFO` can use useful for structuring the application if your server only supports CGI via executable programs. The executable can then represent the root directory of the application and all the subdirectories and files exist only virtually as part of the URL path. Remember, the path in the URL and the files on disk do not have to match in any way.  @@ -178,7 +179,7 @@ On server-side, the application typically runs inside a CGI environment launched  There may be server-specific differences in the CGI environment, but typically the following variables are available:  -* (environment variables) +* (TODO: environment variables)  ## 3.1 Queries  @@ -224,10 +225,12 @@ Ways to deal with the URL length limitation of 1024 bytes:  Once a user has submitted content to your application, they may need to later modify it.  +TODO: * no support for prefilled query content * make it easy for the user to copy and paste previous content * careful with line wrapping: coping from a hard-wrapped terminal, for example * the user must somehow copy and paste from somewhere (side note for consideration: Lagrange has a "Paste Preceding Line" feature that automates copy-pasting the Gemtext source line immediately preceding the link line that opened the query prompt; you may want to consider placing an "Edit" link immediately below such an editable line; but always ensure your app works in any client, a strength of Gemini is the software diversity, so interoperability must be paid attention to) +* pages in clientside cached history may be available for restoring earlier versions of edited content; e.g., make accidental edit, then navigate back to copy/paste the previous version  # 4. Sessions and users  @@ -249,11 +252,11 @@ Therefore, it is good to understand some details about TLS and client certificat  ### X.509  -Client certificates are a standard part of TLS. They are sometimes used on the web as well, and for things like securing connections to enterprise email servers. However, web browsers do not typically use them for identifying individual users. +The X.509 standard defines the format of public key certificates used in various internet protocols, including TLS.  -TODO: (DIAGRAM? X.509 certificate as a concept, key pair, metadata, fields) +Client certificates are a part of TLS. They are sometimes used on the web as well, and for things like securing connections to enterprise email servers. However, web browsers do not typically use them for identifying individual users like is done on Gemini.  -The encryption in TLS is based on key pairs. The Gemini client (or user) is in possession of a private key with which the client (or user) can create X.509 certificates. These certificates can optionally be included as part of Gemini requests, containing information like the certificate public key, expiration date, and what the certificate is supposed to be used for. When such a client certificate is active in a TLS session, the client certificate's keys are verified during the TLS handshake, so successful communication is only possible when both the server's and client's certificates and key pairs are valid. The server and the application can therefore both be cryptographically certain of the client's identity — or at least whether the client is in possession of the private key of the client certificate. +Encryption in TLS is based on symmetric key pairs. The Gemini client (or user) is in possession of a private key with which the client (or user) can create X.509 certificates. These certificates can optionally be included in Gemini requests, containing information like the certificate public key, expiration date, and what the certificate is supposed to be used for. When such a client certificate is active in a TLS session, the client certificate's keys are verified during the TLS handshake, so successful communication is only possible when both the server's and client's certificates and key pairs are valid. The server and the application can therefore both be cryptographically certain of the client's identity — or at least whether the client is in possession of the private key of the client certificate.  => https://en.wikipedia.org/wiki/X.509 See also: X.509 in Wikipedia  @@ -273,13 +276,9 @@ Due to Gemini's URL-prefix based client certificate activation, you must structu  ### Generating client certificates  -There are a few things you should know about where client certificates come from, as it may impact how well they work in your application. +You may need to generate multiple client certificates when developing and testing your application. Some Gemini clients allow you to generate new client certificates as needed, however certificates created with any X.509 software can also be used. The `gemcert` utility by Solderpunk and the OpenSSL command line tools are good choices for generating certificates, the former being specifically written for Gemini and the latter being widely available.  -TODO: -* see Solderpunk's documentation, `gemcert` -* some clients can do it for you -* anything to note about cert versions and ciphers? you may need to experiment a little with different clients and/or servers, don't get too exotic -* basic OpenSSL CLI example, useful for scripted testing +=> https://git.sr.ht/~solderpunk/gemcert gemcert: A simple tool for creating self-signed certs for use in Geminispace  ## 4.2 User accounts  @@ -499,7 +498,9 @@ A malicious user could create 10000 unique accounts using a script in a short pe  Check if your Gemini server provides rate limiting suitable for your application. However, when it comes to the server, it most likely has been implemented with the goal of remaining responsive under heavy load to serve as many requests as possible, instead of trying to prevent malice. You may find that implementing a more adaptive rate limit is necessary. For example, only certain actions in your application might warrant strict rate limiting while most pages can be served with the server's generic, more generous limits. For example, user registration and publishing content may be considered for more strict limits. Appropriate limits may also depend on the type of user account, with "trusted" users (e.g., administrators) having unlimited access.  -(TODO: Notes about implementing limits.) Rate limiting by definition requires you to keep a log of the incoming requests. Any logging that you perform should be done in privacy-sensitive manner; record hashes of client IP addresses instead of the actual plain addresses, for example. Recording the client certificate hash is preferable to the IP address, if the action is performed with certificate activated. +Rate limiting by definition requires you to keep a log of incoming requests. Any logging that you perform should be done in a privacy-sensitive manner: store hashes of client IP addresses instead of the actual plain addresses, for example. It is standard internet security practice to not store this kind of sensitive information plainly accessible in a database, in case a third party gains access to it. Recording the client certificate hash is preferable to the IP address, if the action is performed with a certificate activated. + +A very basic rate limiter would count the number of requests that have occurred inside a given time window (per action/user), and reject further requests if a predefined threshold has been reached. You should make any threshold values easily adjustable so they can be tuned to the current circumstances. If a more robust algorithm is needed, you should check out the leaky bucket algorithm:  => https://en.wikipedia.org/wiki/Leaky_bucket Leaky bucket (algorithm)  @@ -511,46 +512,73 @@ You should treat client certicates as sensitive information. If your application  Your application should have adequate administrative features for cleaning up messes caused by malicious users. For example, you may need a way to quickly and easily delete thousands of accounts that were created in a scripted attack, without having to roll the database back to an earlier backup.  -# 6. Miscellaneous details +## 5.5 Path handling  -Some lower-level notes about Gemini-specific CGI. However, this guide is not meant to be a technical reference about CGI; for getting CGI up and running,see @tomasino's Gemini CGI videos (tbd: other CGI tutorials?) +If you find yourself implementing URL path handling, for example as part of processing PATH_INFO or in a custom-built Gemini server, note the common pitfalls in mapping the requested path to actual files: you should prevent access to hidden Unix files (whose name starts with a period) and reject extraneous ".." references that attempt to access out-of-bounds parent directories.  -* CGI environment, cf. official/Sean's documentation, server's own documentation -* what to do with the cert/pubkey hash; storing in database or persistent storage, security implications (TLS-backed); what do the variables mean and how to use them -* example: generating a SHA-256 fingerprint using pyOpenSSL from the peer X.509 cert -* pitfalls in mapping the request path to files (prepare for "." and ".." references) -* similarities to web apps: prepare for parallel request processing; full-blown vs. SQLite databases -* Python, PHP, Lua, Rust (?); libraries and frameworks for implementing a Gemini server +# 6. Technical notes  ## 6.1 Alternatives to CGI  -While you can implement fully-fledged applications with CGI, it still assumes that input comes in via environment variables and output goes to stdout, with each request spawning a CGI process. This can be too performance-intensive for the server and inoptimal for your application. Some Gemini servers support other interfaces like SCGI and FastCGI for handling requests more efficiently. +While you can implement fully-fledged applications with CGI, it still assumes that input comes in via environment variables and output goes to stdout, with each request spawning a CGI process. This can be too performance-intensive for the server and inoptimal for your application, especially if the server is running on low-end hardware.  -See also: -=> gemini://gemini.bunburya.eu/gemlog/posts/2021-04-07-dynamic-content-scgi-gemini.gmi bunburya: Using SCGI to serve dynamic content over the Gemini protocol +Some Gemini servers support other interfaces like SCGI and FastCGI for handling requests more efficiently. A sensible approach could be to get started with basic CGI and move onto more efficient interfaces or a customized server when encountering performance or API limitations. + +### FastCGI + +=> https://en.wikipedia.org/wiki/FastCGI FastCGI: +> FastCGI is a binary protocol for interfacing interactive programs with a web server. It is a variation on the earlier Common Gateway Interface (CGI). FastCGI's main aim is to reduce the overhead related to interfacing between web server and CGI programs, allowing a server to handle more web page requests per unit of time. + +One Gemini server that supports FastCGI is gmid. + +=> gemini://gmid.omarpolo.com/ gmid + +### SCGI  -### Custom servers +=> https://en.wikipedia.org/wiki/Simple_Common_Gateway_Interface SCGI: Simple Common Gateway Interface: +> SCGI is a protocol for applications to interface with HTTP servers, as an alternative to the CGI protocol. It is similar to FastCGI but is designed to be easier to parse. Unlike CGI, it permits a long-running service process to continue serving requests, thus avoiding delays in responding to requests due to setup overhead (such as connecting to a database).  -A bespoke server optimized for a single application is not too difficult to implement thanks to Gemini's simplicity. One could write such a server from scratch or use an off-the-shelf solution that supports extensions or provides a suitable low-level framework: +SCGI is supported at least by the Molly Brown and GLV-1.12556 servers.  -* Sean Conner's GLV (Lua, ?) -* mozz's JetForce (Python) -* skyjake's GmCapsule (Python) +=> https://tildegit.org/solderpunk/molly-brown/src/branch/master/README.md Molly Brown's README +=> https://github.com/spc476/GLV-1.12556 GLV-1.12556 (GitHub)  -(TODO: add some detail here) +For more information about using SCGI:  -A sensible approach could be to get started with standard CGI and move onto a more customized server should the need arise due to performance or API reasons. +=> gemini://gemini.bunburya.eu/gemlog/posts/2021-04-07-dynamic-content-scgi-gemini.gmi bunburya: Using SCGI to serve dynamic content over the Gemini protocol + +### Extensible and custom servers + +A bespoke server optimized for a single application is not too difficult to implement thanks to Gemini's simplicity. One could write such a server from scratch or existing software that supports extensions or provides a suitable low-level framework: + +=> https://github.com/spc476/GLV-1.12556 GLV-1.12556 (Lua) +=> https://pypi.org/project/gmcapsule/ GmCapsule (Python) +=> https://pypi.org/project/Jetforce/ JetForce (Python)  Once you implement a custom server, you can also implement support for requests with custom URI schemes for additional flexibility, targeting specialized clients. However, that is outside the scope of this guide.  -### Proxy applications +## 6.2 Parallel processing + +Gemini applications typically have not encountered high levels of traffic and therefore the need to handle multiple requests in parallel is not a fundamental requirement. To keep things simple, you could simply handle a single request at a time. This sidesteps multiple issues with managing application state, such as making simultanous updates to its database. This makes it possible to rely on SQLite, for instance, simplifying the implementation. However, if the processing of a request takes a long time, or when multiple people do happen to use the application simultanously, it is good to consider how parallel processing is supported by the application.  -(TODO:) Gemini proxy servers are a bit under-documented; a Gemini server can respond to any given URL (scheme, hostname, etc.); an app could be built around this, and a proxy server that fetches remote content could be built as a stateful app, too -- use of client certificates is not well-defined, though, but generally clients activate certs based on the request URL; the proxy server would receive this certificate instead, but cannot redo the request with cert unless it possesses a copy of the same private key -- this area is open for experimentation +In practice, the implementation details are similar to web applications. Commonly applications rely on a full-fledged database server like PostgreSQL and build their internal processing around that. The database itself can then be used for keeping transactions correct and atomic as needed. ("ACID": atomicity, consistency, isolation, and durability.) Your Gemini server may impose some limitations on handling of parallel requests, though. For example, a Python-based server that handles requests using multiple threads can only transmit data in parallel while actually executing code in only one request at a time. (For more information, read about the Python Global Interpreter Lock (GIL).) Externally running CGI programs naturally can run in parallel regardless. Please refer to your server's documentation for more details.  -### Mailto links +## 6.3 Mailto links  The "mailto" URI scheme can be quite convenient for certain applications. You can use these in your application to enable the user to conveniently send an email to some destination address, optionally with a predefined subject and body as well. Some Gemini clients are able to open these links in an email client, much like a web browser would. For example:  => mailto:jaakko.keranen@iki.fi?subject=Lagrange%20commit%2092190836&body=%3D>%20gemini%3A//git.skyjake.fi%3A1965/lagrange/release/commits/92190836356c43238e29856629e47d80e82dd7e3 Send email about Lagrange commit 92190836  The prefilled message subject and body could be used for temporary session tokens or other metadata as required by the application, or to associate the message with a particular user. However, email is usually transmitted as plain text, so the usual privacy concerns apply. It would be possible to incorporate PGP into these emails, but that would have to done manually by the user, so it becomes cumbersome from a user experience point of view. However, encrypted emails do have the benefit of verifying the identity of the sender without any session tickets or single-use tokens. + +## 6.4 Proxy applications + +Gemini proxy servers can respond to Gemini URLs located on hosts other than themself and they may handle non-Gemini URIs as well. When it comes to TLS, the client's connection is to the proxy server; the proxy server does its own, independent TLS requests to the destination hosts. + +An example of a proxy application is one that responds to HTTPS URLs and converts the corresponding web pages to Gemtext: + +=> gemini://gemi.dev/stargate.gmi Public Stargate Proxy + +Other kinds of applications could be built as a proxy server. A Gemini server can respond to any given URL (scheme, hostname, etc.), opening the door to custom URI schemes and virtual path hierarchies. However, Gemini clients typically only recognize a handful of URI schemes, so in practice custom schemes may be useful mostly for special-purpose Gemini clients. + +A proxy server that fetches remote content could be built in a stateful manner, too, because it receives the client certificate if one is enabled for the requested URI. As explained in section 4, the certificate could be used as a session ticket for keeping track of the proxy application's state. However, the proxy cannot perform further requests with the client's certificate, limiting what is possible in practice. This area remains open for experimentation.