repo: gemini-spec action: commit revision: path_from: revision_from: 968dfb40664c31fac3ed97202338b68fa09655e4: path_to: revision_to:
commit 968dfb40664c31fac3ed97202338b68fa09655e4 Author: Sean ConnerDate: Tue Mar 2 18:06:27 2021 -0500 Format changes for editing I've reflowed the documents to make it easier for me to edit, and to make diffs easier to read as I make changes to these documents. Once everything is finalized, they'll be reformatted back to a more Gemini friendly version, so don't worry. diff --git a/best-practices.gmi b/best-practices.gmi
--- a/best-practices.gmi +++ b/best-practices.gmi @@ -2,61 +2,131 @@ ## Introduction -This document describes various conventions and snippets of advice for implementing and using the Gemini protocol which, while not mandated by the protocol specification, are generally considered a good idea. If you're writing Gemini software or building a Gemini site, you should generally follow the advice given here unless you have good reasons not to. +This document describes various conventions and snippets of advice for +implementing and using the Gemini protocol which, while not mandated by the +protocol specification, are generally considered a good idea. If you're +writing Gemini software or building a Gemini site, you should generally +follow the advice given here unless you have good reasons not to. ## Filenames -Gemini servers need to inform clients of the MIME type of the files they are serving. The most convenient way for servers to figure out the MIME type of files is via the extension of the filename. These mappings are mostly well-standardised (and unix systems often have an /etc/mime.types file full of them), but the question remains as to how servers should recognise files to be served with the text/gemini type defined by Gemini. +Gemini servers need to inform clients of the MIME type of the files they are +serving. The most convenient way for servers to figure out the MIME type of +files is via the extension of the filename. These mappings are mostly +well-standardised (and unix systems often have an /etc/mime.types file full +of them), but the question remains as to how servers should recognise files +to be served with the text/gemini type defined by Gemini. -Current Gemini servers seem to use .gmi or .gemini extensions for this purpose, and new servers are strongly encouraged to support one or both of these options instead of adding a new one to the mix. +Current Gemini servers seem to use .gmi or .gemini extensions for this +purpose, and new servers are strongly encouraged to support one or both of +these options instead of adding a new one to the mix. -Following the convention for webservers, if a request is received for a path which maps to a directory in the server's filesystem and a file named index.gmi or index.gemini exists in that directory, it is served up for that +Following the convention for webservers, if a request is received for a path +which maps to a directory in the server's filesystem and a file named +index.gmi or index.gemini exists in that directory, it is served up for that path. ## File size -Gemini servers do not inform clients of the size of files they are serving, which can make it difficult to detect if a connection is closed prematurely due to a server fault. This risk of this happening increases with file size. +Gemini servers do not inform clients of the size of files they are serving, +which can make it difficult to detect if a connection is closed prematurely +due to a server fault. This risk of this happening increases with file +size. -Gemini also has no support for compression of large files, or support for checksums to enable detection of file corruption, the risk of which also increases with file size. +Gemini also has no support for compression of large files, or support for +checksums to enable detection of file corruption, the risk of which also +increases with file size. -For all of these reasons, Gemini is not well suited to the transfer of "very large" files. Exactly what counts as "very large" depends to some extent on the speed and reliability of the internet connections involved, and the patience of the users. As a rule of thumb, files larger than 100MiB might be thought of as best served some other way. +For all of these reasons, Gemini is not well suited to the transfer of "very +large" files. Exactly what counts as "very large" depends to some extent on +the speed and reliability of the internet connections involved, and the +patience of the users. As a rule of thumb, files larger than 100MiB might +be thought of as best served some other way. -Of course, because Gemini supports linking to other online content via any protocol with a URL scheme, it's still possible to link from a Gemini document to a large file served via HTTPS, BitTorrent, IPFS or whatever else tickles your fancy. +Of course, because Gemini supports linking to other online content via any +protocol with a URL scheme, it's still possible to link from a Gemini +document to a large file served via HTTPS, BitTorrent, IPFS or whatever else +tickles your fancy. ## Text encoding -Gemini supports any text encoding you like via the "charset" parameter of text/* MIME types. This allows serving "legacy" text content in obscure regional encoding schemes. +Gemini supports any text encoding you like via the "charset" parameter of +text/* MIME types. This allows serving "legacy" text content in obscure +regional encoding schemes. -For new content, please, please, please just use UTF-8. The Gemini specification mandates that clients be able to handle UTF-8 text. Support for any other encoding is up to the client and is not guaranteed. Serving your content as UTF-8 maximises its accessibility and maximises the utility of simple clients which support only UTF-8. +For new content, please, please, please just use UTF-8. The Gemini +specification mandates that clients be able to handle UTF-8 text. Support +for any other encoding is up to the client and is not guaranteed. Serving +your content as UTF-8 maximises its accessibility and maximises the utility +of simple clients which support only UTF-8. ## Redirects ### General remarks -Redirects were included in Gemini primarily to permit the restructuring of sites or the migration of sites between servers without breaking existing links. A large, interconnected space of documents without such a facility inevitably becomes "brittle". +Redirects were included in Gemini primarily to permit the restructuring of +sites or the migration of sites between servers without breaking existing +links. A large, interconnected space of documents without such a facility +inevitably becomes "brittle". -However, redirects are, generally speaking, nasty things. They reduce the transparency of a protocol and make it harder for people to make informed choices about which links to follow, and they can leak information about people's online activity to third parties. They are not as bad in Gemini as in HTTP (owing to the lack of cookies, referer headers, etc.), but they remain at best a necessary evil. +However, redirects are, generally speaking, nasty things. They reduce the +transparency of a protocol and make it harder for people to make informed +choices about which links to follow, and they can leak information about +people's online activity to third parties. They are not as bad in Gemini as +in HTTP (owing to the lack of cookies, referer headers, etc.), but they +remain at best a necessary evil. -As such, please refrain from using redirects frivolously! Things like URL-shorteners are almost totally without merit. In general, think long and hard about using redirects to do anything other than avoid link breakage. +As such, please refrain from using redirects frivolously! Things like +URL-shorteners are almost totally without merit. In general, think long and +hard about using redirects to do anything other than avoid link breakage. ### Redirect limits -Clients may prompt their users for decisions as to whether or not to follow a redirect, or they may follow redirects automatically. If you write a client which follows redirects automatically, you should keep the following issues in mind. +Clients may prompt their users for decisions as to whether or not to follow +a redirect, or they may follow redirects automatically. If you write a +client which follows redirects automatically, you should keep the following +issues in mind. -Misconfigured or malicious Gemini servers may serve redirects in such a way that a client which follows them blindly gets trapped in an infinite loop of redirects, or otherwise has to complete a very long chain of redirects. Robust clients will need to be smart enough to detect these conditions and act accordingly. The simplest implementation is to refuse to follow more than N consecutive redirects. It is recommended that N be set no higher than 5. This is inline with the original recommenation for HTTP (see RFC-2068). +Misconfigured or malicious Gemini servers may serve redirects in such a way +that a client which follows them blindly gets trapped in an infinite loop of +redirects, or otherwise has to complete a very long chain of redirects. +Robust clients will need to be smart enough to detect these conditions and +act accordingly. The simplest implementation is to refuse to follow more +than N consecutive redirects. It is recommended that N be set no higher +than 5. This is inline with the original recommenation for HTTP (see +RFC-2068). ### Cross-protocol redirects -Cross-protocol redirects (i.e. redirects from Gemini to something else, like Gopher) are possible within Gemini, but are very heavily discouraged. However, misconfigured or malicious servers will always be able to serve such redirects, so well-written clients should be ready to detect them and respond accordingly. +Cross-protocol redirects (i.e. redirects from Gemini to something else, +like Gopher) are possible within Gemini, but are very heavily discouraged. +However, misconfigured or malicious servers will always be able to serve +such redirects, so well-written clients should be ready to detect them and +respond accordingly. -It is strongly recommended that even clients which generally follow redirects automatically alert the user and ask for explicit confirmation when served a redirect to a non-TLS-secured protocols like HTTP or Gopher, assuming the client implements support for these protocols. This avoids unintentional plaintext transfers. +It is strongly recommended that even clients which generally follow +redirects automatically alert the user and ask for explicit confirmation +when served a redirect to a non-TLS-secured protocols like HTTP or Gopher, +assuming the client implements support for these protocols. This avoids +unintentional plaintext transfers. ### TLS Cipher suites -TLS 1.2 is reluctantly permitted in Gemini despite TLS 1.3 being drastically simpler and removing many insecure cryptographic primitives. This is because only OpenSSL seems to currently have good support for TLS 1.3 and so requiring TLS 1.3 or higher would discourage the use of libraries like LibreSSL or BearSSL, which otherwise have much to recommend them over OpenSSL. +TLS 1.2 is reluctantly permitted in Gemini despite TLS 1.3 being drastically +simpler and removing many insecure cryptographic primitives. This is +because only OpenSSL seems to currently have good support for TLS 1.3 and so +requiring TLS 1.3 or higher would discourage the use of libraries like +LibreSSL or BearSSL, which otherwise have much to recommend them over +OpenSSL. -Client and server authors who choose to support TLS 1.2 should ideally only permit the use of ciphersuites which offer similar security to TLS 1.3. In particular, such software should: +Client and server authors who choose to support TLS 1.2 should ideally only +permit the use of ciphersuites which offer similar security to TLS 1.3. In +particular, such software should: + +* Use only Ephemeral Diffie-Hellman (DHE) Ephermeral Eliptic Curve + Diffie-Hellman (ECDHE) for key agreement, in order to provide forward + secrecy. -* Use only Ephemeral Diffie-Hellman (DHE) Ephermeral Eliptic Curve Diffie-Hellman (ECDHE) for key agreement, in order to provide forward secrecy. * Use AES or ChaCha20 as bulk ciphers + * Use SHA2 or SHA3 family hash functions for message authentication. diff --git a/faq.gmi b/faq.gmi
--- a/faq.gmi +++ b/faq.gmi @@ -6,60 +6,138 @@ Last updated: 2021-02-21 ### 1.1 What is Gemini? -Gemini is a new application-level internet protocol for the distribution of arbitrary files, with some special consideration for serving a lightweight hypertext format which facilitates linking between files. You may think of Gemini as "the web, stripped right back to its essence" or as "Gopher, souped up and modernised just a little", depending upon your perspective (the latter view is probably more accurate). Gemini may be of interest to people who are: +Gemini is a new application-level internet protocol for the distribution of +arbitrary files, with some special consideration for serving a lightweight +hypertext format which facilitates linking between files. You may think of +Gemini as "the web, stripped right back to its essence" or as "Gopher, +souped up and modernised just a little", depending upon your perspective +(the latter view is probably more accurate). Gemini may be of interest to +people who are: * Opposed to the web's ubiquitous tracking of users -* Tired of nagging pop-ups, obnoxious adverts, autoplaying videos and other misfeatures of the modern web -* Interested in low-power computing and/or low-speed networks, either by choice or necessity -Gemini is intended to be simple, but not necessarily as simple as possible. Instead, the design strives to maximise its "power to weight ratio", while keeping its weight within acceptable limits. Gemini is also intended to be very privacy conscious, to be difficult to extend in the future (so that it will *stay* simple and privacy conscious), and to be compatible with a "do it yourself" computing ethos. For this last reason, Gemini is technically very familiar and conservative: it's a protocol in the traditional client-server request-response paradigm, and is built on mature, standardised technology like URIs, MIME media types, and TLS. +* Tired of nagging pop-ups, obnoxious adverts, autoplaying videos and other + misfeatures of the modern web + +* Interested in low-power computing and/or low-speed networks, either by + choice or necessity + +Gemini is intended to be simple, but not necessarily as simple as possible. +Instead, the design strives to maximise its "power to weight ratio", while +keeping its weight within acceptable limits. Gemini is also intended to be +very privacy conscious, to be difficult to extend in the future (so that it +will *stay* simple and privacy conscious), and to be compatible with a "do +it yourself" computing ethos. For this last reason, Gemini is technically +very familiar and conservative: it's a protocol in the traditional +client-server request-response paradigm, and is built on mature, +standardised technology like URIs, MIME media types, and TLS. ### 1.2 How old is Gemini? -Project Gemini started in June 2019. While the protocol itself is largely finalised, the available software, resources and community are still in a relatively early (though thriving!) state of development. +Project Gemini started in June 2019. While the protocol itself is largely +finalised, the available software, resources and community are still in a +relatively early (though thriving!) state of development. ### 1.3 Who is in charge of Gemini? -Project Gemini was originally started by Solderpunk, who remains the "Benevolent Dictator" of the project. However, the protocol has been designed in collaboration with a loose and informal community of many interested parties via emails, posts in Gopher's "phlogosphere" and toots in the Fediverse. Many people have shaped significant parts of the protocol, so despite having a single leader, Gemini should not be thought of as the work of a single person. +Project Gemini was originally started by Solderpunk , who remains the "Benevolent Dictator" of the project. However, +the protocol has been designed in collaboration with a loose and informal +community of many interested parties via emails, posts in Gopher's +"phlogosphere" and toots in the Fediverse. Many people have shaped +significant parts of the protocol, so despite having a single leader, Gemini +should not be thought of as the work of a single person. -In February 2021, long time Gemini contributor Sean Conner was granted some decision making authority to help finalise the Gemini specification during a time when Solderpunk was unable to dedicate the necessary time and energy to the project. +In February 2021, long time Gemini contributor Sean Conner was granted some +decision making authority to help finalise the Gemini specification during a +time when Solderpunk was unable to dedicate the necessary time and energy to +the project. ### 1.4 How large is "Geminispace"? -It's difficult to know exactly. Counting unique hostnames of Gemini servers is likely to exaggerate the size of the space, since some multi-user sites give each user their own subdomain. On the other hand, counting unique IP addresses is likely to underestimate the size, as Gemini allows multiple different domains to be served from the same IP. At any rate, as of early 2021 there were about 200,000 known Gemini URLs, spread across about 750 "capsules" (the Gemini community's term for "sites"), 500 domains and 600 IP addresses. The space is growing rapidly, though. You can find the latest statistics as the link below. +It's difficult to know exactly. Counting unique hostnames of Gemini servers +is likely to exaggerate the size of the space, since some multi-user sites +give each user their own subdomain. On the other hand, counting unique IP +addresses is likely to underestimate the size, as Gemini allows multiple +different domains to be served from the same IP. At any rate, as of early +2021 there were about 200,000 known Gemini URLs, spread across about 750 +"capsules" (the Gemini community's term for "sites"), 500 domains and 600 IP +addresses. The space is growing rapidly, though. You can find the latest +statistics as the link below. => gemini://gemini.bortzmeyer.org/software/lupa/stats.gmi Geminispace statistics provided by Stéphane Bortzmeyer's "Lupa" crawler ### 1.5 What stage of its lifecycle is the project in? -The current (informal) specification of the protocol is largely frozen, modulo small changes to remove ambiguity and address edge cases. Suggestions for new features will not be considered, as the protocol is considered feature complete. Going forward, the main focus of the project now is on growing the community around the protocol, as well as working on translating the existing specification into a more precise and formal version which might be considered for submission to internet standards bodies such as IETF and IANA. +The current (informal) specification of the protocol is largely frozen, +modulo small changes to remove ambiguity and address edge cases. +Suggestions for new features will not be considered, as the protocol is +considered feature complete. Going forward, the main focus of the project +now is on growing the community around the protocol, as well as working on +translating the existing specification into a more precise and formal +version which might be considered for submission to internet standards +bodies such as IETF and IANA. ### 1.6 Do you really think you can replace the web? -Not for a minute! Nor does anybody involved with Gemini want to destroy Gopherspace. Gemini is not intended to replace either Gopher or the web, but to co-exist peacefully alongside them as one more option which people can freely choose to use if it suits them. In the same way that some people currently serve the same content via gopher and the web, people will be able to "bihost" or "trihost" content on whichever combination of protocols they think offer the best match to their technical, philosophical and aesthetic requirements and those of their intended audience. +Not for a minute! Nor does anybody involved with Gemini want to destroy +Gopherspace. Gemini is not intended to replace either Gopher or the web, +but to co-exist peacefully alongside them as one more option which people +can freely choose to use if it suits them. In the same way that some people +currently serve the same content via gopher and the web, people will be able +to "bihost" or "trihost" content on whichever combination of protocols they +think offer the best match to their technical, philosophical and aesthetic +requirements and those of their intended audience. ### 1.7 What's with the name? -It's a reference to the pre-shuttle era of US manned spaceflight, which consisted of three projects. The first was Project Mercury, which was a fairly minimalist "proof of concept" and part of the race to put a human in space soonest (which the Soviet Union won with their Vostok project). Mercury was a one-man capsule with no ability to adjust to its own orbit after launch and only one Mercury flight lasted longer than a single day. The last was Project Apollo, which was large, heavy, complicated and expensive but could, of course, fly three men to the moon and back. - -Less well known to the modern public, Project Gemini was the "middle child": a two person capsule which could rendezvous and dock with other craft in orbit, could be depressurised and repressurised in orbit to facilitate spacewalks, and whose longest flight was almost two weeks - longer than any Apollo mission! In terms of size, weight and cost Gemini was much closer to Mercury than to Apollo, but in terms of capabilities it was the other way around - there were even plans (which never eventuated) to do circumlunar Gemini flights! - -Hopefully the analogy is obvious: Gopher is akin to Mercury, and the web is akin to Apollo. Gemini hopes to sit between the two, doing more with less. - -Gemini very deliberately didn't receive a name which had *anything* to do with gophers, or other rodents, or even other animals. During the earliest phlog-based discussions which eventually grew into Project Gemini, a lack of careful writing meant it was sometimes unclear whether people were talking about replacing Gopher outright, or adding unofficial, compatibility-breaking upgrades into existing Gopher clients and servers. When idle discussion turned into an actual project, it seemed wise to send a clearer message. +It's a reference to the pre-shuttle era of US manned spaceflight, which +consisted of three projects. The first was Project Mercury, which was a +fairly minimalist "proof of concept" and part of the race to put a human in +space soonest (which the Soviet Union won with their Vostok project). +Mercury was a one-man capsule with no ability to adjust to its own orbit +after launch and only one Mercury flight lasted longer than a single day. +The last was Project Apollo, which was large, heavy, complicated and +expensive but could, of course, fly three men to the moon and back. + +Less well known to the modern public, Project Gemini was the "middle child": +a two person capsule which could rendezvous and dock with other craft in +orbit, could be depressurised and repressurised in orbit to facilitate +spacewalks, and whose longest flight was almost two weeks - longer than any +Apollo mission! In terms of size, weight and cost Gemini was much closer to +Mercury than to Apollo, but in terms of capabilities it was the other way +around - there were even plans (which never eventuated) to do circumlunar +Gemini flights! + +Hopefully the analogy is obvious: Gopher is akin to Mercury, and the web is +akin to Apollo. Gemini hopes to sit between the two, doing more with less. + +Gemini very deliberately didn't receive a name which had *anything* to do +with gophers, or other rodents, or even other animals. During the earliest +phlog-based discussions which eventually grew into Project Gemini, a lack of +careful writing meant it was sometimes unclear whether people were talking +about replacing Gopher outright, or adding unofficial, +compatibility-breaking upgrades into existing Gopher clients and servers. +When idle discussion turned into an actual project, it seemed wise to send a +clearer message. ### 1.8 Where can I learn more? -The official home of Project Gemini is the gemini.circumlunar.space server. It serves the latest version of this FAQ document, as well the protocol specification, recommended best practices and other official documentation via Gemini, Gopher and HTTPS, on IPv4 and IPv6. +The official home of Project Gemini is the gemini.circumlunar.space server. +It serves the latest version of this FAQ document, as well the protocol +specification, recommended best practices and other official documentation +via Gemini, Gopher and HTTPS, on IPv4 and IPv6. Official discussion regarding Gemini happens on a mailing list: => https://lists.orbitalfox.eu/listinfo/gemini Subscribe to the list and view archives via the web => gemini://rawtext.club:1965/~sloum/geminilist/ View list archives via Gemini -Anybody who is running a Gemini server or implementing a Gemini client or server software is strongly encouraged to subscribe to the list. +Anybody who is running a Gemini server or implementing a Gemini client or +server software is strongly encouraged to subscribe to the list. -Casual discussion regarding Gemini also happens in the #gemini channel on the tilde.chat IRC server: +Casual discussion regarding Gemini also happens in the #gemini channel on +the tilde.chat IRC server: => gemini://makeworld.gq/cgi-bin/gemini-irc View IRC logs via Gemini @@ -67,151 +145,416 @@ Casual discussion regarding Gemini also happens in the #gemini channel on the ti ## 2.1 What are the design criteria for Gemini? -The following criteria were informally put in place at the beginning of the project. It's debatable how closely some of these goals have been met, but in general Gemini is still quite close to this target. +The following criteria were informally put in place at the beginning of the +project. It's debatable how closely some of these goals have been met, but +in general Gemini is still quite close to this target. ### 2.1.1 Simplicity -In particular, Gemini strives for simplicity of client implementation. Modern web browsers are so complicated that they can only be developed by very large and expensive projects. This naturally leads to a very small number of near-monopoly browsers, which stifles innovation and diversity and allows the developers of these browsers to dictate the direction in which the web evolves. - -Gemini aims to be simple, but not *too* simple. Gopher is simpler at a protocol level, but as a consequence the client is eternally uncertain: what character encoding is this text in? Is this text the intended content or an error message from the server? What kind of file is this binary data? Because of this, a robust Gopher client is made *less* simple by needing to infer or guess missing information. - -Early Gemini discussion included three clear goals with regard to simplicity: - -* It should be possible for somebody who had no part in designing the protocol to accurately hold the entire protocol spec in their head after reading a well-written description of it once or twice. -* A basic but usable (not ultra-spartan) client should fit comfortably within 50 or so lines of code in a modern high-level language. Certainly not more than 100. -* A client comfortable for daily use which implements every single protocol feature should be a feasible weekend programming project for a single developer. - -It's debatable to what extent these goals have been met. Experiments suggest that a very basic interactive client takes more like a minimum of 100 lines of code, and a comfortable fit and moderate feature completeness need more like 200 lines. But Gemini still seems to be in the ballpark of these goals. +In particular, Gemini strives for simplicity of client implementation. +Modern web browsers are so complicated that they can only be developed by +very large and expensive projects. This naturally leads to a very small +number of near-monopoly browsers, which stifles innovation and diversity and +allows the developers of these browsers to dictate the direction in which +the web evolves. + +Gemini aims to be simple, but not *too* simple. Gopher is simpler at a +protocol level, but as a consequence the client is eternally uncertain: what +character encoding is this text in? Is this text the intended content or an +error message from the server? What kind of file is this binary data? +Because of this, a robust Gopher client is made *less* simple by needing to +infer or guess missing information. + +Early Gemini discussion included three clear goals with regard to +simplicity: + +* It should be possible for somebody who had no part in designing the + protocol to accurately hold the entire protocol spec in their head after + reading a well-written description of it once or twice. + +* A basic but usable (not ultra-spartan) client should fit comfortably + within 50 or so lines of code in a modern high-level language. Certainly + not more than 100. + +* A client comfortable for daily use which implements every single protocol + feature should be a feasible weekend programming project for a single + developer. + +It's debatable to what extent these goals have been met. Experiments +suggest that a very basic interactive client takes more like a minimum of +100 lines of code, and a comfortable fit and moderate feature completeness +need more like 200 lines. But Gemini still seems to be in the ballpark of +these goals. ### 2.1.2 Privacy -Gemini is designed with an acute awareness that the modern web is a privacy disaster, and that the internet is not a safe place for plaintext. Things like browser fingerprinting and Etag-based "supercookies" are an important cautionary tale: user tracking can and will be snuck in via the backdoor using protocol features which were not designed to facilitate it. Thus, protocol designers must not only avoid designing in tracking features (which is easy), but also assume active malicious intent and avoid designing anything which could be subverted to provide effective tracking. This concern manifests as a deliberate non-extensibility in many parts of the Gemini protocol. +Gemini is designed with an acute awareness that the modern web is a privacy +disaster, and that the internet is not a safe place for plaintext. Things +like browser fingerprinting and Etag-based "supercookies" are an important +cautionary tale: user tracking can and will be snuck in via the backdoor +using protocol features which were not designed to facilitate it. Thus, +protocol designers must not only avoid designing in tracking features (which +is easy), but also assume active malicious intent and avoid designing +anything which could be subverted to provide effective tracking. This +concern manifests as a deliberate non-extensibility in many parts of the +Gemini protocol. ### 2.1.3 Generality -The "first class" application of Gemini is human consumption of predominantly written material - to facilitate something like gopherspace, or like "reasonable webspace" (e.g. something which is comfortably usable in Lynx or Dillo). But, just like HTTP can be, and is, used for much, much more than serving HTML, Gemini should be able to be used for as many other purposes as possible without compromising the simplicity and privacy criteria above. This means taking into account possible applications built around non-text files and non-human clients. +The "first class" application of Gemini is human consumption of +predominantly written material - to facilitate something like gopherspace, +or like "reasonable webspace" (e.g. something which is comfortably usable +in Lynx or Dillo). But, just like HTTP can be, and is, used for much, much +more than serving HTML, Gemini should be able to be used for as many other +purposes as possible without compromising the simplicity and privacy +criteria above. This means taking into account possible applications built +around non-text files and non-human clients. ## 2.2 Which shortcomings of Gopher does Gemini overcome? Gemini allows for: * Unambiguous use of arbitrary non-ASCII character sets. -* Identifying binary content using MIME types instead of a small set of badly outdated item types. + +* Identifying binary content using MIME types instead of a small set of + badly outdated item types. + * Clearly distinguishing successful transactions from failed ones. -* Linking to resources served over other protocols via simple URLs, without ugly hacks. + +* Linking to resources served over other protocols via simple URLs, without + ugly hacks. + * Redirects to prevent broken links when content moves or is rearranged. + * Domain-based virtual hosting. -Text in Gemini documents is wrapped by the client to fit the device's viewport, rather than being "hard wrapped" at ~80 characters with newline characters. This means content displays equally well on phones, tablets, laptops and desktops. +Text in Gemini documents is wrapped by the client to fit the device's +viewport, rather than being "hard wrapped" at ~80 characters with newline +characters. This means content displays equally well on phones, tablets, +laptops and desktops. -Gemini does away with Gopher's strict directory / text dichotomy and lets you insert links in prose. +Gemini does away with Gopher's strict directory / text dichotomy and lets +you insert links in prose. Gemini mandates the use of TLS encryption. ## 2.3 Is Gopher's directory / text dichotomy *really* a shortcoming? -Modern usage habits in the phlogosphere would seem to suggest that many people think it is. An increasing number of users are serving content which is almost entirely text as item type 1, so that they can insert a relatively small number of "in line" links to other gopher content, providing some semblance of HTML's hyperlinking - a perfectly reasonable and inoffensive thing to want to do. Without taking this approach, the best Gopher content authors can do is to paste a list of URLs at the bottom of their document, for their readers to manually copy and paste into their client. This is not exactly a pleasant user experience. But forcing hyperlinks into Gopher this way isn't just an abuse of the semantics of the Gopher protocol, it's also a surprisingly inefficient way to serve text, because every single line has to have an item type of i and a phony selector, hostname and port transmitted along with it to make a valid Gopher menu. Any and all claims to simplicity and beauty which Gopher might have are destroyed by this. Gemini takes the simple approach of letting people insert as many or as few links as they like into their text content, with extremely low overhead, but retains the one-link-per-line limitation of Gopher which results in clean, list-like organisation of content. It's hard to see this as anything other than an improvement. - -Of course, if you really like the Gopher way, nothing in Gemini stops you from duplicating it. You can serve item type 0 content with a MIME type of text/plain, and you can write text/gemini documents where every single line is a link line, replicating the look and feel of a RFC1436-fearing Gopher menu without that pesky non-standard i item type. +Modern usage habits in the phlogosphere would seem to suggest that many +people think it is. An increasing number of users are serving content which +is almost entirely text as item type 1, so that they can insert a relatively +small number of "in line" links to other gopher content, providing some +semblance of HTML's hyperlinking - a perfectly reasonable and inoffensive +thing to want to do. Without taking this approach, the best Gopher content +authors can do is to paste a list of URLs at the bottom of their document, +for their readers to manually copy and paste into their client. This is not +exactly a pleasant user experience. But forcing hyperlinks into Gopher this +way isn't just an abuse of the semantics of the Gopher protocol, it's also a +surprisingly inefficient way to serve text, because every single line has to +have an item type of i and a phony selector, hostname and port transmitted +along with it to make a valid Gopher menu. Any and all claims to simplicity +and beauty which Gopher might have are destroyed by this. Gemini takes the +simple approach of letting people insert as many or as few links as they +like into their text content, with extremely low overhead, but retains the +one-link-per-line limitation of Gopher which results in clean, list-like +organisation of content. It's hard to see this as anything other than an +improvement. + +Of course, if you really like the Gopher way, nothing in Gemini stops you +from duplicating it. You can serve item type 0 content with a MIME type of +text/plain, and you can write text/gemini documents where every single line +is a link line, replicating the look and feel of a RFC1436-fearing Gopher +menu without that pesky non-standard i item type. ## 2.4 Which shortcomings of the web does Gemini overcome? -Gemini contains no equivalent of User-Agent or Referer headers, and the request format is not extensible so that these cannot be shoehorned in later. In fact, Gemini requests contain nothing other than the URL of the resource being requested. This goes a very long way to preventing user tracking. +Gemini contains no equivalent of User-Agent or Referer headers, and the +request format is not extensible so that these cannot be shoehorned in +later. In fact, Gemini requests contain nothing other than the URL of the +resource being requested. This goes a very long way to preventing user +tracking. -The "native content type" of Gemini (analogous to HTML for HTTP(S) or plain text for Gopher) never requires additional network transactions (there are no in-line images, external stylesheets, fonts or scripts, no iframes, etc.). This allows for quick browsing even on slow connections and for full awareness of and control over which hosts connections are made to. +The "native content type" of Gemini (analogous to HTML for HTTP(S) or plain +text for Gopher) never requires additional network transactions (there are +no in-line images, external stylesheets, fonts or scripts, no iframes, +etc.). This allows for quick browsing even on slow connections and for full +awareness of and control over which hosts connections are made to. -The native content type of Gemini is strictly a document, with no facility for scripting, allowing for easy browsing even on old computers with limited processor speed or memory. +The native content type of Gemini is strictly a document, with no facility +for scripting, allowing for easy browsing even on old computers with limited +processor speed or memory. ## 2.5 Why not just use a subset of HTTP and HTML? -Many people are confused as to why it's worth creating a new protocol to address perceived problems with optional, non-essential features of the web. Just because websites *can* track users and run CPU-hogging Javsacript and pull in useless multi-megabyte header images or even larger autoplaying videos, doesn't mean they *have* to. Why not just build non-evil websites using the existing technology? - -Of course, this is possible. "The Gemini experience" is roughly equivalent to HTTP where the only request header is "Host" and the only response header is "Content-type" and HTML where the only tags are ,
, ,through
,
and
- and
- and the https://gemini.circumlunar.space website offers pretty much this experience. We know it can be done. - -The problem is that deciding upon a strictly limited subset of HTTP and HTML, slapping a label on it and calling it a day would do almost nothing to create a clearly demarcated space where people can go to consume *only* that kind of content in *only* that kind of way. It's impossible to know in advance whether what's on the other side of a https:// URL will be within the subset or outside it. It's very tedious to verify that a website claiming to use only the subset actually does, as many of the features we want to avoid are invisible (but not harmless!) to the user. It's difficult or even impossible to deactivate support for all the unwanted features in mainstream browsers, so if somebody breaks the rules you'll pay the consequences. Writing a dumbed down web browser which gracefully ignores all the unwanted features is much harder than writing a Gemini client from scratch. Even if you did it, you'd have a very difficult time discovering the minuscule fraction of websites it could render. - -Alternative, simple-by-design protocols like Gopher and Gemini create alternative, simple-by-design spaces with obvious boundaries and hard restrictions. You know for sure when you enter Geminispace, and you can know for sure and in advance when following a certain link will cause you leave it. While you're there, you know for sure and in advance that everybody else there is playing by the same rules. You can relax and get on with your browsing, and follow links to sites you've never heard of before, which just popped up yesterday, and be confident that they won't try to track you or serve you garbage because they *can't*. You can do all this with a client you wrote yourself, so you *know* you can trust it. It's a very different, much more liberating and much more empowering experience than trying to carve out a tiny, invisible sub-sub-sub-sub-space of the web. +Many people are confused as to why it's worth creating a new protocol to +address perceived problems with optional, non-essential features of the web. +Just because websites *can* track users and run CPU-hogging Javsacript and +pull in useless multi-megabyte header images or even larger autoplaying +videos, doesn't mean they *have* to. Why not just build non-evil websites +using the existing technology? + +Of course, this is possible. "The Gemini experience" is roughly equivalent +to HTTP where the only request header is "Host" and the only response header +is "Content-type" and HTML where the only tags are,
, ,+through
,
and
- and
- and the +https://gemini.circumlunar.space website offers pretty much this experience. +We know it can be done. + +The problem is that deciding upon a strictly limited subset of HTTP and +HTML, slapping a label on it and calling it a day would do almost nothing to +create a clearly demarcated space where people can go to consume *only* that +kind of content in *only* that kind of way. It's impossible to know in +advance whether what's on the other side of a https:// URL will be within +the subset or outside it. It's very tedious to verify that a website +claiming to use only the subset actually does, as many of the features we +want to avoid are invisible (but not harmless!) to the user. It's difficult +or even impossible to deactivate support for all the unwanted features in +mainstream browsers, so if somebody breaks the rules you'll pay the +consequences. Writing a dumbed down web browser which gracefully ignores +all the unwanted features is much harder than writing a Gemini client from +scratch. Even if you did it, you'd have a very difficult time discovering +the minuscule fraction of websites it could render. + +Alternative, simple-by-design protocols like Gopher and Gemini create +alternative, simple-by-design spaces with obvious boundaries and hard +restrictions. You know for sure when you enter Geminispace, and you can +know for sure and in advance when following a certain link will cause you +leave it. While you're there, you know for sure and in advance that +everybody else there is playing by the same rules. You can relax and get on +with your browsing, and follow links to sites you've never heard of before, +which just popped up yesterday, and be confident that they won't try to +track you or serve you garbage because they *can't*. You can do all this +with a client you wrote yourself, so you *know* you can trust it. It's a +very different, much more liberating and much more empowering experience +than trying to carve out a tiny, invisible sub-sub-sub-sub-space of the web. ## 2.6 Does Gemini have any shortcomings of it's own? Naturally! -Gemini has no support for caching, compression, or resumption of interrupted downloads. As such, it's not very well suited to distributing large files, for values of "large" which depend upon the speed and reliability of your network connection. +Gemini has no support for caching, compression, or resumption of interrupted +downloads. As such, it's not very well suited to distributing large files, +for values of "large" which depend upon the speed and reliability of your +network connection. ## 2.7 How can you say Gemini is simple if it uses TLS? -Some people are upset that the TLS requirement means they need to use a TLS library to write Gemini code, whereas e.g. Gopher allows them full control by writing everything from scratch themselves. +Some people are upset that the TLS requirement means they need to use a TLS +library to write Gemini code, whereas e.g. Gopher allows them full control +by writing everything from scratch themselves. -Of course, even a "from scratch" Gopher client actually depends crucially on thousands of lines of complicated code written by other people in order to provide a functioning IP stack, DNS resolver and filesystem. Using a TLS library to provide a trustworthy implementation of cryptography is little different. +Of course, even a "from scratch" Gopher client actually depends crucially on +thousands of lines of complicated code written by other people in order to +provide a functioning IP stack, DNS resolver and filesystem. Using a TLS +library to provide a trustworthy implementation of cryptography is little +different. -Gemini also turns TLS client certificates - very rarely seen on the web - into a first-class citizen with in-band signalling of their requirement. This allows restricting access to Gemini resources to certain parties, or voluntarily establishing "sessions" with server-side applications, without having to pass around cookies, passwords, authentication tokens or anything else you may be used to. It's much closer to SSH's notion of "authorized keys" and is, in fact, a much simpler approach to user authentication. +Gemini also turns TLS client certificates - very rarely seen on the web - +into a first-class citizen with in-band signalling of their requirement. +This allows restricting access to Gemini resources to certain parties, or +voluntarily establishing "sessions" with server-side applications, without +having to pass around cookies, passwords, authentication tokens or anything +else you may be used to. It's much closer to SSH's notion of "authorized +keys" and is, in fact, a much simpler approach to user authentication. -## 2.8 Why use TLS for crypto instead of something more modern like the Noise protocol? +## 2.8 Why use TLS for crypto instead of something more modern like the + Noise protocol? TLS is certainly not without its shortcomings, but: -* There are bindings to TLS libraries available for almost every programming language under the sun -* Many developers are already at least partially familiar with TLS and therefore don't need to learn anything new to implement Gemini -* Most users are already trusting TLS to secure their web browsing and email, and therefore don't need to decide whether or not they want to trust some unfamiliar technology to start using Gemini -* TLS is a deeply entrenched industry standard, whose definition and implementations will both continue to be scrutinised and improved by security experts for the foreseeable future, and that work will happen for reasons entirely unrelated to Gemini - it makes a lot of sense for a small project to "freeride" like this. +* There are bindings to TLS libraries available for almost every programming + language under the sun -## 2.9 Why didn't you just use Markdown instead of defining text/gemini? +* Many developers are already at least partially familiar with TLS and + therefore don't need to learn anything new to implement Gemini -The text/gemini markup borrows heavily from Markdown, which might prompt some people to wonder "Why not just use Markdown as the default media type for Gemini? Sure, it's complicated to implement, but like TLS there are plenty of libraries available in all the major languages". Reasons not to go down this route include: +* Most users are already trusting TLS to secure their web browsing and + email, and therefore don't need to decide whether or not they want to + trust some unfamiliar technology to start using Gemini -* There are actually many subtly different and incompatible variants of Markdown in existence, so unlike TLS all the different libraries are not guaranteed to behave similarly. -* The vast majority of Markdown libraries don't actually do anything more than convert Markdown to HTML, which for a Gemini client is a needless intermediary format which is heavier than the original! -* Many Markdown variants permit features which were not wanted for Gemini, e.g. inline images. -* A desire to preserve Gopher's requirement of "one link per line" on the grounds that it encourages extremely clear site designs. +* TLS is a deeply entrenched industry standard, whose definition and + implementations will both continue to be scrutinised and improved by + security experts for the foreseeable future, and that work will happen for + reasons entirely unrelated to Gemini - it makes a lot of sense for a small + project to "freeride" like this. -Of course, it is possible to serve Markdown over Gemini. The inclusion of a text/markdown Media type in the response header will allow more advanced clients to support it. +## 2.9 Why didn't you just use Markdown instead of defining text/gemini? -## 2.10 Why doesn't text/gemini have support for in-line links? +The text/gemini markup borrows heavily from Markdown, which might prompt +some people to wonder "Why not just use Markdown as the default media type +for Gemini? Sure, it's complicated to implement, but like TLS there are +plenty of libraries available in all the major languages". Reasons not to +go down this route include: -Because text/gemini is an entirely new format defined from scratch for Gemini, client authors will typically need to write their own code to parse and render the format from scratch, without being able to rely on a pre-existing, well-tested library implementation. Therefore, it is important that the format is extremely simple to handle correctly. The line-based format where text lines and link lines are separate concepts achieves this. There is no need for clients to scan each line character-by-character, testing for the presence of some special link syntax. Even the simplest special link syntax introduces the possibility of malformed syntax which clients would need to be robust against, and has edge cases whose handling would either need to be explicitly addressed in the protocol specification (leading to a longer, more tedious specification which was less fun to read and harder to hold in your head), or left undefined (leading to inconsistent behaviour across different clients). Even though in-line links may be a more natural fit for some kinds of content, they're just not worth the increased complexity and fragility they would inevitably introduce to the protocol. +* There are actually many subtly different and incompatible variants of + Markdown in existence, so unlike TLS all the different libraries are not + guaranteed to behave similarly. -It's true that you need to shift your thinking a bit to get used to the one link per line writing style, but it gets easier over time. There are benefits to the style as well. It encourages including only the most important or relevant links, organising links into related lists, and giving each link a maximally descriptive label without having to worry about whether or not that label fits naturally into the flow of your main text. +* The vast majority of Markdown libraries don't actually do anything more + than convert Markdown to HTML, which for a Gemini client is a needless + intermediary format which is heavier than the original! -## 2.11 Why doesn't text/gemini have support for styling? +* Many Markdown variants permit features which were not wanted for Gemini, + e.g. inline images. -Some people have expressed a desire for something similar to CSS in Gemini. While it's true that something much simpler and lighter than CSS could easily be designed, Gemini instead takes the position that visual styling of Gemini content should be under the sole and direct control of the reader, not the writer. Not everybody has the same taste in colours and fonts, and no single way of styling a page will be optimal for all readers, all devices and all lighting conditions. There is much more at stake here than the age old divide in preferene for dark text on a light background or vice versa. People with reading disabilities like dyslexia may benefit tremendously from using specially designed fonts, for example. A simple "one size fits all" styling system where content looks the same everywhere is guaranteed to perform poorly for a lot of people. A more complicated styling system which can specify different looks for different devices and contexts burdens every individual author with the task of making sure their capsule is usable everywhere. Experience from the web suggests that accessibility issues will often be an afterthought at best. It's much simpler, and in fact much more liberating for content authors, to let content just be content, and leave styling to the client. Some Gemini clients might look dull and boring, but there's no reason this has to be the case. If there is demand for clients with high quality font rendering and beautiful typography, such clients will eventually be developed - and when they are, users who value those things can enjoy that reading experience everywhere in Geminispace, even when reading content written by authors who don't care about styling at all. +* A desire to preserve Gopher's requirement of "one link per line" on the + grounds that it encourages extremely clear site designs. -## 2.12 Why isn't there an equivalent of the HTTP Content-length header? - -Non-extensibility of the protocol was a major design principle for Gemini. Things like cookies, Etags and other tracking tools were not present in the original design of HTTP, but could be seamlessly added later because the HTTP response format is open-ended and allows the easy inclusion of new headers. To minimise the risk of Gemini slowly mutating into something more web-like, it was decided to include one and exactly one piece of information in the response header for successful requests. Including two pieces of information with a specified delimiter would provide a very obvious path for later adding a third piece - just use the same delimiter again. There is basically no stable position between one piece of information and arbitrarily many pieces of information, so Gemini sticks hard to the former option, even if it means having to sacrifice some nice and seemingly harmless functionality. Given this restriction, including only an equivalent of Content-type seemed clearly more useful than including only an equivalent of Content-length. The same is true for other harmless and useful HTTP headers, like Last-Modified. +Of course, it is possible to serve Markdown over Gemini. The inclusion of a +text/markdown Media type in the response header will allow more advanced +clients to support it. -Gopher also has no equivalent of the Content-length header, and this has not proven to be a practical obstacle in Gopherspace. +## 2.10 Why doesn't text/gemini have support for in-line links? -Even without this header, it is possible (unlike in Gopher) for clients to distinguish between a Gemini transaction which has completed successfully and one which has dropped out mid-transfer due to a network fault or malicious attack via the presence or absence of a TLS Shutdown message. +Because text/gemini is an entirely new format defined from scratch for +Gemini, client authors will typically need to write their own code to parse +and render the format from scratch, without being able to rely on a +pre-existing, well-tested library implementation. Therefore, it is +important that the format is extremely simple to handle correctly. The +line-based format where text lines and link lines are separate concepts +achieves this. There is no need for clients to scan each line +character-by-character, testing for the presence of some special link +syntax. Even the simplest special link syntax introduces the possibility of +malformed syntax which clients would need to be robust against, and has edge +cases whose handling would either need to be explicitly addressed in the +protocol specification (leading to a longer, more tedious specification +which was less fun to read and harder to hold in your head), or left +undefined (leading to inconsistent behaviour across different clients). +Even though in-line links may be a more natural fit for some kinds of +content, they're just not worth the increased complexity and fragility they +would inevitably introduce to the protocol. + +It's true that you need to shift your thinking a bit to get used to the one +link per line writing style, but it gets easier over time. There are +benefits to the style as well. It encourages including only the most +important or relevant links, organising links into related lists, and giving +each link a maximally descriptive label without having to worry about +whether or not that label fits naturally into the flow of your main text. -It is true that the inability for clients to tell users how much more of a large file still has to be downloaded and to estimate how long this may take means Gemini cannot provide a very user-friendly experience for large file downloads. However, this would be the case even if Content-length were specified, as such an experience would also require other complications to be added to the protocol e.g. the ability to resume interrupted downloads. Gemini documents can of course straightforwardly link to resources hosted via HTTPS, BitTorrent, IPFS, DAT, etc. and this may be the best option for very large files. +## 2.11 Why doesn't text/gemini have support for styling? -## 2.13 Why isn't a protocol version number included with requests or responses? +Some people have expressed a desire for something similar to CSS in Gemini. +While it's true that something much simpler and lighter than CSS could +easily be designed, Gemini instead takes the position that visual styling of +Gemini content should be under the sole and direct control of the reader, +not the writer. Not everybody has the same taste in colours and fonts, and +no single way of styling a page will be optimal for all readers, all devices +and all lighting conditions. There is much more at stake here than the age +old divide in preferene for dark text on a light background or vice versa. +People with reading disabilities like dyslexia may benefit tremendously from +using specially designed fonts, for example. A simple "one size fits all" +styling system where content looks the same everywhere is guaranteed to +perform poorly for a lot of people. A more complicated styling system which +can specify different looks for different devices and contexts burdens every +individual author with the task of making sure their capsule is usable +everywhere. Experience from the web suggests that accessibility issues will +often be an afterthought at best. It's much simpler, and in fact much more +liberating for content authors, to let content just be content, and leave +styling to the client. Some Gemini clients might look dull and boring, but +there's no reason this has to be the case. If there is demand for clients +with high quality font rendering and beautiful typography, such clients will +eventually be developed - and when they are, users who value those things +can enjoy that reading experience everywhere in Geminispace, even when +reading content written by authors who don't care about styling at all. -This would only be useful if there were plans to smoothly upgrade to a "Gemini 2.0" in the future - and there aren't! Gemini is a "less is more" reaction against web browsers and servers becoming too complicated and too powerful. It makes no sense to plan to add more functionality to Gemini later. Instead the plan is to "get it right the first time", as much as possible, then freeze the protocol specification forever after, without upgrades, enhancements or extensions. +## 2.12 Why isn't there an equivalent of the HTTP Content-length header? -This may seem radical or unwise, but we're cautiously optimistic. The Gopher specification has not been changed in about 30 years, and only a very small number of quite minor unofficial changes to that spec are in common use in today's Gopherspace, which is actually growing in popularity. Gemini combines mature, ubiquitous internet primitives like URIs, MIME media types and TLS in a very straightforward way, and seeks to foster a culture of working within - and even embracing - carefully chosen limitations, rather than removing each constraint as it is encountered to make anything possible. There are plenty of things that Gemini is useful for and good at right now, and there is no reason to think it won't be useful for and good at those same things decades from now. +Non-extensibility of the protocol was a major design principle for Gemini. +Things like cookies, Etags and other tracking tools were not present in the +original design of HTTP, but could be seamlessly added later because the +HTTP response format is open-ended and allows the easy inclusion of new +headers. To minimise the risk of Gemini slowly mutating into something more +web-like, it was decided to include one and exactly one piece of information +in the response header for successful requests. Including two pieces of +information with a specified delimiter would provide a very obvious path for +later adding a third piece - just use the same delimiter again. There is +basically no stable position between one piece of information and +arbitrarily many pieces of information, so Gemini sticks hard to the former +option, even if it means having to sacrifice some nice and seemingly +harmless functionality. Given this restriction, including only an +equivalent of Content-type seemed clearly more useful than including only an +equivalent of Content-length. The same is true for other harmless and +useful HTTP headers, like Last-Modified. + +Gopher also has no equivalent of the Content-length header, and this has not +proven to be a practical obstacle in Gopherspace. + +Even without this header, it is possible (unlike in Gopher) for clients to +distinguish between a Gemini transaction which has completed successfully +and one which has dropped out mid-transfer due to a network fault or +malicious attack via the presence or absence of a TLS Shutdown message. + +It is true that the inability for clients to tell users how much more of a +large file still has to be downloaded and to estimate how long this may take +means Gemini cannot provide a very user-friendly experience for large file +downloads. However, this would be the case even if Content-length were +specified, as such an experience would also require other complications to +be added to the protocol e.g. the ability to resume interrupted downloads. +Gemini documents can of course straightforwardly link to resources hosted +via HTTPS, BitTorrent, IPFS, DAT, etc. and this may be the best option for +very large files. + +## 2.13 Why isn't a protocol version number included with requests or + responses? + +This would only be useful if there were plans to smoothly upgrade to a +"Gemini 2.0" in the future - and there aren't! Gemini is a "less is more" +reaction against web browsers and servers becoming too complicated and too +powerful. It makes no sense to plan to add more functionality to Gemini +later. Instead the plan is to "get it right the first time", as much as +possible, then freeze the protocol specification forever after, without +upgrades, enhancements or extensions. + +This may seem radical or unwise, but we're cautiously optimistic. The +Gopher specification has not been changed in about 30 years, and only a very +small number of quite minor unofficial changes to that spec are in common +use in today's Gopherspace, which is actually growing in popularity. Gemini +combines mature, ubiquitous internet primitives like URIs, MIME media types +and TLS in a very straightforward way, and seeks to foster a culture of +working within - and even embracing - carefully chosen limitations, rather +than removing each constraint as it is encountered to make anything +possible. There are plenty of things that Gemini is useful for and good at +right now, and there is no reason to think it won't be useful for and good +at those same things decades from now. ## 2.14 Why don't you care about retrocomputing support? -Gopher is so simple that computers from the 80s or 90s can easily implement the protocol, and for some people this is one of the great virtues of Gopher. The TLS requirement of Gemini limits it to more modern machines. - -Old machines are awesome, and keeping them running, online and useful for as long as possible is an awesome thing to do. But it also makes no sense for the vast majority of internet users to sacrifice any and all privacy protection to facilitate this. Remember, though, that Gemini does not aim to replace Gopher, so the retro-compatible internet is not directly endangered by it. In fact, people serving content via Gopher right now are strongly encouraged to start also serving that same content via Gemini simultaneously. Retrocomputing fans can continue to access the content via Gopher, while modern computer users who wish to can switch to Gemini and reap some benefits. +Gopher is so simple that computers from the 80s or 90s can easily implement +the protocol, and for some people this is one of the great virtues of +Gopher. The TLS requirement of Gemini limits it to more modern machines. + +Old machines are awesome, and keeping them running, online and useful for as +long as possible is an awesome thing to do. But it also makes no sense for +the vast majority of internet users to sacrifice any and all privacy +protection to facilitate this. Remember, though, that Gemini does not aim +to replace Gopher, so the retro-compatible internet is not directly +endangered by it. In fact, people serving content via Gopher right now are +strongly encouraged to start also serving that same content via Gemini +simultaneously. Retrocomputing fans can continue to access the content via +Gopher, while modern computer users who wish to can switch to Gemini and +reap some benefits. # 3. Getting started in Geminispace ## 3.1 I'm curious about Geminispace, how can I check it out? -The lowest commitment way to explore Geminispace is to use a web proxy or "portal", such as one of the following: +The lowest commitment way to explore Geminispace is to use a web proxy or +"portal", such as one of the following: => https://portal.mozz.us/gemini/gemini.circumlunar.space/ The mozz.us Gemini portal => https://proxy.vulpes.one/gemini/gemini.circumlunar.space The vulpes.one Gemini portal -This will allow you to use your regular web browser to explore Geminispace. If you like what you see, you might want to consider installing a dedicated Gemini client, which will typically offer a better and more complete browsing experience. You can find a list of clients (and other software) at the link below. There are even clients available for mobile platforms like Android and iOS! +This will allow you to use your regular web browser to explore Geminispace. +If you like what you see, you might want to consider installing a dedicated +Gemini client, which will typically offer a better and more complete +browsing experience. You can find a list of clients (and other software) at +the link below. There are even clients available for mobile platforms like +Android and iOS! => /software/ Gemini software list -If you have an ssh client installed, you can try some terminal clients out without installing them by running: +If you have an ssh client installed, you can try some terminal clients out +without installing them by running: > ssh kiosk@gemini.circumlunar.space @@ -219,34 +562,50 @@ This Gemini kiosk was inspired by the Gopher kiosk at bitreich.org! ## 3.2 Okay, I've got a client, where can I find content? -For now, Geminispace is still small enough that it's feasible to use directories as a way to discover what is out there. Some of these are listed below: +For now, Geminispace is still small enough that it's feasible to use +directories as a way to discover what is out there. Some of these are +listed below: => gemini://medusae.space/ The medusae.space Gemini directory has a list of capsules divided into thematic categories => gemini://gus.guru/known-hosts The GUS search engine's list of known Gemini hosts => /servers/ A historic list of the first 50 Gemini servers -If you are looking for something in particular, Gemini has two search engines: +If you are looking for something in particular, Gemini has two search +engines: => gemini://gus.guru GUS, the first Gemini search engine => gemini://houston.coder.town Houston, the second Gemini search engine -There are two public aggregators which attempt to make it easier to find recently-updated material in Geminispace: +There are two public aggregators which attempt to make it easier to find +recently-updated material in Geminispace: => /capcom/ CAPCOM, which aggregates Atom feeds of Gemini content => gemini://rawtext.club:1965/~sloum/spacewalk.gmi Spacewalk, which uses change-detection to find new content ## 3.3 How can I put some content of my own in Geminspace? -One option is to set up your own Gemini server on a VPS or a computer in your home (small SBCs like the RaspberryPi are perfectly capable of acting as Gemini servers). There is a wide range of server software available to choose from: +One option is to set up your own Gemini server on a VPS or a computer in +your home (small SBCs like the RaspberryPi are perfectly capable of acting +as Gemini servers). There is a wide range of server software available to +choose from: => /software/ Gemini software list -Alternatively, you can find somewhere else to host your content for you. Gemini hosting is also available from the following providers: +Alternatively, you can find somewhere else to host your content for you. +Gemini hosting is also available from the following providers: => gemini://idf.looting.uk/hosting idf.looting.uk => gemini://srht.site/ SourceHut (including support for custom domains!) -A number of "pubnix" or "tilde" communities (multi-user unix systems where users interact with one another by sshing in and using local email, chat and BBS applications) also offer Gemini hosting (typically alongside web and/or Gopher hosting). You may be able to get an account of one of the communities listed below. Please note that most of these communities are older than Gemini itself, and may be focussed on other services, or may be specific to a particular theme or interest. Research your choices carefully and join somewhere you think you might fit in well overall, rather than just treating these amazing little worlds as free space to dump your stuff. +A number of "pubnix" or "tilde" communities (multi-user unix systems where +users interact with one another by sshing in and using local email, chat and +BBS applications) also offer Gemini hosting (typically alongside web and/or +Gopher hosting). You may be able to get an account of one of the +communities listed below. Please note that most of these communities are +older than Gemini itself, and may be focussed on other services, or may be +specific to a particular theme or interest. Research your choices carefully +and join somewhere you think you might fit in well overall, rather than just +treating these amazing little worlds as free space to dump your stuff. => gemini://gemini.ctrl-c.club Ctrl-C.club => gemini://envs.net envs.net @@ -255,18 +614,26 @@ A number of "pubnix" or "tilde" communities (multi-user unix systems where users => gemini://rawtext.club Raw Text Club, aka RTC => gemini://breadpunk.club/ Breadpunk.club, a baking-themed server -If you belong to a pubnix community which doesn't offer Gemini hosting, it can't hurt to ask the admin(s) if they are interested in adding this service! +If you belong to a pubnix community which doesn't offer Gemini hosting, it +can't hurt to ask the admin(s) if they are interested in adding this +service! -If you do not feel comfortable with the technologies needed to make use of pubnix hosting (ssh or sftp, terminal text editors, unix file permissions, etc) you can get free accounts at the services below which will allow you to maintain a capsule via a web application: +If you do not feel comfortable with the technologies needed to make use of +pubnix hosting (ssh or sftp, terminal text editors, unix file permissions, +etc) you can get free accounts at the services below which will allow you to +maintain a capsule via a web application: => https://gemlog.blue Gemlog Blue, featuring an ultralight interface with no cookies or Javascript => https://flounder.online/ Flounder, where your content will be available via Gemini and the web simultaneously ## 3.4 I set up my own Gemini server, is there anything I should do? -Please consider joining the mailing list (see question 1.3) so that you can announce your new server to the community, and keep up to date with e.g. updates to your server software or to the Gemini protocol itself. +Please consider joining the mailing list (see question 1.3) so that you can +announce your new server to the community, and keep up to date with e.g. +updates to your server software or to the Gemini protocol itself. -You can submit your server's URL to the GUS search engine so that it gets crawled, via the link below: +You can submit your server's URL to the GUS search engine so that it gets +crawled, via the link below: => gemini://gus.guru/add-seed Submit a URL to GUS @@ -274,42 +641,96 @@ You can submit your server's URL to the GUS search engine so that it gets crawle ## 4.1 I like the sound of the Gemini project, how can I help? -Gemini already has a surprising number of client and server implementations in existence - which isn't to say more aren't welcome, but the real shortage right now is not of software but of content. The more interesting and exciting stuff people find in Geminispace, the more likely they are to want to add content of their own. So, the greatest contribution you can make to the project is to be a part of this process. See question 3.3 above for details on how to get your content into Geminispace. - -If you have the necessary technical skills, you can make a major contribution to the growth of Geminispace by providing a hosting service which people can use to publish content. This can be as simple as setting up sftp-only user accounts on a VPS. Offering hosting doesn't necessarily need to be a big committment. You can use the cheapest VPS services on offer to very comfortably host a dozen or so users. A large number of hosts each serving the content of a relatively small number of users is a much more robust and sustainable ecosystem than a small number of servers each hosting hundreds or thousands of users! - -If you really want to write some software, a powerful tool for expanding Geminispace could be a single piece of software which simultaneously provides a Gemini server and a way for multiple users to easily manage the content provided by said server, e.g. via an interactive web interface or by sending emails full of content; Something like the Gemlog Blue and Flounder services (see question 3.3 again), but packaged up and documented in such a way that it's easy for people to deploy and customise their own multiuser sites, much like e.g. a Mastodon instance. - -You can also help the project by contributing corrections and additions to or translations of the official site and documentation (see questions 4.2 and 4.3 below). +Gemini already has a surprising number of client and server implementations +in existence - which isn't to say more aren't welcome, but the real shortage +right now is not of software but of content. The more interesting and +exciting stuff people find in Geminispace, the more likely they are to want +to add content of their own. So, the greatest contribution you can make to +the project is to be a part of this process. See question 3.3 above for +details on how to get your content into Geminispace. + +If you have the necessary technical skills, you can make a major +contribution to the growth of Geminispace by providing a hosting service +which people can use to publish content. This can be as simple as setting +up sftp-only user accounts on a VPS. Offering hosting doesn't necessarily +need to be a big committment. You can use the cheapest VPS services on +offer to very comfortably host a dozen or so users. A large number of hosts +each serving the content of a relatively small number of users is a much +more robust and sustainable ecosystem than a small number of servers each +hosting hundreds or thousands of users! + +If you really want to write some software, a powerful tool for expanding +Geminispace could be a single piece of software which simultaneously +provides a Gemini server and a way for multiple users to easily manage the +content provided by said server, e.g. via an interactive web interface or +by sending emails full of content; Something like the Gemlog Blue and +Flounder services (see question 3.3 again), but packaged up and documented +in such a way that it's easy for people to deploy and customise their own +multiuser sites, much like e.g. a Mastodon instance. + +You can also help the project by contributing corrections and additions to +or translations of the official site and documentation (see questions 4.2 +and 4.3 below). ## 4.2 How do I contribute to the official Gemini site and documentation? -All the documentation hosted at gemini.circumlunar.space, including the FAQ you're reading now, lives in a single git repository, which has read-only access open to the public. You can clone the repo as follows: +All the documentation hosted at gemini.circumlunar.space, including the FAQ +you're reading now, lives in a single git repository, which has read-only +access open to the public. You can clone the repo as follows: > git clone git://gemini.circumlunar.space/gemini-site -Then, make your suggested changes to the relevant files (the structure of the URLs mirrors the structure of the repository exactly, so e.g. gemini://gemini.circumlunar.space/docs/faq.gmi lives at docs/faq.gmi in the repo). Commit your changes with meaningful commit messages (make sure to set your name and email address so people can see who did your work!), with one commit per conceptual change. You then have two options to send your work upstream. +Then, make your suggested changes to the relevant files (the structure of +the URLs mirrors the structure of the repository exactly, so e.g. +gemini://gemini.circumlunar.space/docs/faq.gmi lives at docs/faq.gmi in the +repo). Commit your changes with meaningful commit messages (make sure to +set your name and email address so people can see who did your work!), with +one commit per conceptual change. You then have two options to send your +work upstream. -If you have git's send-email command configured (see below for a link to a tutorial), you can email patches containing your commits towith a single command. Otherwise, you can simply run: +If you have git's send-email command configured (see below for a link to a +tutorial), you can email patches containing your commits to with a single command. Otherwise, you can simply run: > git format-patch origin -to create a set of patch files, which you can manually attach to an email using your ordinary mail client of choice. +to create a set of patch files, which you can manually attach to an email +using your ordinary mail client of choice. => https://git-send-email.io/ A friendly tutorial on configuring git send-email ## 4.3 I'd like to translate some Gemini documentation into my native language, how can I do that? -Thank you! Volunteering to translate documentation is a wonderful way to help the project. +Thank you! Volunteering to translate documentation is a wonderful way to +help the project. -To do so, first clone the git repository as described in question 4.2 above. Change into the `doc` directory of the repository, and create a new subdirectory with your language's two letter ISO 639-1 code, e.g. Finnish translations should live in `doc/fi/`, Japanese translations in `doc/jp/`, etc. You can find a complete list of codes at Wikipedia, linked below. If you are translating into a region-specific variant of a language, you can use RFC4646-style codes instead, e.g. pt-PT or pt-BR for the Portuguese as spoken in Portugal or Brazil, respectively. +To do so, first clone the git repository as described in question 4.2 above. +Change into the `doc` directory of the repository, and create a new +subdirectory with your language's two letter ISO 639-1 code, e.g. Finnish +translations should live in `doc/fi/`, Japanese translations in `doc/jp/`, +etc. You can find a complete list of codes at Wikipedia, linked below. If +you are translating into a region-specific variant of a language, you can +use RFC4646-style codes instead, e.g. pt-PT or pt-BR for the Portuguese as +spoken in Portugal or Brazil, respectively. => https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes List of language codes at Wikipedia -For each English file which lives in `doc` which you want to translate, create a corresponding file in your language's subdirectory. It's okay to change the file name as part of the translation, e.g. the German translation of `doc/specification.gmi` might be called `doc/spezifikation.gmi`. You can translate as many or as few of the files in `doc` as you have time and energy for. Don't be shy about submitting partial translations! Once somebody else who speaks your language sees your effort, they might provide some or all of the remaining work. Having some content translated is better than none. - -Once you're done, copy across the `doc/index.gmi` file and modify it to match your translated filenames and document titles, and remove links for any of the original documents which you haven't translated yet. - -Finally, update `doc/translations.gmi` to include a link to your new subdirectory. - -Commit your translations to the repository and send Solderpunk the patch as described in question 4.2 above. +For each English file which lives in `doc` which you want to translate, +create a corresponding file in your language's subdirectory. It's okay to +change the file name as part of the translation, e.g. the German +translation of `doc/specification.gmi` might be called +`doc/spezifikation.gmi`. You can translate as many or as few of the files +in `doc` as you have time and energy for. Don't be shy about submitting +partial translations! Once somebody else who speaks your language sees your +effort, they might provide some or all of the remaining work. Having some +content translated is better than none. + +Once you're done, copy across the `doc/index.gmi` file and modify it to +match your translated filenames and document titles, and remove links for +any of the original documents which you haven't translated yet. + +Finally, update `doc/translations.gmi` to include a link to your new +subdirectory. + +Commit your translations to the repository and send Solderpunk the patch as +described in question 4.2 above. diff --git a/specification.gmi b/specification.gmi --- a/specification.gmi +++ b/specification.gmi @@ -4,19 +4,35 @@ v0.14.3, November 29th 2020 -This is an increasingly less rough sketch of an actual spec for Project Gemini. Although not finalised yet, further changes to the specification are likely to be relatively small. You can write code to this pseudo-specification and be confident that it probably won't become totally non-functional due to massive changes next week, but you are still urged to keep an eye on ongoing development of the protocol and make changes as required. +This is an increasingly less rough sketch of an actual spec for Project +Gemini. Although not finalised yet, further changes to the specification +are likely to be relatively small. You can write code to this +pseudo-specification and be confident that it probably won't become totally +non-functional due to massive changes next week, but you are still urged to +keep an eye on ongoing development of the protocol and make changes as +required. -This is provided mostly so that people can quickly get up to speed on what I'm thinking without having to read lots and lots of old phlog posts and keep notes. +This is provided mostly so that people can quickly get up to speed on what +I'm thinking without having to read lots and lots of old phlog posts and +keep notes. -Feedback on any part of this is extremely welcome, please email solderpunk@posteo.net. +Feedback on any part of this is extremely welcome, please email +solderpunk@posteo.net. # 1 Overview -Gemini is a client-server protocol featuring request-response transactions, broadly similar to gopher or HTTP. Connections are closed at the end of a single transaction and cannot be reused. When Gemini is served over TCP/IP, servers should listen on port 1965 (the first manned Gemini mission, Gemini 3, flew in March '65). This is an unprivileged port, so it's very easy to run a server as a "nobody" user, even if e.g. the server is written in Go and so can't drop privileges in the traditional fashion. +Gemini is a client-server protocol featuring request-response transactions, +broadly similar to gopher or HTTP. Connections are closed at the end of a +single transaction and cannot be reused. When Gemini is served over TCP/IP, +servers should listen on port 1965 (the first manned Gemini mission, Gemini +3, flew in March '65). This is an unprivileged port, so it's very easy to +run a server as a "nobody" user, even if e.g. the server is written in Go +and so can't drop privileges in the traditional fashion. ## 1.1 Gemini transactions -There is one kind of Gemini transaction, roughly equivalent to a gopher request or a HTTP "GET" request. Transactions happen as follows: +There is one kind of Gemini transaction, roughly equivalent to a gopher +request or a HTTP "GET" request. Transactions happen as follows: C: Opens connection S: Accepts connection @@ -31,21 +47,39 @@ C: Handles response (see 3.4) ## 1.2 Gemini URI scheme -Resources hosted via Gemini are identified using URIs with the scheme "gemini". This scheme is syntactically compatible with the generic URI syntax defined in RFC 3986, but does not support all components of the generic syntax. In particular, the authority component is allowed and required, but its userinfo subcomponent is NOT allowed. The host subcomponent is required. The port subcomponent is optional, with a default value of 1965. The path, query and fragment components are allowed and have no special meanings beyond those defined by the generic syntax. Spaces in gemini URIs should be encoded as %20, not +. +Resources hosted via Gemini are identified using URIs with the scheme +"gemini". This scheme is syntactically compatible with the generic URI +syntax defined in RFC 3986, but does not support all components of the +generic syntax. In particular, the authority component is allowed and +required, but its userinfo subcomponent is NOT allowed. The host +subcomponent is required. The port subcomponent is optional, with a default +value of 1965. The path, query and fragment components are allowed and have +no special meanings beyond those defined by the generic syntax. Spaces in +gemini URIs should be encoded as %20, not +. # 2 Gemini requests -Gemini requests are a single CRLF-terminated line with the following structure: +Gemini requests are a single CRLF-terminated line with the following +structure:- is a UTF-8 encoded absolute URL, including a scheme, of maximum length 1024 bytes. + is a UTF-8 encoded absolute URL, including a scheme, of maximum length +1024 bytes. -Sending an absolute URL instead of only a path or selector is effectively equivalent to building in a HTTP "Host" header. It permits virtual hosting of multiple Gemini domains on the same IP address. It also allows servers to optionally act as proxies. Including schemes other than "gemini" in requests allows servers to optionally act as protocol-translating gateways to e.g. fetch gopher resources over Gemini. Proxying is optional and the vast majority of servers are expected to only respond to requests for resources at their own domain(s). +Sending an absolute URL instead of only a path or selector is effectively +equivalent to building in a HTTP "Host" header. It permits virtual hosting +of multiple Gemini domains on the same IP address. It also allows servers +to optionally act as proxies. Including schemes other than "gemini" in +requests allows servers to optionally act as protocol-translating gateways +to e.g. fetch gopher resources over Gemini. Proxying is optional and the +vast majority of servers are expected to only respond to requests for +resources at their own domain(s). # 3 Gemini responses -Gemini response consist of a single CRLF-terminated header line, optionally followed by a response body. +Gemini response consist of a single CRLF-terminated header line, optionally +followed by a response body. ## 3.1 Response headers @@ -53,144 +87,324 @@ Gemini response headers look like this: - is a two-digit numeric status code, as described below in 3.2 and in Appendix 1. + is a two-digit numeric status code, as described below in 3.2 and +in Appendix 1. is a single space character, i.e. the byte 0x20. - is a UTF-8 encoded string of maximum length 1024 bytes, whose meaning is dependent. + is a UTF-8 encoded string of maximum length 1024 bytes, whose meaning +is dependent. and are separated by a single space character. -If does not belong to the "SUCCESS" range of codes, then the server MUST close the connection after sending the header and MUST NOT send a response body. +If does not belong to the "SUCCESS" range of codes, then the server +MUST close the connection after sending the header and MUST NOT send a +response body. -If a server sends a which is not a two-digit number or a which exceeds 1024 bytes in length, the client SHOULD close the connection and disregard the response header, informing the user of an error. +If a server sends a which is not a two-digit number or a +which exceeds 1024 bytes in length, the client SHOULD close the connection +and disregard the response header, informing the user of an error. ## 3.2 Status codes Gemini uses two-digit numeric status codes. Related status codes share the same first digit. Importantly, the first digit of Gemini status codes do not group codes into vague categories like "client error" and "server error" as per HTTP. Instead, the first digit alone provides enough information for a client to determine how to handle the response. By design, it is possible to write a simple but feature complete client which only looks at the first digit. The second digit provides more fine-grained information, for unambiguous server logging, to allow writing comfier interactive clients which provide a slightly more streamlined user interface, and to allow writing more robust and intelligent automated clients like content aggregators, search engine crawlers, etc. -The first digit of a response code unambiguously places the response into one of six categories, which define the semantics of the line. +The first digit of a response code unambiguously places the response into +one of six categories, which define the semantics of the line. ### 3.2.1 1x (INPUT) Status codes beginning with 1 are INPUT status codes, meaning: -The requested resource accepts a line of textual user input. The line is a prompt which should be displayed to the user. The same resource should then be requested again with the user's input included as a query component. Queries are included in requests as per the usual generic URL definition in RFC3986, i.e. separated from the path by a ?. Reserved characters used in the user's input must be "percent-encoded" as per RFC3986, and space characters should also be percent-encoded. +The requested resource accepts a line of textual user input. The +line is a prompt which should be displayed to the user. The same resource +should then be requested again with the user's input included as a query +component. Queries are included in requests as per the usual generic URL +definition in RFC3986, i.e. separated from the path by a ?. Reserved +characters used in the user's input must be "percent-encoded" as per +RFC3986, and space characters should also be percent-encoded. ### 3.2.2 2x (SUCCESS) Status codes beginning with 2 are SUCCESS status codes, meaning: -The request was handled successfully and a response body will follow the response header. The line is a MIME media type which applies to the response body. +The request was handled successfully and a response body will follow the +response header. The line is a MIME media type which applies to the +response body. ### 3.2.3 3x (REDIRECT) Status codes beginning with 3 are REDIRECT status codes, meaning: -The server is redirecting the client to a new location for the requested resource. There is no response body. is a new URL for the requested resource. The URL may be absolute or relative. The redirect should be considered temporary, i.e. clients should continue to request the resource at the original address and should not performance convenience actions like automatically updating bookmarks. There is no response body. +The server is redirecting the client to a new location for the requested +resource. There is no response body. is a new URL for the requested +resource. The URL may be absolute or relative. The redirect should be +considered temporary, i.e. clients should continue to request the resource +at the original address and should not performance convenience actions like +automatically updating bookmarks. There is no response body. ### 3.2.4 4x (TEMPORARY FAILURE) Status codes beginning with 4 are TEMPORARY FAILURE status codes, meaning: -The request has failed. There is no response body. The nature of the failure is temporary, i.e. an identical request MAY succeed in the future. The contents of may provide additional information on the failure, and should be displayed to human users. +The request has failed. There is no response body. The nature of the +failure is temporary, i.e. an identical request MAY succeed in the future. +The contents of may provide additional information on the failure, +and should be displayed to human users. ### 3.2.5 5x (PERMANENT FAILURE) Status codes beginning with 5 are PERMANENT FAILURE status codes, meaning: -The request has failed. There is no response body. The nature of the failure is permanent, i.e. identical future requests will reliably fail for the same reason. The contents of may provide additional information on the failure, and should be displayed to human users. Automatic clients such as aggregators or indexing crawlers should not repeat this request. +The request has failed. There is no response body. The nature of the +failure is permanent, i.e. identical future requests will reliably fail for +the same reason. The contents of may provide additional information +on the failure, and should be displayed to human users. Automatic clients +such as aggregators or indexing crawlers should not repeat this request. ### 3.2.6 6x (CLIENT CERTIFICATE REQUIRED) -Status codes beginning with 6 are CLIENT CERTIFICATE REQUIRED status codes, meaning: +Status codes beginning with 6 are CLIENT CERTIFICATE REQUIRED status codes, +meaning: -The requested resource requires a client certificate to access. If the request was made without a certificate, it should be repeated with one. If the request was made with a certificate, the server did not accept it and the request should be repeated with a different certificate. The contents of (and/or the specific 6x code) may provide additional information on certificate requirements or the reason a certificate was rejected. +The requested resource requires a client certificate to access. If the +request was made without a certificate, it should be repeated with one. If +the request was made with a certificate, the server did not accept it and +the request should be repeated with a different certificate. The contents +of (and/or the specific 6x code) may provide additional information +on certificate requirements or the reason a certificate was rejected. ### 3.2.7 Notes -Note that for basic interactive clients for human use, errors 4 and 5 may be effectively handled identically, by simply displaying the contents of under a heading of "ERROR". The temporary/permanent error distinction is primarily relevant to well-behaving automated clients. Basic clients may also choose not to support client-certificate authentication, in which case only four distinct status handling routines are required (for statuses beginning with 1, 2, 3 or a combined 4-or-5). +Note that for basic interactive clients for human use, errors 4 and 5 may be +effectively handled identically, by simply displaying the contents of +under a heading of "ERROR". The temporary/permanent error distinction is +primarily relevant to well-behaving automated clients. Basic clients may +also choose not to support client-certificate authentication, in which case +only four distinct status handling routines are required (for statuses +beginning with 1, 2, 3 or a combined 4-or-5). -The full two-digit system is detailed in Appendix 1. Note that for each of the six valid first digits, a code with a second digit of zero corresponds is a generic status of that kind with no special semantics. This means that basic servers without any advanced functionality need only be able to return codes of 10, 20, 30, 40 or 50. +The full two-digit system is detailed in Appendix 1. Note that for each of +the six valid first digits, a code with a second digit of zero corresponds +is a generic status of that kind with no special semantics. This means that +basic servers without any advanced functionality need only be able to return +codes of 10, 20, 30, 40 or 50. -The Gemini status code system has been carefully designed so that the increased power (and correspondingly increased complexity) of the second digits is entirely "opt-in" on the part of both servers and clients. +The Gemini status code system has been carefully designed so that the +increased power (and correspondingly increased complexity) of the second +digits is entirely "opt-in" on the part of both servers and clients. ## 3.3 Response bodies -Response bodies are just raw content, text or binary, ala gopher. There is no support for compression, chunking or any other kind of content or transfer encoding. The server closes the connection after the final byte, there is no "end of response" signal like gopher's lonely dot. - -Response bodies only accompany responses whose header indicates a SUCCESS status (i.e. a status code whose first digit is 2). For such responses, is a MIME media type as defined in RFC 2046. - -Internet media types are registered with a canonical form. Content transferred via Gemini MUST be represented in the appropriate canonical form prior to its transmission except for "text" types, as defined in the next paragraph. - -When in canonical form, media subtypes of the "text" type use CRLF as the text line break. Gemini relaxes this requirement and allows the transport of text media with plain LF alone (but NOT a plain CR alone) representing a line break when it is done consistently for an entire response body. Gemini clients MUST accept CRLF and bare LF as being representative of a line break in text media received via Gemini. - -If a MIME type begins with "text/" and no charset is explicitly given, the charset should be assumed to be UTF-8. Compliant clients MUST support UTF-8-encoded text/* responses. Clients MAY optionally support other encodings. Clients receiving a response in a charset they cannot decode SHOULD gracefully inform the user what happened instead of displaying garbage. - -If is an empty string, the MIME type MUST default to "text/gemini; charset=utf-8". The text/gemini media type is defined in section 5. +Response bodies are just raw content, text or binary, ala gopher. There is +no support for compression, chunking or any other kind of content or +transfer encoding. The server closes the connection after the final byte, +there is no "end of response" signal like gopher's lonely dot. + +Response bodies only accompany responses whose header indicates a SUCCESS +status (i.e. a status code whose first digit is 2). For such responses, + is a MIME media type as defined in RFC 2046. + +Internet media types are registered with a canonical form. Content +transferred via Gemini MUST be represented in the appropriate canonical form +prior to its transmission except for "text" types, as defined in the next +paragraph. + +When in canonical form, media subtypes of the "text" type use CRLF as the +text line break. Gemini relaxes this requirement and allows the transport +of text media with plain LF alone (but NOT a plain CR alone) representing a +line break when it is done consistently for an entire response body. Gemini +clients MUST accept CRLF and bare LF as being representative of a line break +in text media received via Gemini. + +If a MIME type begins with "text/" and no charset is explicitly given, the +charset should be assumed to be UTF-8. Compliant clients MUST support +UTF-8-encoded text/* responses. Clients MAY optionally support other +encodings. Clients receiving a response in a charset they cannot decode +SHOULD gracefully inform the user what happened instead of displaying +garbage. + +If is an empty string, the MIME type MUST default to "text/gemini; +charset=utf-8". The text/gemini media type is defined in section 5. ## 3.4 Response body handling -Response handling by clients should be informed by the provided MIME type information. Gemini defines one MIME type of its own (text/gemini) whose handling is discussed below in section 5. In all other cases, clients should do "something sensible" based on the MIME type. Minimalistic clients might adopt a strategy of printing all other text/* responses to the screen without formatting and saving all non-text responses to the disk. Clients for unix systems may consult /etc/mailcap to find installed programs for handling non-text types. +Response handling by clients should be informed by the provided MIME type +information. Gemini defines one MIME type of its own (text/gemini) whose +handling is discussed below in section 5. In all other cases, clients +should do "something sensible" based on the MIME type. Minimalistic clients +might adopt a strategy of printing all other text/* responses to the screen +without formatting and saving all non-text responses to the disk. Clients +for unix systems may consult /etc/mailcap to find installed programs for +handling non-text types. # 4 TLS Use of TLS for Gemini transactions is mandatory. -Use of the Server Name Indication (SNI) extension to TLS is also mandatory, to facilitate name-based virtual hosting. +Use of the Server Name Indication (SNI) extension to TLS is also mandatory, +to facilitate name-based virtual hosting. ## 4.1 Version requirements -Servers MUST use TLS version 1.2 or higher and SHOULD use TLS version 1.3 or higher. TLS 1.2 is reluctantly permitted for now to avoid drastically reducing the range of available implementation libraries. Hopefully TLS 1.3 or higher can be specced in the near future. Clients who wish to be "ahead of the curve MAY refuse to connect to servers using TLS version 1.2 or lower. +Servers MUST use TLS version 1.2 or higher and SHOULD use TLS version 1.3 or +higher. TLS 1.2 is reluctantly permitted for now to avoid drastically +reducing the range of available implementation libraries. Hopefully TLS 1.3 +or higher can be specced in the near future. Clients who wish to be "ahead +of the curve MAY refuse to connect to servers using TLS version 1.2 or +lower. ## 4.2 Server certificate validation -Clients can validate TLS connections however they like (including not at all) but the strongly RECOMMENDED approach is to implement a lightweight "TOFU" certificate-pinning system which treats self-signed certificates as first- class citizens. This greatly reduces TLS overhead on the network (only one cert needs to be sent, not a whole chain) and lowers the barrier to entry for setting up a Gemini site (no need to pay a CA or setup a Let's Encrypt cron job, just make a cert and go). - -TOFU stands for "Trust On First Use" and is public-key security model similar to that used by OpenSSH. The first time a Gemini client connects to a server, it accepts whatever certificate it is presented. That certificate's fingerprint and expiry date are saved in a persistent database (like the .known_hosts file for SSH), associated with the server's hostname. On all subsequent connections to that hostname, the received certificate's fingerprint is computed and compared to the one in the database. If the certificate is not the one previously received, but the previous certificate's expiry date has not passed, the user is shown a warning, analogous to the one web browser users are shown when receiving a certificate without a signature chain leading to a trusted CA. - -This model is by no means perfect, but it is not awful and is vastly superior to just accepting self-signed certificates unconditionally. +Clients can validate TLS connections however they like (including not at +all) but the strongly RECOMMENDED approach is to implement a lightweight +"TOFU" certificate-pinning system which treats self-signed certificates as +first- class citizens. This greatly reduces TLS overhead on the network +(only one cert needs to be sent, not a whole chain) and lowers the barrier +to entry for setting up a Gemini site (no need to pay a CA or setup a Let's +Encrypt cron job, just make a cert and go). + +TOFU stands for "Trust On First Use" and is public-key security model +similar to that used by OpenSSH. The first time a Gemini client connects to +a server, it accepts whatever certificate it is presented. That +certificate's fingerprint and expiry date are saved in a persistent database +(like the .known_hosts file for SSH), associated with the server's hostname. +On all subsequent connections to that hostname, the received certificate's +fingerprint is computed and compared to the one in the database. If the +certificate is not the one previously received, but the previous +certificate's expiry date has not passed, the user is shown a warning, +analogous to the one web browser users are shown when receiving a +certificate without a signature chain leading to a trusted CA. + +This model is by no means perfect, but it is not awful and is vastly +superior to just accepting self-signed certificates unconditionally. ## 4.3 Client certificates -Although rarely seen on the web, TLS permits clients to identify themselves to servers using certificates, in exactly the same way that servers traditionally identify themselves to the client. Gemini includes the ability for servers to request in-band that a client repeats a request with a client certificate. This is a very flexible, highly secure but also very simple notion of client identity with several applications: - -* Short-lived client certificates which are generated on demand and deleted immediately after use can be used as "session identifiers" to maintain server-side state for applications. In this role, client certificates act as a substitute for HTTP cookies, but unlike cookies they are generated voluntarily by the client, and once the client deletes a certificate and its matching key, the server cannot possibly "resurrect" the same value later (unlike so-called "super cookies"). -* Long-lived client certificates can reliably identify a user to a multi-user application without the need for passwords which may be brute-forced. Even a stolen database table mapping certificate hashes to user identities is not a security risk, as rainbow tables for certificates are not feasible. -* Self-hosted, single-user applications can be easily and reliably secured in a manner familiar from OpenSSH: the user generates a self-signed certificate and adds its hash to a server-side list of permitted certificates, analogous to the .authorized_keys file for SSH). - -Gemini requests will typically be made without a client certificate. If a requested resource requires a client certificate and one is not included in a request, the server can respond with a status code of 60, 61 or 62 (see Appendix 1 below for a description of all status codes related to client certificates). A client certificate which is generated or loaded in response to such a status code has its scope bound to the same hostname as the request URL and to all paths below the path of the request URL path. E.g. if a request for gemini://example.com/foo returns status 60 and the user chooses to generate a new client certificate in response to this, that same certificate should be used for subsequent requests to gemini://example.com/foo, gemini://example.com/foo/bar/, gemini://example.com/foo/bar/baz, etc., until such time as the user decides to delete the certificate or to temporarily deactivate it. Interactive clients for human users are strongly recommended to make such actions easy and to generally give users full control over the use of client certificates. +Although rarely seen on the web, TLS permits clients to identify themselves +to servers using certificates, in exactly the same way that servers +traditionally identify themselves to the client. Gemini includes the +ability for servers to request in-band that a client repeats a request with +a client certificate. This is a very flexible, highly secure but also very +simple notion of client identity with several applications: + +* Short-lived client certificates which are generated on demand and deleted + immediately after use can be used as "session identifiers" to maintain + server-side state for applications. In this role, client certificates act + as a substitute for HTTP cookies, but unlike cookies they are generated + voluntarily by the client, and once the client deletes a certificate and + its matching key, the server cannot possibly "resurrect" the same value + later (unlike so-called "super cookies"). + +* Long-lived client certificates can reliably identify a user to a + multi-user application without the need for passwords which may be + brute-forced. Even a stolen database table mapping certificate hashes to + user identities is not a security risk, as rainbow tables for certificates + are not feasible. + +* Self-hosted, single-user applications can be easily and reliably secured + in a manner familiar from OpenSSH: the user generates a self-signed + certificate and adds its hash to a server-side list of permitted + certificates, analogous to the .authorized_keys file for SSH). + +Gemini requests will typically be made without a client certificate. If a +requested resource requires a client certificate and one is not included in +a request, the server can respond with a status code of 60, 61 or 62 (see +Appendix 1 below for a description of all status codes related to client +certificates). A client certificate which is generated or loaded in +response to such a status code has its scope bound to the same hostname as +the request URL and to all paths below the path of the request URL path. +E.g. if a request for gemini://example.com/foo returns status 60 and the +user chooses to generate a new client certificate in response to this, that +same certificate should be used for subsequent requests to +gemini://example.com/foo, gemini://example.com/foo/bar/, +gemini://example.com/foo/bar/baz, etc., until such time as the user decides +to delete the certificate or to temporarily deactivate it. Interactive +clients for human users are strongly recommended to make such actions easy +and to generally give users full control over the use of client +certificates. # 5 The text/gemini media type ## 5.1 Overview -In the same sense that HTML is the "native" response format of HTTP and plain text is the native response format of gopher, Gemini defines its own native response format - though of course, thanks to the inclusion of a MIME type in the response header Gemini can be used to serve plain text, rich text, HTML, Markdown, LaTeX, etc. +In the same sense that HTML is the "native" response format of HTTP and +plain text is the native response format of gopher, Gemini defines its own +native response format - though of course, thanks to the inclusion of a MIME +type in the response header Gemini can be used to serve plain text, rich +text, HTML, Markdown, LaTeX, etc. + +Response bodies of type "text/gemini" are a kind of lightweight hypertext +format, which takes inspiration from gophermaps and from Markdown. The +format permits richer typographic possibilities than the plain text of +Gopher, but remains extremely easy to parse. The format is line-oriented, +and a satisfactory rendering can be achieved with a single pass of a +document, processing each line independently. As per gopher, links can only +be displayed one per line, encouraging neat, list-like structure. + +Similar to how the two-digit Gemini status codes were designed so that +simple clients can function correctly while ignoring the second digit, the +text/gemini format has been designed so that simple clients can ignore the +more advanced features and still remain very usable. + +## 5.2 Parameters -Response bodies of type "text/gemini" are a kind of lightweight hypertext format, which takes inspiration from gophermaps and from Markdown. The format permits richer typographic possibilities than the plain text of Gopher, but remains extremely easy to parse. The format is line-oriented, and a satisfactory rendering can be achieved with a single pass of a document, processing each line independently. As per gopher, links can only be displayed one per line, encouraging neat, list-like structure. +As a subtype of the top-level media type "text", "text/gemini" inherits the +"charset" parameter defined in RFC 2046. However, as noted in 3.3, the +default value of "charset" is "UTF-8" for "text" content transferred via +Gemini. + +A single additional parameter specific to the "text/gemini" subtype is +defined: the "lang" parameter. The value of "lang" denotes the natural +language or language(s) in which the textual content of a "text/gemini" +document is written. The presence of the "lang" parameter is optional. +When the "lang" parameter is present, its interpretation is defined entirely +by the client. For example, clients which use text-to-speech technology to +make Gemini content accessible to visually impaired users may use the value +of "lang" to improve pronunciation of content. Clients which render text to +a screen may use the value of "lang" to determine whether text should be +displayed left-to-right or right-to-left. Simple clients for users who only +read languages written left-to-right may simply ignore the value of "lang". +When the "lang" parameter is not present, no default value should be assumed +and clients which require some notion of a language in order to process the +content (such as text-to-speech screen readers) should rely on user-input to +determine how to proceed in the absence of a "lang" parameter. + +Valid values for the "lang" parameter are comma-separated lists of one or +more language tags as defined in RFC4646. For example: -Similar to how the two-digit Gemini status codes were designed so that simple clients can function correctly while ignoring the second digit, the text/gemini format has been designed so that simple clients can ignore the more advanced features and still remain very usable. +* "text/gemini; lang=en" Denotes a text/gemini document written in English -## 5.2 Parameters +* "text/gemini; lang=fr" Denotes a text/gemini document written in French -As a subtype of the top-level media type "text", "text/gemini" inherits the "charset" parameter defined in RFC 2046. However, as noted in 3.3, the default value of "charset" is "UTF-8" for "text" content transferred via Gemini. +* "text/gemini; lang=en,fr" Denotes a text/gemini document written in a + mixture of English and French -A single additional parameter specific to the "text/gemini" subtype is defined: the "lang" parameter. The value of "lang" denotes the natural language or language(s) in which the textual content of a "text/gemini" document is written. The presence of the "lang" parameter is optional. When the "lang" parameter is present, its interpretation is defined entirely by the client. For example, clients which use text-to-speech technology to make Gemini content accessible to visually impaired users may use the value of "lang" to improve pronunciation of content. Clients which render text to a screen may use the value of "lang" to determine whether text should be displayed left-to-right or right-to-left. Simple clients for users who only read languages written left-to-right may simply ignore the value of "lang". When the "lang" parameter is not present, no default value should be assumed and clients which require some notion of a language in order to process the content (such as text-to-speech screen readers) should rely on user-input to determine how to proceed in the absence of a "lang" parameter. +* "text/gemini; lang=de-CH" Denotes a text/gemini document written in Swiss + German -Valid values for the "lang" parameter are comma-separated lists of one or more language tags as defined in RFC4646. For example: +* "text/gemini; lang=sr-Cyrl" Denotes a text/gemini document written in + Serbian using the Cyrllic script -* "text/gemini; lang=en" Denotes a text/gemini document written in English -* "text/gemini; lang=fr" Denotes a text/gemini document written in French -* "text/gemini; lang=en,fr" Denotes a text/gemini document written in a mixture of English and French -* "text/gemini; lang=de-CH" Denotes a text/gemini document written in Swiss German -* "text/gemini; lang=sr-Cyrl" Denotes a text/gemini document written in Serbian using the Cyrllic script -* "text/gemini; lang=zh-Hans-CN" Denotes a text/gemini document written in Chinese using the Simplified script as used in mainland China +* "text/gemini; lang=zh-Hans-CN" Denotes a text/gemini document written in + Chinese using the Simplified script as used in mainland China ## 5.3 Line-orientation -As mentioned, the text/gemini format is line-oriented. Each line of a text/gemini document has a single "line type". It is possible to unambiguously determine a line's type purely by inspecting its first three characters. A line's type determines the manner in which it should be presented to the user. Any details of presentation or rendering associated with a particular line type are strictly limited in scope to that individual line. - -There are 7 different line types in total. However, a fully functional and specification compliant Gemini client need only recognise and handle 4 of them - these are the "core line types", (see 5.4). Advanced clients can also handle the additional "advanced line types" (see 5.5). Simple clients can treat all advanced line types as equivalent to one of the core line types and still offer an adequate user experience. +As mentioned, the text/gemini format is line-oriented. Each line of a +text/gemini document has a single "line type". It is possible to +unambiguously determine a line's type purely by inspecting its first three +characters. A line's type determines the manner in which it should be +presented to the user. Any details of presentation or rendering associated +with a particular line type are strictly limited in scope to that individual +line. + +There are 7 different line types in total. However, a fully functional and +specification compliant Gemini client need only recognise and handle 4 of +them - these are the "core line types", (see 5.4). Advanced clients can +also handle the additional "advanced line types" (see 5.5). Simple clients +can treat all advanced line types as equivalent to one of the core line +types and still offer an adequate user experience. ## 5.4 Core line types @@ -198,21 +412,57 @@ The four core line types are: ### 5.4.1 Text lines -Text lines are the most fundamental line type - any line which does not match the definition of another line type defined below defaults to being a text line. The majority of lines in a typical text/gemini document will be text lines. - -Text lines should be presented to the user, after being wrapped to the appropriate width for the client's viewport (see below). Text lines may be presented to the user in a visually pleasing manner for general reading, the precise meaning of which is at the client's discretion. For example, variable width fonts may be used, spacing may be normalised, with spaces between sentences being made wider than spacing between words, and other such typographical niceties may be applied. Clients may permit users to customise the appearance of text lines by altering the font, font size, text and background colour, etc. Authors should not expect to exercise any control over the precise rendering of their text lines, only of their actual textual content. Content such as ASCII art, computer source code, etc. which may appear incorrectly when treated as such should be enclosed between preformatting toggle lines (see 5.4.3). - -Blank lines are instances of text lines and have no special meaning. They should be rendered individually as vertical blank space each time they occur. In this way they are analogous to
tags in HTML. Consecutive blank lines should NOT be collapsed into a fewer blank lines. Note also that consecutive non-blank text lines do not form any kind of coherent unit or block such as a "paragraph": all text lines are independent entities. - -Text lines which are longer than can fit on a client's display device SHOULD be "wrapped" to fit, i.e. long lines should be split (ideally at whitespace or at hyphens) into multiple consecutive lines of a device-appropriate width. This wrapping is applied to each line of text independently. Multiple consecutive lines which are shorter than the client's display device MUST NOT be combined into fewer, longer lines. - -In order to take full advantage of this method of text formatting, authors of text/gemini content SHOULD avoid hard-wrapping to a specific fixed width, in contrast to the convention in Gopherspace where text is typically wrapped at 80 characters or fewer. Instead, text which should be displayed as a contiguous block should be written as a single long line. Most text editors can be configured to "soft-wrap", i.e. to write this kind of file while displaying the long lines wrapped at word boundaries to fit the author's display device. - -Authors who insist on hard-wrapping their content MUST be aware that the content will display neatly on clients whose display device is as wide as the hard-wrapped length or wider, but will appear with irregular line widths on narrower clients. +Text lines are the most fundamental line type - any line which does not +match the definition of another line type defined below defaults to being a +text line. The majority of lines in a typical text/gemini document will be +text lines. + +Text lines should be presented to the user, after being wrapped to the +appropriate width for the client's viewport (see below). Text lines may be +presented to the user in a visually pleasing manner for general reading, the +precise meaning of which is at the client's discretion. For example, +variable width fonts may be used, spacing may be normalised, with spaces +between sentences being made wider than spacing between words, and other +such typographical niceties may be applied. Clients may permit users to +customise the appearance of text lines by altering the font, font size, text +and background colour, etc. Authors should not expect to exercise any +control over the precise rendering of their text lines, only of their actual +textual content. Content such as ASCII art, computer source code, etc. +which may appear incorrectly when treated as such should be enclosed between +preformatting toggle lines (see 5.4.3). + +Blank lines are instances of text lines and have no special meaning. They +should be rendered individually as vertical blank space each time they +occur. In this way they are analogous to
tags in HTML. Consecutive +blank lines should NOT be collapsed into a fewer blank lines. Note also +that consecutive non-blank text lines do not form any kind of coherent unit +or block such as a "paragraph": all text lines are independent entities. + +Text lines which are longer than can fit on a client's display device SHOULD +be "wrapped" to fit, i.e. long lines should be split (ideally at whitespace +or at hyphens) into multiple consecutive lines of a device-appropriate +width. This wrapping is applied to each line of text independently. +Multiple consecutive lines which are shorter than the client's display +device MUST NOT be combined into fewer, longer lines. + +In order to take full advantage of this method of text formatting, authors +of text/gemini content SHOULD avoid hard-wrapping to a specific fixed width, +in contrast to the convention in Gopherspace where text is typically wrapped +at 80 characters or fewer. Instead, text which should be displayed as a +contiguous block should be written as a single long line. Most text editors +can be configured to "soft-wrap", i.e. to write this kind of file while +displaying the long lines wrapped at word boundaries to fit the author's +display device. + +Authors who insist on hard-wrapping their content MUST be aware that the +content will display neatly on clients whose display device is as wide as +the hard-wrapped length or wider, but will appear with irregular line widths +on narrower clients. ### 5.4.2 Link lines -Lines beginning with the two characters "=>" are link lines, which have the following syntax: +Lines beginning with the two characters "=>" are link lines, which have the +following syntax: ``` =>[] [ ] @@ -234,44 +484,105 @@ All the following examples are valid link lines: => gopher://example.org:70/1 A gopher link ``` -URLs in link lines must have reserved characters and spaces percent-encoded as per RFC 3986. +URLs in link lines must have reserved characters and spaces percent-encoded +as per RFC 3986. -Note that link URLs may have schemes other than gemini. This means that Gemini documents can simply and elegantly link to documents hosted via other protocols, unlike gophermaps which can only link to non-gopher content via a non-standard adaptation of the `h` item-type. +Note that link URLs may have schemes other than gemini. This means that +Gemini documents can simply and elegantly link to documents hosted via other +protocols, unlike gophermaps which can only link to non-gopher content via a +non-standard adaptation of the `h` item-type. -Clients can present links to users in whatever fashion the client author wishes, however clients MUST NOT automatically make any network connections as part of displaying links whose scheme corresponds to a network protocol (e.g. links beginning with gemini://, gopher://, https://, ftp:// , etc.). +Clients can present links to users in whatever fashion the client author +wishes, however clients MUST NOT automatically make any network connections +as part of displaying links whose scheme corresponds to a network protocol +(e.g. links beginning with gemini://, gopher://, https://, ftp:// , etc.). ### 5.4.3 Preformatting toggle lines -Any line whose first three characters are "```" (i.e. three consecutive back ticks with no leading whitespace) are preformatted toggle lines. These lines should NOT be included in the rendered output shown to the user. Instead, these lines toggle the parser between preformatted mode being "on" or "off". Preformatted mode should be "off" at the beginning of a document. The current status of preformatted mode is the only internal state a parser is required to maintain. When preformatted mode is "on", the usual rules for identifying line types are suspended, and all lines should be identified as preformatted text lines (see 5.4.4). - -Preformatting toggle lines can be thought of as analogous to andtags in HTML. - -Any text following the leading "```" of a preformat toggle line which toggles preformatted mode on MAY be interpreted by the client as "alt text" pertaining to the preformatted text lines which follow the toggle line. Use of alt text is at the client's discretion, and simple clients may ignore it. Alt text is recommended for ASCII art or similar non-textual content which, for example, cannot be meaningfully understood when rendered through a screen reader or usefully indexed by a search engine. Alt text may also be used for computer source code to identify the programming language which advanced clients may use for syntax highlighting. - -Any text following the leading "```" of a preformat toggle line which toggles preformatted mode off MUST be ignored by clients. +Any line whose first three characters are "```" (i.e. three consecutive +back ticks with no leading whitespace) are preformatted toggle lines. These +lines should NOT be included in the rendered output shown to the user. +Instead, these lines toggle the parser between preformatted mode being "on" +or "off". Preformatted mode should be "off" at the beginning of a document. +The current status of preformatted mode is the only internal state a parser +is required to maintain. When preformatted mode is "on", the usual rules +for identifying line types are suspended, and all lines should be identified +as preformatted text lines (see 5.4.4). + +Preformatting toggle lines can be thought of as analogous toand +tags in HTML. + +Any text following the leading "```" of a preformat toggle line which +toggles preformatted mode on MAY be interpreted by the client as "alt text" +pertaining to the preformatted text lines which follow the toggle line. Use +of alt text is at the client's discretion, and simple clients may ignore it. +Alt text is recommended for ASCII art or similar non-textual content which, +for example, cannot be meaningfully understood when rendered through a +screen reader or usefully indexed by a search engine. Alt text may also be +used for computer source code to identify the programming language which +advanced clients may use for syntax highlighting. + +Any text following the leading "```" of a preformat toggle line which +toggles preformatted mode off MUST be ignored by clients. ### 5.4.4 Preformatted text lines -Preformatted text lines should be presented to the user in a "neutral", monowidth font without any alteration to whitespace or stylistic enhancements. Graphical clients should use scrolling mechanisms to present preformatted text lines which are longer than the client viewport, in preference to wrapping. In displaying preformatted text lines, clients should keep in mind applications like ASCII art and computer source code: in particular, source code in languages with significant whitespace (e.g. Python) should be able to be copied and pasted from the client into a file and interpreted/compiled without any problems arising from the client's manner of displaying them. +Preformatted text lines should be presented to the user in a "neutral", +monowidth font without any alteration to whitespace or stylistic +enhancements. Graphical clients should use scrolling mechanisms to present +preformatted text lines which are longer than the client viewport, in +preference to wrapping. In displaying preformatted text lines, clients +should keep in mind applications like ASCII art and computer source code: in +particular, source code in languages with significant whitespace (e.g. +Python) should be able to be copied and pasted from the client into a file +and interpreted/compiled without any problems arising from the client's +manner of displaying them. ## 5.5 Advanced line types -The following advanced line types MAY be recognised by advanced clients. Simple clients may treat them all as text lines as per 5.4.1 without any loss of essential function. +The following advanced line types MAY be recognised by advanced clients. +Simple clients may treat them all as text lines as per 5.4.1 without any +loss of essential function. ### 5.5.1 Heading lines -Lines beginning with "#" are heading lines. Heading lines consist of one, two or three consecutive "#" characters, followed by optional whitespace, followed by heading text. The number of # characters indicates the "level" of header; #, ## and ### can be thought of as analogous to,
and
in HTML. - -Heading text should be presented to the user, and clients MAY use special formatting, e.g. a larger or bold font, to indicate its status as a header (simple clients may simply print the line, including its leading #s, without any styling at all). However, the main motivation for the definition of heading lines is not stylistic but to provide a machine-readable representation of the internal structure of the document. Advanced clients can use this information to, e.g. display an automatically generated and hierarchically formatted "table of contents" for a long document in a side-pane, allowing users to easily jump to specific sections without excessive scrolling. CMS-style tools automatically generating menus or Atom/RSS feeds for a directory of text/gemini files can use first -heading in the file as a human-friendly title. +Lines beginning with "#" are heading lines. Heading lines consist of one, +two or three consecutive "#" characters, followed by optional whitespace, +followed by heading text. The number of # characters indicates the "level" +of header; #, ## and ### can be thought of as analogous to
,
and +
in HTML. + +Heading text should be presented to the user, and clients MAY use special +formatting, e.g. a larger or bold font, to indicate its status as a header +(simple clients may simply print the line, including its leading #s, without +any styling at all). However, the main motivation for the definition of +heading lines is not stylistic but to provide a machine-readable +representation of the internal structure of the document. Advanced clients +can use this information to, e.g. display an automatically generated and +hierarchically formatted "table of contents" for a long document in a +side-pane, allowing users to easily jump to specific sections without +excessive scrolling. CMS-style tools automatically generating menus or +Atom/RSS feeds for a directory of text/gemini files can use first heading in +the file as a human-friendly title. ### 5.5.2 Unordered list items -Lines beginning with "* " are unordered list items. This line type exists purely for stylistic reasons. The * may be replaced in advanced clients by a bullet symbol. Any text after the "* " should be presented to the user as if it were a text line, i.e. wrapped to fit the viewport and formatted "nicely". Advanced clients can take the space of the bullet symbol into account when wrapping long list items to ensure that all lines of text corresponding to the item are offset an equal distance from the left of the screen. +Lines beginning with "* " are unordered list items. This line type exists +purely for stylistic reasons. The * may be replaced in advanced clients by +a bullet symbol. Any text after the "* " should be presented to the user as +if it were a text line, i.e. wrapped to fit the viewport and formatted +"nicely". Advanced clients can take the space of the bullet symbol into +account when wrapping long list items to ensure that all lines of text +corresponding to the item are offset an equal distance from the left of the +screen. ### 5.5.3 Quote lines -Lines beginning with ">" are quote lines. This line type exists so that advanced clients may use distinct styling to convey to readers the important semantic information that certain text is being quoted from an external source. For example, when wrapping long lines to the the viewport, each resultant line may have a ">" symbol placed at the front. +Lines beginning with ">" are quote lines. This line type exists so that +advanced clients may use distinct styling to convey to readers the important +semantic information that certain text is being quoted from an external +source. For example, when wrapping long lines to the the viewport, each +resultant line may have a ">" symbol placed at the front. # Appendix 1. Full two digit status codes @@ -281,7 +592,10 @@ As per definition of single-digit code 1 in 3.2. ## 11 SENSITIVE INPUT -As per status code 10, but for use with sensitive input such as passwords. Clients should present the prompt as per status code 10, but the user's input should not be echoed to the screen to prevent it being read by "shoulder surfers". +As per status code 10, but for use with sensitive input such as passwords. +Clients should present the prompt as per status code 10, but the user's +input should not be echoed to the screen to prevent it being read by +"shoulder surfers". ## 20 SUCCESS @@ -293,7 +607,15 @@ As per definition of single-digit code 3 in 3.2. ## 31 REDIRECT - PERMANENT -The requested resource should be consistently requested from the new URL provided in future. Tools like search engine indexers or content aggregators should update their configurations to avoid requesting the old URL, and end-user clients may automatically update bookmarks, etc. Note that clients which only pay attention to the initial digit of status codes will treat this as a temporary redirect. They will still end up at the right place, they just won't be able to make use of the knowledge that this redirect is permanent, so they'll pay a small performance penalty by having to follow the redirect each time. +The requested resource should be consistently requested from the new URL +provided in future. Tools like search engine indexers or content +aggregators should update their configurations to avoid requesting the old +URL, and end-user clients may automatically update bookmarks, etc. Note +that clients which only pay attention to the initial digit of status codes +will treat this as a temporary redirect. They will still end up at the +right place, they just won't be able to make use of the knowledge that this +redirect is permanent, so they'll pay a small performance penalty by having +to follow the redirect each time. ## 40 TEMPORARY FAILURE @@ -305,15 +627,19 @@ The server is unavailable due to overload or maintenance. (cf HTTP 503) ## 42 CGI ERROR -A CGI process, or similar system for generating dynamic content, died unexpectedly or timed out. +A CGI process, or similar system for generating dynamic content, died +unexpectedly or timed out. ## 43 PROXY ERROR -A proxy request failed because the server was unable to successfully complete a transaction with the remote host. (cf HTTP 502, 504) +A proxy request failed because the server was unable to successfully +complete a transaction with the remote host. (cf HTTP 502, 504) ## 44 SLOW DOWN -Rate limiting is in effect. is an integer number of seconds which the client must wait before another request is made to this server. (cf HTTP 429) +Rate limiting is in effect. is an integer number of seconds which +the client must wait before another request is made to this server. (cf +HTTP 429) ## 50 PERMANENT FAILURE @@ -321,19 +647,27 @@ As per definition of single-digit code 5 in 3.2. ## 51 NOT FOUND -The requested resource could not be found but may be available in the future. (cf HTTP 404) (struggling to remember this important status code? Easy: you can't find things hidden at Area 51!) +The requested resource could not be found but may be available in the +future. (cf HTTP 404) (struggling to remember this important status code? +Easy: you can't find things hidden at Area 51!) ## 52 GONE -The resource requested is no longer available and will not be available again. Search engines and similar tools should remove this resource from their indices. Content aggregators should stop requesting the resource and convey to their human users that the subscribed resource is gone. (cf HTTP 410) +The resource requested is no longer available and will not be available +again. Search engines and similar tools should remove this resource from +their indices. Content aggregators should stop requesting the resource and +convey to their human users that the subscribed resource is gone. (cf HTTP +410) ## 53 PROXY REQUEST REFUSED -The request was for a resource at a domain not served by the server and the server does not accept proxy requests. +The request was for a resource at a domain not served by the server and the +server does not accept proxy requests. ## 59 BAD REQUEST -The server was unable to parse the client's request, presumably due to a malformed request. (cf HTTP 400) +The server was unable to parse the client's request, presumably due to a +malformed request. (cf HTTP 400) ## 60 CLIENT CERTIFICATE REQUIRED @@ -341,8 +675,16 @@ As per definition of single-digit code 6 in 3.2. ## 61 CERTIFICATE NOT AUTHORISED -The supplied client certificate is not authorised for accessing the particular requested resource. The problem is not with the certificate itself, which may be authorised for other resources. +The supplied client certificate is not authorised for accessing the +particular requested resource. The problem is not with the certificate +itself, which may be authorised for other resources. ## 62 CERTIFICATE NOT VALID -The supplied client certificate was not accepted because it is not valid. This indicates a problem with the certificate in and of itself, with no consideration of the particular requested resource. The most likely cause is that the certificate's validity start date is in the future or its expiry date has passed, but this code may also indicate an invalid signature, or a violation of a X509 standard requirements. The should provide more information about the exact error. +The supplied client certificate was not accepted because it is not valid. +This indicates a problem with the certificate in and of itself, with no +consideration of the particular requested resource. The most likely cause +is that the certificate's validity start date is in the future or its expiry +date has passed, but this code may also indicate an invalid signature, or a +violation of a X509 standard requirements. The should provide more +information about the exact error.
-----END OF PAGE-----