The words Under construction in black text on a yellow background with diagonal black stipes surrounding it
I'm in the process of moving my site. It's still a work in progress. Please excuse the mess and broken links.

Guidelines For Building A Static Site Generator

TODO: Pull subtitle into page object

Introduction

I plan to keep working on my static site generator for the next twenty years. This is the list of things I want to keep in mind should I ever decide to start over.

This is a brain dump. It's not ordered in a meaningful way, there's duplication, etc... Proceed accordingly.

The Guidelines

Debugging Stuff

I'm moving stuff around right now. All this below is helping me figure out where to put stuff

        -- title

Guidelines For Building A Static Site Generator

-- h2

Introduction

I plan to keep working on my static site generator
for the next twenty years. This
is the list of things I want to keep in mind should I 
ever decide to start over.


-- aside

This is a brain dump. It's not ordered in a meaningful way, there's
duplication, etc... Proceed accordingly.


-- h2

The Guidelines

-- list

- Start with the templates (e.g. make a basic set to
procude a sample site and then send hard coded
data into them. The idea is to figure out the
API you want to use in the templates themselves

- Be able to access metadata about each level of 
content from all the levels above it. E.g. using
a template structure of

site_meta - page - sections - blocks - tokens 

The `blocks`` template should know that it's in
a page built from the `music`` template. I didn't
do this great and it's the biggest thing I'm working
on changing. 

There's not really a site_meta template. That's just
the metadata for the site that should be available 
to everything under it. That includes things like
the title. Also, things like collections of links

- `section`` level templates should be able to
include other sections.

- Stick with one way to interact with data in the 
templates (e.g. always use methods in minijinja
instead of sometimes methods and sometimes the
data directly. Doing so means there's less 
syntax to remember)

- Define what should be included in the raw content
source for every piece of content. I use: id, 
date, status, template.

- For any section type that can wrap other sections
create `section_name_start`` and `section_name_end``
templates. (This worked for me and I'm keeping
it in mind but depending on implementation might
not make much sense)

- Provide a way to create a unique ID for each 
piece of content that doesn't change even if the
content/title/whatever does

- Don't use anything other than the content's
ID when creating URLs. Using anything other than 
a base directory and the unique ID for a post adds 
a lot of work.

- Use KSUIDs for the IDs. They sort by date which is nice. 
(I actually clip mine down to eight characters because
that looks nicer, but the base is the same with the date
sort followed by some randomness)

- Don't try to categorize URLs. Just do everything
for the main content with something like "/pages/a1b2c3/"

- Have a config file pretty quickly

- Stop at the directory level for each page. That is, 
do this "/pages/a1b2c3/" instead of "/pages/a1b2c3/index.html".
That'll let you change up the back end tech without having
to setup redirects for the URLs

- Suck in all the content via ASTs to start

- Load all the ASTs for all the content on the site before
doing any rendering. (i.e. create a universe that you can
pull stuff out of)

- Create templates at the page level, the component level, and
the token level (e.g. A page can have a title that shows up
in the metadata as well as an H1 on the page itself. By having
templates at the token level styling can be added to the H1
while keeping plain-text in the meta tag)

- Create a set of test templates with various sets of 
features in them that are independent from the templates
you use for the site output

- Make atom/rss feed templates at the same time you're making
the main HTML templates. (Helps ensure a more flexible 
structure than trying to bolt things on later)

- Don't compile templates into the builder. Make sure they
can be edited independently with no other code changes and
applied with only a build run

- Create an internal content tag to link pages by their ID
(e.g. I use `<<ilink|some text|a1b2c3>>``. That provides
auto linking to every page without having to use the URL.

- Be able to style content pieces

- Create content items in the AST with abandon

- Centralize on standard attributes for content items/components.
(e.g. `id`` always corresponds to and `id`` attribute
and is available in basically everything)

- Allow for custom attributes. But `data-whatever`` as well
as completely custom things

- Make accessing individual parts of the content AST as
fine grained as possible and build up from there. (e.g.
a link token would be the href, the link text, and list
of attributes, each one of which is addressable individually)

- Do the Atom/RSS feed sooner rather than later. Like
right after you have the initial list of pages showing up
on the home page

- Get something showing up as soon as possible. Minimum setup
for me is a home page with no content other than a list of links
to individual pages and then the pages themselves. Once that
base is built, you can decide where to go from there

- Use an ID only style system for accessing images. e.g. require
only the file name and let the system find the file anywhere
inside a dumping ground. That'll let you organize things
in a way that makes sense but not have to worry about
paths when dropping stuff in

- Figure out a way to deal with alt text so you can reuse
the images in multiple places without having to enter
the same text multiple times

- Make sure your ASTs have a solid test suite

- Make multiple template types: e.g. home_page, post, 
feed_post.

- Be able to query the system to get any content inside
any template (sometimes it won't make sense and give
you broken looking stuff, but the goal is be able
to access anything you need without having to build
more cases on top of the existing code)

- Keep template separate from categories. (e.g. 
separate page level templates might exist for bookmarks, 
videos, and posts. Categories might exist for Rust,
JavaScript, WebDev. You want to be able to tag
any type of content in any template with any
of the categories)

- By using only the unique ID you can shift stuff
between categories and or template without having
to worry about changing URLs

- Allow for pages to call custom things in the header
(external CSS, JS, etc...)

- Allow for defining explicit paths for key pages. 
That is, not everything would be just and ID. Landing
pages for anything you want a named URL for should
be possible. e.g. every page should automatically have
it's own ID, but you should be able to override the
URL to anything you want

- Make sure you can still link to pages internally
based off their ID even if they have a custom path. 
(e.g. `<<ilink|some text|a1b2c3>>`` would point
to `/band-names/`` if the page had been updated to 
use that path

- Make sure other content can work on the site. 
(e.g. if you want to throw a raw HTML page up with
a different foundation you should be able to just
throw it in it's own directory and work on it
directly. Then engine should leave it completely alone)

- Test ingestion of the content and building the AST
independently and first before moving on to the 
template output which should be testing on its own. 

- Make template at block levels too. E.g. paragraphs
would have their own template. So, you could have something
like "body_paragraph" and "li_paragraph" that would have
the ability to add different default styles to them

- Provide a way to append classes to the template output. 
(e.g. if the default output for something is 
`<div class="alfaClass"></div>`` make sure you can add a 
class in the content to append to the output so you can 
get `<div class="alfaClass bravoClass"></div>``)

- Don't make things be in a required order in the output. 
This goes back to the AST. (e.g. be able to make 
reference sections anywhere in the content and then 
have them show up in their original position, or all
aggregated at the bottom depending on which template 
you use)

- Keep a single UUIDv4 for atom feed top level ID
that's hard coded (i.e. it's the ID of the feed itself)

- Use UUIDv5 based off the feed's UUIDv4 and the content
ID. That way it'll stay the same between builds and 
can be used to signal updates to readers. 

- Start by just doing a full site build every time
a change is made. Only add in incremental builds if
the times get longer than you'd like. (i.e. that's an 
optimization to add after the site is live)

- Figure out how you want to handle HTML escaping
in each template including code and pre sections

- Provide blurb functionality for each page that
can use explicit text or grabs the first few lines. 

- Automatically use default template for everything but
allow overrides. (i.d. don't have to explicitly call
`template: default`` or whatever in the code every time
only use it if you want to use something else)

- Maintain a date object for every page that you can use
to generate any date format you want. (Dealing with
timezones is left as an exercise for the reader)

- Provide for including content generated at the 
universal level above the pages via a single call
with parameters. e.g. on a posts pages, be able
to call `site.cagegory_links("music")`` to get that
data. 

- Provide for internal reference links. e.g. 
be able to just drop in an id in a reference call
and have it automatically grab the title text, blurb,
and link for the page and drop it in. 

- Provide for a way to show pages in dev that don't
show up in prod. (e.g. make a drafts content type
with a listing page that only gets generated in 
dev)

- I haven't dealt with pagination yet. Notes
on that will happen when I get to it

- Store external IDs instead of URLs where possible.
E.g. with YouTube, I grab the ID and only store
that. I like being able to assemble the URL 
in the template instead of having it be what's
in the content

- Let pages be assembled from other pages. (e.g.
provide a way at the site level to create a pages
showing that last to pages with a template type
of "music")

- Ensure there's a way to delete files when content
is removed from the content source. 

- Provide access to every section via `#`` links

- Provide a way to call external process, send
data to them for processing and drop the results
into the output

- Each template should have things it expects (e.g.
posts should have titles). Decided for each one
if the content should be skipped if it doesn't
contain it. 

- Think about content files as individual items
instead of necessarily pages. e.g. there might 
be a piece of content with just an image and the
base metadata in it. That could have it's own
output page, but might only be part of 
a different group. 

- Provide for groups of content independent of
templates and categories. E.g. a "funk" group
for "music" template pages" that builds a 
collection page. This is independent from
the category of funk which would cross all
page types (e.g. "posts" and "bookmarks").

I haven't really started in on this one yet. 
It may not become a thing, but I keep running
into things where something like this would
be nice. 

- Create internal link references that 
auto pull the title for you too. Could
do something like `<<ilink|some text|a1b2c3>>`` for
linking the text "some text", and then use
this `<<ilink|EMPYT_SPACE_HERE_THAT_DOES_NOT_PARSE_YET|a1b2c3>>`` to default to just
grabbing the title of the page. 

- Make an auto reference link with title thing
like the internal ids. e.g. `-- iref`` with the
id that would just make a reference with that 
page title and blurb. Should be able to override
the blurb though

- Use HTML elements as the starting components
and token elements. (e.g. blockquotes, and asides)

- Allow override template from directly in the content. 
e.g. for outputting a code section, create a template
with and without line numbers

- Create a top level config page with metadata for 
the site

- Ability to add css and javascript in the content
that gets added directly into the head of the 
document (in addition to being able to call
external files)

- Include `span`` as a wrapper for text tokens

- Provide for data payload in content. E.g.
JSON blobs that would be processed by the 
template. 

- Provide a way to execute inline code on the
page to build out content sections. (e.g.
provide a `-- data`` section with a JSON
list of band names and then a `-- exe`` section
with some python code that runs to sort
the output. 

I've got this working from inside my notes, 
but not executing on built yet. It's one of
the key reasons I built all the stuff so
that I can ensure that my code snippets
actually work when I publish them

- In the templates, wrap every section in
a top level element where custom classes
and attributes are applied.

- Be able to send content to dev and prod
and whatever else you want output 
directories

- Make sure for when parsing things like
class attributes to split each class
into its own item so they can be parsed
individually

- Create a grammar for the parser to 
start with at the lowest level first.

- Create full inline token tags 
as well as shorthand ones. Identify
which one was used in the AST so
it can be used to output a valid
version of the source with the same
token set used

- Make an ouput that shows the latest
edit always at the same URL (`/last-edit`` e.g.)
make sure it doesn't show up in 
the main feed outputs

- Set up flags to keep individual pages
out of lists and feeds

- Generate lists dynamically as their
own data feeds that can be used in
their own templates 

- Provided for both an originally
published and an updated date on pages

- Don't require any given section to exist
in the main content bin other than metadata 
which houses the ID.

- Allow files that are not generated by 
the engine to be able to be mixed into
the same directories as files that are. 
(e.g. if a content page builds out to
`/pages/a1b2c3/`` allow for manually
dropping in a file at `/pages/a1b2c3/script.js``

- Add link checking that happens after the build
(e.g. generate all the content so it's ready 
for deploy and then check everything after
that so the checks don't delay the build)


-- categories
-- WebDev 
-- Neopolitan
-- Neopoligen


-- metadata
-- date: 2024-01-02 12:09:08
-- id: 2aplckzd
-- site: aws
-- type: post
-- status: published