Map

Daily open-source software guide--Content creation workflow Daily open-source software guide--Caddy Content creation workflow--Colour scheme Content creation workflow--Software alternatives Caddy--Content creation workflow Caddy--Colour scheme The Waiting room--Cosmic Horror The Waiting room--How to think Cosmic Horror--Reading Cosmic Horror--Excellent Words How to think--The Ławka Initiative How to think--Excellent Words Map--Exocortex Map--Website experience Exocortex--Website experience Exocortex--Paintings Website experience--Contact Website experience--Decentralisation 2022's Devlog--2023's Devlog 2022's Devlog--Travel 2023's Devlog--Listening 2023's Devlog--2025's Devlog Travel--2025's Devlog Travel--2024's Devlog Paintings--How to think Paintings--Map Rust--Making music on Linux Rust--Resources on audio & DSP Making music on Linux--Resources on audio & DSP Making music on Linux--GPU Synth Resources on audio & DSP--Sounds & Melodies Resources on audio & DSP--Custom sequencer Movies/shows--Cosmic Horror Movies/shows--Reading Reading--The Waiting room Reading--Excellent Words GPU Synth--Resources on audio & DSP GPU Synth--Custom synth The word *Bączek*--Paintings The word *Bączek*--The Ławka Initiative The Ławka Initiative--Paintings The Ławka Initiative--Excellent Words Excellent Words--The Waiting room Excellent Words--The word *Bączek* Listening--Movies/shows Listening--Piano 2025's Devlog--2022's Devlog 2025's Devlog--System configuration NixOs--System configuration NixOs--Colour scheme System configuration--Colour scheme System configuration--About Colour scheme--Daily open-source software guide Colour scheme--Software alternatives Piano--2023's Devlog Piano--Rust About--2025's Devlog About--NixOs Software alternatives--Daily open-source software guide Software alternatives--NixOs Bookmarks--Website experience Bookmarks--Registry-based search engine Registry-based search engine--Map Registry-based search engine--Website experience 2024's Devlog--2025's Devlog 2024's Devlog--System configuration Luthier--Resources on audio & DSP Luthier--Custom synth Custom synth--Making music on Linux Custom synth--Resources on audio & DSP Contact--Caddy Contact--2024's Devlog Decentralisation--Content creation workflow Decentralisation--Caddy Sounds & Melodies--Piano Sounds & Melodies--Music Transcribing Music Transcribing--Rust Music Transcribing--Piano Custom sequencer--GPU Synth Custom sequencer--Custom synth Daily open-source software guide Daily open- source software guide Content creation workflow Content creation workflow Caddy Caddy The Waiting room The Waiting room ! Cosmic Horror Cosmic Horror How to think How to think Map Map Exocortex Exocortex Website experience Website experience 2022's Devlog 2022's Devlog 2023's Devlog 2023's Devlog Travel Travel ! Paintings Paintings Rust Rust Making music on Linux Making music on Linux Resources on audio & DSP Resources on audio & DSP Movies/shows Movies/shows Reading Reading GPU Synth GPU Synth The word *Bączek* The word *Bączek* The Ławka Initiative The Ławka Initiative Excellent Words Excellent Words Listening Listening 2025's Devlog 2025's Devlog ! NixOs NixOs System configuration System configuration Colour scheme Colour scheme Piano Piano About About Software alternatives Software alternatives Bookmarks Bookmarks Registry-based search engine Registry-based search engine 2024's Devlog 2024's Devlog Luthier Luthier Custom synth Custom synth Contact Contact Decentralisation Decentralisation Sounds & Melodies Sounds & Melodies Music Transcribing Music Transcribing Custom sequencer Custom sequencer
You can click on each node, they are links!

This is an automatically generated graph containing all pages on my website, along with the connections calculated using sentence embeddings. If you're interested, you can read the source code.

How is this thing generated?

Explained non-technically

  1. Using an AI-esque tool, I'm generating a mathematical representation of what each page on my site contains, in terms of contents
  2. I'm laying out each page on a graph, so that it is placed close to pages with similar contents and far away from pages with different contents. E.x. programming-related stuff will be grouped together, far away from something travel-related.
  3. I'm drawing links between pages which are the closest. This also generates the "related posts" section at the bottom of each page. The drawn links only serve aesthetic purposes.
  4. Posts are colored depending on their relatedness to 3 topics:
    • More red: art-related
    • More green: computers-related
    • More blue: music-related
    • I'm working on better coloring algorithms based on various gradients

The gory technical details

  1. All of the posts are fed through an embeddings generator, I'm using the Sentence Transformers Python library.
  2. The embeddings are passed to UMAP, a dimensionality reduction algorithm, which takes in multi-dimensional embeddings and projects it down to a 2D representation, which can be drawn as a graph. The projection is done so that the high-level "structure" of the data is preserved (at least that's what the UMAP paper states, I'm not data scientist to argue with the experts).
  3. I'm connecting each post with its top 2 nearest posts (using more clutters up the map).
  4. Coloring is done via calculating cosine similarity between the post content embeddings and embeddings of simple tag-based sentences, such as "music, melodies" or "art, beauty". Currently the gradient is dead-simple, similarity directly affects the R/G/B channel.
  5. graphviz renders the graphs and outputs them as SVGs.

A much better description of a similiar idea on Simon Willison's blog.

Future plans

  1. Make this thing look more "map-alike", whatever that might mean.
  2. Experiment with text clusterisation & dimensionality reduction algorithms, such as:
    • tSNE
    • K-means clustering
    • UMAP
    • Latent Dirichlet allocation
    • DBSCAN
  3. Add #tags. Automatically assign posts to categories with cosine distances.
  4. Introduce color-coding and other visual markers, allowing viewers to make sense of the data based on different metrics:
    • Post tags
    • Links to/from other posts
    • Links outside (to the netsphere)
    • Other connections generated by NLP
  5. Check out KagiSearch/vectordb
  6. Color gradients with Python
  7. Circos