Map

Software alternatives--NixOs Software alternatives--Content creation workflow NixOs--About NixOs--Daily open-source software guide Content creation workflow--NixOs Content creation workflow--Decentralisation Custom sequencer--Making music on Linux Custom sequencer--Custom synth Making music on Linux--Custom synth Making music on Linux--Luthier Custom synth--Sounds & Melodies Custom synth--Music Transcribing Registry-based search engine--Website experience Registry-based search engine--Bookmarks Website experience--Bookmarks Website experience--Exocortex Bookmarks--Daily open-source software guide Bookmarks--Caddy 2024's Devlog--About 2024's Devlog--Contact About--System configuration About--Colour scheme Contact--Registry-based search engine Contact--Caddy Reading--The Waiting room Reading--Cosmic Horror The Waiting room--Paintings The Waiting room--How to think Cosmic Horror--The Waiting room Cosmic Horror--Movies/shows Listening--2023's Devlog Listening--Movies/shows 2023's Devlog--2022's Devlog 2023's Devlog--2025's Devlog Movies/shows--2023's Devlog Movies/shows--2025's Devlog Luthier--Custom synth Luthier--Rust Rust--Custom synth Rust--Music Transcribing 2022's Devlog--2025's Devlog 2022's Devlog--The word *Bączek* 2025's Devlog--About 2025's Devlog--The Waiting room 2025's Devlog--2022's Devlog 2025's Devlog--Excellent Words The Ławka Initiative--Travel The Ławka Initiative--The word *Bączek* Travel--Paintings Travel--Exocortex The word *Bączek*--Excellent Words The word *Bączek*--Paintings Excellent Words--The Waiting room Excellent Words--How to think Paintings--The Ławka Initiative Paintings--Excellent Words Daily open-source software guide--Content creation workflow Daily open-source software guide--Decentralisation Decentralisation--Software alternatives Decentralisation--Bookmarks Exocortex--Registry-based search engine Exocortex--The Ławka Initiative System configuration--NixOs System configuration--Colour scheme Colour scheme--Software alternatives Colour scheme--NixOs Resources on audio & DSP--Custom synth Resources on audio & DSP--GPU Synth GPU Synth--Custom sequencer GPU Synth--Custom synth Sounds & Melodies--Music Transcribing Sounds & Melodies--Piano Music Transcribing--Making music on Linux Music Transcribing--Luthier Piano--Listening Piano--Music Transcribing Caddy--Daily open-source software guide Caddy--Decentralisation Map--Website experience Map--Exocortex How to think--Reading How to think--Paintings Software alternatives Software alternatives NixOs NixOs Content creation workflow Content creation workflow Custom sequencer Custom sequencer Making music on Linux Making music on Linux Custom synth Custom synth Registry-based search engine Registry-based search engine Website experience Website experience Bookmarks Bookmarks 2024's Devlog 2024's Devlog About About Contact Contact Reading Reading The Waiting room The Waiting room Cosmic Horror Cosmic Horror Listening Listening 2023's Devlog 2023's Devlog Movies/shows Movies/shows Luthier Luthier Rust Rust ! 2022's Devlog 2022's Devlog 2025's Devlog 2025's Devlog The Ławka Initiative The Ławka Initiative Travel Travel The word *Bączek* The word *Bączek* Excellent Words Excellent Words ! Paintings Paintings Daily open-source software guide Daily open- source software guide Decentralisation Decentralisation Exocortex Exocortex System configuration System configuration Colour scheme Colour scheme Resources on audio & DSP Resources on audio & DSP GPU Synth GPU Synth Sounds & Melodies Sounds & Melodies Music Transcribing Music Transcribing Piano Piano Caddy Caddy Map Map How to think How to think
You can click on each node, they are links!

This is an automatically generated graph containing all pages on my website, along with the connections calculated using sentence embeddings. If you're interested, you can read the source code.

How is this thing generated?

Explained non-technically

  1. Using an AI-esque tool, I'm generating a mathematical representation of what each page on my site contains, in terms of contents
  2. I'm laying out each page on a graph, so that it is placed close to pages with similar contents and far away from pages with different contents. E.x. programming-related stuff will be grouped together, far away from something travel-related.
  3. I'm drawing links between pages which are the closest. This also generates the "related posts" section at the bottom of each page. The drawn links only serve aesthetic purposes.
  4. Posts are colored depending on their relatedness to 3 topics:
    • More red: art-related
    • More green: computers-related
    • More blue: music-related
    • I'm working on better coloring algorithms based on various gradients

The gory technical details

  1. All of the posts are fed through an embeddings generator, I'm using the Sentence Transformers Python library.
  2. The embeddings are passed to UMAP, a dimensionality reduction algorithm, which takes in multi-dimensional embeddings and projects it down to a 2D representation, which can be drawn as a graph. The projection is done so that the high-level "structure" of the data is preserved (at least that's what the UMAP paper states, I'm not data scientist to argue with the experts).
  3. I'm connecting each post with its top 2 nearest posts (using more clutters up the map).
  4. Coloring is done via calculating cosine similarity between the post content embeddings and embeddings of simple tag-based sentences, such as "music, melodies" or "art, beauty". Currently the gradient is dead-simple, similarity directly affects the R/G/B channel.
  5. graphviz renders the graphs and outputs them as SVGs.

A much better description of a similiar idea on Simon Willison's blog.

Future plans

  1. Make this thing look more "map-alike", whatever that might mean.
  2. Experiment with text clusterisation & dimensionality reduction algorithms, such as:
    • tSNE
    • K-means clustering
    • UMAP
    • Latent Dirichlet allocation
    • DBSCAN
  3. Add #tags. Automatically assign posts to categories with cosine distances.
  4. Introduce color-coding and other visual markers, allowing viewers to make sense of the data based on different metrics:
    • Post tags
    • Links to/from other posts
    • Links outside (to the netsphere)
    • Other connections generated by NLP
  5. Check out KagiSearch/vectordb
  6. Color gradients with Python
  7. Circos