Map

Music Transcribing--Sounds & Melodies Music Transcribing--Rust Sounds & Melodies--Rust Sounds & Melodies--Piano Rust--Custom synth Rust--Luthier Bookmarks--Registry-based search engine Bookmarks--Contact Registry-based search engine--Map Registry-based search engine--Website experience Contact--Caddy Contact--About Resources on audio & DSP--Rust Resources on audio & DSP--Making music on Linux Making music on Linux--Rust Making music on Linux--Custom synth Decentralisation--Caddy Decentralisation--Daily open-source software guide Caddy--Content creation workflow Caddy--NixOs Daily open-source software guide--Caddy Daily open-source software guide--Content creation workflow Content creation workflow--Decentralisation Content creation workflow--Software alternatives The Ławka Initiative--Excellent Words The Ławka Initiative--The word *Bączek* Excellent Words--Cosmic Horror Excellent Words--Reading The word *Bączek*--Excellent Words The word *Bączek*--Cosmic Horror Piano--Music Transcribing Piano--Listening Listening--Travel Listening--Movies/shows 2022's Devlog--About 2022's Devlog--Travel About--2024's Devlog About--NixOs Travel--2023's Devlog Travel--The Waiting room 2023's Devlog--2024's Devlog 2023's Devlog--The Waiting room 2024's Devlog--Contact 2024's Devlog--Travel The Waiting room--How to think The Waiting room--Reading NixOs--Daily open-source software guide NixOs--Colour scheme Map--Exocortex Map--Website experience Exocortex--Website experience Exocortex--How to think Website experience--Bookmarks Website experience--Contact Paintings--The word *Bączek* Paintings--How to think How to think--2023's Devlog How to think--Cosmic Horror Custom synth--Luthier Custom synth--GPU Synth Luthier--Resources on audio & DSP Luthier--Making music on Linux GPU Synth--Luthier GPU Synth--Custom sequencer Movies/shows--Travel Movies/shows--The Waiting room Colour scheme--System configuration Colour scheme--Software alternatives System configuration--NixOs System configuration--Software alternatives Software alternatives--Daily open-source software guide Software alternatives--NixOs Cosmic Horror--The Waiting room Cosmic Horror--Movies/shows Reading--Movies/shows Reading--Cosmic Horror Custom sequencer--Rust Custom sequencer--Custom synth Music Transcribing Music Transcribing Sounds & Melodies Sounds & Melodies Rust Rust Bookmarks Bookmarks Registry-based search engine Registry-based search engine Contact Contact Resources on audio & DSP Resources on audio & DSP Making music on Linux Making music on Linux Decentralisation Decentralisation Caddy Caddy Daily open-source software guide Daily open- source software guide Content creation workflow Content creation workflow The Ławka Initiative The Ławka Initiative Excellent Words Excellent Words The word *Bączek* The word *Bączek* Piano Piano Listening Listening 2022's Devlog 2022's Devlog About About Travel Travel 2023's Devlog 2023's Devlog 2024's Devlog 2024's Devlog ! The Waiting room The Waiting room ! NixOs NixOs Map Map Exocortex Exocortex Website experience Website experience Paintings Paintings How to think How to think Custom synth Custom synth Luthier Luthier GPU Synth GPU Synth Movies/shows Movies/shows Colour scheme Colour scheme System configuration System configuration Software alternatives Software alternatives Cosmic Horror Cosmic Horror Reading Reading Custom sequencer Custom sequencer !
You can click on each node, they are links!

This is an automatically generated graph containing all pages on my websites, along with the connections calculated using sentence embeddings. If you're interested, you can read the source code.

How is this thing generated?

Explained non-technically

  1. Using an AI-esque tool, I'm generating a mathematical representation of what each page on my site contains, in terms of contents
  2. I'm laying out each page on a graph, so that it is placed close to pages with similar contents and far away from pages with different contents. E.x. programming-related stuff will be grouped together, far away from something travel-related.
  3. I'm drawing links between pages which are the closest. This also generates the "related posts" section at the bottom of each page. The drawn links only serve aesthetic purposes.
  4. Posts are colored depending on their relatedness to 3 topics:
    • More red: art-related
    • More green: computers-related
    • More blue: music-related
    • I'm working on better coloring algorithms based on various gradients

The gory technical details

  1. All of the posts are fed through an embeddings generator, I'm using the Sentence Transformers Python library.
  2. The embeddings are passed to UMAP, a dimensionality reduction algorithm, which takes in multi-dimensional embeddings and projects it down to a 2D representation, which can be drawn as a graph. The projection is done so that the high-level "structure" of the data is preserved (at least that's what the UMAP paper states, I'm not data scientist to argue with the experts).
  3. I'm connecting each post with its top 2 nearest posts (using more clutters up the map).
  4. Coloring is done via calculating cosine similarity between the post content embeddings and embeddings of simple tag-based sentences, such as "music, melodies" or "art, beauty". Currently the gradient is dead-simple, similarity directly affects the R/G/B channel.
  5. graphviz renders the graphs and outputs them as SVGs.

A much better description on Simon Willison's blog.

Future plans

  1. Make this thing look more "map-alike", whatever that might mean.
  2. Experiment with text clusterisation & dimensionality reduction algorithms, such as:
    • tSNE
    • K-means clustering
    • UMAP
    • Latent Dirichlet allocation
    • DBSCAN
  3. Add #tags. Automatically assign posts to categories with cosine distances.
  4. Introduce color-coding and other visual markers, allowing viewers to make sense of the data based on different metrics:
    • Post tags
    • Links to/from other posts
    • Links outside (to the netsphere)
    • Other connections generated by NLP
  5. Check out KagiSearch/vectordb
  6. Color gradients with Python
  7. Circos