Colour scheme--Software alternatives Colour scheme--Daily open-source software guide Software alternatives--About Software alternatives--Contact Daily open-source software guide--Software alternatives Daily open-source software guide--Decentralisation NixOs--Software alternatives NixOs--System configuration System configuration--Software alternatives System configuration--About GPU Synth--Custom synth GPU Synth--Custom sequencer Custom synth--Making music on Linux Custom synth--Luthier Custom sequencer--Resources on audio & DSP Custom sequencer--Rust Listening--2024's Devlog Listening--Movies/shows 2024's Devlog--Movies/shows 2024's Devlog--Travel Movies/shows--Reading Movies/shows--Travel Making music on Linux--Custom sequencer Making music on Linux--Resources on audio & DSP Resources on audio & DSP--Custom synth Resources on audio & DSP--Luthier Cosmic Horror--Movies/shows Cosmic Horror--Reading Reading--Paintings Reading--Excellent Words Travel--2023's Devlog Travel--The Ławka Initiative 2023's Devlog--2024's Devlog 2023's Devlog--About The Ławka Initiative--Reading The Ławka Initiative--Paintings 2022's Devlog--2024's Devlog 2022's Devlog--2023's Devlog About--Travel About--2022's Devlog Caddy--Daily open-source software guide Caddy--Decentralisation Decentralisation--Colour scheme Decentralisation--Content creation workflow Piano--Music Transcribing Piano--Sounds & Melodies Music Transcribing--Resources on audio & DSP Music Transcribing--Luthier Sounds & Melodies--Music Transcribing Sounds & Melodies--Rust Paintings--The word *Bączek* Paintings--Excellent Words The word *Bączek*--The Ławka Initiative The word *Bączek*--How to think Excellent Words--The Ławka Initiative Excellent Words--The word *Bączek* The Waiting room--Reading The Waiting room--Excellent Words Website experience--Registry-based search engine Website experience--Map Registry-based search engine--Caddy Registry-based search engine--Map Map--Caddy Map--Bookmarks Bookmarks--Website experience Bookmarks--Registry-based search engine Contact--Daily open-source software guide Contact--Bookmarks How to think--Paintings How to think--Excellent Words Rust--Making music on Linux Rust--Music Transcribing Content creation workflow--Daily open-source software guide Content creation workflow--Caddy Exocortex--Website experience Exocortex--Map Luthier--GPU Synth Luthier--Making music on Linux Colour scheme Colour scheme Software alternatives Software alternatives Daily open-source software guide Daily open- source software guide NixOs NixOs System configuration System configuration GPU Synth GPU Synth Custom synth Custom synth Custom sequencer Custom sequencer Listening Listening 2024's Devlog 2024's Devlog ! Movies/shows Movies/shows Making music on Linux Making music on Linux Resources on audio & DSP Resources on audio & DSP Cosmic Horror Cosmic Horror Reading Reading Travel Travel 2023's Devlog 2023's Devlog The Ławka Initiative The Ławka Initiative 2022's Devlog 2022's Devlog About About ! Caddy Caddy Decentralisation Decentralisation Piano Piano Music Transcribing Music Transcribing Sounds & Melodies Sounds & Melodies Paintings Paintings The word *Bączek* The word *Bączek* Excellent Words Excellent Words The Waiting room The Waiting room Website experience Website experience Registry-based search engine Registry-based search engine Map Map Bookmarks Bookmarks Contact Contact How to think How to think ! Rust Rust Content creation workflow Content creation workflow Exocortex Exocortex Luthier Luthier
*Note*: you can click on each page, all nodes are links!

This is an automatically generated graph containing all pages on my websites, along with the connections calculated using sentence embeddings. If you're interested, you can read the source code.

How is this thing generated?

  1. All of the posts are fed through an embeddings generator, I'm using the Sentence Transformers Python library.
  2. The embeddings are passed to UMAP, a dimensionality reduction algorithm, which takes in multi-dimensional embeddings and projects it down to a 2D representation, which can be drawn as a graph. The projection is done so that the high-level "structure" of the data is preserved.
  3. I'm connecting each post with its top 2 nearest posts (using more clutters up the map).
  4. graphviz renders the graph and outputs it as an SVG file.

A much better description on Simon Willison's blog.

Future plans

  1. Make this thing look more "map-alike", whatever that might mean.
  2. Experiment with text clusterisation & dimensionality reduction algorithms, such as:
    • tSNE
    • K-means clustering
    • UMAP
    • Latent Dirichlet allocation
    • DBSCAN
  3. Add #tags.
  4. Introduce color-coding and other visual markers, allowing viewers to make sense of the data based on different metrics:
    • Post tags
    • Links to/from other posts
    • Links outside (to the netsphere)
    • Other connections generated by NLP
  5. Check out KagiSearch/vectordb