Map--Website experience Map--Exocortex Website experience--Contact Website experience--Bookmarks Exocortex--Website experience Exocortex--How to think Travel--Listening Travel--2023's Devlog Listening--Music Transcribing Listening--The Ławka Initiative 2023's Devlog--Listening 2023's Devlog--Piano Decentralisation--Caddy Decentralisation--Content creation workflow Caddy--Content creation workflow Caddy--NixOs Content creation workflow--NixOs Content creation workflow--Daily open-source software guide System configuration--Colour scheme System configuration--Software alternatives Colour scheme--Software alternatives Colour scheme--NixOs Software alternatives--Caddy Software alternatives--Content creation workflow NixOs--System configuration NixOs--Software alternatives The Waiting room--Movies/shows The Waiting room--How to think Movies/shows--Travel Movies/shows--Paintings How to think--Map How to think--Movies/shows Custom synth--Making music on Linux Custom synth--Custom sequencer Making music on Linux--Custom sequencer Making music on Linux--Resources on audio & DSP Custom sequencer--Resources on audio & DSP Custom sequencer--Luthier Daily open-source software guide--Decentralisation Daily open-source software guide--NixOs 2022's Devlog--Listening 2022's Devlog--About About--Listening About--System configuration Resources on audio & DSP--Custom synth Resources on audio & DSP--Rust Luthier--Resources on audio & DSP Luthier--GPU Synth Sounds & Melodies--Music Transcribing Sounds & Melodies--Piano Music Transcribing--2023's Devlog Music Transcribing--Custom sequencer Piano--Listening Piano--Music Transcribing The Ławka Initiative--The word *Bączek* The Ławka Initiative--Excellent Words The word *Bączek*--Excellent Words The word *Bączek*--Paintings Excellent Words--Movies/shows Excellent Words--Cosmic Horror Rust--Custom sequencer Rust--Music Transcribing GPU Synth--Making music on Linux GPU Synth--Resources on audio & DSP Paintings--Excellent Words Paintings--Cosmic Horror Cosmic Horror--Movies/shows Cosmic Horror--Reading Contact--Registry-based search engine Contact--Bookmarks Registry-based search engine--Website experience Registry-based search engine--Bookmarks Bookmarks--Map Bookmarks--Decentralisation Reading--The Waiting room Reading--Movies/shows Map Map Website experience Website experience Exocortex Exocortex Travel Travel Listening Listening 2023's Devlog 2023's Devlog ! Decentralisation Decentralisation Caddy Caddy Content creation workflow Content creation workflow System configuration System configuration Colour scheme Colour scheme Software alternatives Software alternatives NixOs NixOs The Waiting room The Waiting room Movies/shows Movies/shows How to think How to think Custom synth Custom synth Making music on Linux Making music on Linux Custom sequencer Custom sequencer Daily open-source software guide Daily open- source software guide 2022's Devlog 2022's Devlog About About Resources on audio & DSP Resources on audio & DSP Luthier Luthier Sounds & Melodies Sounds & Melodies ! Music Transcribing Music Transcribing Piano Piano The Ławka Initiative The Ławka Initiative The word *Bączek* The word *Bączek* Excellent Words Excellent Words ! Rust Rust GPU Synth GPU Synth Paintings Paintings Cosmic Horror Cosmic Horror Contact Contact Registry-based search engine Registry-based search engine Bookmarks Bookmarks Reading Reading
*Note*: you can click on each page, all nodes are links!

This is an automatically generated graph containing all pages on my websites, along with the connections calculated using sentence embeddings. If you're interested, you can read the source code.

How is this thing generated?

  1. All of the posts are fed through an embeddings generator, I'm using the Sentence Transformers Python library.
  2. The embeddings are passed to UMAP, a dimensionality reduction algorithm, which takes in multi-dimensional embeddings and projects it down to a 2D representation, which can be drawn as a graph. The projection is done so that the high-level "structure" of the data is preserved.
  3. I'm connecting each post with its top 2 nearest posts (using more clutters up the map).
  4. graphviz renders the graph and outputs it as an SVG file.

A much better description on Simon Willison's blog.

Future plans

  1. Make this thing look more "map-alike", whatever that might mean.
  2. Experiment with text clusterisation & dimensionality reduction algorithms, such as:
    • tSNE
    • K-means clustering
    • UMAP
    • Latent Dirichlet allocation
    • DBSCAN
  3. Add #tags.
  4. Introduce color-coding and other visual markers, allowing viewers to make sense of the data based on different metrics:
    • Post tags
    • Links to/from other posts
    • Links outside (to the netsphere)
    • Other connections generated by NLP
  5. Check out KagiSearch/vectordb