Website map

Reading Reading Listening recommendations Listening recommendations Reading--Listening recommendations Reading--Listening recommendations Movies/shows Movies/shows Reading--Movies/shows My system configuration My system configuration The Waiting room The Waiting room My system configuration--The Waiting room Website map Website map My system configuration--Website map My system configuration--Movies/shows About About 🆕 Rust Rust About--Rust Contact Contact About--Contact Decentralisation Decentralisation About--Decentralisation 'Synth Patch Generation - Evolutionary Approach' 'Synth Patch Generation - Evolutionary Approach' 'Synth Patch Generation - Evolutionary Approach'--The Waiting room GPU Synth GPU Synth 'Synth Patch Generation - Evolutionary Approach'--GPU Synth Resources on audio & DSP Resources on audio & DSP 'Synth Patch Generation - Evolutionary Approach'--Resources on audio & DSP NixOs NixOs Colour scheme for everything Colour scheme for everything NixOs--Colour scheme for everything Curated list of alternatives to C/C++/Java software Curated list of alternatives to C/C++/Java software 🆕 NixOs--Curated list of alternatives to C/C++/Java software Caddy web server Caddy web server NixOs--Caddy web server The word *Bączek* in Polish The word *Bączek* in Polish 🆕 Custom synth design idea dump Custom synth design idea dump The word *Bączek* in Polish--Custom synth design idea dump Sounds & Melodies Sounds & Melodies The word *Bączek* in Polish--Sounds & Melodies The word *Bączek* in Polish--Decentralisation Listening recommendations--Listening recommendations Listening recommendations--The Waiting room The Waiting room--GPU Synth 2023's Devlog 2023's Devlog The Waiting room--2023's Devlog Website experience Website experience Rust--Website experience Custom sequencer idea dump Custom sequencer idea dump Rust--Custom sequencer idea dump Registry-based search engine manifesto Registry-based search engine manifesto Daily open-source software guide Daily open-source software guide Registry-based search engine manifesto--Daily open-source software guide Content creation workflow Content creation workflow Registry-based search engine manifesto--Content creation workflow Bookmarks Bookmarks 🆕 Registry-based search engine manifesto--Bookmarks Exocortex Exocortex Exocortex--Content creation workflow Exocortex--Bookmarks Exocortex--Caddy web server GPU Synth--2023's Devlog 2022's Devlog 2022's Devlog 🆕 2022's Devlog--Custom sequencer idea dump Music Transcribing Music Transcribing 2022's Devlog--Music Transcribing 2022's Devlog--Website map Contact--Rust Contact--Website experience Support Support Contact--Support Website experience--Custom sequencer idea dump Website experience--Content creation workflow Daily open-source software guide--Content creation workflow Daily open-source software guide--Support Custom sequencer idea dump--Music Transcribing Custom synth design idea dump--Sounds & Melodies Custom synth design idea dump--Resources on audio & DSP Music Transcribing--Movies/shows Sounds & Melodies--Resources on audio & DSP Decentralisation--2023's Devlog Website map--2023's Devlog Colour scheme for everything--Reading Colour scheme for everything--Movies/shows Support--Website experience Bookmarks--Caddy web server Resources on audio & DSP--GPU Synth Curated list of alternatives to C/C++/Java software--Bookmarks Curated list of alternatives to C/C++/Java software--Caddy web server Note: you can click on each page, all nodes are links!

This is an automatically generated graph of all pages on my websites, along with the connections chosen by a text vectorisation algorithm.

If you're interested, you can read the source code.

How is this thing generated?

TODO: update once I'm finished with UMAP

  1. All of the posts are fed through a stemming algorithm, which reduces them to their root form (think "doing" -> "do", "derivation" -> "deriv").
  2. All the posts are fed into a TF-IDF vectorizer, which assigns a unique index to each unique word and weights them by the frequency in which it appears. For example, the word "to" probably won't be an important word, opposed to "circumvolution" or "simulacrum". After this procedure, each post becomes a vector of numbers containing the amount of occurences of each unique word found in all of the posts, multiplied by the value of each word.
  3. Post similiarity is compared by using a metric called cosine similiarity, calculated from the vectors obtained in the 2nd step.
  4. I'm connecting each post with its top 3 similiar posts (using 4 or more leads to clutter).
  5. This graph is drawn by graphviz, it automatically generates the layout.

Future plans

  1. Make this thing look more map-alike
  2. Experiment with text clusterisation & dimensionality reduction algorithms, such as:
    • tSNE
    • K-means clustering
    • UMAP
    • Latent Dirichlet allocation
    • DBSCAN
  3. Add #tags.
  4. Introduce color-coding and other visual markers, allowing viewers to make sense of the data based on different metrics:
    • Post tags
    • Links to/from other posts
    • Links outside (to the netsphere)
    • Last edited time (check pygit)
    • Other connections generated by NLP
  5. Generate a mini-map below each page, containing links to the posts "around" it.