Website map

Exocortex Exocortex Colour scheme for everything Colour scheme for everything Exocortex--Colour scheme for everything Website map Website map Exocortex--Website map About About 🆕 Exocortex--About The Waiting room The Waiting room 2023's Devlog 2023's Devlog The Waiting room--2023's Devlog Listening recommendations Listening recommendations The Waiting room--Listening recommendations The Waiting room--Listening recommendations Music Transcribing Music Transcribing Custom sequencer idea dump Custom sequencer idea dump Music Transcribing--Custom sequencer idea dump Registry-based search engine manifesto Registry-based search engine manifesto Music Transcribing--Registry-based search engine manifesto Support Support Music Transcribing--Support Custom sequencer idea dump--Registry-based search engine manifesto Luthier Luthier Custom sequencer idea dump--Luthier Registry-based search engine manifesto--Support Website experience Website experience Registry-based search engine manifesto--Website experience Bookmarks Bookmarks Registry-based search engine manifesto--Bookmarks Resources on audio & DSP Resources on audio & DSP Curated list of alternatives to C/C++/Java software Curated list of alternatives to C/C++/Java software Resources on audio & DSP--Curated list of alternatives to C/C++/Java software Caddy web server Caddy web server Resources on audio & DSP--Caddy web server Rust Rust Resources on audio & DSP--Rust Content creation workflow Content creation workflow Luthier--Content creation workflow Luthier--Rust GPU Synth GPU Synth Movies/shows Movies/shows GPU Synth--Movies/shows GPU Synth--Colour scheme for everything How to think How to think GPU Synth--How to think Daily open-source software guide Daily open-source software guide Daily open-source software guide--Bookmarks Daily open-source software guide--Content creation workflow NixOs NixOs Daily open-source software guide--NixOs Curated list of alternatives to C/C++/Java software--2023's Devlog 2022's Devlog 2022's Devlog 🆕 Curated list of alternatives to C/C++/Java software--2022's Devlog Curated list of alternatives to C/C++/Java software--Caddy web server Support--Website experience Contact Contact Support--Contact 2023's Devlog--2022's Devlog Movies/shows--How to think Movies/shows--How to think Reading Reading Movies/shows--Reading Listening recommendations--Listening recommendations Listening recommendations--Reading Custom synth design idea dump Custom synth design idea dump My system configuration My system configuration Custom synth design idea dump--My system configuration Sounds & Melodies Sounds & Melodies Custom synth design idea dump--Sounds & Melodies The word *Bączek* in Polish The word *Bączek* in Polish Custom synth design idea dump--The word *Bączek* in Polish Website experience--Bookmarks Website experience--About Colour scheme for everything--My system configuration How to think--How to think Bookmarks--Content creation workflow Reading--How to think Reading--How to think Decentralisation Decentralisation My system configuration--Decentralisation Website map--2022's Devlog Website map--NixOs Content creation workflow--Rust Sounds & Melodies--My system configuration Sounds & Melodies--The word *Bączek* in Polish 2022's Devlog--Caddy web server Caddy web server--Daily open-source software guide The word *Bączek* in Polish--Decentralisation About--NixOs Decentralisation--Custom synth design idea dump Decentralisation--Contact Contact--The word *Bączek* in Polish
*Note*: you can click on each page, all nodes are links!

This is an automatically generated graph of all pages on my websites, along with the connections chosen by a text vectorisation algorithm.

If you're interested, you can read the source code.

How is this thing generated?

TODO: update once I'm finished with UMAP

  1. All of the posts are fed through a stemming algorithm, which reduces them to their root form (think "doing" -> "do", "derivation" -> "deriv").
  2. All the posts are fed into a TF-IDF vectorizer, which assigns a unique index to each unique word and weights them by the frequency in which it appears. For example, the word "to" probably won't be an important word, opposed to "circumvolution" or "simulacrum". After this procedure, each post becomes a vector of numbers containing the amount of occurences of each unique word found in all of the posts, multiplied by the value of each word.
  3. Post similiarity is compared by using a metric called cosine similiarity, calculated from the vectors obtained in the 2nd step.
  4. I'm connecting each post with its top 3 similiar posts (using 4 or more leads to clutter).
  5. This graph is drawn by graphviz, it automatically generates the layout.

Future plans

  1. Make this thing look more map-alike
  2. Experiment with text clusterisation & dimensionality reduction algorithms, such as:
    • tSNE
    • K-means clustering
    • UMAP
    • Latent Dirichlet allocation
    • DBSCAN
  3. Add #tags.
  4. Introduce color-coding and other visual markers, allowing viewers to make sense of the data based on different metrics:
    • Post tags
    • Links to/from other posts
    • Links outside (to the netsphere)
    • Last edited time (check pygit)
    • Other connections generated by NLP
  5. Generate a mini-map below each page, containing links to the posts "around" it.