Giving LLMs Psychedelics

Most research towards LLM interpretability is trying to get them more consistent / low variance. But what if we want our LLMs to actually be good at brainstorming and higher variance?

One of the main failures of agent swarms today empirically is that LLMs don't come up with more ideas when you throw 1k LLM agents at a problem, even with different system prompts. But, we know that 1k people would have a ton of different ideas. The variance in behavior is obviously much much higher in people.

What if we want the Beatles or Steve Jobs LLMs, not the robot? Temperature only fixes this at output logits time, but the novel ideas are compressed away to 0 in the weights at that point.

Some psychedelic inference time LLM manipulation ideas

(Claude told me no one has tried most of these[?])

1. Super Subconscious

Psychedelics cause thalamic gating collapse, allowing more ideas to go from the subconscious to the conscious brain.

The equivalent is an inverted DoLa layer — Chuang et al. (2024); Banerjee et al. (2025) showed that Decoding by Contrasting Layers at inference time suppresses divergent signal. Reversing would preserve more thoughts from the earlier layers that get forcibly suppressed by later layers.

2. Idea Connector

Psychedelics cause hyperassociative thinking which increases semantic network activation. So thinking "dog" would also make you think "evolution from wolves, loyalty", instead of just "fur, pet".

The equivalent is feed forward network gate widening. Research in the gate activations area is entirely in the opposite direction, trying to activate more sparsely so the network can get faster. But what if we want to make these weird (and usually irrelevant) connections between thoughts?

3. Dreamy

Psychedelics increase neural entropy at higher cortical layers, causing variance in thinking about abstract beliefs and habits.

The equivalent would be to make some noise vectors in some of the middle layers of the LLM, human evaluate which ones make the model more of a vibe. Cluster noise patterns to make moods & reuse.

4. No Priors

Psychedelics reduce the brain's confidence in its "dominant high-level beliefs"; reduced belief in priors (REBUS).

Dampening the weights in the later attention layers to get the model back closer to a word association model and less chatbot. Some experiments have tried blending RLHF weights w/ base model weights to "recover lost capabilities", but this would be a layer by layer tune to try and increase "novelty".

Should I try some of these out? Should only be a grand or 2 on a 70b model.