Wolf Alignment and AI Alignment

good doggos

Sep 03, 2025

Thinking about the AI alignment problem, it seems to me humans have done something similar before: aligning wolves to be friendly towards us.

The result, of course, is a dog.

The fact that its possible to breed dogs to be friendly towards humans means that there's things in their genome that can be tweaked in a way that makes them, in practice, be friendly towards us.

Could AIs be aligned in a similar way?

So could we breed AIs in the same way that we have bred dogs?

One problem is there's nothing in an AI that's precisely equivalent to dog's genome. Yes there is its architecture, which is defined in, for example, a Python program, but that's quite small informationally (c. 10-100 kB versus 700 MB for a dog genome).

Maybe there could be initial weights attached to some neurons before training? Note that the dog connectome is much bigger than the genome: Dogs have about 2 billion neurons each with 1000 connections so that's 2 TB of information (i.e. 2e12) in the connectome. This is around 3000 times bigger than the genome.

Looking at AI architectures, GPT-4 reportedly has about 2 T connections and it has been speculated that GPT-5 has about 3 T to 5 T. connections, so about the same as a dog. By comparison the human connectome is a lot bigger: it has an estimated 1e14 to 1e15 connection (100 T to 1000 T).

Or maybe there could be a pre-training phase before ordinary training, and the data in the pre-training phase can be used as an equivalent to a dog's genome: i.e. randomly changed, seeing which resulting AIs are best aligned to us, and breeding them.

Or maybe entirely new AI architectures would be needed to reliably breed AIs.

Is something like this feasible? I've no idea, it's just a thought that struck me.

Pontifex Minimus

Discussion about this post