Promoting simple Django fields to ForeignKeys
Posted on 2024-01-23, by Racum.
Did you ever change your mind about your Django database structure? Sometimes you need to create references between models, but for some reason you didn’t use a ForeignKey field, and there are good cases for that, for example: the ID came from a different system, but you decided to have a local copy. On this article I show how to convert it the proper way.
If you decide to transform that field into a ForeignKey, it all depends on how you called that field, if you didn’t add an _id
suffix, than it is just a matter of change the field and apply a single migration, close this article and go eat some ice cream, you earned it!
But, if you happen to have a field like something_id
things can get dangerous and complicated, follow carefully the steps below to avoid big problems like data-loss or broken migrations.
Scenario
Consider the models of a toy-example like this:
File short_posts/models.py
:
from uuid import uuid4
from django.db import models
class User(models.Model):
name = models.CharField(max_length=100)
pinned_post_id = models.UUIDField(null=True) # <-- Weak reference field.
class ShortPost(models.Model):
id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
user = models.ForeignKey(User, on_delete=models.CASCADE, related_name='posts')
creation_time = models.DateTimeField(auto_now_add=True)
message = models.TextField()
In this example, the field pinned_post_id
from the model User
is the one that you want to transform into a ForeignKey, but first, let’s add some data into it via ./manage.py shell
:
>>> from short_posts.models import User, ShortPost
>>> user = User(name='Racum')
>>> user.save()
>>> post1 = ShortPost(user=user, message='Hello World!')
>>> post1.save()
>>> post2 = ShortPost(user=user, message='Second message')
>>> post2.save()
>>> user.pinned_post_id = post1.id
>>> user.save()
>>> ShortPost.objects.get(id=user.pinned_post_id).message
'Hello World!'
After the data gets added, the database should look like this:
sqlite> select * from short_posts_user;
id name pinned_post_id
-- ----- --------------------------------
1 Racum d94c57edf6064d5090459be0de12c5c8
sqlite> select * from short_posts_shortpost;
id creation_time message user_id
-------------------------------- -------------------------- -------------- -------
d94c57edf6064d5090459be0de12c5c8 2024-01-22 22:47:12.700014 Hello World! 1
0bb7c64e3c6c444981216382eab459b2 2024-01-22 22:47:12.702336 Second message 1
I’m using SQLite here for simplicity, but this technique works with all Django-supported DB engines.
Naive (and destructive) approach
If you just do all changes at once, meaning, remove the _id
suffix at the same time you transform the field type, you will lose data!
Consider this change on User
model:
class User(models.Model):
name = models.CharField(max_length=100)
pinned_post = models.ForeignKey( # <-- All changes at once.
'short_posts.ShortPost',
null=True,
on_delete=models.SET_NULL,
related_name='pinned_by_user',
)
Look what happens when you try to migrate it:
$ ./manage.py makemigrations
Migrations for 'short_posts':
short_posts/migrations/0002_remove_user_pinned_post_id_user_pinned_post.py
- Remove field pinned_post_id from user
- Add field pinned_post to user
$ ./manage.py migrate
Operations to perform:
Apply all migrations: admin, auth, contenttypes, sessions, short_posts
Running migrations:
Applying short_posts.0002_remove_user_pinned_post_id_user_pinned_post... OK
The reference is lost! Django can’t detect that the field changed, so it removes the old field (and its data) and creates a new (empty) one. Here is how the user table looks like if you apply this migration:
sqlite> select * from short_posts_user;
id name pinned_post_id
-- ----- --------------
1 Racum
Notice that the previous ID on pinned_post_id
is lost! …the old ID references gets replaced by null values.
Promoting to ForeignKey the right way
Step 0: Make a backup
I can’t believe I have to make this explicit, but you know the risks of copy&paste commands from a random article on the Internet, please make sure you have your working code committed on git
and your database properly backed up. Don’t complain to me if you break your system!
Step 1: Transform the original field into a ForeignKey
Apply these changes:
- Keep the same name (with the
_id
suffix). - Make it a ForeignKey.
- Make the
db_column
explicit with the full name of the field (also including_id
). - Set
db_constraint=False
, this is temporary, but needed to inform Django that the DB checks should be ignored at this time. - If the original field was nullable, set
null=True
andon_delete=models.SET_NULL
. If not, pick the value of db_delete that better fits your case.
The model, at this step, should look like this:
class User(models.Model):
name = models.CharField(max_length=100)
pinned_post_id = models.ForeignKey( # <-- Transformed into ForeignKey.
'short_posts.ShortPost',
db_column='pinned_post_id',
db_constraint=False,
null=True,
on_delete=models.SET_NULL,
related_name='pinned_by_user',
)
Than run makemigrations
, but DO NOT run migrate
:
$ ./manage.py makemigrations
Migrations for 'short_posts':
short_posts/migrations/0002_alter_user_pinned_post_id.py
- Alter field pinned_post_id on user
Step 2: Remove the "_id" prefix
Apply these changes:
- Simply remove the
_id
suffix from the field name. - Keep everything else as is.
The model, at this step, should look like this:
class User(models.Model):
name = models.CharField(max_length=100)
pinned_post = models.ForeignKey( # <-- Removed the "_id" suffix.
'short_posts.ShortPost',
db_column='pinned_post_id',
db_constraint=False,
null=True,
on_delete=models.SET_NULL,
related_name='pinned_by_user',
)
Again, run makemigrations
(you need confirm the renaming this time), but DO NOT run migrate
:
$ ./manage.py makemigrations
Was user.pinned_post_id renamed to user.pinned_post (a ForeignKey)? [y/N] y
Migrations for 'short_posts':
short_posts/migrations/0003_rename_pinned_post_id_user_pinned_post.py
- Rename field pinned_post_id on user to pinned_post
Step 3: Remove the "helper" attributes
Now that the name of the field is no longer an issue, you can safely remove the parameters that we added just to allow the change:
- Remove the
db_column
parameter. - Remove the
db_constraint
parameter.
The model, at this step, should look like this:
class User(models.Model):
name = models.CharField(max_length=100)
pinned_post = models.ForeignKey( # <-- Removed db_column and db_constraint.
'short_posts.ShortPost',
null=True,
on_delete=models.SET_NULL,
related_name='pinned_by_user',
)
One more time, run makemigrations
, and, of course, DO NOT run migrate
:
$ ./manage.py makemigrations
Migrations for 'short_posts':
short_posts/migrations/0004_alter_user_pinned_post.py
- Alter field pinned_post on user
Step 4: Squash the migrations
At this point, you should have 3 migration files waiting to run, but, before that, we should combine them, so the whole process can run as a single migration step.
This is the syntax for squashmigrations:
./manage.py squashmigrations --squashed-name [name] [django_app] [first_migration] [last_migration]
Explaining:
name
:just a descriptive name, if you omit this Django will pick an automatic dumb filename, since it lacks the context of why we are doing this.django_app
: self-descriptive, in my case it isshort_posts
, please adapt it accordingly.first_migration
: take note of the file prefix on Step 1, in my case it was0002
(since0001
was myinitial
migration).last_migration
: take note of the file prefix on Step 3, in my case it was0004
.
$ ./manage.py squashmigrations --squashed-name promoted_pinned_post short_posts 0002 0004
Will squash the following migrations:
- 0002_alter_user_pinned_post_id
- 0003_rename_pinned_post_id_user_pinned_post
- 0004_alter_user_pinned_post
Do you wish to proceed? [yN] y
Optimizing...
No optimizations possible.
Created new squashed migration short_posts/migrations/0002_promoted_pinned_post.py
After this you can safely remove the intermediate migrations, keeping only the squashed (combined) one:
$ rm short_posts/migrations/0002_alter_user_pinned_post_id.py
$ rm short_posts/migrations/0003_rename_pinned_post_id_user_pinned_post.py
$ rm short_posts/migrations/0004_alter_user_pinned_post.py
The migrations
directory of your app should look like this:
$ ls short_posts/migrations/
0001_initial.py __init__.py
0002_promoted_pinned_post.py __pycache__
And the combined migration should have an auto-generated content like this:
# Generated by Django 5.0.1 on 2024-01-22 23:32
import django.db.models.deletion
from django.db import migrations, models
class Migration(migrations.Migration):
replaces = [
('short_posts', '0002_alter_user_pinned_post_id'),
('short_posts', '0003_rename_pinned_post_id_user_pinned_post'),
('short_posts', '0004_alter_user_pinned_post'),
]
dependencies = [
('short_posts', '0001_initial'),
]
operations = [
migrations.AlterField(
model_name='user',
name='pinned_post_id',
field=models.ForeignKey(
db_column='pinned_post_id',
db_constraint=False,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name='pinned_by_user',
to='short_posts.shortpost',
),
),
migrations.RenameField(
model_name='user',
old_name='pinned_post_id',
new_name='pinned_post',
),
migrations.AlterField(
model_name='user',
name='pinned_post',
field=models.ForeignKey(
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name='pinned_by_user',
to='short_posts.shortpost',
),
),
]
Notice that there are no removal operations, no data gets changed at all, since all operations are only changing the Django references, not the database structure itself.
Step 5: Run the combined migration
You can now finally run it:
$ ./manage.py migrate short_posts
Operations to perform:
Apply all migrations: short_posts
Running migrations:
Applying short_posts.0002_promoted_pinned_post... OK
Outcome
If you managed to follow all the steps, here is the reward:
sqlite> select * from short_posts_user;
id name pinned_post_id
-- ----- --------------------------------
1 Racum d94c57edf6064d5090459be0de12c5c8
The user table looks exactly the same, no data gets lost!
And now you can call the referenced model as a proper ForeignKey on your root model:
>>> User.objects.get(id=1).pinned_post.message
'Hello World!'
Conclusion
This article was based on a real-world case that I had to fix, but, even with the great Django documentation and the amazing community, my case was too specific to find an answer, and I have to follow the path alone until find a solution by myself; and now that I know the way, this is my contribution to guide you!