Promoting simple Django fields to ForeignKeys


Posted on 2024-01-23, by Racum. Tags: Django Python

Did you ever change your mind about your Django database structure? Sometimes you need to create references between models, but for some reason you didn’t use a ForeignKey field, and there are good cases for that, for example: the ID came from a different system, but you decided to have a local copy. On this article I show how to convert it the proper way.

If you decide to transform that field into a ForeignKey, it all depends on how you called that field, if you didn’t add an _id suffix, than it is just a matter of change the field and apply a single migration, close this article and go eat some ice cream, you earned it!

But, if you happen to have a field like something_id things can get dangerous and complicated, follow carefully the steps below to avoid big problems like data-loss or broken migrations.

Scenario

Consider the models of a toy-example like this:

File short_posts/models.py:

from uuid import uuid4
from django.db import models

class User(models.Model):
    name = models.CharField(max_length=100)
    pinned_post_id = models.UUIDField(null=True)  # <-- Weak reference field.

class ShortPost(models.Model):
    id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
    user = models.ForeignKey(User, on_delete=models.CASCADE, related_name='posts')
    creation_time = models.DateTimeField(auto_now_add=True)
    message = models.TextField()

In this example, the field pinned_post_id from the model User is the one that you want to transform into a ForeignKey, but first, let’s add some data into it via ./manage.py shell:

>>> from short_posts.models import User, ShortPost

>>> user = User(name='Racum')
>>> user.save()

>>> post1 = ShortPost(user=user, message='Hello World!')
>>> post1.save()
>>> post2 = ShortPost(user=user, message='Second message')
>>> post2.save()

>>> user.pinned_post_id = post1.id
>>> user.save()

>>> ShortPost.objects.get(id=user.pinned_post_id).message
'Hello World!'

After the data gets added, the database should look like this:

sqlite> select * from short_posts_user;
id  name   pinned_post_id
--  -----  --------------------------------
1   Racum  d94c57edf6064d5090459be0de12c5c8

sqlite> select * from short_posts_shortpost;
id                                creation_time               message         user_id
--------------------------------  --------------------------  --------------  -------
d94c57edf6064d5090459be0de12c5c8  2024-01-22 22:47:12.700014  Hello World!    1
0bb7c64e3c6c444981216382eab459b2  2024-01-22 22:47:12.702336  Second message  1

I’m using SQLite here for simplicity, but this technique works with all Django-supported DB engines.

Naive (and destructive) approach

If you just do all changes at once, meaning, remove the _id suffix at the same time you transform the field type, you will lose data!

Consider this change on User model:

class User(models.Model):
    name = models.CharField(max_length=100)
    pinned_post = models.ForeignKey(  # <-- All changes at once.
        'short_posts.ShortPost',
        null=True,
        on_delete=models.SET_NULL,
        related_name='pinned_by_user',
    )

Look what happens when you try to migrate it:

$ ./manage.py makemigrations
Migrations for 'short_posts':
  short_posts/migrations/0002_remove_user_pinned_post_id_user_pinned_post.py
    - Remove field pinned_post_id from user
    - Add field pinned_post to user

$ ./manage.py migrate
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, sessions, short_posts
Running migrations:
  Applying short_posts.0002_remove_user_pinned_post_id_user_pinned_post... OK

The reference is lost! Django can’t detect that the field changed, so it removes the old field (and its data) and creates a new (empty) one. Here is how the user table looks like if you apply this migration:

sqlite> select * from short_posts_user;
id  name   pinned_post_id
--  -----  --------------
1   Racum

Notice that the previous ID on pinned_post_id is lost! …the old ID references gets replaced by null values.

Promoting to ForeignKey the right way

Step 0: Make a backup

I can’t believe I have to make this explicit, but you know the risks of copy&paste commands from a random article on the Internet, please make sure you have your working code committed on git and your database properly backed up. Don’t complain to me if you break your system!

Step 1: Transform the original field into a ForeignKey

Apply these changes:

  • Keep the same name (with the _id suffix).
  • Make it a ForeignKey.
  • Make the db_column explicit with the full name of the field (also including _id).
  • Set db_constraint=False, this is temporary, but needed to inform Django that the DB checks should be ignored at this time.
  • If the original field was nullable, set null=True and on_delete=models.SET_NULL. If not, pick the value of db_delete that better fits your case.

The model, at this step, should look like this:

class User(models.Model):
    name = models.CharField(max_length=100)
    pinned_post_id = models.ForeignKey(  # <-- Transformed into ForeignKey.
        'short_posts.ShortPost',
        db_column='pinned_post_id',
        db_constraint=False,
        null=True,
        on_delete=models.SET_NULL,
        related_name='pinned_by_user',
    )

Than run makemigrations, but DO NOT run migrate:

$ ./manage.py makemigrations
Migrations for 'short_posts':
  short_posts/migrations/0002_alter_user_pinned_post_id.py
    - Alter field pinned_post_id on user

Step 2: Remove the "_id" prefix

Apply these changes:

  • Simply remove the _id suffix from the field name.
  • Keep everything else as is.

The model, at this step, should look like this:

class User(models.Model):
    name = models.CharField(max_length=100)
    pinned_post = models.ForeignKey(  # <-- Removed the "_id" suffix.
        'short_posts.ShortPost',
        db_column='pinned_post_id',
        db_constraint=False,
        null=True,
        on_delete=models.SET_NULL,
        related_name='pinned_by_user',
    )

Again, run makemigrations (you need confirm the renaming this time), but DO NOT run migrate:

$ ./manage.py makemigrations
Was user.pinned_post_id renamed to user.pinned_post (a ForeignKey)? [y/N] y
Migrations for 'short_posts':
  short_posts/migrations/0003_rename_pinned_post_id_user_pinned_post.py
    - Rename field pinned_post_id on user to pinned_post

Step 3: Remove the "helper" attributes

Now that the name of the field is no longer an issue, you can safely remove the parameters that we added just to allow the change:

  • Remove the db_column parameter.
  • Remove the db_constraint parameter.

The model, at this step, should look like this:

class User(models.Model):
    name = models.CharField(max_length=100)
    pinned_post = models.ForeignKey(  # <-- Removed db_column and db_constraint.
        'short_posts.ShortPost',
        null=True,
        on_delete=models.SET_NULL,
        related_name='pinned_by_user',
    )

One more time, run makemigrations, and, of course, DO NOT run migrate:

$ ./manage.py makemigrations
Migrations for 'short_posts':
  short_posts/migrations/0004_alter_user_pinned_post.py
    - Alter field pinned_post on user

Step 4: Squash the migrations

At this point, you should have 3 migration files waiting to run, but, before that, we should combine them, so the whole process can run as a single migration step.

This is the syntax for squashmigrations:

./manage.py squashmigrations --squashed-name [name] [django_app] [first_migration] [last_migration]

Explaining:

  • name:just a descriptive name, if you omit this Django will pick an automatic dumb filename, since it lacks the context of why we are doing this.
  • django_app: self-descriptive, in my case it is short_posts, please adapt it accordingly.
  • first_migration: take note of the file prefix on Step 1, in my case it was 0002 (since 0001 was my initial migration).
  • last_migration: take note of the file prefix on Step 3, in my case it was 0004.
 $ ./manage.py squashmigrations --squashed-name promoted_pinned_post short_posts 0002 0004
Will squash the following migrations:
 - 0002_alter_user_pinned_post_id
 - 0003_rename_pinned_post_id_user_pinned_post
 - 0004_alter_user_pinned_post
Do you wish to proceed? [yN] y
Optimizing...
  No optimizations possible.
Created new squashed migration short_posts/migrations/0002_promoted_pinned_post.py

After this you can safely remove the intermediate migrations, keeping only the squashed (combined) one:

$ rm short_posts/migrations/0002_alter_user_pinned_post_id.py
$ rm short_posts/migrations/0003_rename_pinned_post_id_user_pinned_post.py
$ rm short_posts/migrations/0004_alter_user_pinned_post.py

The migrations directory of your app should look like this:

$ ls short_posts/migrations/
0001_initial.py                 __init__.py
0002_promoted_pinned_post.py    __pycache__

And the combined migration should have an auto-generated content like this:

# Generated by Django 5.0.1 on 2024-01-22 23:32

import django.db.models.deletion
from django.db import migrations, models


class Migration(migrations.Migration):

    replaces = [
        ('short_posts', '0002_alter_user_pinned_post_id'),
        ('short_posts', '0003_rename_pinned_post_id_user_pinned_post'),
        ('short_posts', '0004_alter_user_pinned_post'),
    ]

    dependencies = [
        ('short_posts', '0001_initial'),
    ]

    operations = [
        migrations.AlterField(
            model_name='user',
            name='pinned_post_id',
            field=models.ForeignKey(
                db_column='pinned_post_id',
                db_constraint=False,
                null=True,
                on_delete=django.db.models.deletion.SET_NULL,
                related_name='pinned_by_user',
                to='short_posts.shortpost',
            ),
        ),
        migrations.RenameField(
            model_name='user',
            old_name='pinned_post_id',
            new_name='pinned_post',
        ),
        migrations.AlterField(
            model_name='user',
            name='pinned_post',
            field=models.ForeignKey(
                null=True,
                on_delete=django.db.models.deletion.SET_NULL,
                related_name='pinned_by_user',
                to='short_posts.shortpost',
            ),
        ),
    ]

Notice that there are no removal operations, no data gets changed at all, since all operations are only changing the Django references, not the database structure itself.

Step 5: Run the combined migration

You can now finally run it:

$ ./manage.py migrate short_posts
Operations to perform:
  Apply all migrations: short_posts
Running migrations:
  Applying short_posts.0002_promoted_pinned_post... OK

Outcome

If you managed to follow all the steps, here is the reward:

sqlite> select * from short_posts_user;
id  name   pinned_post_id
--  -----  --------------------------------
1   Racum  d94c57edf6064d5090459be0de12c5c8

The user table looks exactly the same, no data gets lost!

And now you can call the referenced model as a proper ForeignKey on your root model:

>>> User.objects.get(id=1).pinned_post.message
'Hello World!'

Conclusion

This article was based on a real-world case that I had to fix, but, even with the great Django documentation and the amazing community, my case was too specific to find an answer, and I have to follow the path alone until find a solution by myself; and now that I know the way, this is my contribution to guide you!