Self-referencing many-to-many relationships using Ecto

EDIT: I've changed the last section of this article to reflect my findings in actually using this in production.


I've decided to do the backend for a new app I'm creating myself, using Elixir/Phoenix. I've been learning Elixir for the past few months and thought it would be a great opportunity to put what I've learned to good use.

There's a lot of documentation about the new many_to_many macro in Ecto 2 — but somehow, it still was a bit of a pain to get this right the first time. I couldn't find a concrete example of how a self-referencing many-to-many relationship would work with Ecto, and after some digging and the help of the awesome Elixir community, I got it working. I'm writing this post just to put this out there so you can go through this quicker.

In the app, a User can have contacts, which are themselves Users. (Let's omit the authentication stuff for now) The migration looks like this:

# priv/repo/migrations/create_users_table.ex
defmodule MyApp.Repo.Migrations.CreateUsersTable do
  use Ecto.Migration

  def change do
    create table(:users) do
      add :username, :string
    end

    create unique_index(:users, [:username])
  end
end

Then, the module that defines the schema looks like this:

defmodule MyApp.User do
  use MyApp.Web, :model

  schema "users" do
    field :username, :string

    timestamps
  end

  # Omitting changesets
end

In order to create a many-to-many relationship correctly we must have an associative table, that we'll call "contacts". This is the migration for that:

defmodule MyApp.Repo.Migrations.CreateContactsTable do
  use Ecto.Migration

  def change do
    create table(:contacts) do
      add :user_id, references(:users, on_delete: :nothing), primary_key: true
      add :contact_id, references(:users, on_delete: :nothing), primary_key: true
      add :status, :int

      timestamps()
    end
  end
end

To expand a bit, the "Contacts" table has... :

  1. a column user_id, which is the id for the user that initiated the "friend request".
  2. a column contact_id, which is the id for the user that must accept or reject the "friend request"
  3. a column status, which represents the friendship status (0=pending, 1=accepted, -1=rejected).

Now for the interesting part, we have to create a "Contacts" schema that Ecto can work with in order for the association to work correctly. On to it:

defmodule MyApp.Contact do
  use MyApp.Web, :model

  alias MyApp.User

  schema "contacts" do
    belongs_to :user, User
    belongs_to :contact, User
  end
end

We define the Contact module, that belongs to both a user and a contact, both being User types. We now have to update the User model to reflect the many-to-many relation:

defmodule MyApp.User do
  use MyApp.Web, :model

  schema "users" do
    field :username, :string
    # Add the many-to-many association
    many_to_many :contacts, MyApp.User, join_through: "contacts", on_replace: :delete
    timestamps
  end

  # Omitting changesets
end

Using the many_to_many macro we only have to specify the field on the User model, tell Ecto what table use as association table, and that's it!

You can now do this in your code:

user = Repo.get(User, 1) |> Repo.preload(:contacts)
contact = hd(user.contacts)

However...

A member of the Elixir community told me on Slack that this looks more like a one-to-many relationship, and that I should try to represent it that way in my app. It would look like this:

defmodule MyApp.User do
  use MyApp.Web, :model
  alias MyApp.Contact

  schema "users" do
    field :username, :string
    # Add the many-to-many association
    has_many :_contacts, MyApp.Contact
    has_many :contacts, through: [:_contacts, :contact]
    timestamps
  end

  # Omitting changesets
end

It works, however I don't like the fact that now we basically have cluttered our model, having _contacts and contacts in it. Then there's the fact that Ecto models are just backed by a simple struct, so there's no way to "hide" details about it (as-in: make _contacts private). I decided to stick with the many_to_many version of this code because it hides the association table quite nicely.

After playing with this for a bit, I realised that this is the correct approach, although it clutters the user instance space.

During actual testing in my Phoenix app, I noticed that the query that Ecto was generating with the former association type was the wrong one. Switching it from many_to_many to has_many as above, fixed the issue.

Have something to add to this article, or did I miss something? Hit me up on Twitter.