How to build a REAL chat system for your startup

David Qorashi
7 min readNov 24, 2017

Every other day I see a blog post on how to create a chat system using WebSockets or how to use the magic of Firebase to build a majestic one, although let’s be honest: Building a REAL customizable chat system is a whole different story.

Now, let’s consider a complicated use case:
we have a product consisting of two parts: 1) mobile app that our end users install and 2) a web dashboard. The dashboard is used by administrators to target a certain group of audience and send them messages. The selected audience will receive the messages on their mobile apps.

Consider that we need to provide solutions for the following use-cases:

  • 1–1 chat between two users
  • Group chats among users
  • Sending bulk messages (we refer to them as `campaigns`) from dashboard to users
  • Sending individual messages to a specific user from the dashboard

campaign messages break down into following categories:

  • Dashboard’s admin needs to create a group chat and add an audience group.
  • Dashboard’s admin creates individual channels between the dashboard and each user and drops a message in that channel. The users as members, will receive the messages in their mobile apps.
  • Dashboard’s admin sends a uni-directional message to the users. The user will recieve the message either by SMS or chat format.

Now the question is how to implement this?

Solutions

One solution is to reinvent the wheel. We all have heard of Firebase and its greatness. We can use Firebase’s Real-time database to build our own chat infrastructure.

Let’s see how the implementation looks like:

At the end you will end up with following Firebase tree:

  • campaignMessageQueue
  • campaignQueue
  • campaignStatus
  • chatMessages
  • chatReadReceipts
  • chats
  • chatTypingIndicators
  • emailNotifications
  • messageQueue
  • userChats
  • userPushTokens
  • nonChatSMSQueue
  • campaignMessageQueue

Scary! Isn’t it?
Does it work? Yes.
Does it scale? Kind of.
Is it easy to deal with? Hell, no!
To support this beast, you need to dedicate a lot of time and energy.
This solution reinvents every aspect of a chat system including: Read receipts, Typing indicators, notifications, and case status for chats.
You need to have different Node workers set up which read from different queues in Firebase Tree.

Here’s a description of each worker:

campaignWorker

The worker operates on campaignQueue.
The worker receives jobs from web dashboard UI, fetches distribution list, creates new chats with users, and then places the campaign message in them. It also has the responsibility of creating jobs in campaignMessageQueue.

messageWorker

The worker operates on messageQueue.
It handles all notification tasks for normal/non-campaign messages.

campaignMessageWorker

The worker operates on campaignMessageQueue. The functionality is exactly like messageWorker, it just operates on a different queue.

nonChatSMSWorker

The worker operates on nonChatSMSQueue.
It handles sending SMS-only messages that do not create a message within a real chat.

campaignMessageWorker

The worker operates on campaignMessageQueue .
It handles sending mass non-campaign messages for a dashboard.

Sounds like a lot of work, no? Implement and maintaining all this code is not the only problem, sometimes the Node workers clog up and other times they crash and need to be restarted.
The cost of maintenance is so high that soon you are going to be looking for an alternative. Your target goal will be to eliminate reliance on Firebase as your Chat database, the question is how is that possible?

That’s when Twilio comes to play. They are mostly known for their SMS API but they also provide a chat API.

Twilio’s Programmable Chat API

Programmable Chat is a cloud-based chat product which provides a number of client SDKs and a REST API for use in integrating Chat capabilities into applications and websites. It is modeled after Extensible Messaging and Presence Protocol (XMPP) and revolves around the concept of Service Instances. Chat Services are where all the Channels, Messages, Users and other resources within a Chat deployment live.

  • Each service instance could have many channels.
  • Each channel could have many members.
  • Each channel could have many messages.

Service instances are isolated silos and there is no way for two different service instances to communicate with one another. To send a message between two entities; first, there should be an existing channel between them. Then the entities must be added as members to the channel. Every message dropped in the channel will be published to the members. Using client SDKs (available for web, iOS and Android), we can retrieve the list of Subscribed Channels that a User is a Member of (or has been invited to) once that user logs into the client.
This model is super flexible.

Now let’s go back to the original problem we solved previously via Firebase. We will try to implement it in Twilio:

First, we create a default service instance. This service instance is used by the dashboard to send messages to the mobile users. Also, our users will be able to send private messages to the dashboard’s admin. All these channels are resided in this service instance.

Using Programmable Chat API, we can totally eliminate the reliance on Firebase. We will proceed to remove the aforementioned node workers and we will simplify the structure of our project.

Now let’s talk about Twilio’s Chat REST API. We can use it to implement all the use-cases discussed earlier. The API is used by our backend and is intended for system usage. The API can be used to orchestrate the usage of Programmable Chat, you can add members, send arbitrary messages, change users’ role and etc. Basically, it’s a god-mode for our chat system.

Here is a brief detail on how each scenario was implemented:

  • User to User chat (mobile): Whenever a user needs to initiate chat with another user, the mobile client sends a request to the provided endpoint on the backend. Our backend will try to find a channel with the unique name of [user1.uuid, user2.uuid].sort.join(‘:’) in default service instance. If no channel with that unique is found, our REST Client will create a new channel with that name and adds both users as members to it. The endpoint will return the appropriate Channel ID to the mobile client in the return payload. From there, mobile client is able to send messages on that channel. Using Twilio’s REST API, we make sure that we are not creating duplicate channels between two users and we preserve the history of conversations. Next time, a user wants to send a message to a fellow user, the same channel will be used.
def create_1_to_1_chat(u1, u2)
unique_name = [u1.uuid, u2.uuid].sort.join(‘:’)
name = User.where(uuid: [u1.uuid, u2.uuid].sort).map(&:name).join(‘ — ‘)
c = add_channel(unique_name, name, channel_type: :public)
c = update_attribute(c, ‘topic’, name)
add_member(c, u1)
add_member(c, u2)
c.sid
end

2. User to dashboard chat (mobile): The process is very similar to the former use-case. The biggest difference is that, the backend will try to find or initialize the channel in dashboard’s service instance (and not the default service instance):

def create_user_to_dashboard_chat(u)
c = add_channel(u.uuid, “#{@dashboard.name”>@dashboard.name”>@dashboard.name">u.name}-#{@dashboard.name} #{u.uuid}”)
c = update_attribute(c, ‘topic’, @dashboard.name)
c = update_attribute(c, ‘dashboard_uuid’, @dashboard.uuid)
add_member(c, u)
add_member(c, @dashboard, role_sid: :channel_admin)
res = TwilioChannel.load_channel(@dashboard, c)
res
end

3. dashboard to user chat (web): Sometimes we would like to send a message to an individual user. The logic used here is the exact same one we used for “user-to-dashboard” chat.

4. dashboard-to-users (aka. Campaigns): As mentioned before, we needed to support two different types of dashboard-to-users communication.
First use-case is when the dashboard needs to create a group chat with some users in it. The request will be sent from dashboard to the backend along with the list of search params that needs to be used to define the audience. Backend will receive the request and asynchronously creates a new channel and adds the chosen audience as members to the channel.
The second use-case is when the dashboard needs to open individual channels between itself and the mobile users. Normally an admin uses this communication type to require a response from the audience. The request will be sent from dashboard to the backend along with the list of search params that needs to be used to define the audience identities. Backend receives the request and creates a different channel for each user and adds that user as a member to it. We use Sidekiq for efficient background processing.

Webhooks Events

Programmable Chat event callbacks allow you to monitor and intercept specific events in your backend service. The two categories of events are “Pre-Event” (synchronous) and “Post-Event” (asynchronous). When a callback is specified Twilio will make an HTTP request to the designated webhook URL. This request contains all relevant variable data.
We used post-event callbacks for analytics purposes and also to re-add different dashboard as members to the channels. According to docs, “Post-Event” webhooks are “notify only” and provide information after the action has completed. Unlike Pre-Event ones, these are not blocking callbacks as they are informational.
We specifically relied on onMessageSent and onChannelAdded event handlers.

Limitations

Programmable Chat API is still young and a work in progress. While we were working on creating our brand new chat experience we faced following limitations:

  • In case of campaigns (bulk messages initiated from dashboard), we don’t initially add the dashboard’s admins as a member to the channel. One limitation of current Twilio Chat API is that, an entity could be a member of up to 1000 channels. According to Twilio, this is a soft limit that lets them scale better, but on the other hand, it’s kind of limiting. Thus, we only add the dashboard as a member if there is an incoming message from the mobile users. This limitation also means that we need to remove the dashboard from the channel, when the admin is done processing it ( Open vs. Closed Channels concept mentioned earlier).
  • Currently, we could set custom attributes on channels and messages; although there is no way to query via them. This is super limiting and in my opinion is the most important lacking feature.
  • Each channel can only have up to 1000 members in it. To be fair, I think any group chat with more than 20 members in it is a place of utter chaos, therefore I don’t consider this a real limitation.

Conclusion

Overall, I think Twilio provides users with a much smoother experience building a chat system compared to Firebase.
In the long term, a chat system powered by a Pub-Sub protocol will be the real winner in terms of scalability and maintenance.
Let’s just know that there are different solutions for building a chat system. When you need to decide which solution works better for you, try to make a list of pros and cons. Reinventing the wheel is not always a bad idea, but you also need to consider the resources at hand. Being a startup, it’s vital to move fast. Now you could move fast without sacrificing quality by just using the right tool.

Resources:
https://www.twilio.com/docs/api/chat/
https://firebase.google.com/

--

--