Solved

Usage of Front's Events API and the q query string

  • 6 December 2022
  • 8 replies
  • 211 views

Badge +2
Our operations team would like to take advantage of Events data and I’m working on interpreting the API documentation.

The documentation says:
 Lists all the detailed events which occured (sic) in the inboxes of the company ordered in reverse chronological order (newest first).
But we are given the following query param:
string
Search query object with optional properties beforeafter, or typesbefore and after should be a timestamp in seconds with up to 3 decimal places. types should be a list of event types.

 
I did some digging and discovered how the Search Query Object is defined, kind of. Looks like we can search using the before: and after: available here, but this appears to only index on conversations created_at, created before/after the supplied parameter. Given that this is an event endpoint, I find this behavior confusing. Surely multiple events can occur after a conversation is created. Is the proposed way of interacting with this endpoint to:
  1. 1. run a backfill, or at least for messages created after some timeframe.
  2. 2. run incrementally, since events are pulled in reverse chronological order, and stop when the contents reach messages we’ve already seen.
If this is the case, then to make this incremental means we are required to inspect all results for the event ID (IDs are non-incrementing alphanumeric keys, and comparing all of these would be a nightmare, performance wise) or the updated_at value.
 
Question: Is my above understanding correct, and can we rely on this updated_at value?If there are other examples of customers utilizing this endpoint I would love to hear more. My current reading of the documentation such as it is is that we cannot rely on the Events API to incrementally ingest all or meaningfully programmatically filter events data for an analytics ingestion.
 
Thank you
icon

Best answer by Javier - Developer Relations 6 December 2022, 01:35

View original

8 replies

Userlevel 5
Badge +8

For Event objects returned from the API (specifically for Message events), we include the following attributes;

  • emitted_at: Indicates the time this Event was generated (i.e. the time Front processed this event)
  • target.data.created_at: Indicates the timestamp on this event. For a Message, this might be the value of the <date> email message header.

 

The order and filtering of the List Events endpoint is based on the creation, or emission, of the Event.  However, for Message events, the target.date.created_at and emitted_at timestamps may differ. 

 

One reason for this is when a new channel is connected to Front.  For some channels, Front will perform an import of message history.  The time of the import can differ from the time of the message. In that case, the emitted_at timestamp would be today, and the target.data.created_at timestamp would be the time the email was sent.

 

Also, please know that the ids of events are sequential globally and never reused.  After dropping the `evt_` prefix, the remaining string is a base 36 conversion of the decimal id.  You can always trust that a larger id is more recent.

 

Kind regards,



Evan

Badge +2
This is super helpful, thank you. I'd like to clarify a couple of additional things:
 
Does the before or after properties in the q query param allow me to actually filter on emitted_at? When I fussed with this a few weeks ago I got results that suggested it was filtering on the conversation created_at, but I'm not so sure.
 
I'd really like to make requests to the API in a programmatic, limited fashion rather than requesting all data for all time every day for the purpose of analytics. How would I go about doing that, in the event after is not filtered on emitted_at
 
Also, the documentation I found mentions the "List Events" endpoint, but you mention Message Events - is that just a subset of message types from the Events endpoint? If I'm able to filter and list by emitted_at, the difference in behavior is good to know for downstream users, and I'll pass it on, but I don't think it should impact my ingestion design.
Userlevel 5
Badge +8

The sort field used depends on which API endpoint you call, and it sounds like you were working with a different endpoint before.  Events are sorted on "created_at", which usually matches, but are filtered on "emitted_at".

 

Message events refer to events with a target type message listed here: https://dev.frontapp.com/reference/events

 

Is there any reason you're not using the Analytics Export endpoint to ingest message and activity data?  https://dev.frontapp.com/reference/create-analytics-export

Badge +2

We do already utilize an analytics export, but it's my understanding that there is some detail about tags, conversations moving, and other finegrained details that Events captures and our current analytics export does not.

 

A question from me - what would building that q param look like, for filtering on the event emitted_at? Particularly if I'm concerned that created_at =/= emitted_at and I am interested in 1) incrementally querying the API, let's say once a day, and 2) I want to be both idempotent, but not miss any data. The idempotence part is easy (some deduping before loading to our final table) - it's the created_at =/= emitted_at that I'm worried about, if one is filtered on, but the other is sorted on. If they don't match, is this a matter of milliseconds? Days? I want to make sure I understand those two fields 100%.

Userlevel 5
Badge +8

That difference can be arbitrarily large, again, this is about imported messages.  Let's say you import messages from 2010 in 2022. Later, you want to filter for all events in 2022.  Those messages from 2010 should still appear as that's when they were imported, but at the end of the list to reflect the message's real creation date in an external system.   This allows your team to keep track of both the import timestamp (emitted_at) and the message's timestamp (created_at).

 

Have you considered webhooks for all events?  This would move to an asynchronous, event-based approach rather than a polling one.

Badge +2

I like the webhooks discussion, but will need to think more about it.

 

Thanks for the detailed breakdown on `emitted_at` vs `created_at`. I've got enough to go on and have demonstrated a working query. Thank you.

Badge +2

Our team reviewed the Events spec, and identified what (common) fields we need.

 

While that helps me identify which fields need be written, I'm concerned about the volume of read/transmission. What I'm getting at: is there a way to filter down the fields returned from the Events endpoint? I really don't want the gigantic `body` field included in all emails. Feels like a waste, since I won't be writing it.

 

I know I can filter on event types, but I think our team doesn't know which ones we need just yet.

 

Userlevel 5
Badge +8

Front's API doesn't support "graph"-style queries, but I have expressed your interest in this to our API product manager.  But, if your team used webhooks for the events, you can opt not to receive the full event payload for the integration.  This would omit fields like Subject, Status, Assignee, Recipient, Tags and the message itself in the webhook payload.

 

Overall, an event-based system with light webhook payloads is my recommendation if the volume of data is any concern for your team.

Reply