Not a strictly SQL Server related topic, but I know many of you want to know this. Let’s talk about how I built the Data Community Events Aggregator.
When the Great Lockdowns of 2020 started, most user groups and community conferences went online. This has created an opportunity for everyone from across the world to join user groups and conferences they couldn’t attend before. For example, for the first time, I was able to “zoom” into a SQL Server User Group in Los Angeles without having to fly there.
But the same opportunity has also created a problem. How do you find all those events? As I was eager to get the most of it, I decided to build a Data Community Event Aggregator to list all (well, most) events in a single place and decided to publish it for others to use as well.
Getting the data
The first and the most important problem was finding events. The ambition was to have a crawler that would just magically go through the web and find all relevant events. Whilst the attempt somehow worked, it quickly turned out that this was not what I wanted. What I wanted was a “curated” list of data community events and not just random events that have the word SQL in them. This meant that the portal was going to require my manual input to validate events.
Sure, I hear you: “Build a machine learning model on top of the crawler that will eventually learn what the data community events are.“. Absolutely, that would be the ultimate goal but remember, I wanted this up and running in a couple of days without having to spend much time and money.
Step 1: Finding existing events
This was relatively easy. I could just search through meetup.com, eventbrite.com and Google for the existing events.
Step 2: Finding new events in the future
This was more complicated. As I did not want to spend too much time on this, I needed a relatively easy and automated or semi-automated way to find any new events. I have built a simple Power App that was “listening” to social media, capturing keywords of interest and sending me a summary email every couple of days. I was then able to review the results and decide what goes into the aggregator.
Step 3: Allowing community submissions
As the portal was growing and gaining more and more interest, I was getting requests from organisers to add their events to the calendar. I needed a way to make this all work together without spending too much time on it.
This functionality meant that I needed to have a submission form on the website for people to submit their events. The website was running on WordPress, and the simplest way to achieve this was to store the submissions directly in the WordPress database.
Coming up with the data standard
One of the conditions I had was to for the portal to be automated, especially when it comes to recurring events. Once I found the event or user group, I wanted to add it and forget about it. I did not want to type in every single details event or recurrence manually. Ideally, I wanted every single event to be in the iCalendar
*.ics format. I also quickly found out that 90% of the user groups and data community events were on the meetup.com portal. All I was building was a list of “curated” Meetup events.
So the data standard was:
+--------------------+--------------------------------------------------+ | Event Organiser | iCal feed | +--------------------+--------------------------------------------------+ | One-off data event | http://oneoffdataevent.com/events/ics/event.ics | | Monthly User Group | http://mylocalusergroup.com/events/ics/event.ics | +--------------------+--------------------------------------------------+
Dealing with non-standard feeds
There are a number of one-off events that do not have the ical feed. If I become aware of it, I often add it manually for the benefit of the Data Community, but I have always pushed for ical feeds, even for one-off events. It just makes sense even so individuals can integrate it into their schedules.
Since all events were now in WordPress’s database, I could easily access them from within the blog itself. WordPress is written in PHP Language, which I
am used to be relatively familiar with. All I needed to do was to get the records from the table, then loop through each
*.ics parse the event details and display the results.
That did not quite go as planned as I quickly learned that PHP skills were a bit rusty, and dealing with all the different time zones and translating it into the UTC so we can show on the same calendar was a bit too much for me to handle.
I needed a better approach. If only there were an excellent online iCalendar parser that could do the job for me.
Hello Google Calendar Service.
How could I not think of the Google Calendar Service before? I needed to write a simple Google Script that would pull the iCal URLs from the WordPress table via WordPress API, pull event data from each ical, parse it and build a single, aggregated calendar that I can then consume back into the WordPress blog. Super-duper easy.
The App Script runs every few hours on a time trigger, pulls the list of feeds from my WordPress database via REST API, then goes through every single feed and gets the details, puts into a new Google Calendar Service and then pushes it back into my WordPress blog.
Once I had a single, unified calendar with standardised times and dates, I quickly parsed it in PHP and displayed it on the website.
I was keen to have “reminders” of some sort for upcoming Data Community Events. I created a new Data Community Events Twitter account. I achieved this with Microsoft Power Automate again and the WordPress API. I had the Power App querying the API for the next upcoming Data Community Events and posting a tweet every few hours:
Data Community Events Tracker was a fun project to do, especially for a “database guy”. Certainly different to writing T-SQL. It is primarily serverless apart from the WordPress bit. End to end took me a couple of days which I was happy it was not too long.