Haskell is our primary backend language at Wagon— it helps us (safely) iterate faster, build for multiple environments, and attracts great engineering talent. It is fun to meet engineers from startups and large companies at Haskell meetups, BayHac, and in the functional programming Slack channel. Many engineers are curious how we use Haskell in production and have invited us to speak at their companies.
Wagon’s CTO Mike Craig spoke last month at Twitter (and earlier this year at Square) on the pros and cons of using Haskell in production. Thanks to Twitter’s Peter Seibel for inviting us and to their team for posting the video.
How engaged are your users? How frequently do they visit your website or app? Analytics services like Google Analytics and MixPanel calculate basic counts of daily, weekly, and monthly active users, but it’s difficult to customize or join these results with other data. Writing this query in SQL gives you more control. Let’s do it!
Here’s a table of user logins by day. How many users were active in the last week and month?
In either approach, it’s helpful to have a table of logins per user for each day, even if the user didn’t login (null in this example). If your data isn’t already organized like this, you can generate a table with a row per day, per user, with the following query (this is Postgres syntax, for other databases, modify the generate_series function to generate a table of dates).
To use this data, you can create a temporary table, use a common table expression, or include it as a subselect.
Approach 1: Self Join
A self join is when you join a table with itself. How meta is that? For each row, we ask how many logins that user had in the last week. The join condition requires emails to match and for the date to be within the last 7 days. In line 5, the query sums num_logins for those dates. The case statement identifies the user as active on that day if she had any logins in the prior week.
This query generates a table that tells us which users are seven-day-active over time. This result can be aggregated further, filtered for specific dates, used to find inactive users, and joined with other data. In Wagon, we can create a graph of the number of 7 day active users over time.
Approach 2: Window Functions
The self join works great, but modern databases have a more efficient way to get the same results. With window functions, we can explicitly aggregate only over rows that we care about with just a single pass through the data. If you have millions or billions of rows (lucky you), the self join will take a long time to compute. In line 5, the query sums num_logins for the user’s previous 14 days. It first partitions the table by email, then evaluates over a set of rows - in this case we’re looking at a specific date range. The case statement classifies the user as active or not just as before.
This query makes it easier to add additional metrics for 7 and 30 day active users. As expected, the wider your definition of active user, the more you’ll have. Use these new powers carefully!
Want to learn more SQL? Join us on Monday, November 16 at the Wagon office in San Francisco for a free SQL workshop. Please RSVP!
Electron has momentum. The open source project for building cross platform desktop apps with web technologies now has 6000+ commits from 200+ contributors, a 3500+ person Slack room, and now it’s 4th meetup with 90+ RSVPs (the 1st was a few of us at a bar).
On Monday, the Bay Area Electron group met at Microsoft Reactor, an event space in San Francisco. A few members of Microsoft’s open source team were in town to hear how people are using Electron to build Windows apps. The team is dedicated to helping projects run well with Microsoft platforms. It’s also exciting to see that Visual Studio Code is using Electron. Two of the five talks were Windows related:
Kevin Sawicki from GitHub spoke about how to test Electron apps using ChromeDriver and his project Spectron. Try this in your next project.
Felix Rieseberg from Microsoft’s open source team gave us walkthrough of WinJS. It seems like a great way to make Electron apps look native on Windows 10. Here are the slides.
John Haley from Axosoft’s GitKraken team talked about how they handle task scheduling in their Electron apps. This strategy seems like a new standard way to handle both UI and processing intensive applications.
The Square engineering team invited Wagon to give a tech talk on how we use Haskell in production. Their teams are interested in functional programming and we were honored to walkthrough our experience building a modern analytics tool using Haskell, React, and Electron. Thanks Square for welcoming us!
If you’re using Haskell at your company, let us know. We’d love to trade notes.
How many users joined in the last 5 months? What were total sales in Q2? How much revenue came from the March sign up cohort?
Although these questions can be answered with a single number, it can be useful to see a running total over time: how many unique users joined, or how much cumulative revenue was received by day over some period.
Usually, data is stored incrementally. For example, here’s a table of sales per day:
How do we generate the following table of cumulative sales over time? In SQL, there are two typical approaches: a self join or a window function.
Running Total of Sales
A self join is a query that compares a table to itself. In this case, we’re comparing each date to any date less than or equal to it in order to calculate the running total. Concretely, we take the sum of sales in the second table over every row that has a date less than or equal to the date coming from the first table. This is Postgres/Redshift syntax, but other SQL dialects are very similar.
This is not a bad approach; it is a nice showcase of how extensible SQL can be using only select, from, join, and group by statements.
But it is a lot of code for a simple task. Let’s try a window function. They are designed to calculate a metric over a set of rows. In our case, we want to sum every row where the date is less than or equal to the date in the current row.
The window function can filter and arrange the set of rows to run the function over. Here the order by date rows unbounded preceding limits the sum function to only sales before the date of the current row. Window functions are incredibly useful for time-based analytical queries; to learn more, the Postgres docs are a great place to start.
The final step of creating a chart and sharing it triumphantly with your teammates is easily accomplished using Wagon. Window functions for the win!
Wagon is a modern SQL editor for analysts and engineers: write queries, visualize data, and share charts with your team. Signup for free:
We’re building a hybrid web/native application that runs both in the browser and as a downloadable desktop app. Analysts use Wagon to query, analyze, visualize, and share data: the app is highly interactive and data-heavy. It has to be fast, furious, and stable even when used for hours.
It ain’t all gravy: it’s difficult to maintain a UI with asynchronous updates, large scale data manipulation, and many other cross-cutting concerns. How can we build a sane frontend codebase, without losing our ability to iterate and ship quickly? The answer is to separate concerns. We break our UI into small self-contained components, and we isolate state and manage it separately from the UI. Facebook’s React and Flux libraries make this practical.
The big idea behind React is this: a UI component is just a function from its inputs to its content. All a component needs is a render() method that returns the elements we want the user to see. As an example, here’s a component that takes a size and color and renders a div displaying a filled-in square. Notice that users of this component don’t need to know about how it is implemented.
React components are simple to reuse because they nest like HTML elements. It’s easy to wrap an existing component to add additional styles or behavior—React favors composition over inheritance. We can forget about carefully maintaining the DOM to avoid excessive redraws and flicker: we declare what our components should look like, and React makes it so.
React is great for organizing view elements, but an application is more than static UI. Users generate events and we need to capture them, update state, and direct how the app should respond. Flux manages this flow by clearly seperating user action events from application responses.
Actions encapsulate events. They’re the application logic that runs in response to users doing stuff. In our example, when a user clicks a colored square, we update the server and dispatch to let the rest of app know what happened:
Stores encapsulate state. They listen to state changes dispatched from actions, and they update themselves to record the changes.
UI components listen to stores and re-render when relevant state changes.
Building a solid, maintainable frontend is still difficult despite these great libraries. Here are a few other strategies we’re using:
Build pure React components whenever possible. A pure component doesn’t make any external calls from render()—it’s a pure function of the component’s properties and state. Components like this are much easier to test, debug, and reuse. It’s such a good idea it’s included in React itself!
Separate React components that listen to Flux stores from those that render the UI. Wrapping UI components in container components is another win for reuse. Jason Bonta mentioned this in his great talk at React.js Conf 2015.
Take care when integrating non-React UI components. Mixing React’s declarative API with another library’s imperative API can be painful. Build wrapper components around external libraries, and use React’s lifecycle methods to handle setup and teardown. When possible, avoid exposing direct-update methods like drawChart() or setCursorPosition()—manage state through component properties or Flux stores.
Split Flux actions into modules by UX concern. We separate navigation and authentication from running queries and making charts. Carve out submodules for cross-cutting concerns, like AJAX requests or logging.
Split Flux stores by domain. It’s helpful to separate persisted server-side state from ephemeral page state, for example. We hide the state of the URL bar behind a store, too!
We’re tackling fun engineering challenges at Wagon. If you want to learn more or work everyday on these technologies, check out our jobs page and get in touch!
Many Wagon users previously used SQL Workbench/J to query Amazon Redshift. Older SQL tools are focused on DBA tasks like managing tables, updating schemas, and provisioning users. Analysts just want a simple way to query data, analyze it, visualize it, and collaborate with others. It’s no surprise that we’re frequently asked how to move from legacy tools like SQL Workbench/J to Wagon. It’s super easy.
If you are currently using SQL Workbench/J and want to try Wagon, here are the quick steps to connect to Redshift in Wagon:
In SQL Workbench/J, open the connection window
Grab the hostname, port, and database from the URL, the username, and the password (in the Redshift interface, the URL is called the JDBC URL)
Paste into Wagon (no need to install any drivers!)