<?xml version="1.0" encoding="UTF-8"?>

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Wagon Blog</title>
    <description>Wagon a simple way to see, explore, and collaborate on data.</description>
    <link>http://www.wagonhq.com/blog</link>
    <atom:link href="http://www.wagonhq.com/blog/feed.xml" rel="self" type="application/rss+xml" />
    
      
        <item>
          <title>Wagon Joins Box</title>
          
            <dc:creator>Team Wagon</dc:creator>
          
          
            <description>&lt;p&gt;We’re excited to share that the Wagon team will be joining the &lt;a href=&quot;https://www.box.com/home&quot;&gt;Box&lt;/a&gt; family! We are thrilled to bring our data analytics and insights expertise to the Box enterprise platform.&lt;/p&gt;

&lt;div class=&quot;text-center&quot;&gt;
  &lt;p class=&quot;text-right&quot; style=&quot;&quot;&gt;
    &lt;img src=&quot;/images/box-wagon.png&quot; style=&quot;
      width: 100%;
      margin-bottom: 20px;
      margin-top: 10px;
    &quot; /&gt;
      &lt;br /&gt;
    &lt;code style=&quot;font-size: smaller; font-style: italic; color: #888; position: relative; top: -12px&quot; class=&quot;text-right&quot;&gt;INSERT INTO &quot;Box&quot; VALUES (&#39;Wagon&#39;);&lt;/code&gt;
  &lt;/p&gt;
&lt;/div&gt;

&lt;p&gt;Wagon and Box believe that shared knowledge empowers individuals and unites teams. Box helps enterprises do more with their content and get their work done faster by acting as the central, modern content platform. Fortune 500 companies across every industry trust Box to sit at the center of their businesses. We’re excited to build data analytics products at Box to help people understand their data and work better, together.&lt;/p&gt;

&lt;p&gt;Our team will be focused on building new products at Box, so we’ll be shutting down the current Wagon product on October 3rd, 2016. Please read our &lt;a href=&quot;/faq&quot;&gt;transition FAQ&lt;/a&gt; for details.&lt;/p&gt;

&lt;p&gt;We started Wagon over two years ago to help teams collaborate on data analysis. Tens of thousands of SQL savvy people use Wagon to understand their data. We’re proud of what we’ve accomplished and are ready to continue building at Box.&lt;/p&gt;

&lt;p&gt;We’re humbled by the support from our users, partners, investors, advisors, friends (#bandwagon), and family.  Thank you for your feedback, bug reports, demands, and kind words.&lt;/p&gt;

&lt;p&gt;Please read Box CEO Aaron Levie’s &lt;a href=&quot;http://blog.box.com/blog/wagon-box&quot;&gt;blog post&lt;/a&gt;.&lt;/p&gt;

&lt;p style=&quot;float: right;&quot;&gt;
We&#39;ll be rooting you all on. &lt;i&gt;Gogogo!&lt;/i&gt;&lt;br /&gt;
&amp;mdash; &lt;a href=&quot;/about&quot;&gt;Team Wagon&lt;/a&gt;
&lt;/p&gt;
</description>
          
          <pubDate>Wed, 31 Aug 2016 00:00:00 -0700</pubDate>
          <link>http://www.wagonhq.com/blog/wagon-joins-box</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/wagon-joins-box</guid>
        </item>
      
    
      
        <item>
          <title>Your customer data, together at last</title>
          
            <dc:creator>Andy Granowitz</dc:creator>
          
          
            <description>&lt;p&gt;Customers interact with your company through many paths — they engage inside your product, open emails, view ads, answer surveys, etc. To understand your customers, you have to combine product engagement with their other interactions.&lt;/p&gt;

&lt;p&gt;Unfortunately, it’s difficult to analyze customer data across these different experiences.  Customer behavior in your product is usually measured through a 3rd party tracking service or by analyzing raw application logs.  Other interactions, like as customer support chats or sales calls, are stored in SaaS apps or homegrown internal tools.  You can only access this data through each of their limited reporting UIs, exports, or possibly an API.  There’s no magic wand for joining these disparate streams.&lt;/p&gt;

&lt;p&gt;There are two common approaches to join these multiple datasets: (1) send all data to one tracking service or (2) move the data to a common place. Sending all the data to a single tracking service, like Google Analytics or Mixpanel, is cumbersome, nearly impossible to implement, and inflexible to query. Dead end.&lt;/p&gt;

&lt;p&gt;The better approach is to move all customer data into a common store. Amazon Redshift, Google BigQuery, and Azure SQL Warehouse are our favorite cloud data stores for large scale analytics. They’re fast, easy to manage, and decreasing in price. But moving data from 3rd party sources to these cloud databases is still hard. You have to work with multiple APIs, find join keys, schedule data pulls, handle backfill, etc… It’s a huge headache.&lt;/p&gt;

&lt;p&gt;There are two strategies: build or buy. Building your own is complicated - we have a &lt;a href=&quot;/blog/building-an-analytics-pipeline&quot;&gt;post&lt;/a&gt; about it!&lt;/p&gt;

&lt;p style=&quot;max-width: 600px; margin: auto;&quot;&gt;
	&lt;img src=&quot;/images/partners/segment-sources.png&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h2 class=&quot;text-center&quot; style=&quot;font-size: 45px; margin-bottom: 0px;&quot;&gt;
  &lt;a href=&quot;/partners/segment&quot; class=&quot;no-underline&quot;&gt;
    &lt;img src=&quot;/images/wagon.png&quot; id=&quot;wagon-partner-logo&quot; /&gt;
      &amp;nbsp;&amp;nbsp;💖&amp;nbsp;&amp;nbsp;
    &lt;span&gt;
      &lt;img src=&quot;/images/partners/segment-logo.svg&quot; id=&quot;segment-logo&quot; /&gt;
    &lt;/span&gt;
  &lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Today, thanks to &lt;a href=&quot;https://segment.com/sources/?utm_medium=blog&amp;amp;utm_source=wagon&amp;amp;utm_campaign=sources&quot;&gt;Segment Sources&lt;/a&gt;, bringing all of your customer touch points together got easier. They’ve made it easy to pull customer data from many sources, such as Salesforce, Zendesk, and behavioral data from your app or website, into a single warehouse.  We worked with Segment to write some &lt;a href=&quot;/partners/segment/queries&quot;&gt;starter queries&lt;/a&gt; - give them a go. If you’re already using Wagon and connect a Segment Warehouse, you can open these queries right in the app.&lt;/p&gt;

&lt;p&gt;Segment will be expanding its Source catalog in the coming months, and you can always check out the current &lt;a href=&quot;https://segment.com/catalog/?utm_medium=blog&amp;amp;utm_source=wagon&amp;amp;utm_campaign=sources&quot;&gt;catalog&lt;/a&gt; to see the latest additions. We’re excited to partner with Segment on our shared mission to make answering questions with data easy.&lt;/p&gt;

&lt;p&gt;We also recommend other tools to help unify your various customer data sets: &lt;a href=&quot;https://fivetran.com/&quot;&gt;Fivetran&lt;/a&gt;, &lt;a href=&quot;https://rjmetrics.com/product/pipeline&quot;&gt;RJ Metrics Pipeline&lt;/a&gt;, and &lt;a href=&quot;http://snowplowanalytics.com/&quot;&gt;Snowplow&lt;/a&gt; – so that you’ll always have access to the data you need in Wagon, no matter how you choose to set up your pipeline.&lt;/p&gt;

&lt;p&gt;Modern ETL + modern SQL tools = win!&lt;/p&gt;

&lt;script src=&quot;/js/public-query.js&quot;&gt;&lt;/script&gt;

&lt;script src=&quot;/js/protocol-url.js&quot;&gt;&lt;/script&gt;

</description>
          
          <pubDate>Wed, 06 Apr 2016 00:00:00 -0700</pubDate>
          <link>http://www.wagonhq.com/blog/customer-data-together</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/customer-data-together</guid>
        </item>
      
    
      
        <item>
          <title>Deploying Electron</title>
          
            <dc:creator>Matt DeLand</dc:creator>
          
          
            <description>&lt;p&gt;&lt;a href=&quot;http://electron.atom.io/&quot;&gt;Electron&lt;/a&gt; makes it easy to write cross-platform desktop apps using web technology— but how do you build and deploy these hybrid apps in production?&lt;/p&gt;

&lt;p&gt;At January’s &lt;a href=&quot;http://www.meetup.com/Bay-Area-Electron-User-Group/events/228010482/&quot;&gt;Bay Area Electron Meetup&lt;/a&gt;, we presented how Wagon builds and delivers new updates to our users. This talk covers both standard and custom Electron usage for background updating, uninterrupted usage, helper processes, cross-platform builds, OS notifications, and the occasional hotfix.&lt;/p&gt;

&lt;script async=&quot;&quot; class=&quot;speakerdeck-embed&quot; data-id=&quot;1b22323b1ac34371a4d75a5813c17f86&quot; data-ratio=&quot;1.29456384323641&quot; src=&quot;//speakerdeck.com/assets/embed.js&quot;&gt;&lt;/script&gt;

&lt;hr /&gt;

&lt;p&gt;Thanks to &lt;a href=&quot;https://twitter.com/bengotow&quot;&gt;Ben Gotow&lt;/a&gt; at Nylas for organizing!&lt;/p&gt;
</description>
          
          <pubDate>Tue, 23 Feb 2016 00:00:00 -0800</pubDate>
          <link>http://www.wagonhq.com/blog/deploying-electron</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/deploying-electron</guid>
        </item>
      
    
      
    
      
    
      
    
      
    
      
        <item>
          <title>Haskell for commercial software development</title>
          
            <dc:creator>Mike Craig</dc:creator>
          
          
            <description>&lt;p&gt;Inquiring minds on Quora want to know, &lt;em&gt;&lt;a href=&quot;https://www.quora.com/Is-Haskell-suitable-for-commercial-software-development&quot;&gt;is Haskell suitable for commercial software development?&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I’m looking for an alternative to Python mainly for reasons of Python having so much trouble with
workable concurrency, and Go seemed great at first look (it’s still not bad, but indeed error
handling and lack of generics are a few features cut away too far).&lt;/p&gt;

  &lt;p&gt;This pretty much leaves Haskell as workable alternative. But is it really a practical alternative?&lt;/p&gt;

  &lt;p&gt;My biggest fear that while Haskell seems very powerful, it is also difficult and so it may be too
difficult for person A to read code written by person B (what seems to have killed Lisp in my view
is that problem and person-specific DSLs written in Lisp were simply unreadable for too many other
people).&lt;/p&gt;

  &lt;p&gt;PS, in other words: I do not ask about strong sides of Haskell (or language X), those are well
known, but rather about a lack of show-stoppers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Absolutely, yes! At Wagon, we’ve built server- and client-side systems in Haskell, running in AWS and distributed to our users in a cross-platform desktop app.&lt;/p&gt;

&lt;p style=&quot;max-width: 600px; margin: auto;&quot;&gt;
  &lt;img src=&quot;/images/posts/quora.png&quot; alt=&quot;Wagon on Quora&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Possible show-stoppers to using Haskell commercially:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Community support&lt;/strong&gt;&lt;br /&gt; A good open-source community is a big deal. It’s the difference between fixing other people’s code and finding an open GitHub issue with a workaround and an in-progress fix. With Haskell, we find the latter much more frequently than the former. The community is positive, active, and generally helpful when problems come up. Lively discussion takes place on &lt;a href=&quot;https://www.reddit.com/r/haskell/&quot;&gt;Reddit&lt;/a&gt;, &lt;a href=&quot;https://wiki.haskell.org/IRC_channel&quot;&gt;IRC&lt;/a&gt;, a &lt;a href=&quot;http://fpchat.com/&quot;&gt;public Slack channel&lt;/a&gt;, and several &lt;a href=&quot;https://wiki.haskell.org/Mailing_lists&quot;&gt;mailing lists&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hiring&lt;/strong&gt;&lt;br /&gt; There are many factors involved in hiring: team size, location, remove vs on-site, other expertise required, etc. At Wagon we’re an on-site team of ~10 in San Francisco, with lots of web- and database-related problems to work on. Finding qualified Haskellers has not been an issue.&lt;/p&gt;

&lt;p&gt;Haskell is different from the languages lots of developers are used to. ML-style syntax, strong types, and lazy evaluation make for a steep initial learning curve. Anecdotally, we’ve found Haskell’s secondary learning curve smooth and productive. Intermediate Haskellers quickly pick up advanced concepts on the job: parsers, monad transformers, higher-order types, profiling and optimization, etc.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Available industrial-strength libraries&lt;/strong&gt;&lt;br /&gt; Haskell lends itself to lightweight-but-powerful libraries that do one thing very well. We get a lot done by composing these small libraries rather than relying on big feature-complete frameworks. Some libraries have become de-facto standards that are stable and performant, and &lt;a href=&quot;http://www.stackage.org/&quot;&gt;Stackage&lt;/a&gt; gives us a reliable way to find additional packages that usually “just work”.&lt;/p&gt;

&lt;p&gt;We do sometimes find gaps in Haskell’s available libraries. For example, Haskell can’t yet match Python’s &lt;a href=&quot;http://pandas.pydata.org/&quot;&gt;tools for numerical computing&lt;/a&gt;. And like every open-source ecosystem, there is some bad code out there. We rely on community recommendations and a sense of dependency hygiene to stay productive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance&lt;/strong&gt;&lt;br /&gt; A quick Google search turns up benchmarks comparing Haskell to anything else you can imagine. In practice, we’re interested in tools with two properties:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Fast enough under most circumstances. “Fast enough” means performance is never an issue for a given piece of code.&lt;/li&gt;
  &lt;li&gt;Easy to optimize when needed. When we do need more speed, we’d like to get it incrementally rather than immediately rewrite that code in C.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Haskell has both of these properties. We almost never have to address performance in our Haskell code. When we do, we turn to GHC’s profiler and its many performance knobs: optimization flags, inlining, strictness and memory layout controls, etc. Haskell also has good interopability with C for when that’s appropriate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory leaks&lt;/strong&gt;&lt;br /&gt; There’s a lot of talk about memory leaks caused by Haskell’s lazy evaluation model. We run into this very infrequently in practice, and in those rare situations GHC’s profiler has led us to solutions quickly. The real problem is not lazy evaluation but &lt;a href=&quot;http://dev.stephendiehl.com/hask/#lazy-io&quot;&gt;lazy IO&lt;/a&gt;, which we avoid with tools like &lt;a href=&quot;http://www.stackage.org/package/conduit&quot;&gt;conduit&lt;/a&gt; and &lt;a href=&quot;http://www.stackage.org/package/pipes&quot;&gt;pipes&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debuggability&lt;/strong&gt;&lt;br /&gt; Haskell doesn’t have a debugger in the same sense as Python or Java. This is mostly a non-issue, because exceptions are rare and GHCi gives us a flexible way to run code interactively. Nonetheless, hunting down problems in a big application can be difficult. This has improved in recent versions of GHC with support for proper &lt;a href=&quot;https://downloads.haskell.org/~ghc/latest/docs/html/libraries/base-4.8.2.0/GHC-Stack.html&quot;&gt;stack traces&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Haskellers reading other Haskellers’ code&lt;/strong&gt;&lt;br /&gt; This hasn’t been a problem for us. Haskell is a flexible language both in syntax and semantics, but this leads to better solutions more than it leads to opaque or unreadable code. Heavy indirection—via embedded DSLs or deep typeclass hierarchies—is unusual in practice. We reach for those tools as a last resort, knowing that they come with a serious maintenance cost. Agreeing on a basic &lt;a href=&quot;https://github.com/tibbe/haskell-style-guide/blob/master/haskell-style.md&quot;&gt;style guide&lt;/a&gt; helps smooth out minor syntax differences.&lt;/p&gt;

&lt;p&gt;Read our other &lt;a href=&quot;https://www.wagonhq.com/blog/haskell&quot;&gt;Haskell engineering blog posts&lt;/a&gt;, come to our community events, or better yet, &lt;a href=&quot;https://www.wagonhq.com/jobs&quot;&gt;join our team&lt;/a&gt;!&lt;/p&gt;
</description>
          
          <pubDate>Tue, 02 Feb 2016 00:00:00 -0800</pubDate>
          <link>http://www.wagonhq.com/blog/haskell-for-industry</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/haskell-for-industry</guid>
        </item>
      
    
      
    
      
        <item>
          <title>Python UDFs in Amazon Redshift</title>
          
            <dc:creator>Elise Breda (Yhat)</dc:creator>
          
          
            <description>&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://twitter.com/elisebreda&quot;&gt;Elise Breda&lt;/a&gt; is the Growth Strategist at &lt;a href=&quot;https://www.yhat.com/&quot;&gt;Yhat&lt;/a&gt;, a tech company that builds software for deploying predictive analytics to mobile and web apps. When she isn’t at their awesome new office in DUMBO, she can be found exploring the bike paths and Thai restaurants of NYC.&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;in-the-beginning&quot;&gt;In the Beginning…&lt;/h3&gt;

&lt;p&gt;In the early 2000’s, the behemoth collection of cloud computing services we now know as &lt;a href=&quot;http://aws.amazon.com/&quot;&gt;Amazon Web Services&lt;/a&gt; (AWS) was little more than sparks firing in &lt;a href=&quot;https://twitter.com/ccpinkham&quot;&gt;Chris Pinkham&lt;/a&gt; and &lt;a href=&quot;https://twitter.com/b6n&quot;&gt;Benjamin Black&lt;/a&gt;’s neurons. In 2003, the two presented a paper (&lt;a href=&quot;http://blog.b3k.us/2009/01/25/ec2-origins.html&quot;&gt;blog post here&lt;/a&gt;) outlining a radical vision for a retail computing infrastructure that would be standardized, automated and built upon web services. The next year, &lt;a href=&quot;https://en.wikipedia.org/wiki/Amazon_Simple_Queue_Service&quot;&gt;Simple Queue Service&lt;/a&gt;, the first AWS service for public usage, was launched.&lt;/p&gt;

&lt;p&gt;Fast forward almost a decade, and AWS is now the most commonly used cloud platform among enterprise software developers. AWS products span the gamut of web services, from computation (eg EC2) to networking (eg VPC) and content delivery (eg S3). In this post we’ll explore a small fraction of a fraction of the AWS ecosystem–a database that’s generating all kinds of groundswell right now: &lt;a href=&quot;https://en.wikipedia.org/wiki/Amazon_Redshift&quot;&gt;Amazon Redshift&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The rest of this post will talk about Redshift at a high level and then dive into a mini overview of &lt;a href=&quot;https://en.wikipedia.org/wiki/User-defined_function&quot;&gt;User Defined Functions&lt;/a&gt; (UDFs), how they work, why they’re great, and how to start using them.&lt;/p&gt;

&lt;h3 id=&quot;amazon-redshift&quot;&gt;Amazon Redshift&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;https://aws.amazon.com/redshift/&quot;&gt;Amazon Redshift&lt;/a&gt; is a hosted data warehouse that’s accessible / easy to set up, and built for speed and suitable for a variety of combining, storing, and compute-heavy analytics tasks.&lt;/p&gt;

&lt;p&gt;Two things make Redshift particularly attractive. First, Redshift can handle insane amounts of data–it is a petabyte-scale warehouse. A petabyte is a &lt;em&gt;lot&lt;/em&gt; (10&lt;sup&gt;15&lt;/sup&gt; bytes) of data. As a point of reference, the entire master catalog of Netflix video in 2013 amounted to about 3.14 petabytes of storage space (interesting read &lt;a href=&quot;https://www.quora.com/What-things-in-the-world-have-a-pebibyte-of-storage-space-in-them&quot;&gt;on Quora&lt;/a&gt;). Second, unlike Amazon’s other hosted database product, Amazon RDS, Redshift stores data according to column-based structure. Column orientation is good for tables containing columns with lots of repeated values (i.e. Credit Card Names, County/State, Product Type, etc, like &lt;a href=&quot;http://www.salesforce.com&quot;&gt;CRM&lt;/a&gt; data. The benefit of column data is that because it’s uniform, there are opportunities for storage size optimization via compression. You can read more about how to maximize compression &lt;a href=&quot;https://en.wikipedia.org/wiki/Column-oriented_DBMS#Compression&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p style=&quot;max-width: 400px; margin: auto;&quot;&gt;
	&lt;img src=&quot;https://kejserbi.files.wordpress.com/2012/07/image4.png&quot; alt=&quot;Column orientation compression&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Redshift handles large scale column-oriented datasets using massive parallel processing, performing coordinated computations across a large number of processors in parallel, making it a fast and powerful data warehouse option.&lt;/p&gt;

&lt;h3 id=&quot;data-warehouse-setup-in-the-data-age&quot;&gt;Data Warehouse Setup in the Data Age&lt;/h3&gt;

&lt;p&gt;Even just a few years ago, getting a data warehouse and proper ETL processes in place was a long, painful and probably very expensive ordeal. But we’ve arrived in the data age where easy-to-use, affordable data solutions are bountiful.&lt;/p&gt;

&lt;p&gt;At &lt;a href=&quot;https://www.yhat.com/&quot;&gt;Yhat&lt;/a&gt;, we use a Redshift to warehouse everything–CRM data (we use SFDC), product data, site metrics from Google Analytics, and data from a bunch of other data. It took us about 20 mins to set up the database on AWS, and it took us…wait for it…another 20 mins or so to set up all of our ETL using &lt;a href=&quot;http://fivetran.com/&quot;&gt;Fivetran&lt;/a&gt; which we couldn’t be more impressed with.&lt;/p&gt;

&lt;h3 id=&quot;sql-ide-done-right&quot;&gt;SQL IDE Done Right&lt;/h3&gt;

&lt;p&gt;Most SQL IDEs of yesteryear leave something to be desired in terms of UX. The majority are clunky and have super old school frankenstein UIs. Why they all focus on making exploring the DB schema rather than on making it easy to write queries, view results and think critically about your data has always been a mystery.&lt;/p&gt;

&lt;p&gt;Well those days are also over. Wagon is the query-focused SQL app I’ve been looking for for years. Wagon boasts a clean UX designed analysts. Features are carefully chosen with a keen eye for usability for people writing tens or hundreds of queries per day. Wagon gets it in spades.&lt;/p&gt;

&lt;h3 id=&quot;overview-of-python-udfs-in-redshift&quot;&gt;Overview of Python UDFs in Redshift&lt;/h3&gt;

&lt;p&gt;UDF stands for user-defined function, meaning that you can add functions to an environment (in this case, Redshift) in addition to those that come built in. Python UDFs allow you combine the power of Redshift with what you know and love about the Python programming language without switching between IDEs or systems.&lt;/p&gt;

&lt;p&gt;The great thing about UDFs in Redshift is that Redshift will automatically execute it using its MPP architecture. One caveat to keep in mind is that your Python code still won’t execute as quickly as native SQL functions (&lt;code&gt;AVG&lt;/code&gt;, &lt;code&gt;MIN&lt;/code&gt;, &lt;code&gt;MAX&lt;/code&gt;, etc.) that are baked into the database.&lt;/p&gt;

&lt;h3 id=&quot;how-to-use-udfs&quot;&gt;How to Use UDFs&lt;/h3&gt;

&lt;p&gt;You can certainly work with text in pure SQL, but some tasks are just easier to do in a scripting language like Python instead. Here’s a toy example to illustrate how to use Python functionality within Redshift using a UDF.&lt;/p&gt;

&lt;p&gt;Suppose a column in one of our tables contains huge chunks of text or html, and we’re interested to find any email addresses within any one record. Let’s write a function that will take in raw text and return a pipe &lt;code&gt;|&lt;/code&gt; separated string containing any email addresses found within the input text document. Define the function like so:&lt;/p&gt;

&lt;script src=&quot;https://gist.github.com/elisebreda/e5ea2dcb43bc896c3ab0.js&quot;&gt;&lt;/script&gt;

&lt;p&gt;Once defined, you can use it like this:&lt;/p&gt;

&lt;script src=&quot;https://gist.github.com/elisebreda/6286e4497a96bfa122b7.js&quot;&gt;&lt;/script&gt;

&lt;p&gt;This is a scalar function, so it’ll return one record for each input row (i.e. not an aggregate function). One thing to remember is that your UDFs are per-database, meaning that if you have multiple in your Redshift cluster, you’ll need to define your functions in each database.&lt;/p&gt;

&lt;h3 id=&quot;example&quot;&gt;Example&lt;/h3&gt;

&lt;p&gt;Redshift Python UDFs are based on Python 2.7 and come preloaded with a lot of our favorite libraries, including NumPy, SciPy and Pandas. You can also import custom modules from S3 and the web.&lt;/p&gt;

&lt;p&gt;Here’s the template published on the AWS blog that you can use to start creating your own scalar functions:&lt;/p&gt;

&lt;script src=&quot;https://gist.github.com/elisebreda/471b18eb6e87c4fa3b3f.js&quot;&gt;&lt;/script&gt;

&lt;p&gt;The scalar UDFs that you create will return a single result value for each input value. Once you’ve defined a UDF, you can use it in any SQL statement. One thing to remember is that your UDFs are per-database, meaning that if you have multiple in your Redshift cluster, you’ll need to define your functions in each database.&lt;/p&gt;

&lt;h3 id=&quot;helpful-resources&quot;&gt;Helpful Resources&lt;/h3&gt;

&lt;p&gt;To learn more about Python UDFs in Redshift, check out Amazon’s &lt;a href=&quot;http://docs.aws.amazon.com/redshift/latest/dg/user-defined-functions.html&quot;&gt;documentation&lt;/a&gt;, which is super helpful and covers everything from constraints to security and python support. You can also check out the &lt;a href=&quot;https://aws.amazon.com/blogs/aws/user-defined-functions-for-amazon-redshift/&quot;&gt;initial release blogpost&lt;/a&gt; and a more &lt;a href=&quot;http://blogs.aws.amazon.com/bigdata/post/Tx1IHV1G67CY53T/Introduction-to-Python-UDFs-in-Amazon-Redshift&quot;&gt;extensive post&lt;/a&gt; that uses UDFs to analyze the CMS Open Payments data set.&lt;/p&gt;

&lt;h3 id=&quot;yhat&quot;&gt;Yhat&lt;/h3&gt;
&lt;p&gt;Yhat’s flagship product, &lt;a href=&quot;https://www.yhat.com/products/sciencecluster&quot;&gt;ScienceOps&lt;/a&gt;, empowers teams of data scientists deploy their models into web and mobile applications. These models are embedded into production applications via REST APIs without any recoding from their native statistical language.&lt;/p&gt;
</description>
          
          <pubDate>Mon, 01 Feb 2016 00:00:00 -0800</pubDate>
          <link>http://www.wagonhq.com/blog/Redshift-UDFs-in-Python</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/Redshift-UDFs-in-Python</guid>
        </item>
      
    
      
    
      
    
      
    
      
        <item>
          <title>FUNctional Programming with Haskell - Twitter Tech Talk</title>
          
            <dc:creator>Jeff Weinstein</dc:creator>
          
          
            <description>&lt;p&gt;Haskell is our primary backend language at Wagon— it helps us (safely) iterate faster, build for multiple environments, and attracts great engineering talent. It is fun to meet engineers from startups and large companies at &lt;a href=&quot;http://www.meetup.com/Bay-Area-Haskell-Users-Group/&quot;&gt;Haskell meetups&lt;/a&gt;, &lt;a href=&quot;http://bayhac.org/&quot;&gt;BayHac&lt;/a&gt;, and in the &lt;a href=&quot;http://www.fpchat.com/&quot;&gt;functional programming Slack channel&lt;/a&gt;. Many engineers are curious how we use Haskell in production and have invited us to speak at their companies.&lt;/p&gt;

&lt;p style=&quot;max-width: 100px; margin: auto;&quot;&gt;
	&lt;img src=&quot;/images/posts/twitter_eng.png&quot; alt=&quot;Twitter Engineering.&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Wagon’s CTO Mike Craig spoke last month at Twitter (and &lt;a href=&quot;https://www.wagonhq.com/blog/square-tech-talk&quot;&gt;earlier this year at Square&lt;/a&gt;) on the pros and cons of using Haskell in production. Thanks to Twitter’s &lt;a href=&quot;https://twitter.com/peterseibel&quot;&gt;Peter Seibel&lt;/a&gt; for inviting us and to their team for posting the video.&lt;/p&gt;

&lt;p&gt;Here’s the talk:&lt;/p&gt;

&lt;div class=&quot;responsive-container&quot;&gt;
  &lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/Dbjt_u1kSo4&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;If you’re using Haskell at your company, &lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#104;&amp;#101;&amp;#108;&amp;#108;&amp;#111;&amp;#064;&amp;#119;&amp;#097;&amp;#103;&amp;#111;&amp;#110;&amp;#104;&amp;#113;&amp;#046;&amp;#099;&amp;#111;&amp;#109;&quot;&gt;let us know&lt;/a&gt;.&lt;/p&gt;
</description>
          
          <pubDate>Mon, 21 Dec 2015 00:00:00 -0800</pubDate>
          <link>http://www.wagonhq.com/blog/twitter-tech-talk</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/twitter-tech-talk</guid>
        </item>
      
    
      
        <item>
          <title>Calculating Active Users in SQL</title>
          
            <dc:creator>Andy Granowitz</dc:creator>
          
          
            <description>&lt;p&gt;How engaged are your users? How frequently do they visit your website or app? Analytics services like Google Analytics and MixPanel calculate basic counts of daily, weekly, and monthly active users, but it’s difficult to customize or join these results with other data. Writing this query in SQL gives you more control. Let’s do it!&lt;/p&gt;

&lt;p&gt;Here’s a table of user logins by day. How many users were active in the last week and month?&lt;/p&gt;

&lt;table class=&quot;table&quot;&gt;
  &lt;tr&gt;
    &lt;th&gt;date&lt;/th&gt;
    &lt;th&gt;user_id&lt;/th&gt;
    &lt;th&gt;num_logins&lt;/th&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/1/15&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
    &lt;td&gt;3&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/1/15&lt;/td&gt;
    &lt;td&gt;2&lt;/td&gt;
    &lt;td&gt;&lt;em&gt;null&lt;/em&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/1/15&lt;/td&gt;
    &lt;td&gt;3&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/2/15&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
    &lt;td&gt;&lt;em&gt;null&lt;/em&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/2/15&lt;/td&gt;
    &lt;td&gt;2&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/2/15&lt;/td&gt;
    &lt;td&gt;3&lt;/td&gt;
    &lt;td&gt;3&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Like &lt;a href=&quot;/blog/running-totals-sql&quot;&gt;calculating running totals&lt;/a&gt;, there are two approaches: &lt;code&gt;self join&lt;/code&gt; or &lt;code&gt;window function&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In either approach, it’s helpful to have a table of logins per user for each day, even if the user didn’t login (&lt;em&gt;null&lt;/em&gt; in this example). If your data isn’t already organized like this, you can generate a table with a row per day, per user, with the following query (this is Postgres syntax, for other databases, modify the &lt;code&gt;generate_series&lt;/code&gt; function to generate a table of dates).&lt;/p&gt;

&lt;noscript&gt;&lt;pre&gt;select 
    d.date, 
    u.user_id, 
    c.num_logins
from (
    select * from
    -- fill in the minimum date in your dataset
    generate_series(&amp;#39;01-05-15&amp;#39;::timestamp, 
                    current_date::timestamp, &amp;#39;24 hours&amp;#39;) as date
) d
full outer join (select distinct(user_id) as user_id from userActivityTable) u on 1 = 1
full outer join userActivityTable c on u.user_id = c.user_id and c.date = d.date;&lt;/pre&gt;&lt;/noscript&gt;
&lt;script src=&quot;https://gist.github.com/grano/215991d7fb785bf7685a.js?file=generate_full_table.sql&quot;&gt; &lt;/script&gt;

&lt;p&gt;To use this data, you can create a temporary table, use a common table expression, or include it as a subselect.&lt;/p&gt;

&lt;h4 id=&quot;approach-1-self-join&quot;&gt;Approach 1: Self Join&lt;/h4&gt;

&lt;p&gt;A self join is when you join a table with itself.  How meta is that?  For each row, we ask how many logins that user had in the last week.  The join condition requires emails to match and for the date to be within the last 7 days. In line 5, the query sums num_logins for those dates. The case statement identifies the user as active on that day if she had any logins in the prior week.&lt;/p&gt;

&lt;noscript&gt;&lt;pre&gt;select 
    o.user_id, 
    o.date,
    case
        when sum(i.num_logins) &amp;gt;= 1 then TRUE
        else FALSE 
    end as active
from userActivityTable as o
join userActivityTable as i on 
    i.date &amp;lt;= o.date AND
    i.date &amp;gt;= (o.date :: date) - integer &amp;#39;7&amp;#39; AND
    i.user_id = o.user_id
group by o.user_id, o.date;
&lt;/pre&gt;&lt;/noscript&gt;
&lt;script src=&quot;https://gist.github.com/grano/a668412b889cb172823f.js?file=active_users_self_join.sql&quot;&gt; &lt;/script&gt;

&lt;p&gt;This query generates a table that tells us which users are seven-day-active over time. This result can be aggregated further, filtered for specific dates, used to find inactive users, and joined with other data. In Wagon, we can create a &lt;a href=&quot;https://app.wagonhq.com/result/worlyovh53iwh4xr&quot;&gt;graph of the number of 7 day active users over time&lt;/a&gt;.&lt;/p&gt;

&lt;h4 id=&quot;approach-2-window-functions&quot;&gt;Approach 2: Window Functions&lt;/h4&gt;

&lt;p&gt;The self join works great, but modern databases have a more efficient way to get the same results. With window functions, we can explicitly aggregate only over rows that we care about with just a single pass through the data. If you have millions or billions of rows (lucky you), the self join will take a long time to compute. In line 5, the query sums num_logins for the user’s previous 14 days. It first partitions the table by email, then evaluates over a set of rows - in this case we’re looking at a specific date range.  The case statement classifies the user as active or not just as before.&lt;/p&gt;

&lt;noscript&gt;&lt;pre&gt;select
    user_id,
    date,
    case
        when sum(num_logins) over (partition by user_id order by date rows between 14 preceding and current row) &amp;gt;= 1 then TRUE
        else FALSE
    end as active
from userActivityTable;
&lt;/pre&gt;&lt;/noscript&gt;
&lt;script src=&quot;https://gist.github.com/grano/a16e64d9164924bc6f3a.js?file=active_users_window_function.sql&quot;&gt; &lt;/script&gt;

&lt;p&gt;This query makes it easier to add additional metrics for 7 and 30 day active users. As expected, the &lt;a href=&quot;https://app.wagonhq.com/result/4ooragqzn3fv3xex&quot;&gt;wider your definition of active user&lt;/a&gt;, the more you’ll have. Use these new powers carefully!&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;Want to learn more SQL? Join us on Monday, November 16 at the Wagon office in San Francisco for a free SQL workshop.  Please &lt;a href=&quot;https://www.eventbrite.com/e/an-evening-of-sql-and-cheese-tickets-19230173968&quot;&gt;RSVP&lt;/a&gt;!&lt;/em&gt;&lt;/p&gt;
</description>
          
          <pubDate>Fri, 13 Nov 2015 00:00:00 -0800</pubDate>
          <link>http://www.wagonhq.com/blog/active-users-in-sql</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/active-users-in-sql</guid>
        </item>
      
    
      
        <item>
          <title>Electron meetup at Microsoft</title>
          
            <dc:creator>Jeff Weinstein</dc:creator>
          
          
            <description>&lt;p&gt;&lt;a href=&quot;http://electron.atom.io&quot;&gt;Electron&lt;/a&gt; has momentum. The open source project for building cross platform desktop apps with web technologies now has 6000+ commits from 200+ contributors, a 3500+ person Slack room, and now it’s 4th meetup with 90+ RSVPs (the 1st was a few of us at a bar).&lt;/p&gt;

&lt;p&gt;On Monday, the &lt;a href=&quot;http://www.meetup.com/Bay-Area-Electron-User-Group&quot;&gt;Bay Area Electron group&lt;/a&gt; met at Microsoft &lt;a href=&quot;https://twitter.com/MSFTReactor&quot;&gt;Reactor&lt;/a&gt;, an event space in San Francisco. A few members of Microsoft’s open source team were in town to hear how people are using Electron to build Windows apps. The team is dedicated to helping projects run well with Microsoft platforms. It’s also exciting to see that &lt;a href=&quot;https://code.visualstudio.com/&quot;&gt;Visual Studio Code&lt;/a&gt; is using Electron. Two of the five talks were Windows related:&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;a href=&quot;http://twitter.com/kevinsawicki&quot;&gt;Kevin Sawicki&lt;/a&gt; from GitHub spoke about how to test Electron apps using ChromeDriver and his project &lt;a href=&quot;https://github.com/kevinsawicki/spectron&quot;&gt;Spectron&lt;/a&gt;. Try this in your next project.&lt;/p&gt;

&lt;script async=&quot;&quot; class=&quot;speakerdeck-embed&quot; data-id=&quot;ecbcfd9419c845faa0e791d87ba1ce97&quot; data-ratio=&quot;1.77777777777778&quot; src=&quot;//speakerdeck.com/assets/embed.js&quot;&gt;&lt;/script&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://twitter.com/felixrieseberg&quot;&gt;Felix Rieseberg&lt;/a&gt; from Microsoft’s open source team gave us walkthrough of &lt;a href=&quot;http://try.buildwinjs.com/&quot;&gt;WinJS&lt;/a&gt;. It seems like a great way to make Electron apps look native on Windows 10. Here are the &lt;a href=&quot;https://onedrive.live.com/view.aspx?resid=4EA869C40F03DA47!238769&amp;amp;ithint=file%2cpptx&amp;amp;app=PowerPoint&amp;amp;authkey=!ANhVu0KbHyOHXOQ&quot;&gt;slides&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://twitter.com/johnhaley81&quot;&gt;John Haley&lt;/a&gt; from Axosoft’s GitKraken team talked about how they handle task scheduling in their Electron apps. This strategy seems like a new standard way to handle both UI and processing intensive applications.&lt;/p&gt;

&lt;iframe src=&quot;https://drive.google.com/file/d/0B8f-2LpBuV0pMUU1bC0wTFJZMjg/preview&quot; width=&quot;640&quot; height=&quot;480&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Wagon’s &lt;a href=&quot;http://twitter.com/deland&quot;&gt;Matt DeLand&lt;/a&gt; presented a few tips and tricks for developing, building, and deploying on Windows along with a demo of our latest version. &lt;a href=&quot;https://www.wagonhq.com/&quot;&gt;Sign up for early access to try Wagon on Windows&lt;/a&gt;.&lt;/p&gt;

&lt;script async=&quot;&quot; class=&quot;speakerdeck-embed&quot; data-id=&quot;28549e1d1599428cb575dac98d94d766&quot; data-ratio=&quot;1.29456384323641&quot; src=&quot;//speakerdeck.com/assets/embed.js&quot;&gt;&lt;/script&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://twitter.com/edgararout&quot;&gt;Edgar Aroutiounian&lt;/a&gt; gave a quick demo of how he used OCaml to build an Electron app. Checkout the example project &lt;a href=&quot;https://github.com/fxfactorial/ocaml-electron&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Special thanks to Microsoft for hosting. We’re excited the Electron community is growing and that large companies are adopting and supporting this platform. See you at the next event!&lt;/p&gt;
</description>
          
          <pubDate>Fri, 06 Nov 2015 00:00:00 -0800</pubDate>
          <link>http://www.wagonhq.com/blog/electron-microsoft-meetup</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/electron-microsoft-meetup</guid>
        </item>
      
    
      
        <item>
          <title>Haskell in Production - Square Tech Talk</title>
          
            <dc:creator>Mike Craig</dc:creator>
          
          
            <description>&lt;p&gt;The &lt;a href=&quot;https://corner.squareup.com/&quot;&gt;Square engineering team&lt;/a&gt; invited Wagon to give a tech talk on how we use Haskell in production.  Their teams are interested in functional programming and we were honored to walkthrough our experience building a modern analytics tool using &lt;a href=&quot;/blog/engineering-at-wagon&quot;&gt;Haskell, React, and Electron&lt;/a&gt;. Thanks Square for welcoming us!&lt;/p&gt;

&lt;div class=&quot;responsive-container&quot;&gt;
  &lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/gX3rJkOjcz4&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;If you’re using Haskell at your company, &lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#104;&amp;#101;&amp;#108;&amp;#108;&amp;#111;&amp;#064;&amp;#119;&amp;#097;&amp;#103;&amp;#111;&amp;#110;&amp;#104;&amp;#113;&amp;#046;&amp;#099;&amp;#111;&amp;#109;&quot;&gt;let us know&lt;/a&gt;. We’d love to trade notes.&lt;/p&gt;
</description>
          
          <pubDate>Wed, 28 Oct 2015 00:00:00 -0700</pubDate>
          <link>http://www.wagonhq.com/blog/square-tech-talk</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/square-tech-talk</guid>
        </item>
      
    
      
        <item>
          <title>Calculating Running Totals using SQL</title>
          
            <dc:creator>Andy Granowitz</dc:creator>
          
          
            <description>&lt;p&gt;How many users joined in the last 5 months? What were total sales in Q2? How much revenue came from the March sign up cohort?&lt;/p&gt;

&lt;p&gt;Although these questions can be answered with a single number, it can be useful to see a &lt;em&gt;running total&lt;/em&gt; over time: how many unique users joined, or how much cumulative revenue was received by day over some period.&lt;/p&gt;

&lt;p&gt;Usually, data is stored incrementally. For example, here’s a table of sales per day:&lt;/p&gt;

&lt;table class=&quot;table&quot;&gt;
  &lt;tr&gt;
    &lt;th&gt;Date&lt;/th&gt;
    &lt;th&gt;Sales&lt;/th&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td style=&quot;width: 216px;&quot;&gt;10/1/2015&lt;/td&gt;
    &lt;td&gt;5&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/2/2015&lt;/td&gt;
    &lt;td&gt;3&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/3/2015&lt;/td&gt;
    &lt;td&gt;7&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/4/2015&lt;/td&gt;
    &lt;td&gt;8&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/5/2015&lt;/td&gt;
    &lt;td&gt;2&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/6/2015&lt;/td&gt;
    &lt;td&gt;3&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/7/2015&lt;/td&gt;
    &lt;td&gt;6&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;How do we generate the following table of cumulative sales over time? In SQL, there are two typical approaches: a self join or a window function.&lt;/p&gt;

&lt;table class=&quot;table&quot;&gt;
  &lt;tr&gt;
    &lt;th&gt;Date&lt;/th&gt;
    &lt;th&gt;Running Total of Sales&lt;/th&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td style=&quot;width: 216px;&quot;&gt;10/1/2015&lt;/td&gt;
    &lt;td&gt;5&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/2/2015&lt;/td&gt;
    &lt;td&gt;8&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/3/2015&lt;/td&gt;
    &lt;td&gt;15&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/4/2015&lt;/td&gt;
    &lt;td&gt;23&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/5/2015&lt;/td&gt;
    &lt;td&gt;25&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/6/2015&lt;/td&gt;
    &lt;td&gt;28&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;10/7/2015&lt;/td&gt;
    &lt;td&gt;34&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;A self join is a query that compares a table to itself. In this case, we’re comparing each date to any date less than or equal to it in order to calculate the running total. Concretely, we take the sum of &lt;code&gt;sales&lt;/code&gt; in the second table over every row that has a date less than or equal to the date coming from the first table. This is Postgres/Redshift syntax, but other SQL dialects are very similar.&lt;/p&gt;

&lt;noscript&gt;&lt;pre&gt;select 
    a.date,
    sum(b.sales) as cumulative_sales
from sales_table a 
join sales_table b on a.date &amp;gt;= b.date
group by a.date
order by a.date;&lt;/pre&gt;&lt;/noscript&gt;
&lt;script src=&quot;https://gist.github.com/grano/b705f532374c0ec02f03.js?file=sales_running_total.sql&quot;&gt; &lt;/script&gt;

&lt;p&gt;This is not a bad approach; it is a nice showcase of how extensible SQL can be using only &lt;code&gt;select&lt;/code&gt;, &lt;code&gt;from&lt;/code&gt;, &lt;code&gt;join&lt;/code&gt;, and &lt;code&gt;group by&lt;/code&gt; statements.&lt;/p&gt;

&lt;p&gt;But it is a lot of code for a simple task. Let’s try a window function. They are designed to calculate a metric over a set of rows. In our case, we want to sum every row where the date is less than or equal to the date in the current row.&lt;/p&gt;

&lt;noscript&gt;&lt;pre&gt;select
    date,
    sum(sales) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table;&lt;/pre&gt;&lt;/noscript&gt;
&lt;script src=&quot;https://gist.github.com/grano/88fcf67e5ff14ae9e1c2.js?file=cumulative_sales_window_function.sql&quot;&gt; &lt;/script&gt;

&lt;p&gt;The window function can filter and arrange the set of rows to run the function over. Here the &lt;code&gt;order by date rows unbounded preceding&lt;/code&gt; limits the sum function to only &lt;code&gt;sales&lt;/code&gt; before the date of the current row. Window functions are incredibly useful for time-based analytical queries; to learn more, the &lt;a href=&quot;http://www.postgresql.org/docs/9.4/static/tutorial-window.html&quot;&gt;Postgres docs&lt;/a&gt; are a great place to start.&lt;/p&gt;

&lt;p&gt;The final step of creating a chart and sharing it triumphantly with your teammates is easily accomplished using &lt;a href=&quot;https://app.wagonhq.com/result/dcyjd5ha7eiptaoa&quot;&gt;Wagon&lt;/a&gt;. Window functions for the win!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Wagon is a modern SQL editor for analysts and engineers: write queries, visualize data, and share charts with your team. Signup for free:&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;form id=&quot;waitlist-form&quot;&gt;
  &lt;div id=&quot;signup-name&quot; class=&quot;form-group&quot;&gt;
    &lt;input class=&quot;form-control input-lg&quot; type=&quot;text&quot; name=&quot;name&quot; placeholder=&quot;Your name&quot; /&gt;
  &lt;/div&gt;
  &lt;div id=&quot;signup-email&quot; class=&quot;form-group&quot;&gt;
    &lt;input class=&quot;form-control input-lg&quot; type=&quot;text&quot; name=&quot;email&quot; placeholder=&quot;name@company.com&quot; /&gt;
  &lt;/div&gt;
  &lt;p id=&quot;signup-error-message&quot; class=&quot;form-group&quot; style=&quot;color: red;&quot;&gt;&lt;/p&gt;
  &lt;p id=&quot;signup-success-message&quot; class=&quot;form-group text-center&quot;&gt;&lt;/p&gt;
  &lt;div id=&quot;signup-submit-button&quot; class=&quot;form-group&quot;&gt;
    &lt;button class=&quot;btn btn-success btn-lg waitlist-submit-button&quot; type=&quot;submit&quot; name=&quot;submit&quot; id=&quot;signup-submit&quot;&gt;Sign up and download&lt;/button&gt;
  &lt;/div&gt;
&lt;/form&gt;
&lt;script src=&quot;/js/signup.js&quot;&gt;&lt;/script&gt;

</description>
          
          <pubDate>Tue, 20 Oct 2015 00:00:00 -0700</pubDate>
          <link>http://www.wagonhq.com/blog/running-totals-sql</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/running-totals-sql</guid>
        </item>
      
    
      
        <item>
          <title>React at Wagon</title>
          
            <dc:creator>Mike Craig</dc:creator>
          
          
            <description>&lt;p&gt;We’re building a hybrid web/native application that runs both in the browser and as a downloadable desktop app. Analysts use Wagon to query, analyze, visualize, and share data: the app is highly interactive and data-heavy. It has to be fast, furious, and stable even when used for hours.&lt;/p&gt;

&lt;p&gt;It ain’t all gravy: it’s difficult to maintain a UI with asynchronous updates, large scale data manipulation, and many other cross-cutting concerns. How can we build a sane frontend codebase, without losing our ability to iterate and ship quickly? The answer is to separate concerns. We break our UI into small self-contained components, and we isolate state and manage it separately from the UI. Facebook’s React and Flux libraries make this practical.&lt;/p&gt;

&lt;p style=&quot;max-width: 330px; margin: auto;&quot;&gt;
  &lt;img src=&quot;/images/posts/react.png&quot; alt=&quot;Wagon loves React&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;The big idea behind React is this: a UI component is just a function from its inputs to its content. All a component needs is a &lt;code&gt;render()&lt;/code&gt; method that returns the elements we want the user to see.  As an example, here’s a component that takes a size and color and renders a div displaying a filled-in square. Notice that users of this component don’t need to know about how it is implemented.&lt;/p&gt;

&lt;noscript&gt;&lt;pre&gt;var ColoredSquare = React.createClass({
  render: function() {
    // the component&amp;#39;s inputs are available via this.props
    var squareStyle = {
      display: &amp;quot;inline-block&amp;quot;,
      width: this.props.width,
      height: this.props.width,
      backgroundColor: this.props.color,
      borderRadius: this.props.radius
    };

    return &amp;lt;div style={squareStyle} onClick={saveUserFavColor(this.props.color)} /&amp;gt;;
  }
});&lt;/pre&gt;&lt;/noscript&gt;
&lt;script src=&quot;https://gist.github.com/jweinstein/e6cc2a29621a4508f9e4.js?file=ColoredSquare.jsx&quot;&gt; &lt;/script&gt;

&lt;p&gt;React components are simple to reuse because they nest like HTML elements. It’s easy to wrap an existing component to add additional styles or behavior—React favors composition over inheritance. We can forget about carefully maintaining the DOM to avoid excessive redraws and flicker: we declare what our components should look like, and React makes it so.&lt;/p&gt;

&lt;p&gt;React is great for organizing view elements, but an application is more than static UI. Users generate events and we need to capture them, update state, and direct how the app should respond. Flux manages this flow by clearly seperating user action events from application responses.&lt;/p&gt;

&lt;p&gt;Actions encapsulate events. They’re the application logic that runs in response to users doing stuff. In our example, when a user clicks a colored square, we update the server and dispatch to let the rest of app know what happened:&lt;/p&gt;

&lt;noscript&gt;&lt;pre&gt;function saveUserFavColor(color) {
  var id = getCurrentUserId();  // move along, this is just a demo
  
  ajax.user.saveInfo(id, color)  // make an AJAX call,
    // then tell the rest of the application what happened
    .then(function() {
      AppDispatcher.handleAction(USER_INFO_SAVED, {id: id, favoriteColor: color});
    })
    .catch(function() {
      AppDispatcher.handleAction(USER_INFO_SAVE_ERROR, {id: id});
    });
}&lt;/pre&gt;&lt;/noscript&gt;
&lt;script src=&quot;https://gist.github.com/jweinstein/e6cc2a29621a4508f9e4.js?file=UserInfoAction.js&quot;&gt; &lt;/script&gt;

&lt;p&gt;Stores encapsulate state. They listen to state changes dispatched from actions, and they update themselves to record the changes.&lt;/p&gt;

&lt;noscript&gt;&lt;pre&gt;class UserStore extends EventEmitter {
  // administrivia omitted!

  onDispatch(action) {
    switch (action.type) {
      case USER_INFO_SAVED:
        // save the updated state in an instance property
        this._users[action.payload.id].favoriteColor = action.payload.favoriteColor;
        // tell the rest of the application to come and get it
        this.emitChange();
        break;

      // etc
    }
  }
}&lt;/pre&gt;&lt;/noscript&gt;
&lt;script src=&quot;https://gist.github.com/jweinstein/e6cc2a29621a4508f9e4.js?file=UserStore.js&quot;&gt; &lt;/script&gt;

&lt;p&gt;UI components listen to stores and re-render when relevant state changes.&lt;/p&gt;

&lt;p&gt;Building a solid, maintainable frontend is still difficult despite these great libraries. Here are a few other strategies we’re using:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Build pure React components whenever possible. A pure component doesn’t make any external calls from &lt;code&gt;render()&lt;/code&gt;—it’s a pure function of the component’s properties and state. Components like this are much easier to test, debug, and reuse. It’s such a good idea it’s &lt;a href=&quot;https://facebook.github.io/react/docs/pure-render-mixin.html&quot;&gt;included in React itself&lt;/a&gt;!&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Separate React components that listen to Flux stores from those that render the UI. Wrapping UI components in container components is another win for reuse. Jason Bonta mentioned this in &lt;a href=&quot;https://www.youtube.com/watch?v=KYzlpRvWZ6c&quot;&gt;his great talk at React.js Conf 2015&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Take care when integrating non-React UI components. Mixing React’s declarative API with another library’s imperative API can be painful. Build wrapper components around external libraries, and use &lt;a href=&quot;https://facebook.github.io/react/docs/component-specs.html#lifecycle-methods&quot;&gt;React’s lifecycle methods&lt;/a&gt; to handle setup and teardown. When possible, avoid exposing direct-update methods like &lt;code&gt;drawChart()&lt;/code&gt; or &lt;code&gt;setCursorPosition()&lt;/code&gt;—manage state through component properties or Flux stores.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Split Flux actions into modules by UX concern. We separate navigation and authentication from running queries and making charts. Carve out submodules for cross-cutting concerns, like AJAX requests or logging.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Split Flux stores by domain. It’s helpful to separate persisted server-side state from ephemeral page state, for example. We hide the state of the URL bar behind a store, too!&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’re tackling fun engineering challenges at Wagon.  If you want to learn more or work everyday on these technologies, &lt;a href=&quot;/jobs&quot;&gt;check out our jobs page&lt;/a&gt; and get in touch!&lt;/p&gt;
</description>
          
          <pubDate>Thu, 01 Oct 2015 00:00:00 -0700</pubDate>
          <link>http://www.wagonhq.com/blog/react-at-wagon</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/react-at-wagon</guid>
        </item>
      
    
      
        <item>
          <title>Migrating from SQL Workbench/J to Wagon</title>
          
            <dc:creator>Andy Granowitz</dc:creator>
          
          
            <description>&lt;p&gt;Many Wagon users previously used SQL Workbench/J to query Amazon Redshift. Older SQL tools are focused on DBA tasks like managing tables, updating schemas, and provisioning users. Analysts just want a simple way to query data, analyze it, visualize it, and collaborate with others. It’s no surprise that we’re frequently asked how to move from legacy tools like SQL Workbench/J to Wagon. It’s super easy.&lt;/p&gt;

&lt;p&gt;If you are currently using SQL Workbench/J and want to &lt;a href=&quot;/&quot;&gt;try Wagon&lt;/a&gt;, here are the quick steps to connect to Redshift in Wagon:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;In SQL Workbench/J, open the connection window&lt;/li&gt;
  &lt;li&gt;Grab the hostname, port, and database from the URL, the username, and the password (in the Redshift interface, the URL is called the JDBC URL)&lt;/li&gt;
  &lt;li&gt;Paste into Wagon (no need to install any drivers!)&lt;/li&gt;
&lt;/ol&gt;

&lt;p style=&quot;max-width: 600px; margin: auto;&quot;&gt;
	&lt;img src=&quot;/images/posts/workbench-config.png&quot; alt=&quot;SQL Workbench/J connect window&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Happy querying!&lt;/p&gt;
</description>
          
          <pubDate>Wed, 30 Sep 2015 00:00:00 -0700</pubDate>
          <link>http://www.wagonhq.com/blog/sql-workbench-to-wagon</link>
          <guid isPermaLink="true">http://www.wagonhq.com/blog/sql-workbench-to-wagon</guid>
        </item>
      
    
  </channel>
</rss>
