<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>steel</title>
  <id>urn:planet:steel</id>
  <updated>2026-05-06T15:14:00Z</updated>
  <generator>Pluto 1.6.3 on Ruby 3.3.11 (2026-03-26) [x86_64-linux]</generator>

  <entry>
    <title>💥 “They Would Never Use the Death Star on Us”: Alderaan Residents Reflect on Their Support for the Empire as a Large Imperial Installation Enters the System ↗</title>
    <link href="https://www.mcsweeneys.net/articles/they-would-never-use-the-death-star-on-us-alderaan-residents-reflect-on-their-support-for-the-empire-as-a-large-imperial-installation-enters-the-system"/>
    <id>tag:violetpixel.com,2026-05-05:they-would-never-use-the-death-star-on-us</id>
    <updated>2026-05-05T20:11:00Z</updated>


    <content type="html">&lt;p&gt;Jack Loftus over at McSweeny&#39;s:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Lesser of two evils. The Senate was ineffective, and the liberal Jedi were out of touch. The Emperor said he’d cut through all that. And he did—sometimes literally. You have to give him that. Things moved. Maybe a little too much moving right now, with the Death Star repositioning every few minutes to maintain a firing solution on our planet, but still.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Yeah.&lt;/p&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/fun/&quot;&gt;Fun&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/politics/&quot;&gt;Politics&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-05-05-they-would-never-use-the-death-star-on-us/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>I Built My Own Hair Electrolysis Machine</title>
    <link href="https://www.scd31.com/posts/diy-hair-electrolysis-machine"/>
    <id>https://www.scd31.com/posts/diy-hair-electrolysis-machine</id>
    <updated>2026-04-27T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Analogical reasoning</title>
    <link href="https://blog.cjquines.com/post/analogical-reasoning/"/>
    <id>https://blog.cjquines.com/post/analogical-reasoning/</id>
    <updated>2026-04-26T04:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>Turns out, competition works</title>
    <link href="https://brandur.org/fragments/competition"/>
    <id>tag:brandur.org,2026-04-23:fragments/competition</id>
    <updated>2026-04-23T07:33:20Z</updated>


    <content type="html">&lt;p&gt;I&amp;rsquo;m settling into Austin here and preparing a move into my new apartment. As I was working through internet setup, I was impressed to find an all-you-can-eat menu of four easy fiber options for internet, with symmetrical gigabit starting at $60 and no data caps.&lt;/p&gt;

&lt;p&gt;Last year when I left San Francisco, I was paying over $100/month for a 100 mbps / 20 mbps cable connection, no better than what I&amp;rsquo;d first gotten in the late 90s, and with a strict 1.2 TB cap (overage cost: $10 every 50 GB).&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;d tried to find another ISP, &lt;em&gt;any&lt;/em&gt; other ISP, but there were none. Previously I&amp;rsquo;d had &lt;a href=&quot;/fragments/monkeybrains&quot;&gt;MonkeyBrains&lt;/a&gt;, but this landlord wouldn&amp;rsquo;t allow the invasive rooftop antenna installation, and the unit didn&amp;rsquo;t have good line of sight anyway. Not even the other dinosaurs like AT&amp;amp;T provided service.&lt;/p&gt;

&lt;p&gt;Why were there no alternatives? Because &amp;ldquo;progressive&amp;rdquo; San Francisco &lt;a href=&quot;https://forums.sonic.net/viewtopic.php?t=15499&quot;&gt;banned them&lt;/a&gt;, handing a de facto monopoly on a silver platter to Comcast, which was more than willing to take advantage of that by offering 1990s speeds, punitive data caps, and sky high pricing.&lt;/p&gt;

&lt;p&gt;In Austin, competition is legal. Comcast (a garbage company full of garbage people) would &lt;em&gt;love&lt;/em&gt; to charge $100 for a 100 mbps connection with no upload, but they offer $70 fiber instead. Why? Because every resident in the building has 3+ other options to choose from. They could try to pull the same racket here, and wouldn&amp;rsquo;t get a single taker. Turns out, competition works. Who could possibly have known.&lt;/p&gt;

&lt;p&gt;On a related note, the apartment I&amp;rsquo;m renting is $200 cheaper than the San Francisco rent I paid 15 years ago back in 2011. The unit&amp;rsquo;s in a better neighborhood (right on the park), is larger, considerably better built (concrete between units so you aren&amp;rsquo;t sharing TV programming with your neighbors), and with full amenities. How? Because again, unlike &amp;ldquo;progressive&amp;rdquo; SF, &lt;a href=&quot;https://www.pew.org/en/research-and-analysis/articles/2026/03/18/austins-surge-of-new-housing-construction-drove-down-rents&quot;&gt;building is legal&lt;/a&gt;. It&amp;rsquo;s &lt;em&gt;so&lt;/em&gt; legal that despite booming population growth, rents between 2021-25 are down 4%, or a whopping &lt;strong&gt;19%&lt;/strong&gt; inflation adjusted.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>My Nix Config Is Intimate</title>
    <link href="https://www.scd31.com/posts/nix-files-are-intimate"/>
    <id>https://www.scd31.com/posts/nix-files-are-intimate</id>
    <updated>2026-04-22T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Caveman</title>
    <link href="https://brandur.org/fragments/caveman"/>
    <id>tag:brandur.org,2026-04-12:fragments/caveman</id>
    <updated>2026-04-12T16:41:03Z</updated>


    <content type="html">&lt;p&gt;An excerpt from Michael Crichton&amp;rsquo;s &lt;a href=&quot;https://en.wikipedia.org/wiki/Congo_(novel)&quot;&gt;Congo (1980)&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“I don&amp;rsquo;t understand,&amp;rdquo; Elliot said. Ross explained that the &amp;ldquo;M&amp;rdquo; meant that there was more message, and he had to press the transmit button again. He pushed the button several times before he got the message, which in its entirety read:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;REVUWD ORGNL TAPE HUSTN NU FINDNG RE AURL SIGNL INFO-COMPUTR ANLYSS COMPLTE THNK ITS LNGWGE.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Elliot found he could read the compressed shortline language by speaking it aloud: &amp;ldquo;Reviewed original tape Houston, new finding regarding aural signal information, computer analysis complete think it&amp;rsquo;s language.&amp;rdquo; He frowned. &amp;ldquo;Language?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Crichton was a gear guy. The story&amp;rsquo;s protagonists took high tech satellite uplinks into the field, allowing transmission back to HQ, but due to the extreme expense of satellite bandwidth, having to read messages in shorthand like, &amp;ldquo;REVUWD ORGNL TAPE HUSTN NU FINDNG&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;I always found it ridiculous. Although these words have had their vowels removed, they&amp;rsquo;re still uniquely intelligible in the English language. It&amp;rsquo;d be trivial to write a short algorithm that&amp;rsquo;d use a dictionary to expand the message back to uncompressed English on the receiving end. Or better yet, stop with the vowel thing and use a standard compression algorithm &lt;sup id=&quot;footnote-1-source&quot;&gt;&lt;a href=&quot;#footnote-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. You&amp;rsquo;d get better results.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Yesterday, I came across &lt;a href=&quot;https://github.com/JuliusBrussee/caveman&quot;&gt;Caveman&lt;/a&gt;. Its job is to save tokens in Claude by having the LLM speak like a caveman, removing filler words and other niceties that make up a more fluently legible human language.&lt;/p&gt;

&lt;p&gt;Before:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;Sure! I&amp;rsquo;d be happy to help you with that. The issue you&amp;rsquo;re experiencing is most likely caused by your authentication middleware not properly validating the token expiry. Let me take a look and suggest a fix.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;After:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;Bug in auth middleware. Token expiry check use &lt;code&gt;&amp;lt;&lt;/code&gt; not &lt;code&gt;&amp;lt;=&lt;/code&gt;. Fix:&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Crichton would&amp;rsquo;ve loved it. 45 years later we&amp;rsquo;ve come full circle, are back to speaking like cavemen again, and as an at-least-somewhat legitimate technical workaround. I don&amp;rsquo;t know what I thought I knew anymore.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>🔊 Hostile Volume ↗</title>
    <link href="https://hostilevolume.com/"/>
    <id>tag:violetpixel.com,2026-04-09:hostile-volume</id>
    <updated>2026-04-10T04:11:00Z</updated>


    <content type="html">&lt;p&gt;Just set the volume to 25%.  Just fold in the cheese.  UX is my passion.&lt;/p&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/ux/&quot;&gt;UX&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/fun/&quot;&gt;Fun&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-04-09-hostile-volume/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>🏎️ International Chess Federation Adds Race Car Piece ↗</title>
    <link href="https://theonion.com/international-chess-federation-adds-race-car-piece/"/>
    <id>tag:violetpixel.com,2026-04-09:international-chess-federation-adds-race-car-piece</id>
    <updated>2026-04-10T03:46:00Z</updated>


    <content type="html">&lt;p&gt;The Onion:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In all officially sanctioned matches played from today forward, the pawn immediately in front of a player’s king will be replaced with a sick little hot rod that can move any number of squares horizontally, vertically, or in a circle like it’s doing donuts[.]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Delightful piece.  Do pay them a visit to learn all the new rules.&lt;/p&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/chess/&quot;&gt;Chess&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/fun/&quot;&gt;Fun&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-04-09-international-chess-federation-adds-race-car-piece/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>🎲 Claude mixes up who said what, and that&#39;s not OK ↗</title>
    <link href="https://dwyer.co.za/static/claude-mixes-up-who-said-what-and-thats-not-ok.html"/>
    <id>tag:violetpixel.com,2026-04-09:claude-mixes-up-who-said-what-and-that-s-not-ok</id>
    <updated>2026-04-10T03:34:00Z</updated>


    <content type="html">&lt;p&gt;Gareth Dwyer, in a post about LLMs not keeping track of who said what:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Yes, of course AI has risks and can behave unpredictably, but after using it for months you get a ‘feel’ for what kind of mistakes it makes, when to watch it more closely, when to give it more permissions or a longer leash.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Sure, you can get a “feel” for what kind of mistakes LLMs make, &lt;em&gt;but that feeling means absolutely nothing and cannot be trusted&lt;/em&gt;.  Randomness is literally built in.  &lt;a href=&quot;https://www.ibm.com/think/topics/llm-temperature&quot;&gt;The temperature setting of an LLM&lt;/a&gt; determines how random the token selection process will be, and every token selection is a fork in the road for the rest of the output.&lt;/p&gt;
&lt;p&gt;This kind of widespread misunderstanding about how LLMs work terrifies me.  To be clear, I’m not picking on Gareth specifically—this is just a good example of the misplaced trust in LLMs I see running rampant these days.  These systems are intentionally designed to be nondeterministic, with randomness as a key component of how they work.  Thinking you can get a “feel” for when you can trust them is like getting a “feel” for how the dice are rolling or a “feel” for how the cards are being shuffled.&lt;/p&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/generative_ai/&quot;&gt;Generative AI&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/slop_generators/&quot;&gt;Slop Generators&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-04-09-claude-mixes-up-who-said-what-and-that-s-not-ok/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>🔴 FBI Extracts Suspect’s Deleted Signal Messages Saved in iPhone Notification Database ↗</title>
    <link href="https://www.404media.co/fbi-extracts-suspects-deleted-signal-messages-saved-in-iphone-notification-database-2/"/>
    <id>tag:violetpixel.com,2026-04-09:fbi-extracts-suspect-s-deleted-signal-messages-saved-in-iphone-notification-database</id>
    <updated>2026-04-10T03:00:00Z</updated>


    <content type="html">&lt;p&gt;Joseph Cox at 404 Media:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The FBI was able to forensically extract copies of incoming Signal messages from a defendant’s iPhone, even after the app was deleted, because copies of the content were saved in the device’s push notification database, multiple people present for FBI testimony in a recent trial told 404 Media.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href=&quot;https://9to5mac.com/2026/04/09/fbi-used-iphone-notification-data-to-retrieve-deleted-signal-messages/&quot;&gt;9to5Mac has some additional details&lt;/a&gt; if you hit 404 Media’s paywall.&lt;/p&gt;
&lt;p&gt;There’s a setting in Signal to prevent actual message content from showing up in notifications, but this person didn’t have it enabled.&lt;/p&gt;
&lt;p&gt;Stay safe out there, kids.&lt;/p&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/apple/&quot;&gt;Apple&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/iphone/&quot;&gt;iPhone&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/ios/&quot;&gt;iOS&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/security/&quot;&gt;Security&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-04-09-fbi-extracts-suspect-s-deleted-signal-messages-saved-in-iphone-notification-database/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>🎮 Porting Mac OS X to the Nintendo Wii ↗</title>
    <link href="https://bryankeller.github.io/2026/04/08/porting-mac-os-x-nintendo-wii.html"/>
    <id>tag:violetpixel.com,2026-04-08:porting-mac-os-x-to-the-nintendo-wii</id>
    <updated>2026-04-09T01:33:00Z</updated>


    <content type="html">&lt;p&gt;Bryan Keller got the first version of Mac OS X (10.0, Cheetah) running on a Nintendo Wii:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Before figuring out how to tackle this project, I needed to know whether it would even be possible. According to a 2021 &lt;a href=&quot;https://www.reddit.com/r/wii/comments/mm8i8w/is_it_possible_to_run_old_versions_of_os_x_on_the/gts4glp/&quot;&gt;Reddit comment&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There is a zero percent chance of this ever happening.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Feeling encouraged, I started with the basics […]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Hell yeah.  That’s a great attitude to have.  And it paid off!&lt;/p&gt;
&lt;p&gt;Something about seeing Mac OS X running on a Wii seems correct to me in a weird way.  Whenever I saw Mac OS X running on other things, like a hackintosh, it always felt a little off, but the Wii and Mac OS X seem like kindrid spirits.&lt;/p&gt;
&lt;p&gt;If you want to learn a lot about how old versions of Mac OS X or the Wii work, read Bryan’s entire post.  If you want to skip all the technical details, I understand, but the last paragraph is a good reminder for everyone:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In the end, I learned (and accomplished) far more than I ever expected - and perhaps more importantly, I was reminded that the projects that seem just out of reach are exactly the ones worth pursuing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/apple/&quot;&gt;Apple&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/mac_os_x/&quot;&gt;Mac OS X&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/nintendo/&quot;&gt;Nintendo&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/wii/&quot;&gt;Wii&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-04-08-porting-mac-os-x-to-the-nintendo-wii/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>💻 This Is Not The Computer For You ↗</title>
    <link href="https://samhenri.gold/blog/20260312-this-is-not-the-computer-for-you/"/>
    <id>tag:violetpixel.com,2026-04-07:this-is-not-the-computer-for-you</id>
    <updated>2026-04-08T02:17:00Z</updated>


    <content type="html">&lt;p&gt;Sam Henri Gold, in a wonderful review of the MacBook Neo:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Nobody starts in the right place. You don’t begin with the correct tool and work sensibly within its constraints until you organically graduate to a more capable one. That is not how obsession works. Obsession works by taking whatever is available and pressing on it until it either breaks or reveals something. The machine’s limits become a map of the territory. You learn what computing actually costs by paying too much of it on hardware that can barely afford it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;When &lt;a href=&quot;https://wowpedia.fandom.com/wiki/World_of_Warcraft_Standard_Edition&quot;&gt;the first version of World of Warcraft&lt;/a&gt; was released in 2004, the PC I had at the time didn&#39;t meet &lt;a href=&quot;https://wowpedia.fandom.com/wiki/System_requirements#World_of_Warcraft&quot;&gt;the minimum system requirements&lt;/a&gt;. Not by a long shot. My CPU was too slow, I didn&#39;t have enough RAM, and my graphics card was a generation or two out of date.&lt;/p&gt;
&lt;p&gt;I bought World of Warcraft and forced it to run anyway.&lt;/p&gt;
&lt;p&gt;Even with the resolution cranked way down, the frame rate was abysmal. Visual glitches were abundant. But it ran. &lt;em&gt;I was in Azeroth.&lt;/em&gt; It was blurry, fuzzy, and glitchy, but I was doing quests, finding loot, and chatting with other players.&lt;/p&gt;
&lt;p&gt;It wasn&#39;t good enough. Every update or change—even tiny ones—tended to break everything and require hours of adjustments, experimentation, and tweaks to get things working again. I installed so many different drivers. I made so many registry and INI file tweaks. So, so many. Even when the game was technically working, I had to avoid certain areas because they were too visually demanding or too crowded and would cause a crash.&lt;/p&gt;
&lt;p&gt;It was, in many ways, terrible.  But terrible got me in the door. Terrible taught me a lot about how games worked, about how graphics drivers worked, about how Windows worked, and about how computers in general worked. Terrible forced me to figure out how to wring every last bit of performance from outdated hardware that had every right to flip me the bird as it committed thermal suicide.&lt;/p&gt;
&lt;p&gt;The MacBook Neo is far from terrible.  If anything, it’s far more capable than most people think when they hear things like “iPhone chip” and “8 GB of RAM.”  It does have more limits than any other current-gen Mac, but it’s also scrappy, fun, extremely well-designed, and makes the Mac accessible to a much larger audience.&lt;/p&gt;
&lt;p&gt;When it comes to performance and specs, there’s no way anyone can argue it’s the best Mac.  When it comes to people and practicality, it’s the best Mac Apple has made in years.&lt;/p&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/apple/&quot;&gt;Apple&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/mac/&quot;&gt;Mac&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/craft/&quot;&gt;Craft&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/constraints/&quot;&gt;Constraints&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-04-07-this-is-not-the-computer-for-you/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>&quot;Somewhere&quot; (2010) review</title>
    <link href="https://brandur.org/fragments/somewhere"/>
    <id>tag:brandur.org,2026-04-06:fragments/somewhere</id>
    <updated>2026-04-07T00:09:44Z</updated>


    <content type="html">&lt;p&gt;I&amp;rsquo;ve often cited Sofia Coppola&amp;rsquo;s &lt;em&gt;Lost in Translation&lt;/em&gt; (2003) as one of my favorite movies. I&amp;rsquo;d never dug much into Coppola&amp;rsquo;s other work, so imagine my delight to discover that she&amp;rsquo;s made another movie, &lt;em&gt;Somewhere&lt;/em&gt; (2010) with a similar premise.&lt;/p&gt;

&lt;p&gt;I excitedly got to watching it, but was ultimately disappointed. There&amp;rsquo;s room for two movies to have similar premises, but &lt;em&gt;Somewhere&lt;/em&gt; takes that to another level. It&amp;rsquo;s functionally the same film.&lt;/p&gt;

&lt;p&gt;The macro/themes are the same &amp;ndash; disengaged, burned-out actor stays long-term at a hotel. A young woman comes into his life with whom he feels a genuine human connection. She helps break his sad routine and rediscover joy. One is in Tokyo, one is in LA. In one the woman is a much younger stranger, in the other his daughter.&lt;/p&gt;

&lt;p&gt;But overarching story aside, even specific scenes are strongly derivative:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There&amp;rsquo;s an absurdist foreign interview of the lead in each.&lt;/li&gt;
&lt;li&gt;Both heavily feature scenes of characters lying in beds.&lt;/li&gt;
&lt;li&gt;Each has meta-scenes of leads watching TV.&lt;/li&gt;
&lt;li&gt;Both include a scene of another woman sleeping over with the lead, and the awkward morning after interaction with the young woman about it.&lt;/li&gt;
&lt;li&gt;There are scenes of the characters swimming around in upscale hotel pools.&lt;/li&gt;
&lt;li&gt;Each has a scene of the lead watching strippers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I understand having a few callbacks in there to the filmmaker&amp;rsquo;s previous work, but this is something else.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Lost in Translation&lt;/em&gt; is clearly the distantly better movie. My takeaway is that although it had a good script, Bill Murray and the overwhelming chemistry between him and Scarlett Johansson carried that movie. Switch out those two leads, and it&amp;rsquo;s very possible that like &lt;em&gt;Somewhere&lt;/em&gt;, almost no one would have heard of it.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>at arm’s length</title>
    <link href="https://blog.cjquines.com/post/at-arms-length/"/>
    <id>https://blog.cjquines.com/post/at-arms-length/</id>
    <updated>2026-04-05T04:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>The Second Wave of the API-first Economy</title>
    <link href="https://brandur.org/second-wave-api-first"/>
    <id>tag:brandur.org,2026-03-27:second-wave-api-first</id>
    <updated>2026-03-27T15:11:08Z</updated>


    <content type="html">&lt;p&gt;Fifteen years ago, when some colleagues and I were building Heroku&amp;rsquo;s V3 API, we set an ambitious goal: the public API should be powerful enough to run our own dashboard. No private endpoints, no escape hatches.&lt;/p&gt;

&lt;p&gt;It was a stretch, but it worked. A new version of the company&amp;rsquo;s dashboard shipped on V3, and an unaffiliated developer who we&amp;rsquo;d never met before built Heroku&amp;rsquo;s first iOS app on it, without a single feature request sent our way.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;first-wave&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#first-wave&quot;&gt;The first wave&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Our dashboard-on-public-APIs-only seems needlessly idealistic nowadays, but it was an objective born of the time. The year was 2011, and the optimism around the power of APIs was palpable. A new world was opening up. One of openness, interconnectivity, unbounded possibility.&lt;/p&gt;

&lt;p&gt;And we weren&amp;rsquo;t the only ones thinking that way:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Only a year before (2010) Facebook released its original Open Graph API, providing immensely powerful insights into its platform data.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Twitter&amp;rsquo;s API at the time was almost completely open. You didn&amp;rsquo;t even need an OAuth token &amp;mdash; just authenticate on API endpoints with your username/password and get access to just about anything.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;GitHub was doing really impressive API design work, providing an expansive, feature-complete API with access to anything developers could need, and playing with forward-thinking ideas like hypermedia APIs/HATEOAS.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can still find traces of this bygone era, standing like some cyclopean ruins from a previous age. Hit the root GitHub API and you&amp;rsquo;ll find an artifact over a decade old &amp;mdash; a list of links that were intended to be followed as &lt;a href=&quot;https://en.wikipedia.org/wiki/HATEOAS&quot;&gt;hypermedia&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ curl https://api.github.com | jq

{
  &amp;quot;current_user_url&amp;quot;: &amp;quot;https://api.github.com/user&amp;quot;,
  &amp;quot;current_user_authorizations_html_url&amp;quot;: &amp;quot;https://github.com/settings/connections/applications{/client_id}&amp;quot;,
  &amp;quot;authorizations_url&amp;quot;: &amp;quot;https://api.github.com/authorizations&amp;quot;,
  &amp;quot;code_search_url&amp;quot;: &amp;quot;https://api.github.com/search/code?q={query}{&amp;amp;page,per_page,sort,order}&amp;quot;,
  &amp;quot;commit_search_url&amp;quot;: &amp;quot;https://api.github.com/search/commits?q={query}{&amp;amp;page,per_page,sort,order}&amp;quot;,
  &amp;quot;emails_url&amp;quot;: &amp;quot;https://api.github.com/user/emails&amp;quot;,
  &amp;quot;emojis_url&amp;quot;: &amp;quot;https://api.github.com/emojis&amp;quot;,
  &amp;quot;events_url&amp;quot;: &amp;quot;https://api.github.com/events&amp;quot;,
  ...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This wasn&amp;rsquo;t a pre-planned, stack-ranked feature that a product team spent half a year putting together. It was one or two early engineers who got really excited about an API idea, and shipped it, probably without even asking for permission.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Part of the push for open APIs was simple good will towards the rest of the world. The engineers building them were brought up in the earliest days of the internet, steeped in its original counterculture, and had an innate bias for radical openness.&lt;/p&gt;

&lt;p&gt;There was also a feeling from the companies involved that the APIs would be beneficial for their bottom lines. Users and third parties would use APIs to supplement the core product with add-ons and extensions that&amp;rsquo;d drive growth and increase product retention and satisfaction.&lt;/p&gt;

&lt;p&gt;Sites like the now defunct ProgrammableWeb popped up to discuss and catalog the newly appearing APIs, and the &amp;ldquo;programmable web&amp;rdquo; wasn&amp;rsquo;t only a website, it was a principle.&lt;/p&gt;

&lt;p&gt;In the near future, &lt;em&gt;all&lt;/em&gt; platforms would be API-first, providing full programmatic access and opening a new wave of interoperability across the web that&amp;rsquo;d let any service talk to any other service and massively accelerate the scope and reach of the internet. APIs would help expand everything from freedom to communication to commerce. An overwhelming force for good in the world.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;api-winter&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#api-winter&quot;&gt;API winter&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Of course, it didn&amp;rsquo;t last. The programmable web went through a phase of expansion, reached its maximum extent, and began to contract.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Twitter&amp;rsquo;s famous API, which used to be an API tinkerer&amp;rsquo;s dream, leveled off and began to dip as the company struggled to find ways to generate revenue. New features no longer got first-class API treatment. Access to the firehose was closed. Third-party Twitter clients were restricted and eventually locked out.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;The power of Facebook&amp;rsquo;s Graph API was hugely constricted post-Cambridge Analytica where a single rogue app was able to suck up data on millions of users and put it up for sale. Strict app review procedures were implemented. The API went from open access to a walled garden.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Even more extreme, Instagram&amp;rsquo;s previously public API was deprecated totally. Realizing they had a real money maker on their hands, they saw no reason to share ad revenue with anyone else. Use Instagram through the first-party app or not at all.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Even APIs like GitHub&amp;rsquo;s that stayed quite open had to crack down to a degree. Endpoints became authenticated by necessity and aggressive rate limiting was put in to curb abuse and reduce operational toil. And even when APIs were still largely accessible, using them to build a full-scale third-party app became more difficult as limiters flattened heavy (even if legitimate) use.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rationale for why APIs were being declawed or disappearing completely varied&amp;mdash;abuse, monetization pressure, competitive risk, privacy, etc.&amp;mdash;but the pattern was clear. Walls were going up across the world.&lt;/p&gt;

&lt;p&gt;APIs didn&amp;rsquo;t disappear, but it was a cold winter for them. The expectation of an API became more limited to developer-focused platforms whose users paid them &amp;mdash; Stripe, Twilio, Slack, etc. When new consumer products appeared on the market (e.g. TikTok), no one expected them to have much in the way of an API.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;second-wave&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#second-wave&quot;&gt;The coming second wave&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;For many years this was the status quo. If you were using Twitter, you&amp;rsquo;d use it from Twitter.com. Facebook, from Facebook.com. Instagram or TikTok, from their respective iOS/Android apps. Developer products like GitHub and Stripe continued strong, but elsewhere, APIs weren&amp;rsquo;t enough of a competitive advantage for anyone who didn&amp;rsquo;t have one to suffer.&lt;/p&gt;

&lt;p&gt;But around mid-2025, the world changed. The last half year especially has been distinguished by the rise of indescribably powerful LLMs, which now dominate discourse as the most useful new tool in a generation.&lt;/p&gt;

&lt;p&gt;They&amp;rsquo;re already useful enough as incredible trivia machines or code generators, but they really start to shine when they integrate with things. It&amp;rsquo;s pretty neat having one generate a valid Kubernetes configuration for your new app, but it&amp;rsquo;s &lt;em&gt;really&lt;/em&gt; neat watching it provision an &lt;acronym title=&quot;Amazon Elastic Kubernetes Service&quot;&gt;EKS&lt;/acronym&gt; cluster via &lt;code&gt;awscli&lt;/code&gt; and send out its first production deploy on your behalf.&lt;/p&gt;

&lt;p&gt;Suddenly, an API is no longer liability, but a major saleable vector to give users what they want: a way into the services they use and pay for so that an agent can carry out work on their behalf. Especially given a field of relatively undifferentiated products, in the near future the availability of an API might just be the crucial deciding factor that leads to one choice winning the field.&lt;/p&gt;

&lt;h3 id=&quot;my-future-bank&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#my-future-bank&quot;&gt;Picking my future bank&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Let&amp;rsquo;s think about banks. I have a couple bank accounts, each offering a standard set of features largely unchanged since the 60s. If I call them, they&amp;rsquo;ll send me some checks. I can request a transfer between two internal accounts and they will transfer the money &amp;hellip; in 1-5 business days. Nowadays, they even offer ultra-modern features (from 2010) like &lt;em&gt;gasp&lt;/em&gt;, MFA, just as long as it&amp;rsquo;s through a provider that&amp;rsquo;s paid them off (Symantec VIP). Suffice it to say, they&amp;rsquo;re comfortable in the status quo. My banks do not have good APIs.&lt;/p&gt;

&lt;p&gt;So far this has worked out okay for them. People aren&amp;rsquo;t known to migrate banks often, and even if they did, regulatory moats make new incumbents rare.&lt;/p&gt;

&lt;p&gt;But in the modern age, can it last? When I want to move $100 from one bank to another, my banks put me through a humiliating ritual of logging into both accounts, and bypassing multiple security checks and captchas before I can perform any operation. All this despite me having just logged into both accounts from this exact location and biometrically-secured computer the day before.&lt;/p&gt;

&lt;p&gt;The world I &lt;em&gt;want&lt;/em&gt; is to instruct an LLM: &amp;ldquo;move $100 from Wells Fargo checking to Charles Schwab brokerage&amp;rdquo; and it will just &lt;em&gt;happen&lt;/em&gt;. And to be fair, LLMs are already so absurdly good at reverse engineering things that this might already work today. But you know what&amp;rsquo;d work better? If both banks shipped with APIs, LLM-friendly usage instructions (through MCP or the like), and a strong auth layer to give me confidence that the whole process is secure.&lt;/p&gt;

&lt;p&gt;If I were choosing a bank today, some considerations would be the same as they&amp;rsquo;ve always been&amp;mdash;competent security, free checking, no foreign transaction fees&amp;mdash;but I&amp;rsquo;d also futureproof the choice by picking one that&amp;rsquo;s established technical bona fides by providing an API. Even if I&amp;rsquo;m not quite ready to trust my banking credentials to an agent quite yet, I assume that this day is coming.&lt;/p&gt;

&lt;h3 id=&quot;ubiquitous-again&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#ubiquitous-again&quot;&gt;Ubiquitous again&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Now apply the same principle to every service you use during the course of a week, or ever:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Online marketplaces:&lt;/strong&gt; Robot, schedule my normal Amazon Fresh order for the first available slot tomorrow morning.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Office co-working:&lt;/strong&gt; Robot, book me a desk at Embarcadero Center today.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ski resorts:&lt;/strong&gt; Robot, buy me a day pass for tomorrow and load it to my resort card. Confirm the price with me first.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Restaurants:&lt;/strong&gt; Robot, put in my usual lunch order at Musubi Kai. Get me the unadon!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Where &lt;em&gt;wouldn&amp;rsquo;t&lt;/em&gt; you want an API?&lt;/p&gt;

&lt;p&gt;Forecasting the future is infamously hazardous, but based on the adoption patterns of myself and the people around me, I expect the demand to interact with services through LLMs is going to be overwhelming, and services aiming to provide a good product experience or which face competitive pressure (i.e. someone else could provide that experience instead) will offer APIs.&lt;/p&gt;

&lt;p&gt;I used to wish that we&amp;rsquo;d gone down an alternative branch of web technology and adopted a protocol like &lt;a href=&quot;https://en.wikipedia.org/wiki/Gopher_(protocol)&quot;&gt;Gopher&lt;/a&gt; so we&amp;rsquo;d have a more standardized web experience instead of every product you use producing its own unique UX, most bad. I think we will see more standardization, just not in the form I expected. The convention of the future will be human language, fed into what looks a lot like a terminal, and fulfilled via API.&lt;/p&gt;

&lt;h3 id=&quot;on-behalf-of-people&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#on-behalf-of-people&quot;&gt;On behalf of people&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Notably, this is different than the first wave of APIs that I described above. Instead of APIs being to offer infinitely flexible access for inter-service communication, scrape data, or build apps on top of someone else&amp;rsquo;s platform, their primary use will be to fulfill requests on behalf of a primary user. Exactly like what they&amp;rsquo;d be doing through a first-party app, but in a programmatic way.&lt;/p&gt;

&lt;figure&gt;
    &lt;img alt=&quot;During the first wave, APIs were largely aimed at third parties who&#39;d use them to extend and augment the underlying platform to provide additional features for users.&quot; class=&quot;overflowing&quot; loading=&quot;lazy&quot; src=&quot;/assets/images/second-wave-api-first/first-wave.svg&quot;&gt;
    &lt;figcaption&gt;During the first wave, APIs were largely aimed at third parties who&#39;d use them to extend and augment the underlying platform to provide additional features for users.&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;figure&gt;
    &lt;img alt=&quot;In the second wave, APIs map cleanly to normal product capabilities. They provide programmatic access for agents that act on behalf of people.&quot; class=&quot;overflowing&quot; loading=&quot;lazy&quot; src=&quot;/assets/images/second-wave-api-first/second-wave.svg&quot;&gt;
    &lt;figcaption&gt;In the second wave, APIs map cleanly to normal product capabilities. They provide programmatic access for agents that act on behalf of people.&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;It may seem like a subtle distinction, but there are considerable differences. The second model better incentivizes APIs to exist:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;APIs aren&amp;rsquo;t for building a product that aims to displace the offerings of the underlying platform, but rather for giving users an alternative way to access it.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Security models are simplified because they&amp;rsquo;re the same ones used by the product itself. Users have the same visibility that they&amp;rsquo;d have through a first-party app, and no more.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Aiming to support access patterns for a single person, platforms can rate limit much more aggressively to curb expenses and operational problems associated with offering an API.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;APIs should aim to provide a little more leeway than they would for a human, but only nominally so. An agent acting on my behalf should be able to occasionally poll LinkedIn for old colleagues that I should be reconnecting with and send them connect requests, but if someone&amp;rsquo;s set up their ClawBot to scrape the entire social graph on their behalf, platforms should feel more than free to throttle the hell out of them and give them a strike towards a permanent ban.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://docs.slack.dev/ai/slack-mcp-server/#rate-limits&quot;&gt;Slack&amp;rsquo;s rate limits&lt;/a&gt; are a good example of this, supporting numbers like 50 channel or 100 profile reads per minute. You can&amp;rsquo;t build a multi-user app with 50 channel reads per minute, but it&amp;rsquo;s plenty for a single user to access their own account.&lt;/p&gt;

&lt;h3 id=&quot;limits-of-the-model&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#limits-of-the-model&quot;&gt;Limits of the model&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;While can expect many products and services to offer APIs for good agentic interoperability, it won&amp;rsquo;t be forthcoming everywhere.&lt;/p&gt;

&lt;p&gt;Don&amp;rsquo;t expect much out of Instagram, TikTok, or other platforms that power themselves with ads. Neither from monopolies that won&amp;rsquo;t feel any serious pressure to change &amp;mdash; you won&amp;rsquo;t be reliably paying your Xfinity bill via agent anytime soon.&lt;/p&gt;

&lt;h3 id=&quot;future-today&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#future-today&quot;&gt;Hints of the future, today&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;In this section I figured I&amp;rsquo;d call out a few services that are already pulling this future forward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;As I was in the middle of writing this essay, I got a &lt;a href=&quot;https://world.hey.com/dhh/basecamp-becomes-agent-accessible-3ae6b949&quot;&gt;note from Basecamp&lt;/a&gt; that they&amp;rsquo;d revamped themselves for LLM accessibility, including &lt;a href=&quot;https://github.com/basecamp/bc3-api&quot;&gt;new API&lt;/a&gt;, &lt;a href=&quot;https://github.com/basecamp/basecamp-cli&quot;&gt;new CLI&lt;/a&gt;, and &lt;a href=&quot;https://github.com/basecamp/basecamp-cli/blob/main/skills/basecamp/SKILL.md&quot;&gt;bundled skill&lt;/a&gt; to instruct agents on their use.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Salesforce introduced their &lt;a href=&quot;https://www.salesforce.com/news/stories/salesforce-headless-360-announcement/&quot;&gt;&amp;ldquo;Headless 360&amp;rdquo; initiative&lt;/a&gt;, which purports to have made every Salesforce feature accessible by API, MCP, or CLI command.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;api-spring&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#api-spring&quot;&gt;API spring&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Fifteen years ago, us API maximalists thought that APIs were going to eat the world, ushering in a new paradigm of interoperability that would vastly expand our capabilities as users, and even change the world for the better.&lt;/p&gt;

&lt;p&gt;What we got instead was an API winter. As useful as APIs were in some situations, that usefulness was outweighed by concerns around revenue, privacy, and abuse.&lt;/p&gt;

&lt;p&gt;But as scary of a thought as it was that this might be the end, it wasn&amp;rsquo;t. We&amp;rsquo;re at the beginning of a new spring of APIs that&amp;rsquo;ll appear to support use by agents acting on behalf of people. As this mode of operation gets more popular, expect the availability of an API to be a competitive edge that differentiates a service from its competitors. The result will be a global proliferation of APIs and expanding product capability like never before seen.&lt;/p&gt;</content>

    <author>
      <name>brandur</name>

      <uri>https://brandur.org</uri>

    </author>
  </entry>

  <entry>
    <title>Video game March</title>
    <link href="https://blog.cjquines.com/post/video-game-march/"/>
    <id>https://blog.cjquines.com/post/video-game-march/</id>
    <updated>2026-03-22T04:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>What I&#39;ve Worked On at Stainless</title>
    <link href="https://tomeraberba.ch/what-ive-worked-on-at-stainless"/>
    <id>https://tomeraberba.ch/what-ive-worked-on-at-stainless</id>
    <updated>2026-03-21T00:00:00Z</updated>


    <content type="html">At Stainless, I build compiler-like generators that transform API specifications into idiomatic client libraries that feel as though they were hand-written by a language expert who had the time to get…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>In and of itself</title>
    <link href="https://blog.cjquines.com/post/in-and-of-itself/"/>
    <id>https://blog.cjquines.com/post/in-and-of-itself/</id>
    <updated>2026-03-18T04:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>📊 Nobody Gets Promoted for Simplicity ↗</title>
    <link href="https://terriblesoftware.org/2026/03/03/nobody-gets-promoted-for-simplicity/"/>
    <id>tag:violetpixel.com,2026-03-09:nobody-gets-promoted-for-simplicity</id>
    <updated>2026-03-09T15:24:00Z</updated>


    <content type="html">&lt;p&gt;Matheus Lima in a piece that had my head nodding up and down the whole way through:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;If you’re an engineer&lt;/strong&gt;, learn that simplicity needs to be made visible. The work doesn’t speak for itself; not because it’s not good, but because most systems aren’t designed to hear it.&lt;/p&gt;
&lt;p&gt;Start with how you talk about your own work. “Implemented feature X” doesn’t mean much. But &lt;em&gt;“evaluated three approaches including an event-driven architecture and a custom abstraction layer, determined that a straightforward implementation met all current and projected requirements, and shipped in two days with zero incidents over six months”&lt;/em&gt;, that’s the same simple work, just described in a way that captures the judgment behind it. The decision &lt;em&gt;not&lt;/em&gt; to build something is a decision, an important one! Document it accordingly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I&#39;ve met a lot of great engineers who do absolutely amazing work.  Many of them have no idea how to articulate and celebrate what they accomplish in a way that resonates with leadership.  This post should be required reading for every software developer.&lt;/p&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/engineering/&quot;&gt;Engineering&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/software_development/&quot;&gt;Software Development&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/code/&quot;&gt;Code&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/design/&quot;&gt;Design&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-03-09-nobody-gets-promoted-for-simplicity/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>🏛️ Archive of Archives ↗</title>
    <link href="https://archiveofarchives.com/?ref=simplebits.com#site-footer"/>
    <id>tag:violetpixel.com,2026-03-08:archive-of-archives</id>
    <updated>2026-03-08T15:39:00Z</updated>


    <content type="html">&lt;p&gt;One of the best things about the web are the digital collections various people curate and share.  Now we have an archive of those archives:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Archive of Archives is a curated collection of digital archives developed by &lt;a href=&quot;https://selfhelpartpublishingempire.com/&quot;&gt;SHAPE&lt;/a&gt;. All are welcome to contribute to the project. Archive submission guidelines can be found on &lt;a href=&quot;https://github.com/davidsizemore/archive-of-archives&quot;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Superb!&lt;/p&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/web/&quot;&gt;Web&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/creativity/&quot;&gt;Creativity&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/inspiration/&quot;&gt;Inspiration&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/fun/&quot;&gt;Fun&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/resources/&quot;&gt;Resources&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-03-08-archive-of-archives/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>Minimum force</title>
    <link href="https://blog.cjquines.com/post/minimum-force/"/>
    <id>https://blog.cjquines.com/post/minimum-force/</id>
    <updated>2026-03-08T05:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>Professional vision</title>
    <link href="https://blog.cjquines.com/post/professional-vision/"/>
    <id>https://blog.cjquines.com/post/professional-vision/</id>
    <updated>2026-02-27T05:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>🏙️ Why is the sky blue? ↗</title>
    <link href="https://explainers.blog/posts/why-is-the-sky-blue/"/>
    <id>tag:violetpixel.com,2026-02-24:why-is-the-sky-blue</id>
    <updated>2026-02-24T21:16:00Z</updated>


    <content type="html">&lt;p&gt;Erik Kennedy goes into incredible detail to explain exactly why the sky appears blue.&lt;/p&gt;
&lt;p&gt;But...&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As you saw above, violet scatters &lt;em&gt;more&lt;/em&gt; than blue. So why isn’t the sky purple? The dumb but true answer is: &lt;em&gt;our eyes are just worse at seeing violet&lt;/em&gt;. It’s the very highest frequency of light we can see; it’s riiight on the edge of our perception.&lt;/p&gt;
&lt;p&gt;But! – if we could see violet as well as blue, the sky &lt;em&gt;would&lt;/em&gt; appear violet.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Neat!&lt;/p&gt;
&lt;p class=&quot;tags&quot;&gt;&lt;span&gt;🏷️ &lt;a href=&quot;https://violetpixel.com/tags/&quot;&gt;Tags&lt;/a&gt;:&lt;/span&gt; &lt;a href=&quot;https://violetpixel.com/tags/science/&quot;&gt;Science&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/nature/&quot;&gt;Nature&lt;/a&gt;, &lt;a href=&quot;https://violetpixel.com/tags/links/&quot;&gt;Links&lt;/a&gt;&lt;/p&gt;
&lt;p class=&quot;permalink&quot;&gt;📄 &lt;a href=&quot;https://violetpixel.com/blog/2026-02-24-why-is-the-sky-blue/&quot;&gt;Permalink to this post&lt;/a&gt;&lt;/p&gt;</content>

    <author>
      <name>Justin (violetpixel)</name>

      <uri>https://violetpixel.com</uri>

    </author>
  </entry>

  <entry>
    <title>Notes on senses</title>
    <link href="https://blog.cjquines.com/post/senses/"/>
    <id>https://blog.cjquines.com/post/senses/</id>
    <updated>2026-02-15T05:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>Building a 24-bit Arcade CRT Display Adapter, From Scratch</title>
    <link href="https://www.scd31.com/posts/building-an-arcade-display-adapter"/>
    <id>https://www.scd31.com/posts/building-an-arcade-display-adapter</id>
    <updated>2026-02-04T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>On the MIT Mystery Hunt 2026</title>
    <link href="https://blog.cjquines.com/post/mystery-hunt-2026/"/>
    <id>https://blog.cjquines.com/post/mystery-hunt-2026/</id>
    <updated>2026-02-01T05:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>the cruelty of air</title>
    <link href="https://blog.cjquines.com/post/cruelty-of-air/"/>
    <id>https://blog.cjquines.com/post/cruelty-of-air/</id>
    <updated>2026-01-11T05:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>Some stories from developing puzlink.js</title>
    <link href="https://blog.cjquines.com/post/puzlink-js/"/>
    <id>https://blog.cjquines.com/post/puzlink-js/</id>
    <updated>2026-01-03T05:00:00Z</updated>


    <content type="html"></content>

    <author>
      <name>CJ Quines</name>

      <uri>https://cjquines.com</uri>

    </author>
  </entry>

  <entry>
    <title>Actual Lines (And GIFs) From Tech Recruiter Emails</title>
    <link href="https://tomeraberba.ch/actual-lines-from-tech-recruiter-emails"/>
    <id>https://tomeraberba.ch/actual-lines-from-tech-recruiter-emails</id>
    <updated>2025-12-13T00:00:00Z</updated>


    <content type="html">Inspired by CJ Quines&#39;s original blog post, which may or may not be a joke, hard to tell... Anyway, the following are real, I promise! &quot;Tomer?&quot; (this was the whole email) &quot;Transform an unsexy…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>dd is my fdisk</title>
    <link href="https://www.scd31.com/posts/dd-is-my-fdisk"/>
    <id>https://www.scd31.com/posts/dd-is-my-fdisk</id>
    <updated>2025-12-01T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>I Program on the Subway</title>
    <link href="https://www.scd31.com/posts/programming-on-the-subway"/>
    <id>https://www.scd31.com/posts/programming-on-the-subway</id>
    <updated>2025-11-16T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Introducing Glide, an extensible, keyboard-focused web browser</title>
    <link href="https://blog.craigie.dev/introducing-glide/"/>
    <id>https://blog.craigie.dev/introducing-glide/</id>
    <updated>2025-09-30T20:00:00Z</updated>


    <content type="html">&lt;style&gt;
#demo-video {
  border: 2px solid #7f7474;
  border-radius: 8px;
}
@media screen and (max-width: 768px) {
  #demo-video {
    max-width: 100%;
    height: auto;
  }
}
&lt;/style&gt;
&lt;p&gt;TL;DR: &lt;a href=&#39;https://glide-browser.app&#39;&gt;Glide&lt;/a&gt; is a Firefox fork with a TypeScript &lt;a href=&#39;https://glide-browser.app/config&#39;&gt;config&lt;/a&gt; that lets you build &lt;em&gt;anything&lt;/em&gt;.&lt;/p&gt;
&lt;h5 id=invisible-heading&gt;invisible-heading&lt;/h5&gt;&lt;p&gt;&lt;a href=&#39;https://glide-browser.app/#download&#39;&gt;Download&lt;/a&gt; - &lt;a href=&#39;https://glide-browser.app/&#39;&gt;Docs&lt;/a&gt; - &lt;a href=&#39;https://glide-browser.app/cookbook&#39;&gt;Cookbook&lt;/a&gt; - &lt;a href=&#39;https://github.com/glide-browser/glide&#39;&gt;Source&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;br&gt;
&lt;p&gt;Browsers should be hackable, just like your editor.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;glide&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;keymaps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;normal&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;gC&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;async&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;// extract the owner and repo from a url like &amp;#39;https://github.com/glide-browser/glide&amp;#39;&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;owner&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;repo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;glide&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;pathname&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;/&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;slice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;owner&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;||&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;repo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;throw&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;ow&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;ne&quot;&gt;Error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;current URL is not a github repo&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;// * clone the current github repo to ~/github.com/$owner/$repo&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;// * start kitty with neovim open at the cloned repo&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;repo_path&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;glide&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;glide&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;home_dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;github.com&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;owner&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;repo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;glide&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;gh&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;repo&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;clone&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;glide&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;repo_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]);&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;glide&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;kitty&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;-d&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;repo_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;nvim&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;cwd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;repo_path&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;description&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;open the GitHub repo in the focused tab in Neovim&amp;quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;Glide keymapping to clone the GitHub repo in the current tab and open it in &lt;a href=&#39;https://sw.kovidgoyal.net/kitty/&#39;&gt;kitty&lt;/a&gt; + &lt;a href=&#39;https://neovim.io/&#39;&gt;neovim&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For my job, I have to clone many different repos. This keymapping saves me a couple seconds each time, multiple times a day.&lt;/p&gt;
&lt;p&gt;Although it probably doesn&#39;t save me much time, adding this mapping only took a couple of minutes and using it makes me happy&lt;/p&gt;
&lt;h3 id=why-i-built-glide&gt;Why I built Glide&lt;/h3&gt;&lt;p&gt;I was using &lt;a href=&#39;https://addons.mozilla.org/en-US/firefox/addon/tridactyl-vim/&#39;&gt;tridactyl&lt;/a&gt; within Firefox and generally enjoying it, but occasionally I would run into frustrating issues due to security constraints imposed by Firefox on web extensions. For example, extensions are completely disabled on &lt;a href=&#39;https://addons.mozilla.org&#39;&gt;addons.mozilla.org&lt;/a&gt;, so all of my mappings could break depending on the website I had open. Additionally, tridactyl wouldn&#39;t work with a custom homepage.&lt;/p&gt;
&lt;p&gt;These security constraints imposed on tridactyl, and every other extension, are fundamental to how extensions operate. For example, it would be very bad if an extension could prevent itself from being uninstalled by modifying &lt;a href=&#39;https://addons.mozilla.org&#39;&gt;addons.mozilla.org&lt;/a&gt;, so browsers have to protect users from potentially malicious extension writers.&lt;/p&gt;
&lt;p&gt;At that point, I realised there was an opportunity to make a browser that&#39;s &lt;em&gt;truly&lt;/em&gt; customisable at its core, with no restrictions on what you can accomplish. From customising the browser UI itself, to calling out to other tools—anything should be possible.&lt;/p&gt;
&lt;h3 id=how-glide-is-different&gt;How Glide is different&lt;/h3&gt;&lt;p&gt;Glide holistically solves these usability issues by supporting a TypeScript &lt;a href=&#39;https://glide-browser.app/config&#39;&gt;config&lt;/a&gt; that lets you do &lt;em&gt;anything&lt;/em&gt;&lt;sup class=&quot;footnote-ref&quot; id=&quot;fnref-1&quot;&gt;&lt;a href=&quot;#fn-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Glide can support APIs and functionality that will never be supported in web extensions as the security model is fundamentally different. You—the end user—are responsible for the config, so there&#39;s no reason to restrict what you can do.&lt;/p&gt;
&lt;p&gt;In the Glide config you can define custom &lt;a href=&#39;https://glide-browser.app/keys&#39;&gt;key mappings&lt;/a&gt;, access the &lt;a href=&#39;https://glide-browser.app/extensions&#39;&gt;web extensions API&lt;/a&gt;, spawn arbitrary &lt;a href=&#39;https://glide-browser.app/api#glide.process&#39;&gt;processes&lt;/a&gt;, define &lt;a href=&#39;https://glide-browser.app/api#glide.keys&#39;&gt;macros&lt;/a&gt;, and more.&lt;/p&gt;
&lt;p&gt;Here&#39;s a small example that adds &lt;kbd&gt;g&lt;/kbd&gt;+&lt;kbd&gt;c&lt;/kbd&gt; as a key mapping to switch to the calendar tab:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;// ~/.config/glide/glide.ts&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;glide&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;keymaps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;normal&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;gc&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;async&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;tab&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;glide&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;tabs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;get_first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;https://calendar.google.com/*&amp;quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;assert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;tab&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;tab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;browser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;tabs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;tab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;description&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[g]o to [c]alendar.google.com&amp;quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For more examples, see the &lt;a href=&#39;https://glide-browser.app/cookbook&#39;&gt;cookbook&lt;/a&gt; or my own &lt;a href=&#39;https://github.com/RobertCraigie/dotfiles/tree/main/glide&#39;&gt;dotfiles&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The cherry on top is that all of this is built on &lt;em&gt;top&lt;/em&gt; of Firefox. If you already use Firefox, then your existing extensions and workflows will still work in Glide.&lt;/p&gt;
&lt;h3 id=modes&gt;Modes&lt;/h3&gt;&lt;p&gt;Glide borrows the concept of modes from (neo)vim; every key mapping you define will be attached to a specific mode.&lt;/p&gt;
&lt;p&gt;Glide switches between modes automatically when you interact with the browser. For example, in the default mode, &lt;code&gt;normal&lt;/code&gt;, if you click on an &lt;code&gt;&amp;lt;input&amp;gt;&lt;/code&gt; element, Glide will switch to &lt;code&gt;insert&lt;/code&gt; mode, so that key mappings don&#39;t interfere with entering text.&lt;/p&gt;
&lt;p&gt;If a website doesn&#39;t play well with your key mappings you can switch to &lt;code&gt;ignore&lt;/code&gt; mode by pressing &lt;kbd&gt;Shift&lt;/kbd&gt;+&lt;kbd&gt;Escape&lt;/kbd&gt;, in this mode the only default key mapping is &lt;kbd&gt;Shift&lt;/kbd&gt;+&lt;kbd&gt;Escape&lt;/kbd&gt; to exit ignore mode.&lt;/p&gt;
&lt;h3 id=navigating&gt;Navigating&lt;/h3&gt;&lt;p&gt;Glide supports a &lt;code&gt;hint&lt;/code&gt; mode that lets you operate web pages entirely using the keyboard.&lt;/p&gt;
&lt;p&gt;Press &lt;code&gt;f&lt;/code&gt; to enter &lt;code&gt;hint&lt;/code&gt; mode and Glide will overlay text &lt;a href=&#39;https://glide-browser.app/hints#label-generation&#39;&gt;labels&lt;/a&gt; over every &lt;a href=&#39;https://glide-browser.app/hints#hintable-elements&#39;&gt;hintable&lt;/a&gt; element, e.g. links and buttons. Typing the &lt;a href=&#39;https://glide-browser.app/hints#label-generation&#39;&gt;label&lt;/a&gt; for a hint will then focus and click the element.&lt;/p&gt;
&lt;p&gt;&lt;video
id=&quot;demo-video&quot;
width=&quot;690&quot;
height=&quot;497&quot;
controls
autoplay
loop
title=&quot;Demo video showing Glides support for hints&quot;&lt;/p&gt;
  &lt;source src=&quot;https://bear-images.sfo2.cdn.digitaloceanspaces.com/craigie/demo-hints.mp4&quot; type=&quot;video/mp4&quot; /&gt;
&lt;/video&gt;
&lt;h3 id=personal-favourite-features&gt;Personal favourite features&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;gI&lt;/code&gt; focuses the largest visible input element on the page and it feels like magic every time I use it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;&amp;lt;space&amp;gt;&amp;lt;space&amp;gt;&lt;/code&gt; opens a tab fuzzy finder, which is invaluable for finding that &lt;em&gt;one&lt;/em&gt; tab you always lose.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;&amp;lt;c-i&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;c-o&amp;gt;&lt;/code&gt; are also invaluable for navigating previously open tabs.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;:repl&lt;/code&gt; for quickly testing out config changes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#39;https://glide-browser.app/hints&#39;&gt;hints&lt;/a&gt; for when I don&#39;t want to reach for my mouse.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A &lt;a href=&#39;https://github.com/folke/which-key.nvim&#39;&gt;which-key&lt;/a&gt; inspired UI for reminding you of the many key mappings.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;br&gt;
&lt;p&gt;I&#39;ve been daily driving Glide for ~6 months now and while I&#39;m biased, I love it. It&#39;s still in a very early alpha stage but if you&#39;d like to try it out, you can &lt;a href=&#39;https://glide-browser.app/#download&#39;&gt;download&lt;/a&gt; it for macOS / Linux today. I&#39;d recommend checking out the tutorial with &lt;code&gt;:tutor&lt;/code&gt; to get your bearings, although it&#39;s far from complete yet.&lt;/p&gt;
&lt;p&gt;p.s. sorry Linux folks, Glide isn&#39;t in any package repositories yet, so you&#39;ll have to untar it and set up Glide manually for now.&lt;/p&gt;
&lt;br&gt;
&lt;hr /&gt;
&lt;section class=&quot;footnotes&quot;&gt;
&lt;ol&gt;
&lt;li id=&quot;fn-1&quot;&gt;&lt;p&gt;As Glide is in very early alpha there are missing APIs so you can&#39;t literally do &lt;em&gt;everything&lt;/em&gt; yet, but enabling full control is one of the main goals.&lt;a href=&quot;#fnref-1&quot; class=&quot;footnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/section&gt;</content>

    <author>
      <name>Robert Craigie</name>

      <uri>https://craigie.dev</uri>

    </author>
  </entry>

  <entry>
    <title>How to make a rare birds hotline in 2025</title>
    <link href="https://beak-v2.onrender.com/blog/bird-calls"/>
    <id>https://beak-v2.onrender.com/blog/bird-calls</id>
    <updated>2025-09-06T00:00:00Z</updated>


    <content type="html">&lt;p&gt;I had an idea the other day when working on some AI voice stuff... what if you could re-create the old birding hotlines but have them automatically update upon recent sightings? What would it be like to be able to call a number and hear which birds are currently being seen in an area, knowing that the information was always up to do? How feasible is this with current tooling and would it even be fun to use? Let&#39;s see!&lt;/p&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;beginnings&quot;&gt;Beginnings&lt;a href=&quot;https://beak-v2.onrender.com/blog/bird-calls#beginnings&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Beginnings&quot; title=&quot;Direct link to Beginnings&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The other day at work I was helping a client &lt;a href=&quot;https://openai.com/index/introducing-gpt-realtime/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;launch a new feature for a &quot;real time&quot; AI voice API&lt;/a&gt;. I was helping update their SDKs to support some new features and as part of that I needed to use some quick examples to test them out. I had never really used an &quot;audio&quot; forms of AI before so I honestly had no idea what was even possible here in mid 2025. I fired up one of our &quot;push to talk&quot; examples, and was prompted to record a prompt to the AI. As usual, my thought turned to birds, so I asked &quot;what are some good birds to see in New York right now?&quot; I was quite surprised when a very convincing voice replied back with some genuinely good advice about fall migrants.&lt;/p&gt;
&lt;p&gt;I quickly moved on to polishing the examples up and getting the release out the door, but the experience stuck with me. I, like so many others no doubt, am more than a bit uncertain about our strange new future we live in. At the same time, I&#39;ve also seen real, tangible ways that AI has helped speed up my work or just get rid of hassles or blockers (like setting up the frontend for half of this site). So I got to thinking, how could I use this voice stuff when it comes to birding?&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;backstory-on-birding-hotlines&quot;&gt;Backstory on birding hotlines&lt;a href=&quot;https://beak-v2.onrender.com/blog/bird-calls#backstory-on-birding-hotlines&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Backstory on birding hotlines&quot; title=&quot;Direct link to Backstory on birding hotlines&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;I&#39;m honestly not the best person to explain these. They predated my time in birding by probably 2 decades. However, I&#39;ve heard them detailed in more than a few birding stories or memoirs so I have a good idea of how they came about and how they worked. Back around the 1970&#39;s when listing and Big Years really started taking off, one of the biggest enablers of those were the ability to get more and more up to date information about where rare birds might be. As birders started tearing across the country to try and see more and more birds, it became quite helpful to answer questions like &quot;What rare birds are in Florida today?&quot; or &quot;Is the Amur Stonechat continuing in Texas right now?&quot; I believe answering this first was possible as more birders came to know each other and would just call around to their friends or network to find out information. As the field grew though, I imagine it got more than a bit annoying (and expensive) to always be &quot;on call&quot; for a location or a rarity. Thus, the hotlines were born. Once a day or so, an area&#39;s birding club or society would record a message about the day&#39;s rarities:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Hudsonian Godwit found by Leo continues on Jamaica Bay&#39;s East Pond today. Additionally a Wilson&#39;s Phalarope was reported just west of there at Floyd Bennet field. The Red-footed Booby first found by Tim has not been seen since Tuesday morning&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Not only was this helpful for Big Year birders, who might be trying to make the call of diverting a route to pick up another species, but it was also useful for local birders too to stay up to date with their local rarities.&lt;/p&gt;
&lt;p&gt;These hotlines continued for quite a while, up until the 2000&#39;s when they began to be replaced by quicker and more easily distributed communication methods. Email list-serves required much less effort to update, and then even more-so once larger messaging platforms like Whatsapp, GroupMe and now Discord entered the picture. eBird itself also has become instrumental in helping track and find rare birds. You can set up email alerts for rare birds across the world and with a few clicks check and see just when the last reporting of a rare species was.&lt;/p&gt;
&lt;p&gt;All in all, this progression has been a great one for basically everyone involved. If you&#39;re a lister or working on a Big Year it&#39;s never been quicker nor easier to track the status of a rare bird whether it&#39;s in your local patch or halfway around the world. I imagine the only person&#39;s job that might&#39;ve gotten difficult is the moderators of these communication mediums. There are countless more birders today and there are bound to be a few mistakes (just like a rare Cuckoo in my patch yesterday seemed to morph into a Blue Jay once I had better viewing conditions). I&#39;m sure the moderators of yore had to deal with this in their capacity too, though maybe it was over a panicked phone call rather than a spurious Discord post.&lt;/p&gt;
&lt;p&gt;So why go back down the path of a hotline? There&#39;s virtually no reason in practical terms. But, there&#39;s just something quaint and nice to me about potentially being able to call in to my local patch to hear what&#39;s been going on. Additionally, I imagine a big reason these hotlines fell away is that it was just too much work to record a new update every day, much less every few minutes when a new observation comes in. Today, though, that work might be largely mitigated by what&#39;s been coming out with the latest technology. And, I&#39;m really genuinely curious to see if an outdated form of technology might have a second chance when it&#39;s paired with the newest of technology we have today&lt;/p&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;making-this-work&quot;&gt;Making this work&lt;a href=&quot;https://beak-v2.onrender.com/blog/bird-calls#making-this-work&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Making this work&quot; title=&quot;Direct link to Making this work&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Like always, I&#39;d like to start simple before going too far with this idea. I&#39;d like to only work with one place first, my local patch McGolrick Park. I&#39;d love to stick with this first cause I&#39;m going to have a deep sense of what kind of updates and contents would resonate the most with our birders here. I&#39;d like to run this at some sort of regular cadence. Maybe I&#39;ll try hourly to start before seeing if I should make that more frequent, or even experiment with having it update in real time? Fetching this data should be pretty straightforward. I can just ask the eBird API for the observations in the park from the last day. I can also ask for any rarities too (those these are very rare in this park). From that, I think then I can send this over to an LLM to have it do two things: first, turn the raw observational data into a human readable (or really, hearable) form. Then, I can use its text-to-speech capabilities to convert that into audio. I&#39;m arguably thinking about this wrong, since both of those steps will probably just happen in one call. Last, I&#39;ll just need to get that recording into a place where Twilio or some other phone provider can pick it up and play it over a &quot;recorded line&quot; of sorts. This last part, the phone bits, is funnily enough the part I&#39;m least familiar with.&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;fetching-the-data&quot;&gt;Fetching the data.&lt;a href=&quot;https://beak-v2.onrender.com/blog/bird-calls#fetching-the-data&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Fetching the data.&quot; title=&quot;Direct link to Fetching the data.&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Let&#39;s start with the data. Qualitatively, what I&#39;m looking for here is something like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;first, and most importantly, are there any rare birds in the park?&lt;/li&gt;
&lt;li&gt;overall, which bird species are being seen right now?&lt;/li&gt;
&lt;li&gt;if something interesting is here, did it just show up or has it been here?&lt;/li&gt;
&lt;li&gt;if we can know, where specifically is it in the park? (hello helpful comments!)&lt;/li&gt;
&lt;li&gt;also, who saw the bird? Can we give them credit?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I think roughly we&#39;ll want to make a few calls to the eBird API:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;bird species over the last few days in the hotspot&lt;/li&gt;
&lt;li&gt;rare bird species in the last few days in the hotspot&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And maybe that&#39;s it? I might have to make a few follow up calls if I want to see the history of a bird over time (to say something like &quot;The Cerulean Warbler first found by P continues in the Magic Bushes&quot; [what a dream if it did]). But let&#39;s walk before we run.&lt;/p&gt;
&lt;p&gt;Given I already have the eBird API set up nicely as a Python SDK, getting this up and going is pretty easy:&lt;/p&gt;
&lt;div class=&quot;language-python codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-python codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;async&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;fetch_observations_for_regions_from_phoebe&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    region_code&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;PhoebeObservation&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; get_phoebe_client&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;observations&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;recent&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        back&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        cat&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;species&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        hotspot&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        region_code&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;region_code&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        include_provisional&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;async&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;fetch_notable_observations_for_area_from_phoebe&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    region_code&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;PhoebeObservation&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; get_phoebe_client&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;observations&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;recent&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;notable&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        back&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        hotspot&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        region_code&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;region_code&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;async&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;run_bird_calls_job&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;region_code&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    species_observations &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; fetch_observations_for_regions_from_phoebe&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;region_code&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;f&quot;Fetched &lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation builtin&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation&quot;&gt;species_observations&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt; observations&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    species_found &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;obs&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;species_code &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; obs &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; species_observations&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;f&quot;Found these species codes: &lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation&quot;&gt;species_found&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    notable_observations &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; fetch_notable_observations_for_area_from_phoebe&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        region_code&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;f&quot;Fetched &lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation builtin&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation&quot;&gt;notable_observations&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt; notable observations&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; notable_observations&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        notable_species_found &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;obs&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;species_code &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; obs &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; notable_observations&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;f&quot;Found these notable species codes: &lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation&quot;&gt;notable_species_found&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Running that produces some pretty useful results!&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;% uv run src/cloaca/api/bird_calls/main.py&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Fetched 30 observations&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Found these species codes: {&#39;grcfly&#39;, &#39;bawwar&#39;, &#39;amered&#39;, &#39;eursta&#39;, &#39;dowwoo&#39;, &#39;chswar&#39;, &#39;amerob&#39;, &#39;amecro&#39;, &#39;rthhum&#39;, &#39;laugul&#39;, &#39;veery&#39;, &#39;scatan&#39;, &#39;amhgul1&#39;, &#39;chiswi&#39;, &#39;norcar&#39;, &#39;eawpew&#39;, &#39;rethaw&#39;, &#39;houspa&#39;, &#39;camwar&#39;, &#39;yelwar&#39;, &#39;norwat&#39;, &#39;norpar&#39;, &#39;magwar&#39;, &#39;yebcuc&#39;, &#39;rocpig&#39;, &#39;comyel&#39;, &#39;blujay&#39;, &#39;olsfly&#39;, &#39;comgra&#39;, &#39;moudov&#39;}&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Fetched 0 notable observations&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The 0 notable observations is expected, so let&#39;s swap over to Jamaica Bay&#39;s East Pond to make sure that part of the query is working:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;% uv run src/cloaca/api/bird_calls/main.py&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Fetched 101 observations&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Found these species codes: {&#39;comyel&#39;, &#39;merlin&#39;, &#39;cangoo&#39;, &#39;dowwoo&#39;, &#39;gbbgul&#39;, &#39;bkbplo&#39;, &#39;comter&#39;, &#39;snoegr&#39;, &#39;lessca&#39;, &#39;mallar3&#39;, &#39;eursta&#39;, &#39;mutswa&#39;, &#39;blkski&#39;, &#39;rewbla&#39;, &#39;wilpha&#39;, &#39;bubsan&#39;, &#39;leasan&#39;, &#39;fiscro&#39;, &#39;lobdow&#39;, &#39;pecsan&#39;, &#39;baleag&#39;, &#39;carwre&#39;, &#39;semplo&#39;, &#39;stisan&#39;, &#39;greegr&#39;, &#39;whrsan&#39;, &#39;bcnher&#39;, &#39;rudduc&#39;, &#39;norwat&#39;, &#39;sposan&#39;, &#39;rocpig&#39;, &#39;grycat&#39;, &#39;osprey&#39;, &#39;swaspa&#39;, &#39;buwtea&#39;, &#39;houspa&#39;, &#39;forter&#39;, &#39;willet1&#39;, &#39;buggna&#39;, &#39;amewig&#39;, &#39;amerob&#39;, &#39;marwre&#39;, &#39;amgplo&#39;, &#39;balori&#39;, &#39;barswa&#39;, &#39;margod&#39;, &#39;amecro&#39;, &#39;amhgul1&#39;, &#39;sonspa&#39;, &#39;amwpel&#39;, &#39;cedwax&#39;, &#39;laugul&#39;, &#39;hudgod&#39;, &#39;brnthr&#39;, &#39;norhar2&#39;, &#39;amered&#39;, &#39;gloibi&#39;, &#39;norcar&#39;, &#39;shbdow&#39;, &#39;treswa&#39;, &#39;houwre&#39;, &#39;ribgul&#39;, &#39;sora&#39;, &#39;amekes&#39;, &#39;renpha&#39;, &#39;normoc&#39;, &#39;norsho&#39;, &#39;pibgre&#39;, &#39;moudov&#39;, &#39;lesyel&#39;, &#39;gadwal&#39;, &#39;btbwar&#39;, &#39;whevir&#39;, &#39;gresca&#39;, &#39;caster1&#39;, &#39;rebwoo&#39;, &#39;botgra&#39;, &#39;wessan&#39;, &#39;amegfi&#39;, &#39;purmar&#39;, &#39;gnwtea&#39;, &#39;perfal&#39;, &#39;doccor&#39;, &#39;rethaw&#39;, &#39;royter1&#39;, &#39;grbher3&#39;, &#39;easpho&#39;, &#39;norfli&#39;, &#39;coohaw&#39;, &#39;greyel&#39;, &#39;wooduc&#39;, &#39;ambduc&#39;, &#39;yelwar&#39;, &#39;houfin&#39;, &#39;ycnher&#39;, &#39;redkno&#39;, &#39;ameoys&#39;, &#39;easkin&#39;, &#39;killde&#39;, &#39;comgra&#39;, &#39;semsan&#39;}&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Fetched 88 notable observations&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Found these notable species codes: {&#39;lessca&#39;, &#39;wessan&#39;, &#39;purmar&#39;, &#39;amwpel&#39;, &#39;margod&#39;, &#39;wilpha&#39;, &#39;caster1&#39;, &#39;hudgod&#39;, &#39;bubsan&#39;, &#39;sora&#39;}&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Looks like it is (and also I really need to get back over there!).&lt;/p&gt;
&lt;p&gt;Let&#39;s do some quick and dirty filtering on the McGolrick data though to remove our most common species (I don&#39;t think anyone&#39;s calling in to hear what the Starlings are up to). I can mostly manually figure these out, though it&#39;s funny when Copilot tries to help...&lt;/p&gt;
&lt;p&gt;&lt;img decoding=&quot;async&quot; loading=&quot;lazy&quot; alt=&quot;screenshot of code showing Github Copilot trying to add a Great Horned Owl to a list of &amp;amp;quot;common&amp;amp;quot; birds&quot; src=&quot;https://beak-v2.onrender.com/assets/images/bird_call_bad_copilot-14cc8c713d15135c9f5f4cdd778bb66e.png&quot; width=&quot;624&quot; height=&quot;358&quot; class=&quot;img_ev3q&quot;&gt;&lt;/p&gt;
&lt;p&gt;Something interesting I noticed right away... someone noticed a Yellow-billed Cuckoo this morning! Maybe that darn Blue Jay turned back... But, for McGolrick, this and the Olive-sided Flycatcher are genuine patch rarities. We get 1-2 of these per year, so I&#39;d like to flag these as rare. Unfortunately, I think I&#39;d have to use the EBD to do that... but fortunately, I already have that loaded up and working so it shouldn&#39;t be too difficult to add?&lt;/p&gt;
&lt;p&gt;Before embarking on that and probably ending up writing another new blog post about just that, I&#39;d like to proceed a bit further with trying and proving out that idea.&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;human-hearable-reports&quot;&gt;Human hearable reports&lt;a href=&quot;https://beak-v2.onrender.com/blog/bird-calls#human-hearable-reports&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Human hearable reports&quot; title=&quot;Direct link to Human hearable reports&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;For this next part we need to take this data and get it into the LLM to make it something more human appealing. Let&#39;s try and turn this into a form that an LLM like ChatGPT can use. One though I&#39;m toying with is giving the LLM a &quot;tier&quot; of possible birds like so:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;rarities: genuinely rare birds that would show up on eBird&#39;s rarity reports&lt;/li&gt;
&lt;li&gt;patch_rarities: birds that are rare for the park (like the Cuckoo or Flycatcher today)&lt;/li&gt;
&lt;li&gt;patch_favorites: birds that aren&#39;t as rare, but everyone loves to see when they show up (warblers, tanagers, etc.) For now I&#39;ll just let this be anything that isn&#39;t one of the prior tiers and isn&#39;t a common bird.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So maybe I can define that in a dataclass like so:&lt;/p&gt;
&lt;div class=&quot;language-python codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-python codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;BirdRarityTier&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;enum&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;Enum&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# genuinely rare birds that would show up on eBird&#39;s rarity reports&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    RARITY &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;rarity&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# birds that are rare for the park (like the Cuckoo or Flycatcher today)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    PATCH_RARITY &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;patch_rarity&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# birds that aren&#39;t as rare, but everyone loves to see when they show up (warblers, tanagers, etc.)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    PATCH_FAVORITE &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;patch_favorite&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# the rest&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    COMMON &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;common&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;PatchObservation&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    common_name&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    date_last_seen&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    taxonomic_order&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    scientific_name&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    species_code&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    rarity_tier&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; BirdRarityTier&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Then we can just convert our two API responses into this form. I can&#39;t actually distinguish patch_rarities just yet, but that&#39;ll come shortly. Here&#39;s a sample of this data for Jamaica Bay:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+-----------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Osprey                      | 2025-09-06 10:11 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+-----------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Gray Catbird                | 2025-09-06 10:11 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+-----------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Northern Waterthrush        | 2025-09-06 10:11 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+-----------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| rarity         | Marbled Godwit              | 2025-08-31 09:30 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+-----------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| rarity         | Caspian Tern                | 2025-08-31 09:30 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+-----------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| rarity         | American White Pelican      | 2025-08-31 09:30 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+-----------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| rarity         | Purple Martin               | 2025-08-31 09:30 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;...&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;(of course I haven&#39;t set up common birds for Jamaica Bay, hence why an Osprey is there)&lt;/p&gt;
&lt;p&gt;This is just about there! All that&#39;s left is a prompt to tell the LLM how to report this. I&#39;m in no ways talented at writing these, but let&#39;s take a first stab:&lt;/p&gt;
&lt;div class=&quot;language-text codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-text codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;You are a rare bird alert hotline that Birders will call into to hear about rare, notable or interesting birds at a Hotspot. The hotspot name and the list of birds will be provided to you in this input. Your job is to turn that list of birds into a brief summary of the birds at the hotspot today. You always want to focus on reporting `rarity` level birds first, then `patch_rarity`&#39;s, followed by `patch_favorites`. Prioritize birds that have been seen the most recently too. Don&#39;t report anything that hasn&#39;t been seen in the last 3 days.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Here&#39;s an example input:&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&amp;lt;!-- cSpell:disable --&amp;gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;\`\`\`&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Hotspot: McGolrick Park&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| Rarity         | Common Name               | Last Seen        |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+================+===========================+==================+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Baltimore Oriole          | 2025-08-30 18:25 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Ovenbird                  | 2025-08-30 18:25 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Black-and-white Warbler   | 2025-09-01 07:00 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Chestnut-sided Warbler    | 2025-09-01 07:00 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Scarlet Tanager           | 2025-09-01 07:00 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Great Crested Flycatcher  | 2025-09-02 06:30 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Common Yellowthroat       | 2025-09-02 06:30 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Magnolia Warbler          | 2025-09-02 06:30 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Ruby-throated Hummingbird | 2025-09-03 06:30 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Yellow Warbler            | 2025-09-06 07:55 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Northern Parula           | 2025-09-06 09:52 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | American Redstart         | 2025-09-06 10:37 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Northern Waterthrush      | 2025-09-06 11:08 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_favorite | Cape May Warbler          | 2025-09-06 11:08 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_rarity   | Olive-sided Flycatcher    | 2025-09-05 18:26 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;| patch_rarity   | Yellow-billed Cuckoo      | 2025-09-06 11:08 |&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;+----------------+---------------------------+------------------+&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;\`\`\`&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&amp;lt;!-- cSpell:enable --&amp;gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;And here&#39;s an example of how you would report this&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&amp;gt; Thanks for calling McGolrick Park rare bird alert. A Yellow-Billed cuckoo was seen starting this morning. Yesterday, an Olive-sided flycatcher was seen. For the warbler lovers, Yellow Warblers, Redstarts, Northern Waterthrushes, Cape Mays and Northern Parulas are all being seen. Happy birding!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;That&#39;s probably woefully insufficient, but let&#39;s try it! It&#39;s funny how I spend all day working with OpenAI&#39;s SDKs yet I still failed to get this working the first time...&lt;/p&gt;
&lt;div class=&quot;language-text codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-text codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Hey there! Looks like McGolrick Park is buzzing with bird activity right now! We’ve got some real gems spotted recently. Just today, you’ve had sightings of the Northern Parula, American Redstart, Northern Waterthrush, Cape May Warbler, and even the Yellow Warbler.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;And check this out—there’s a rare visitor: the Yellow-billed Cuckoo was spotted today, too! Plus, yesterday you had an Olive-sided Flycatcher—another rare one in the area. If you head over to the park now, you might still catch those patch favorites like the Ruby-throated Hummingbird, Scarlet Tanager, and Magnolia Warbler, all seen just in the last few days.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;It’s prime time for birding out there. Grab your binoculars and happy birdwatching!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Wow! McGolrick Park is a birding hotspot right now! Let me give you the highlights in a fun rundown.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;First off, some of the favorite feathered friends that have shown up recently include the Yellow Warbler, Northern Parula, and American Redstart—all seen today. These warblers are in peak migration mode, so now’s a fantastic time to spot them. Plus, you’ve got the Northern Waterthrush and Cape May Warbler sighted today too—talk about a warbler bonanza!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Then, looking at yesterday, you had the Common Grackle pop up, and earlier this week, the Magnolia Warbler and Great Crested Flycatcher were around, along with that dazzling Scarlet Tanager.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;But here’s where it gets really exciting: we’ve got a couple of rarities on the list! Just yesterday, an Olive-sided Flycatcher was seen, and today—drumroll please—a Yellow-billed Cuckoo made an appearance. That’s a rare bird for the area and a real treat if you can catch a glimpse.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;So, if you’re heading out to McGolrick Park, bring those binoculars and keep your eyes peeled. You’ve got a great mix of patch favorites and some special visitors right in the heart of migration season. Happy birding!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Alright, let’s break it down! McGolrick Park is absolutely alive with birds right now. Here’s the quick scoop.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;We’ve got some super fresh sightings today—September 6th. The Yellow Warbler showed up early in the morning at 7:55. Then, the Northern Parula swooped in around 9:52. Not long after, the American Redstart popped up at 10:37. And get this—just a bit later, at 11:08, someone spotted both a Northern Waterthrush and a Cape May Warbler. That’s a warbler party!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;And for the rare birds? Today’s jackpot: the Yellow-billed Cuckoo was seen at 11:08 as well. That’s a big one. Plus, just yesterday—September 5th—an Olive-sided Flycatcher was spotted around 6:26 in the evening. Those are some rare treats for the park!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;And so on... it was just running in a loop. And that&#39;s entirely my fault cause I copied the example for &quot;push to talk&quot; including the &lt;code&gt;while True&lt;/code&gt; infinite loop: 🤦‍♂️.&lt;/p&gt;
&lt;div class=&quot;language-python codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-python codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;openai &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; AsyncOpenAI&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;async&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; openai&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;realtime&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    model&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;gpt-realtime&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; connection&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; connection&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;session&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        session&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;output_modalities&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;text&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;model&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;gpt-realtime&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;realtime&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; connection&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;conversation&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            item&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;                &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;message&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;                &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;role&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;user&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;                &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;content&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;input_text&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;text&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; user_prompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;await&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; connection&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;async&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; event &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; connection&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; event&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;response.output_text.delta&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;                &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;delta&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; flush&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; end&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;elif&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; event&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;response.output_text.done&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;                &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;elif&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; event&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;response.done&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;                &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;break&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Also that&#39;s not really the way we want this hotline to sound like... oh right, I forgot to to provide a system prompt! Adding that in and...&lt;/p&gt;
&lt;div class=&quot;language-text codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-text codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Thanks for calling the McGolrick Park rare bird alert. The standout sighting today is a Yellow-billed Cuckoo, seen this morning. Also notable, an Olive-sided Flycatcher was spotted yesterday. Warbler enthusiasts will be thrilled: Northern Parula, American Redstart, Northern Waterthrush, Cape May Warbler, and Yellow Warbler have all been seen today. Other favorites recently spotted include a Common Grackle, Ruby-throated Hummingbird, and Veery. Happy birding!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Hey that&#39;s pretty good! I have this just outputting as text for now, so let&#39;s add audio too. Getting the audio output added was simple: I just needed to add &lt;code&gt;audio&lt;/code&gt; as an &lt;code&gt;output_modalities&lt;/code&gt; and also process the &lt;code&gt;output_audio&lt;/code&gt; events. Getting this into an audio file was slightly trickier but still pretty straightforward:&lt;/p&gt;
&lt;div class=&quot;language-python codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-python codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;save_pcm_audio_chunks&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;pcm_bytes_array&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;f&quot;Received &lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation builtin&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation&quot;&gt;pcm_bytes_array&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt; audio chunks&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# Open WAV file for writing&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; wave&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;bird_report_audio.wav&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;wb&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; wav_file&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# Set WAV parameters&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        wav_file&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;setnchannels&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# mono&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        wav_file&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;setsampwidth&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# 2 bytes per sample (16-bit)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        wav_file&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;setframerate&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;24000&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# sample rate&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;# Write PCM data&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        wav_file&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;writeframes&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;b&quot;&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;pcm_bytes_array&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;And...&lt;/p&gt;
&lt;audio controls=&quot;&quot;&gt;&lt;source src=&quot;/assets/medias/bird_report_audio_v1-1e28742961654d3847908952add6455b.wav&quot; type=&quot;audio/wav&quot;&gt;&lt;p&gt;Your browser does not support the audio element.&lt;/p&gt;&lt;/audio&gt;
&lt;p&gt;It works! Though why is our fairly obnoxious response back. One thing I noticed when playing around with the various voice options though is that they&#39;re heavily scripting out things like tone, affect and emotion. Life goals honestly:&lt;/p&gt;
&lt;div class=&quot;language-text codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-text codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Affect: Deep, commanding, and slightly dramatic, with an archaic and reverent quality that reflects the grandeur of Olde English storytelling.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I added some extra input like so:&lt;/p&gt;
&lt;div class=&quot;language-text codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-text codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    Affect: deep, informed, wise.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    Tone: informative, terse and to the point.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    Emotion: dry, direct, concise.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    Pronunciation: articulate with a slight twang of a Southern US accent.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Unfortunately, this seemed to have no effect. I eventually realized that this seems to be a bug when providing both text &amp;amp; audio as the output modality. I mistakenly thought this was necessary to get a transcript... but turns out I just need audio. Once I provided that, we had something much better to work with!&lt;/p&gt;
&lt;div class=&quot;language-text codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-text codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Thanks for calling McGolrick Park rare bird alert. A Yellow-billed Cuckoo was seen late this morning. Yesterday, an Olive-sided Flycatcher was spotted. Today’s patch favorites include Northern Waterthrush, Cape May Warbler, American Redstart, Northern Parula, and Yellow Warbler. Happy birding.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This is possibly too formulaic, but let&#39;s go with it for now.&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;getting-this-in-the-phones&quot;&gt;Getting this in the phones&lt;a href=&quot;https://beak-v2.onrender.com/blog/bird-calls#getting-this-in-the-phones&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Getting this in the phones&quot; title=&quot;Direct link to Getting this in the phones&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Okay, we&#39;ve got an audio file now we can work with. All that&#39;s needed is for us to shove this into a phone call! Initially, I thought I might explore a solution where I just upload the file to something periodically. However, doing a bit more research (like 15 seconds more), maybe I&#39;ll just have Twilio read the audio data from an API. This is maybe slightly more work, but also opens the door up to have this data updated in &quot;real time&quot; whenever someone calls. Poking around, I really think what I want is whatever this &lt;a href=&quot;https://www.twilio.com/docs/voice/twiml&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;TwiML&lt;/a&gt; thing is called... maybe this will let me eventually set up some sort of dynamic phone tree even?&lt;/p&gt;
&lt;p&gt;Seems like the main thing I need to do now is to set up a new API endpoint that returns their custom XML stuff:&lt;/p&gt;
&lt;div class=&quot;language-python codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-python codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;async&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;get_bird_call&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;location_code&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; location_code &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;is&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        location_code &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; MC_GOLRICK_PARK_HOTSPOT_ID&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    response &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; VoiceResponse&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    response&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;say&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;Hello you&#39;re talking to Bird!&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;And... that works:&lt;/p&gt;
&lt;div class=&quot;language-text codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-text codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&quot;&amp;lt;?xml version=\&quot;1.0\&quot; encoding=\&quot;UTF-8\&quot;?&amp;gt;&amp;lt;Response&amp;gt;&amp;lt;Say&amp;gt;Hello you&#39;re talking to Bird!&amp;lt;/Say&amp;gt;&amp;lt;/Response&amp;gt;&quot;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Now we just need to map that to the audio file itself. I&#39;m not sure exactly the &lt;em&gt;best&lt;/em&gt; way to host a file like this, but for now I can just push the file up to the server and then read it from there. In some ideal world I guess this would live on a service like S3 better suited for serving data. But, this will do for now:&lt;/p&gt;
&lt;div class=&quot;language-python codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-python codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token decorator annotation punctuation&quot; style=&quot;color:#393A34&quot;&gt;@Cloaca_App&lt;/span&gt;&lt;span class=&quot;token decorator annotation punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token decorator annotation punctuation&quot; style=&quot;color:#393A34&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;/v1/bird_calls/audio_file&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;async&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;get_bird_call_audio_file&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;location_code&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        audio_files_path &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; os&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;environ&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;AUDIO_FILES_PATH&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;except&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; KeyError&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;raise&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; RuntimeError&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;AUDIO_FILES_PATH environment variable is required. &quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;Please set it to the path of your audio files.&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    file_path &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;f&quot;&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation&quot;&gt;audio_files_path&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation&quot;&gt;location_code&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;.wav&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;f&quot;Looking for audio file at &lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation&quot;&gt;file_path&lt;/span&gt;&lt;span class=&quot;token string-interpolation interpolation punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token string-interpolation string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; os&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;file_path&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; FileResponse&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;file_path&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; media_type&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;audio/wav&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;message&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;Audio file not found&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;404&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;From here, the main work was just going through some extra steps in Twilio to set up a phone number. Then, I simply routed that phone number to an ngrok-forwarded URL to my local machine and... 🥁🥁🥁&lt;/p&gt;
&lt;p&gt;It works! I can&#39;t really show a great demo of this given you can&#39;t record the audio of a phone call in a screen recording on iOS, but it really does work!&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;polishing-this-up&quot;&gt;Polishing this up&lt;a href=&quot;https://beak-v2.onrender.com/blog/bird-calls#polishing-this-up&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Polishing this up&quot; title=&quot;Direct link to Polishing this up&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Alright, now on to the final step of getting this working for real. There are a few things we need to finish up. First, we need the alert to run periodically. Then, we need to make sure the file is getting persisted in the right place. Last, we need to wire this all up correctly in Twilio to not be using ngrok for this.&lt;/p&gt;
&lt;p&gt;One forgotten API key, one mis-formatted URL, one forgotten deploy and one long session debugging why Twilio didn&#39;t like my Content-Type later... and it&#39;s live! You can now call 9148637717 and get some local bird intel!&lt;/p&gt;
&lt;p&gt;This was quite fun and quick. It could&#39;ve been quicker if it were simpler to just pass audio straight back to Twilio. All in all, this was pretty straightforward and painless though.&lt;/p&gt;
&lt;p&gt;If you want to see the full details of this, I merged in the Pull Request for it here:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/birds-eye-app/cloaca/pull/8&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;https://github.com/birds-eye-app/cloaca/pull/8&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I think it&#39;d be fun in the future to possibly extend this further:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Certainly adding more supported hotspots. It&#39;d be nice to add a feature to type in or even just say the hotspot you want to see information for.&lt;/li&gt;
&lt;li&gt;I also need to actually add in the DuckDB information to more accurately show patch rarities.&lt;/li&gt;
&lt;li&gt;I&#39;d love to figure out a way to access who submitted the observation. These hotlines love giving credit to who found the bird and it&#39;d be cool to replicate that. I get why eBird might be a bit stingy on handing user details out though.&lt;/li&gt;
&lt;li&gt;Similarly, I wish I could access the species comments on these too. It&#39;d be very nice to include that so people could know where to look: &quot;A Scarlet Tanager is being seen above the playground&quot;.&lt;/li&gt;
&lt;li&gt;It might be cool to eventually support some more true real time voice stuff, enabling someone to talk over the phone?&lt;/li&gt;
&lt;/ul&gt;</content>

    <author>
      <name>David Meadows</name>

      <uri>https://dtmeadows.me</uri>

    </author>
  </entry>

  <entry>
    <title>Typing select queries in Python</title>
    <link href="https://blog.craigie.dev/typing-select/"/>
    <id>https://blog.craigie.dev/typing-select/</id>
    <updated>2022-02-24T08:30:00Z</updated>


    <content type="html">&lt;p&gt;The recently accepted PEP 646 introduced variadic generics! While the main motivation for this feature was to improve typing for numerical libraries it also means that is now possible to accurately type the results of an SQL select query statement in Python!&lt;/p&gt;
&lt;p&gt;There are some limitations to this however, unfortunately we cannot return named properties and instead must use a tuple form.&lt;/p&gt;
&lt;p&gt;This post will walk through the process of typing and implementing the equivalent of an SQL select function, if you just want to see the final results then &lt;a href=&#39;#final-implementation&#39;&gt;click here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=getting-started&gt;Getting Started&lt;/h2&gt;&lt;p&gt;This post assumes that you are already familiar with generics in python, if you are not then I suggest you read through this &lt;a href=&#39;https://decorator-factory.github.io/typing-tips/tutorials/generics/&#39;&gt;tutorial&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You will also need a recent version of &lt;a href=&#39;https://pypi.org/project/typing-extensions/&#39;&gt;typing-extensions&lt;/a&gt; and &lt;a href=&#39;https://github.com/RobertCraigie/pyright-python&#39;&gt;pyright&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=introduction-to-pep-646&gt;Introduction to PEP 646&lt;/h3&gt;&lt;p&gt;The acceptance of &lt;a href=&#39;https://www.python.org/dev/peps/pep-0646/&#39;&gt;PEP 646&lt;/a&gt; means we can now create generics with an arbitrary number of type variables instead of being limited to a pre-defined number!&lt;/p&gt;
&lt;p&gt;The new features that this introduces is &lt;code&gt;TypeVarTuple&lt;/code&gt; and &lt;code&gt;Unpack&lt;/code&gt; which must be used in combination with each other to represent variadic generics.&lt;/p&gt;
&lt;p&gt;A not particularly useful example of this is inserting an integer to the start of a tuple (yes tuples are immutable but lets assume for this example that we would return a new tuple):&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;typing&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;typing_extensions&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVarTuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVarTuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;T&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;insert_int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]])&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]:&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;insert_int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# return type: Tuple[int, str, int]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=minimal-example&gt;Minimal example&lt;/h3&gt;&lt;p&gt;The minimum code required to represent a select satement statically is actually fairly small.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;typing&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;typing_extensions&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVarTuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reveal_type&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVarTuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;FieldT&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;email&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;hashed_password&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;

    &lt;span class=&quot;nd&quot;&gt;@classmethod&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;cls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]:&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;user&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;reveal_type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now running this code through &lt;a href=&#39;https://github.com/microsoft/pyright&#39;&gt;pyright&lt;/a&gt; will give us the following output:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ pyright typed_select.py
typed_select.py:20:13 - information: Type of &amp;quot;user&amp;quot; is &amp;quot;Tuple[int, str]&amp;quot;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Wow, this is exactly what we were looking for!&lt;/p&gt;
&lt;p&gt;Now lets try actually running this code and see what happens:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ python typed_select.py
Traceback (most recent call last):
  File &amp;quot;/Users/robert/code/craigie.dev/sources/typing-select/001.py&amp;quot;, line 19, in &amp;lt;module&amp;gt;
    user = User.select(User.id, User.name)
AttributeError: type object &amp;#39;User&amp;#39; has no attribute &amp;#39;id&amp;#39;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Hmmmm that is annoying.&lt;/p&gt;
&lt;p&gt;This is happening because we&#39;ve given the &lt;code&gt;User&lt;/code&gt; class type hints but haven&#39;t actually given these fields any values.&lt;/p&gt;
&lt;p&gt;Let&#39;s assign some default values for the fields we are selecting:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;typing&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;typing_extensions&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVarTuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reveal_type&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVarTuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;FieldT&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;&amp;#39;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;email&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;hashed_password&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;

    &lt;span class=&quot;nd&quot;&gt;@classmethod&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;cls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]:&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;user&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;reveal_type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We can now sucessfuly run this code:&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ python typed_select.py
Runtime type is &amp;#39;NoneType&amp;#39;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=implementation&gt;Implementation&lt;/h2&gt;&lt;p&gt;Now let&#39;s implement the &lt;code&gt;select()&lt;/code&gt; function so that we can actually use it at runtime.&lt;/p&gt;
&lt;p&gt;Lets start by defining a &lt;code&gt;Field&lt;/code&gt; class so that we don&#39;t have to assign weird defaults for each field and so we can store a reference to the field&#39;s name in the database.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;fm&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We now have to define a wrapper function over the &lt;code&gt;Field&lt;/code&gt; class so that type checkers don&#39;t complain that we&#39;re assigning a &lt;code&gt;Field&lt;/code&gt; instance when we can only assign strings for example.
This works as we tell the type checker that we are returning &lt;code&gt;Any&lt;/code&gt; which disables type checking.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now lets update our &lt;code&gt;User&lt;/code&gt; model to use the new &lt;code&gt;Field&lt;/code&gt; class&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;name&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;email&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;email&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;hashed_password&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;hashed_password&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To make this easier to implement we&#39;ll define a list of records as our database instead of using a real database driver.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fake_db&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;name&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;Robert&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;email&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;robert@craigie.dev&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;hashed_password&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;name&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;Tegan&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;email&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;tegan@craigie.dev&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;hashed_password&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;bar&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now all that we need to do is implement the select() function&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;    &lt;span class=&quot;nd&quot;&gt;@classmethod&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;cls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;ne&quot;&gt;TypeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
                &lt;span class=&quot;s1&quot;&gt;&amp;#39;Expected all select arguments to be an instance of Field&amp;#39;&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# for simplicites sake we&amp;#39;ll just take the first record&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;user&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fake_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# we have to add a type: ignore comment here as the type checker&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# cannot understand the relationship between this expression&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# and the input.&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# type: ignore&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Let&#39;s type check it and make sure we haven&#39;t made any mistakes.&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ pyright typed_select.py
typed_select.py:60:13 - information: Type of &amp;quot;user&amp;quot; is &amp;quot;Tuple[str, int]&amp;quot;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;All good! Now lets run it and see what we get!&lt;/p&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ python typed_select.py
Runtime type is &amp;#39;tuple&amp;#39;
(&amp;#39;Robert&amp;#39;, 1)
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Amazing! We&#39;ve just managed to statically type and implement a select query!&lt;/p&gt;
&lt;p&gt;There are some potential improvements that could be made to this, for example, returning a custom object that provides some helper methods instead of a raw tuple.&lt;/p&gt;
&lt;h2 id=final-implementation&gt;Final Implementation&lt;/h2&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;typing&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;typing_extensions&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVarTuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reveal_type&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVarTuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;FieldT&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;fm&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;fake_db&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;name&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;Robert&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;email&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;robert@craigie.dev&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;hashed_password&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;name&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;Tegan&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;email&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;tegan@craigie.dev&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s1&quot;&gt;&amp;#39;hashed_password&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;bar&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;name&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;email&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;email&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;hashed_password&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;hashed_password&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;nd&quot;&gt;@classmethod&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;cls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FieldT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;ne&quot;&gt;TypeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
                &lt;span class=&quot;s1&quot;&gt;&amp;#39;Expected all select arguments to be an instance of Field&amp;#39;&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# for simplicites sake we&amp;#39;ll just take the first record&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;user&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fake_db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# we have to add a type: ignore comment here as the type checker&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# cannot understand the relationship between this expression&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# and the input.&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# type: ignore&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;user&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;User&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;reveal_type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Revealed type is Tuple[str, int]&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;# (&amp;#39;Robert&amp;#39;, 1)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;</content>

    <author>
      <name>Robert Craigie</name>

      <uri>https://craigie.dev</uri>

    </author>
  </entry>

  <entry>
    <title>Digging into species likelihood with eBird and DuckDB</title>
    <link href="https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3"/>
    <id>https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3</id>
    <updated>2025-08-27T00:00:00Z</updated>


    <content type="html">&lt;p&gt;In part 2 of this exploration we dug deep into how to answer the question &quot;where do all the birders go?&quot; We found many ways to make queries run very slowly on DuckDB but ended up hitting our initial goal of starting to use the EBD in production. Now, we&#39;re going to extend that further to see if we can show people how many birds might be likely in a given place and time.&lt;/p&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;best-practices-when-working-with-ebird-data&quot;&gt;Best practices when working with eBird data&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3#best-practices-when-working-with-ebird-data&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Best practices when working with eBird data&quot; title=&quot;Direct link to Best practices when working with eBird data&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I&#39;ll admit from the start that this is far more of a daunting exercise than the last one. The analysis part of the hotspot popularity was pretty straightforward: count the checklists and group them by the hotspot and month (though, that didn&#39;t stop me from making plenty of mistakes). The challenge there was more on the technical side of getting queries to run quickly and efficiently. This part will be much more challenging on both fronts. For one, answering the question of &quot;which bird can be seen where&quot; with any sort of correctness is quite hard. Second, doing this in a way that will work in production is not going to be easy.&lt;/p&gt;
&lt;p&gt;One advantage that I do have this time around is the fact that the eBird team have spent a lot of time discussing and documenting how to go about this exact sort of analysis. I&#39;ll be borrowing heavily from their fantastic resource here on &lt;a href=&quot;https://ebird.github.io/ebird-best-practices/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;&lt;strong&gt;Best Practices for Using eBird Data&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A particular highlight I&#39;d recommend going over is the various biases or imbalances that creep in when using citizen science data like eBird. I&#39;ll summarize briefly here, but it&#39;s very readable if you want to &lt;a href=&quot;https://ebird.github.io/ebird-best-practices/ebird.html#sec-ebird-challenges&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;check it out directly&lt;/a&gt;. The underlying thing we&#39;re trying to measure is when and where a species of bird will be at any given point. We can&#39;t just know this, so we&#39;re forced to estimate this in some way. The data source we&#39;re using here is humans going out and writing down which birds they saw and how many. Many of them do this with a scientific mindset, but they&#39;re also humans with preferences and needs that influence when and where they do this. So we end up with lots of trends in the observations that are more human than avian. We humans prefer to bird on the weekends, during migration when there are plenty of birds, when the weather isn&#39;t horrendous and so on. And we tend to notice, well, more noticeable birds. A loud or beautiful bird song will get noticed better than an owl sleeping nearly invisibly at the top of a tree.&lt;/p&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;writing-a-query-for-species-likelihood&quot;&gt;Writing a query for species likelihood&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3#writing-a-query-for-species-likelihood&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Writing a query for species likelihood&quot; title=&quot;Direct link to Writing a query for species likelihood&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Bringing this to the task at hand, let&#39;s start with a naive analysis of species likelihood. We&#39;ll count the number of checklists in a hotspot for a certain month. Then we&#39;ll count how many times a species appeared on a checklist in that place. Dividing those we should get a rough idea of &quot;what are the most likely birds to see then and there?&quot;&lt;/p&gt;
&lt;p&gt;Here&#39;s a quick way I got to this data, with the regular caveat that I&#39;ve certainly made mistakes:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  total_checklists &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      extract &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;          &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY ID&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;distinct&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;SAMPLING EVENT IDENTIFIER&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; total_checklists&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      ebird_ny&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;full&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;current_date&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;5 year&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY ID&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;L165143&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  species_checklists &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      extract &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;          &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY ID&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;COMMON NAME&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;distinct&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;SAMPLING EVENT IDENTIFIER&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; species_checklists&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      ebird_ny&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;full&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;current_date&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;5 year&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY ID&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;L165143&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;COMMON NAME&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  species_checklists&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  total_checklists&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  species_checklists &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; total_checklists&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  total_checklists&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; species_checklists &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;using&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY ID&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  species_checklists &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;desc&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;10&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;There&#39;s probably a quicker way to write that too, but using something like window functions gets a little tricky since I don&#39;t want to just add up the checklists counts for all species. Instead I want the unique list of checklists. Anyways, here&#39;s what that returns:&lt;/p&gt;
&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;COMMON NAME&lt;/th&gt;&lt;th&gt;species_checklists&lt;/th&gt;&lt;th&gt;total_checklists&lt;/th&gt;&lt;th&gt;likelihood&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Laughing Gull&lt;/td&gt;&lt;td&gt;593&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.8863976083707026&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Mute Swan&lt;/td&gt;&lt;td&gt;548&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.8191330343796711&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Great Egret&lt;/td&gt;&lt;td&gt;539&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.8056801195814649&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Osprey&lt;/td&gt;&lt;td&gt;533&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.796711509715994&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Double-crested Cormorant&lt;/td&gt;&lt;td&gt;532&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.7952167414050823&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Gray Catbird&lt;/td&gt;&lt;td&gt;527&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.7877428998505231&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Canada Goose&lt;/td&gt;&lt;td&gt;519&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.7757847533632287&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Mallard&lt;/td&gt;&lt;td&gt;516&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.7713004484304933&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Northern Mockingbird&lt;/td&gt;&lt;td&gt;509&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.7608370702541106&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Snowy Egret&lt;/td&gt;&lt;td&gt;506&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.7563527653213752&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;(if the numbers seem high that&#39;s cause the totals are across the past 5 years or so of data)&lt;/p&gt;
&lt;p&gt;Sorted the other way (least popular first) we have these:&lt;/p&gt;
&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;COMMON NAME&lt;/th&gt;&lt;th&gt;species_checklists&lt;/th&gt;&lt;th&gt;total_checklists&lt;/th&gt;&lt;th&gt;(species_checklists / total_checklists)&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Whimbrel&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Monk Parakeet&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Calidris sp.&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Wilson&#39;s Warbler&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Piping Plover&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Hooded Warbler&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;White-faced Ibis&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Black-throated Green Warbler&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Dickcissel&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Tricolored Heron&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;669&lt;/td&gt;&lt;td&gt;0.0014947683109118087&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;This matches my gut level instincts of what might show up at Jamaica Bay in August. But there are some additional steps we should consider.&lt;/p&gt;
&lt;p&gt;The main one is that we can&#39;t naively assume that most people were trying to do complete checklists. Put another way, if someone birded for 2 minutes in Jamaica Bay and saw 3 species, a Laughing Gull, a a House Sparrow and an Osprey, we should not assume that a Mallard wasn&#39;t in Jamaica Bay that day. The eBird docs go much further into what to think about here, but being able to do this conversion, in the right way, is one of the most valuable aspects of eBird and why complete checklists are so valuable to science!&lt;/p&gt;
&lt;p&gt;Oh and also we probably shouldn&#39;t factor in &lt;code&gt;Calidris sp.&lt;/code&gt; type reports either. So let&#39;s apply a few recommended filters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;we&#39;ll only include &lt;code&gt;species&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;no incidental checklists&lt;/li&gt;
&lt;li&gt;keep it under 10km&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Last, I&#39;ll throw in a distinct at the top for 2 reasons. 1, this should condense &quot;shared&quot; checklists down to only 1 report per species. 2, I &lt;em&gt;think&lt;/em&gt; this will also collapse sub-species into the regular species name. Both of these are definitely areas to followup on, though I hope it&#39;ll be good enough for now.&lt;/p&gt;
&lt;p&gt;I&#39;ll just define those in a new &lt;code&gt;input_data&lt;/code&gt; expression to make it easier to keep things consistent:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;distinct&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;-- distinct is here to exclude duplicate (shared) checklists&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    extract &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY ID&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; locality_id&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;SAMPLING EVENT IDENTIFIER&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; checklist_id&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;COMMON NAME&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; common_name&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ebird_ny&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;full&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;current_date&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;5 year&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; CATEGORY &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;species&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;PROTOCOL TYPE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;Stationary&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;Traveling&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;EFFORT DISTANCE KM&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;10&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Once we have that, we can now run this across all the hotspots for August (I&#39;m using a narrow New York only dataset to make this easier). This way we can see something like &quot;which hotspot has the highest number of species likely to be found at it?&quot; And we get something surprising:&lt;/p&gt;
&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;month&lt;/th&gt;&lt;th&gt;LOCALITY&lt;/th&gt;&lt;th&gt;locality_id&lt;/th&gt;&lt;th&gt;common_species&lt;/th&gt;&lt;th&gt;uncommon_species&lt;/th&gt;&lt;th&gt;total_checklists&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;St. Lawrence River Dike Trail at Richard&#39;s Landing&lt;/td&gt;&lt;td&gt;L1810696&lt;/td&gt;&lt;td&gt;100&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;12&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;Freshkills Park--North Park (Phase l)&lt;/td&gt;&lt;td&gt;L27548440&lt;/td&gt;&lt;td&gt;96&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;Ulster Park&lt;/td&gt;&lt;td&gt;L16301078&lt;/td&gt;&lt;td&gt;95&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;td&gt;34&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;Jamaica Bay Wildlife Refuge&lt;/td&gt;&lt;td&gt;L165143&lt;/td&gt;&lt;td&gt;94&lt;/td&gt;&lt;td&gt;40&lt;/td&gt;&lt;td&gt;619&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;Montezuma NWR--Towpath Rd.&lt;/td&gt;&lt;td&gt;L266832&lt;/td&gt;&lt;td&gt;93&lt;/td&gt;&lt;td&gt;54&lt;/td&gt;&lt;td&gt;70&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;Montezuma NWR--Knox-Marsellus and Puddler Marshes&lt;/td&gt;&lt;td&gt;L679571&lt;/td&gt;&lt;td&gt;92&lt;/td&gt;&lt;td&gt;14&lt;/td&gt;&lt;td&gt;44&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;Lower La Chute River &amp;amp; Ticonderoga Marsh&lt;/td&gt;&lt;td&gt;L4910422&lt;/td&gt;&lt;td&gt;91&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;11&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;Spring Farm Nature Sanctuary&lt;/td&gt;&lt;td&gt;L6624443&lt;/td&gt;&lt;td&gt;89&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;River Rd. marshes and RR tracks&lt;/td&gt;&lt;td&gt;L2457005&lt;/td&gt;&lt;td&gt;89&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;15&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;Columbia University, Nevis Laboratories Environs&lt;/td&gt;&lt;td&gt;L27674356&lt;/td&gt;&lt;td&gt;89&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The top hotspot is only one with 12 checklists in it. I haven&#39;t checked those checklists myself to see if they&#39;re valid, but there&#39;s nothing to say there&#39;s anything wrong with something like this. There are plenty of very rich hotspots out there that don&#39;t get visited that often. Still, it seems less than ideal if we&#39;d rank this spot as more rich than something like Jamaica Bay where not only does it have a similar count of species, but we&#39;re also far more certain of the numbers you can see there. We&#39;re gonna need... statistics for this.&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;error-rate--confidence&quot;&gt;Error rate / confidence&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3#error-rate--confidence&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Error rate / confidence&quot; title=&quot;Direct link to Error rate / confidence&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;What I think I really want here is not necessarily a way to alter the number of species we show for lower frequency hotspots. Instead, we really just want to return additional data alongside it that shows how confident we are in the data. It&#39;s been a long while since I was in stats class and unfortunately I&#39;m drawing a bit of blank on how best to model data like this. What&#39;s confusing to me is that we&#39;re not really dealing with &quot;random&quot; data per se. A birder seeing 90 species at a location over 3 hours is a good sign that there&#39;s going to be a lot of species at that location. But it&#39;s not obvious to me how to rate that finding if 100 other birders hit that level, or if no one else does. It would feel like a mistake to throw that observation out, or even to water it down with 10 other quicker trips if they only spent a few minutes and recorded 10 species.&lt;/p&gt;
&lt;p&gt;I think what I&#39;ll do for now is do a very rough &quot;error&quot; calculation (rough cause I&#39;m truly not sure if it&#39;s right), and return that to my map alongside the unmodified count of &quot;common&quot; species. Here&#39;s how I calculate that:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;common_species&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; total_checklists&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;getting-this-to-the-front-end&quot;&gt;Getting this to the front end&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3#getting-this-to-the-front-end&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Getting this to the front end&quot; title=&quot;Direct link to Getting this to the front end&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Now, I had planned originally on making another table for this data, but I think with the learnings from last time, it might make sense to just throw this data into our existing popular hotspots table. It will definitely increase the table size, but I won&#39;t have to, hopefully, worry too much about performance with a new &lt;code&gt;JOIN&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;But, let&#39;s just see how this looks now. I haven&#39;t talked much about the frontend side of displaying things in this series, and that&#39;s mostly cause I haven&#39;t thought too much about it. It&#39;s mostly been me (Claude) copy and pasting the existing map code I have here. I don&#39;t want to do this forever, but it&#39;s been &quot;good enough&quot; for now. So let&#39;s keep doing that.&lt;/p&gt;
&lt;p&gt;&lt;img decoding=&quot;async&quot; loading=&quot;lazy&quot; alt=&quot;image showing a pin on Floyd Bennet field with the number 103 on it and a tooltip on that showing &amp;amp;quot;53 species&amp;amp;quot;&quot; src=&quot;https://beak-v2.onrender.com/assets/images/floyd_bennet_bug-92893f94f1c52b76c96d23e0f84496ae.png&quot; width=&quot;742&quot; height=&quot;580&quot; class=&quot;img_ev3q&quot;&gt;&lt;/p&gt;
&lt;p&gt;So with a decent prompt, we&#39;re up and running... except these numbers are a bit off. Why is Floyd Bennet showing 109 on the map but 53 in the tooltip?&lt;/p&gt;
&lt;p&gt;Looking at the DB for that location and month:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┬─────────────────────┬───────────────────────────────────┬────────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ locality_id │    locality_name    │ avg_weekly_number_of_observations │ common_species │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│   varchar   │       varchar       │              double               │     int64      │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;├─────────────┼─────────────────────┼───────────────────────────────────┼────────────────┤&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             53 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             93 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             70 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             55 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             63 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             92 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             71 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             85 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             62 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │            109 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             67 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L152773     │ Floyd Bennett Field │                 7.869565217391305 │             59 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;├─────────────┴─────────────────────┴───────────────────────────────────┴────────────────┤&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ 12 rows                                                                      4 columns │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└────────────────────────────────────────────────────────────────────────────────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;We should not have 12 rows for this. Oh the ole 1&lt;!-- --&gt;:many&lt;!-- --&gt; accidental &lt;code&gt;JOIN&lt;/code&gt; issue. I&#39;m creating my final table like so:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;SELECT&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  locality_id&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;FROM&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  localities l&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;JOIN&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; hotspot_popularity hp &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;USING&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;locality_id_int&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;LEFT&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;JOIN&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; hotspots_richness hr &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;using&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;locality_id_int&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The join on the ID is fine, but I forgot I need to make sure the same month is used for popularity and richness. This ought to fix it:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;LEFT&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;JOIN&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; hotspots_richness hr &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;using&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;locality_id_int&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Zooming out a bit things are looking overall pretty good. It&#39;s hard to say how accurate this is... I&#39;m learning from just looking around on the map which hotspots really shine at which time of year. I&#39;m also struggling a bit to figure out how best to show an error bar or confidence interval at this level. Should I fade out lower confidence hotspots? Should I make the edges blurrier? Should I de-saturate them?&lt;/p&gt;
&lt;p&gt;For now I think I&#39;m going to focus on the more core problem of making sure this data is accurate. If I quickly compare some hotspots it seems my methodology is vastly under-counting compared to eBird. It&#39;s been fairly hard for me to figure out how eBird and Merlin calculate the numbers here. I&#39;ve seen that for a species to not be a yellow dot, it has to appear on 6% or more of checklists. And that&#39;s actually what I used initially in my data. However, if I actually look into the eBird mobile species under &quot;Likely&quot; for a hotspot they include these yellow dots. So maybe I should include them in mine? Or really what I want to do is to include both?&lt;/p&gt;
&lt;p&gt;Adding those numbers in brings my numbers much closer to eBird&#39;s. I think it also makes since to display both in the tooltip. Maybe I can even add an option shortly to toggle the map from displaying either just common birds or both common and uncommon?&lt;/p&gt;
&lt;p&gt;&lt;img decoding=&quot;async&quot; loading=&quot;lazy&quot; alt=&quot;image multiple pins around South Brooklyn with numbers on them&quot; src=&quot;https://beak-v2.onrender.com/assets/images/richness_working-b3729913a60dd672cbad4786841e6e21.png&quot; width=&quot;2854&quot; height=&quot;1126&quot; class=&quot;img_ev3q&quot;&gt;&lt;/p&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;scaling-this-up&quot;&gt;Scaling this up&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3#scaling-this-up&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Scaling this up&quot; title=&quot;Direct link to Scaling this up&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Before going too much further though I want to stop and check in on the performance of this. I&#39;m a little wary of adding too much more data if I can&#39;t actually deploy it. I do most of my development off of a more limited NY-only dataset. This is plenty for testing and poking around, but I still need to run the full thing to make sure I&#39;m not gonna drain the bank on storage fees.&lt;/p&gt;
&lt;p&gt;The first run of this... wasn&#39;t promising. I very quickly drained my computer memory and the spillover disk storage. I hadn&#39;t taken a cleanup or refactoring pass through the likelihood query, so I did quick look through to remove some extraneous columns and CTE&#39;s. One quick snack break later... and no luck. Still running out of space. Off to the DuckDB docs again!&lt;/p&gt;
&lt;p&gt;The docs were helpful as always, but there weren&#39;t too many tricks I hadn&#39;t already tried. The job is already getting as much memory and disk space as I can provide and I&#39;m not going anything woefully incorrect with my setup from what I can tell. A few things that did stick out to me: when I initially set up the full dataset, I didn&#39;t do anything to help optimize by using the correct types, sorting by timestamps or filtering out data or columns I&#39;m not going to use. That&#39;s probably long overdue here so I set about trying to define a more intermediate table to work off of. Unfortunately, with me stuck with the already too-large dataset on my hard drive I wasn&#39;t left with enough room to spin off a new one, pruned though it might be. Thankfully, I already needed to regenerate the underlying dataset with the latest data from July so we might as well take that on and clean up and format the data a bit more as we go.&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;a-long-side-quest-on-better-data-import&quot;&gt;A long side quest on better data import&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3#a-long-side-quest-on-better-data-import&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to A long side quest on better data import&quot; title=&quot;Direct link to A long side quest on better data import&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Back in Part 1, I was just trying to get to the point of being able to query the EBD in a DuckDB dataset. It was a slow but fun journey, but I&#39;d only done this once. Thus, my methods of doing so were sloppy at best and I knew the next time I was going to do this I should take a few passes and cleaning up what I was going and also making it a bit more efficient. This latter part really piqued my interest this time around, almost certainly to a negative degree. I had the external disk space to just do this in a slow, but sure way, converting the compressed form into an uncompressed TSV and then into a DuckDB database. Still, that felt quite wasteful and slow (in reality it would just mean firing off two scripts before going to bed one night and then waking up to a fresh dataset ready for use). So, like many times before, I set off to spend a bunch of mostly unnecessary time speeding this up and making it more efficient.&lt;/p&gt;
&lt;p&gt;The core of what I didn&#39;t like about the old flow was the necessity of reading through the data 2 or even 3 times to get to a working DuckDB:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Once to un-tar the files (I&#39;m not entirely positive if this requires reading all of the underlying bytes)&lt;/li&gt;
&lt;li&gt;To decompress the compressed TSV file inside of that TAR&lt;/li&gt;
&lt;li&gt;To load that TSV file into DuckDB&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;What would be great is if we could go from the .tar file to the full DB in one step. Theoretically I knew this was possible but I wasn&#39;t sure how feasible it would be with the existing tools I had. Something that gave me early confidence was the fact that DuckDB can read directly from &lt;code&gt;stdin&lt;/code&gt; when invoked from the command line like so:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;cat ebird.tsv | duckdb -c &quot;SELECT * FROM read_csv(&#39;/dev/stdin&#39;)&quot;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Neat! All that&#39;s really needed then is to selectively read and decompress the .gz file inside the .tar and pass that into DuckDB. Overall this was pretty straightforward and it wasn&#39;t long before I had a simple working solution. One aside here is that I did all of this myself in bash as a first pass, but then opted to let Claude Code then take a pass at polishing the script to have more logging, a progress bar using &lt;code&gt;pv&lt;/code&gt; and good timing tracking. Claude Code really seems at its best in moments like this where I have a very clear overall direction for what I want to do, but can rely on it to introduce extra details like args, or logging that would be finicky or frustrating for me to get just right with my intermediate bash knowledge. Here&#39;s how it looked once I had it up and running for real:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt; % ./queries/parse_ebd.sh ebd_relJul-2025 ./dbs/ ./dbs/&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Starting eBird data processing at Sat Aug 30 09:28:23 EDT 2025&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Processing: ebd_relJul-2025&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Output database: ./dbs/ebd_relJul-2025.db&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[09:28:23] Removing existing database file...&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[09:28:23] Database file removed&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[09:28:23] Starting data extraction and processing...&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  - Extracting ebd_relJul-2025.txt.gz from tar archive&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  - Archive size: 198.9G&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  - Compressed .gz file size: 198.9G&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;23.1GiB 0:11:20 [34.5MiB/s] [==&amp;gt;                 ]  11% ETA 1:25:42&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I&#39;m pretty happy with how this script came together! It&#39;s much more efficient with disk space and quicker than doing this in 3 steps. It takes about 60-90 minutes to complete which is the difference between leaving it for a night vs leaving it for a run or a long walk. &lt;a href=&quot;https://github.com/birds-eye-app/cloaca/blob/0ef1cd87a34da83de53bf5c52dda8120e1fbf53d/src/cloaca/swan_lake/scripts/parse_ebd.sh&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Here&#39;s the full version if you care to look.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;It wasn&#39;t all success, though. One important goal I had for this refactor was to introduce better ordering at this stage. DuckDB really emphasizes in their docs and blogs the importance of table ordering for good performance, especially in this great article: &lt;a href=&quot;https://duckdb.org/2025/05/14/sorting-for-fast-selective-queries.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;https://duckdb.org/2025/05/14/sorting-for-fast-selective-queries.html&lt;/a&gt;. Naively, I thought I could just throw in an &lt;code&gt;order by observation_date&lt;/code&gt; at the end of my new script and get this for free. However, this just resulted in all of the free disk space on my machine getting eaten up at an alarmingly fast pace. I think what was going on here is that DuckDB had to basically read in the entire contents of the DB into either memory or spillover disk space to determine the right order before it could begin to start committing anything into the more space-efficient DB.&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;fresh-data-but-no-progress&quot;&gt;Fresh data but no progress&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3#fresh-data-but-no-progress&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Fresh data but no progress&quot; title=&quot;Direct link to Fresh data but no progress&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;So, while it was good to have a much better import script and the latest EBD data, I still didn&#39;t feel like I was getting any closer to being able to generate the full data I wanted here if I couldn&#39;t actually set up ordering. Now the challenge is entirely about how to get this data ordered in a reasonable way. I searched around for ideas about how to order this in place, but didn&#39;t find anything mentioning this as a feature of DuckDB. I couldn&#39;t just write a new query to just &quot;create a new table with everything in the old table but order it&quot; since I&#39;d just run out of disk space again.&lt;/p&gt;
&lt;p&gt;Thankfully, while on a walk a rather simple solution occurred to me. Just do this in batches. I can easily read a &quot;chunk&quot; of the old dataset, push it into a new DB and order it there. Since I&#39;m only working with a portion of the data at a time then I shouldn&#39;t eat up all the available disk space, just what&#39;s necessary to sort the portion of data I chose. And, if disk space gets precious, I can actually just delete those rows from the old DB once I insert them so I&#39;m not actually net adding more data to my machine.&lt;/p&gt;
&lt;p&gt;I thought for a few minutes on the right elegant solution to reliably and repeatedly split my data up before opting to just do it by the month of the observation. In hindsight I think this might&#39;ve been really silly though... I want to be sorting by observation date but if I&#39;m doing this across months then I think I&#39;m probably having to read across wildly different chunks or pages of the data? In any case, the logic of this was pretty straightforward.&lt;/p&gt;
&lt;p&gt;First, create the new DB copying the schema over from the old DB:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;CREATE&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;TABLE&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ebd_sorted &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;AS&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;FROM&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  ebd_unsorted&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;ebd_full&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;WITH&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;NO&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;DATA&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Then, just iterate through the months of the years and push them into the new DB:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;months_of_year=(1 2 3 4 5 6 7 8 9 10 11 12)&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;for MONTH in &quot;${months_of_year[@]}&quot;; do&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  duckdb path_to_output_db &amp;lt;&amp;lt;EOF&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    INSERT INTO&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      ebd_sorted&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    FROM&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      ebd_unsorted.ebd_full&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    where&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      extract month = $MONTH&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    ORDER BY&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      observation_date;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;EOF&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;done&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Thankfully when used like this DuckDB will just compute the percentage for you:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[07:58:42] Processing month: 1&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;100% ▕████████████████████████████████████████████████████████████▏&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[08:05:58] Processing month: 1 completed&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Total execution time: 7m 16s&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[08:05:58] Processing month: 2&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;10% ▕██████&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Un-thankfully, it&#39;s entirely inaccurate when you&#39;re adding sorting at the end. The first time I ran this I crashed my machine since I had very little storage left. Deleting a couple dozen old or broken DuckDB&#39;s fixed this and the second time this ran to completion in about 90 mins. That&#39;s honestly a bit surprisingly slow, though a bit reason for that could be my quite silly chunking strategy. Another confusing thing is that the insertions get faster rather than slower in later batches. You can see there the first insertion took 7+ minutes while the last month took less than 5. I&#39;m not entirely sure why this might be. I would have assumed DuckDB would have to do more work at the end to figure out the right insertion order?&lt;/p&gt;
&lt;p&gt;In any case this worked, but leaves a lot to be improved upon in future iterations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;picking a better chunking strategy&lt;/li&gt;
&lt;li&gt;figuring out how the raw TSV is sorted if at all. If it is it&#39;d be great to use that for my chunking strategy&lt;/li&gt;
&lt;li&gt;the dream: could I figure out a way to merge this step in with the prior one so that I&#39;m iteratively sorting the data as I read it in from the TSV?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/birds-eye-app/cloaca/blob/0ef1cd87a34da83de53bf5c52dda8120e1fbf53d/src/cloaca/swan_lake/scripts/sort_edb_in_place_sort_of.sh&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Here&#39;s the script if you care to take a look!&lt;/a&gt;&lt;/p&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;back-to-the-real-purpose&quot;&gt;Back to the real purpose&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3#back-to-the-real-purpose&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Back to the real purpose&quot; title=&quot;Direct link to Back to the real purpose&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The prior few sections were over a few days of intermittent work, so I was a little fearful that all of that might have been a pointless adventure into premature optimization that went actually nowhere in terms of fixing the actual problem: being able to show species likelihood across the full data set.&lt;/p&gt;
&lt;p&gt;Thankfully, the sorting did appear to pay off and the first run with the full data set was successful! This species richness table creation is still painfully slow, but it finishes:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  Creating localities table with spatial columns... ✅ Created in 13.1s!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  Creating taxonomy table... ✅ Created in 0.1s!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  Creating hotspot popularity table... ✅ Created in 61.0s!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  Creating hotspots richness table... ✅ Created in 525.9s!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  Creating localities_hotspots optimization table... ✅ Created in 7.2s!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  Dropping hotspots richness table... ✅ Dropped in 0.2s!&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Nearly 10 minutes to create that one table, but that&#39;s a small price to pay to be able to see this many hotspots on a map at once!&lt;/p&gt;
&lt;p&gt;&lt;img decoding=&quot;async&quot; loading=&quot;lazy&quot; alt=&quot;image with hundreds if not thousands of pins on the Eastern US with numbers on them&quot; src=&quot;https://beak-v2.onrender.com/assets/images/all_the_richness_pins-4cd07560d3633b2058d7df7a61812685.png&quot; width=&quot;2870&quot; height=&quot;1916&quot; class=&quot;img_ev3q&quot;&gt;&lt;/p&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;conclusion&quot;&gt;Conclusion&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-3#conclusion&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Conclusion&quot; title=&quot;Direct link to Conclusion&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I really need to take a pass now at cleaning things up on the display side of this. I also should absolutely be limiting this query... each one of these runs returns 20MB of data. But, I think this is a nice place to wrap up this post. I&#39;m not entirely certain where I&#39;ll go from here next. There are more than a few places to explore:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I&#39;d love to be able to show more information to users about not just the number of species but characteristics about them.&lt;/li&gt;
&lt;li&gt;I&#39;d love to be able to use this data to now compute &quot;where can I see the most lifers&quot;? I&#39;m already showing that on another page, but it&#39;s using the eBird API and is woefully slow to run. This is going to be hard though because I can&#39;t just aggregate the species count for efficiency&#39;s sake. I&#39;ll need to report back the actual list of species for the hotspot, which would add a whole lot of data.&lt;/li&gt;
&lt;li&gt;My hope here isn&#39;t to have a bunch of different maps with disparate information on them. Instead, I&#39;d like to support a single view for &quot;Where should I go?&quot; that takes into account richness, popularity, lifers to various data points on a hotspot.&lt;/li&gt;
&lt;li&gt;Most of all, I really need to take a pass on polishing things end to end so I can actually share this with some birding friends and have them use it!&lt;/li&gt;
&lt;/ul&gt;</content>

    <author>
      <name>David Meadows</name>

      <uri>https://dtmeadows.me</uri>

    </author>
  </entry>

  <entry>
    <title>Continuing with DuckDB and eBird</title>
    <link href="https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-2"/>
    <id>https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-2</id>
    <updated>2025-08-25T00:00:00Z</updated>


    <content type="html">&lt;p&gt;In the last part we looked at the &quot;basics&quot; of getting DuckDB set up with the full eBird dataset. It was slow, it used a lot of data, but we finally got to answer the real questions, like &quot;Who&#39;s seen the most ducks?&quot; Now, it&#39;s on to slightly more practical explorations. The goal in this post is to go from a large, still relatively unwieldy dataset, to something we can actually deploy and host on the personal site&lt;/p&gt;
&lt;p&gt;I knew from the get-go we were going to have to get very creative in how we approached pruning down the dataset here. There are two primary reasons I knew we&#39;d have to do this. One: there was no way I was going to be able to affordably deploy 200GB of data for my personal site. If it was a for-profit tech company, deploying something like this would be an afterthought. I could get this up and running on S3 or MotherDuck in a matter of minutes. However, doing so would incur major costs, or at least costs that sound major when it&#39;s your blog and not a venture-backed startup. I set myself a goal of getting this under 10GB from the start given that was the threshold where you could meet the MotherDuck &quot;free tier&quot;. As with many things in my programming journey, this starting principle turned out to be woefully incorrect. While they do offer up to 10 GB of storage, this only comes with a finite limit of 10 compute hours. Hopefully I wouldn&#39;t blow through that too soon, but still I&#39;d rather not deploy this to something with a known expiration date. Deploying it to Render, the existing place where I host my site, would still be fairly easy, but there I&#39;d be paying a fixed cost per GB, so I have a real interest to get this data down to the smallest size possible&lt;/p&gt;
&lt;p&gt;The other big limitation I had to think through was performance. DuckDB and tools like it are great for, and explicitly built for, offline analysis. In other words, they&#39;re perfect for ad-hoc questions, but they&#39;re not as purpose-built for repeatable, online analysis like when you&#39;re offering a tool to the internet that might need to answer the same questions over and over again in, ideally, a quick manner. Thankfully, these two ends can go hand-in-hand. Reducing the size of the data will very often result in speeding up the queries themselves too.&lt;/p&gt;
&lt;p&gt;My initial plan for getting started was to use the DuckDB data to actually produce a different database: Postgres. Postgres is extremely popular, well-documented and suited to the exact thing I was doing here. It&#39;s also trivially easy to dump data out of DuckDB into Postgres with something like this:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;-- attach to a postgres DB&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;ATTACH &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;postgresql://localhost:5432/cloaca&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;AS&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; postgres_db &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;TYPE&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; postgres&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot; style=&quot;display:inline-block&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;-- dump a table into it&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;insert&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;into&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;public&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;ebird&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ebd_full&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;full&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I think if I was focusing on doing this the &quot;right&quot; way I would have kept going this way. But something dawned on me as I was fiddling around with DuckDB. The queries I was running on it were already seemingly fast enough for &quot;production&quot; use. If so, why waste all the effort to get it into Postgres? Furthermore... I use Postgres all the time and while I wouldn&#39;t mind spinning it up again, getting my tables neatly defined and honing my indexes to perfection, it just sounds more fun to see if I could just get DuckDB up and running all on its own. I haven&#39;t checked if this is remotely a good idea. It might be. It might be a standard part of the tool? The main pro I could think of was a vague notion of cost savings. DuckDB (I think) compresses data a bit better than Postgres so I&#39;d be having to spend less on the disk usage if I went this way. There&#39;s far more cons that came to mind: it&#39;s probably slower, harder to tune, harder to run myself and won&#39;t be taking advantage of the super rich ecosystem of Postgres monitoring and maintenance tools... All in all, sounds like a bad decision future-me will regret.&lt;/p&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;the-two-initial-datasets&quot;&gt;The two initial datasets&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-2#the-two-initial-datasets&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to The two initial datasets&quot; title=&quot;Direct link to The two initial datasets&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I knew there were two overall types of data sets I wanted to produce for the app out of the gate:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Hotspot popularity&lt;/li&gt;
&lt;li&gt;Species likelihood&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;hotspot-popularity&quot;&gt;Hotspot popularity:&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-2#hotspot-popularity&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Hotspot popularity:&quot; title=&quot;Direct link to Hotspot popularity:&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Something I really like to see when going to a new area is simply which hotspots are more popular. Sure, it&#39;s nice to go where the popular kids are, but this is really more a valid and super easy signal to finding a good place to bird. The most popular birding spots, at least in my area, are truly really good. They&#39;re where I like to go and where I&#39;d recommend others to go too. It&#39;s also a nice signal that a place is safe and accessible. To state the obvious: this isn&#39;t a guarantee of any of these. Some spots are just popular cause, well, they&#39;re popular. Central Park is a great example of both. It truly is a really good spot to see birds, but also, is really, really popular too.&lt;/p&gt;
&lt;p&gt;Here&#39;s what I wrote to calculate this:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY ID&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; locality_id&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        extract&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;-- I think i&#39;m doing this right here?&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;distinct&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; date_trunc&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;week&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; avg_weekly_number_of_observations&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        ebd_full&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;full&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;current_date&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;5 year&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;2&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;A few quick explanation notes: this is &quot;grouping&quot; all observations in the last 5 years by their Locality ID (the unique ID in eBird for a hotspot) and the month of the observation. This is basically a way of saying &quot;how many checklists were submitted in Central Park in October for the last 5 years?&quot;. I don&#39;t exactly want that though. I think a more intuitive metric is &quot;how many checklists typically get submitted in a week there?&quot; A hotspot with limited popularity might only have 1 or 2 per week, something with moderate popularity between 10-50, and then something that&#39;s very popular should be 100 or above. Some quick benchmarking against the &quot;real&quot; data on eBird supports this to be true. In May of this year, Green-wood cemetery saw 536 checklists. That&#39;s roughly 125 per week. McGolrick Park, my local patch, is far less popular but still sees steady traffic, so it&#39;d be moderate with 114 total checklists or 29 or so per week. Then, the more peripheral hotspots would see somewhere between 2-20 checklists per month.&lt;/p&gt;
&lt;p&gt;A few things already stick out to me as issues here and point to the fact that you can spend a LOT of time refining metrics and queries like this. This is an exercise I absolutely love. We start these analyses out with such a rich internal model of what feels like the &quot;true&quot; metric here. But then getting that metric written out and working requires so much refining. One thing about the McGolrick park numbers is that it shows the under-valuing of group checklists. The McGolrick park bird club can have 200+ people in attendance in a given week. This should arguably reflect in the popularity, but it won&#39;t. Simply multiplying the checklists count by the number of observers would solve for this. Another thing is the conversion to weekly numbers. In hindsight, I probably should have gone with monthly data purely as a convenient way to test my results against eBird. However, to me the &quot;weekly number of checklists&quot; seems just more intuitive. Other questions: should we count checklists or number of unique observers? When I see a semi-popular hotspot, would it be confusing to realize it&#39;s just 1-2 birders&#39; local patch? I could spend days more digging into this, but for now I should just mostly move on...&lt;/p&gt;
&lt;p&gt;Except there are at least two glaring bugs I wrote here. In my defense I was a bit doubtful in my own comment on that line.&lt;/p&gt;
&lt;p&gt;1: &lt;code&gt;count(*)&lt;/code&gt;: this isn&#39;t counting checklists, this is counting observations! So my numbers were wayyy too high.&lt;/p&gt;
&lt;p&gt;Oh yeah and they are. We need to count distinct checklists here instead: &lt;code&gt;count(distinct &quot;SAMPLING EVENT IDENTIFIER&quot;)&lt;/code&gt; should do the trick.&lt;/p&gt;
&lt;p&gt;2: This is a more subtle bug but one I should know better than to do... given I&#39;ve done this a million times before when writing this kind of time based average: &lt;code&gt;/ count(distinct date_trunc(&#39;week&#39;, &quot;OBSERVATION DATE&quot;))&lt;/code&gt;. Naively, this seems right: the average is number of checklists / number of weeks. However, we&#39;re grouping by the hotspot and the month. Thus, if we&#39;re using the hotspots own data to determine the number of weeks for that month, that number might not be the actual months in that year. If that hotspot only had 1 checklist that month, then we could just be dividing 1/1. What we actually need to do is just use a constant number of weeks per month for every hotspot.&lt;/p&gt;
&lt;p&gt;Here&#39;s the &quot;final&quot; query here that I ended up with. I&#39;m gonna hedge that still since I know I&#39;ll probably find some more bugs:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; number_of_weeks_in_each_month &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        extract&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;distinct&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; date_trunc&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;week&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; num_weeks&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        ebd_full&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;full&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;current_date&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;5 year&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY ID&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; locality_id&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    extract&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;-- I think i&#39;m doing this right here?&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;distinct&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;SAMPLING EVENT IDENTIFIER&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;num_weeks&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; avg_weekly_number_of_observations&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    ebd_full&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;full&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; number_of_weeks_in_each_month &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;on&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; extract&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;current_date&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;5 year&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;2&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;number_of_weeks_in_each_month&lt;/code&gt; is still not logically perfect, but in actuality should be. I should just figure out how to generate a series here in DuckDB... but I can just throw that on the future list of things to do.&lt;/p&gt;
&lt;p&gt;Having this, we could now &quot;materialize&quot; this into a more condensed dataset. By only storing the actual data we need, we&#39;d be greatly reducing the size of the data and also the time it takes to analyze it. I used the first query here to create a table for &quot;hotspot popularity&quot; and then, at run time, I joined it against a separate table that tracks the actual details for a hotspot. Here&#39;s how that second table is defined:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;distinct&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; LOCALITY&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY ID&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; locality_id&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY TYPE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; locality_type&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    LATITUDE&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    LONGITUDE&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    ST_Point&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;LONGITUDE&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; LATITUDE&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;geometry&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    ebd_full&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;full&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;current_date&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;2 year&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;LOCALITY TYPE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;H&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;AND&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; LATITUDE &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;IS&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;NOT&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;AND&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; LONGITUDE &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;IS&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;NOT&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    locality_id&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;There&#39;s nothing too much to comment on here. I&#39;m only looking at the last 2 years of data to determine if a hotspot should be included (one of many steps to reduce the vast number of hotspots with little to no data in them). I&#39;m also only considering public hotspots, not personal locations. The most interesting thing is the last column I&#39;m selecting: &lt;code&gt;ST_Point(LATITUDE, LONGITUDE)&lt;/code&gt;. As I started playing around with this data I realized one extension of DuckDB I almost certainly want to use is its spatial tooling. I&#39;ve never really worked with Geo data like this before, but I knew that a very important point of optimization would be quickly determining if a hotspot was within the area I wanted to search. You can mathematically figure this out using the raw decimal data of latitude and longitude but that would be quite error-prone for me to work out myself and also probably not nearly as performant.&lt;/p&gt;
&lt;p&gt;Turns out doing this the way I did still resulted in both issues. Starting out, I had the distance filter entirely wrong. So even when I was looking for hotspots within 5km of a point in New York, I was returning basically every hotspot on earth. This was a pain to debug but eventually turned out to be me just not realizing that I was providing 1,000 times the correct value to &lt;code&gt;ST_Distance_Sphere&lt;/code&gt;. A far more subtle bug was when my query resulted in a weird, cats-eye shaped ellipsis instead of a perfect circle. I honestly never figured out what caused this and I tried numerous fixes. Claude really went out of its way to suggest incorrect answers here, telling me I needed to make custom Mercator projections to fix it. What ended up working was to swap the latitude and longitude when I created the table up above in &lt;code&gt;ST_Point&lt;/code&gt;. It feels like when faced with most Geo data bugs, the first trick should always be to try swapping the latitude and longitude.&lt;/p&gt;
&lt;p&gt;In any case, here&#39;s what I ended up with for the run-time query to analyze this data:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;SELECT&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    localities&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;locality_id&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    LOCALITY &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; locality_name&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    LATITUDE &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; latitude&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    LONGITUDE &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; longitude&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    avg_weekly_number_of_observations&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;FROM&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; localities&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;JOIN&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; hotspot_popularity &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;ON&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; localities&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;locality_id &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; hotspot_popularity&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;locality_id&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;WHERE&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; locality_type &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;H&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;-- Only hotspots&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;AND&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ST_Distance_Sphere&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;geometry&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ST_Point&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;lt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ?  &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;-- Great circle distance in meters&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;AND&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; hotspot_popularity&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ?&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; hotspot_popularity&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;avg_weekly_number_of_observations &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;&amp;gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;We&#39;re just joining my localities table against the hotspot popularity one, providing parameters for latitude, longitude, the max distance in meters and lastly the month to use. I&#39;m also filtering out any hotspots with less than 1 avg weekly number of observations since it&#39;s just kinda weird to show 0 on the map. Note: this is something to revisit (probably really soon), or else sparsely visited areas might not have much useful to display. Bigger note: it feels like sparsely visited places are one of the hardest challenges for displaying useful information for birding.&lt;/p&gt;
&lt;p&gt;One last note on the backend implementation: I&#39;m also defining an &lt;code&gt;RTREE&lt;/code&gt; index on these geometric points too, using the following:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;CREATE&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;INDEX&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; idx_localities_spatial &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;ON&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; localities &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;USING&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;RTREE&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;geometry&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I experimented very briefly with defining other indexes but found that their extra size (nearly doubling the DB size), paid virtually no benefit. This was not at all scientific though and is another area for potential investigation especially if these queries end up bogging down and have some &lt;code&gt;EXPLAIN&lt;/code&gt;&#39;ing to do.&lt;/p&gt;
&lt;p&gt;But... that&#39;s a lot of text and not a lot of showing the results. So with a quick 🥁, here&#39;s how it looks:&lt;/p&gt;
&lt;p&gt;&lt;img decoding=&quot;async&quot; loading=&quot;lazy&quot; alt=&quot;image showing a map of Brooklyn with circular indicators for hotspots. Green-wood Cemetery shows the number 101 on it&quot; src=&quot;https://beak-v2.onrender.com/assets/images/popularity_map_working_brooklyn-81d3e73aff9cd8d5367dd6429ef71b05.png&quot; width=&quot;1604&quot; height=&quot;916&quot; class=&quot;img_ev3q&quot;&gt;&lt;/p&gt;
&lt;p&gt;That looks roughly right for typical May activity in Brooklyn. Performance wise it&#39;s quite snappy too:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Executing get_popular_hotspots with lat: 40.65808990435005, lon: -73.96840797919282, radius: 10.0km, month: 5&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[DuckDB Spatial] Query execution took 0.057s, returned 74 rows&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[DuckDB Spatial] Result conversion took 0.000s&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Time took to process the request and return response is 0.10329604148864746 sec&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;INFO:     127.0.0.1:51295 - &quot;GET /v1/popular_hotspots?latitude=40.65808990435005&amp;amp;longitude=-73.96840797919282&amp;amp;radius_km=10&amp;amp;month=5 HTTP/1.1&quot; 200 OK&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I can then zoom out to the entire eastern seaboard too with similar perf:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Executing get_popular_hotspots with lat: 41.92017945145008, lon: -71.16751310883416, radius: 1000.0km, month: 5&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[DuckDB Spatial] Query execution took 0.067s, returned 10138 rows&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;[DuckDB Spatial] Result conversion took 0.003s&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Time took to process the request and return response is 0.1633131504058838 sec&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;INFO:     127.0.0.1:51327 - &quot;GET /v1/popular_hotspots?latitude=41.92017945145008&amp;amp;longitude=-71.16751310883416&amp;amp;radius_km=1000&amp;amp;month=5 HTTP/1.1&quot; 200 OK&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Rendering this in an explainable way is a bit more challenging:&lt;/p&gt;
&lt;p&gt;&lt;img decoding=&quot;async&quot; loading=&quot;lazy&quot; alt=&quot;image showing a map of the eastern United States with a very dense cluster of hotspots on it. Too dense to really read&quot; src=&quot;https://beak-v2.onrender.com/assets/images/eastern_seaboard_popularity_map-949c7990ff9c27c500a9c53e28558946.png&quot; width=&quot;2820&quot; height=&quot;1636&quot; class=&quot;img_ev3q&quot;&gt;&lt;/p&gt;
&lt;p&gt;I&#39;m just using my existing map plotting tools and I think they could use some adjustments here to show a less data at once, especially when zoomed all the way out. Still, it&#39;s rather cool to very quickly get a ranking of the popularity of spots at such a broad range as this!&lt;/p&gt;
&lt;p&gt;Oh and one last note? How much space does this take? Well right now with just these tables, for the entire world, it&#39;s only a measly 466MB. That&#39;s far lower than I expected, though I really am not throwing much into at all yet. I&#39;m more than a bit worried what will happen when I need a row for each species, for each month for each hotspot...&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;getting-this-live&quot;&gt;Getting this live&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-2#getting-this-live&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Getting this live&quot; title=&quot;Direct link to Getting this live&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Now, you might notice, past-writer me and present me are in a bit of conflict here. See, now I&#39;m supposed to go ahead and move on to the species likelihood mapping. But I do have one big concern. I&#39;ve sunk a few hours into this exercise thus far and while I have promising results locally, I really have no idea how this might be perform on a hosted site. So I think before we spend more time adding data, I think it might be worth pausing here to see if we can even deploy what we currently have. My rough plan is to just send the &quot;parsed&quot; DuckDB up to Render as an attached volume and then read it from there... but I am also starting from square 1 this arena.&lt;/p&gt;
&lt;p&gt;Render makes this quite easy though and I can get 1GB of storage attached at &amp;lt; $0.25/month. Creating this took about 5 seconds and hopefully loading it with data should be just as easy since I can SCP data in from my local machine. I didn&#39;t have an SSH key added to my account yet but this also was quite easy. Honestly I have no interest to shill for Render, but kudos to them on their docs in this whole process. This can always be a pain to get right and their docs were the right level of &quot;here&#39;s an example that&#39;s probably what you want&quot; along with great deep links from their docs to the right place in their product.&lt;/p&gt;
&lt;p&gt;Within 1 minute I was SCP&#39;ing the file up to the disk:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;% scp -s ~/parsed_full.db srv@ssh.oregon.render.com:/var/data&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;parsed_full.db    46%  207MB   4.2MB/s   00:56 ETA&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;It took longer (though not even that long), just to upload the parsed dataset than it did to get the drive working. Now I just need to tell Cloaca where to look for this data and we should be good! Thankfully I was already just reading this from an ENV variable &lt;code&gt;DUCK_DB_PATH&lt;/code&gt; so as long as I provide that to the server in Render this should work... In a real deployment, we&#39;d want to have some sort of separate environment to test something like this. But as of this writing this service has a 1:1 ratio of developers to users so it&#39;s gonna be fine if we break a few things.&lt;/p&gt;
&lt;p&gt;Once this was deployed it was simple enough to test with a quick request to the new endpoint... and, as expected, there&#39;s a bit of trouble:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;IO Error: Extension &quot;/root/.duckdb/extensions/v1.3.2/linux_amd64/spatial.duckdb_extension&quot; not found.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Extension &quot;spatial&quot; is an existing extension.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This is somewhat expected. While I&#39;ve delivered the DB file to the server, and that server does have duckdb installed to it via the python wrapper, I can imagine there are some extra steps to actually getting it running there. With a quick google, and boy do I ever have to give the Docs teams of the world credit today, we found the exact solution: &lt;a href=&quot;https://duckdb.org/docs/stable/clients/python/overview.html#loading-and-installing-extensions&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;https://duckdb.org/docs/stable/clients/python/overview.html#loading-and-installing-extensions&lt;/a&gt;. We just need to add these two lines and we should be good:&lt;/p&gt;
&lt;div class=&quot;language-python codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-python codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;con&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;install_extension&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;spatial&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;con&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;load_extension&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;spatial&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;And trying again... it works! But it took 5 seconds. This is certainly less than ideal, but I still think it&#39;s worth getting out the door so we can learn a bit more from more production-like usage. Speaking of which...&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;using-it-live&quot;&gt;Using it live&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-2#using-it-live&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Using it live&quot; title=&quot;Direct link to Using it live&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;As of that writing, I took an ice cream break with E. She asked for a quick demo and I started showing her the NYC area. However, we both were curious to apply this to a real-life use case. We&#39;re planning a trip in a few months to New Zealand, so I spun the globe over there to see how it looked. It was honestly so great to very easily have the entire country in the view and then to see the top 10 or so hotspots. What was even cooler was to see that at least half of them, and all the top 5 locations, were places we were planning on going! This hopefully is a good validation that the data this is showing is actually relevant for trip planning.&lt;/p&gt;
&lt;p&gt;That said, though, this is far too slow. So let&#39;s try and make this faster.&lt;/p&gt;
&lt;p&gt;To look into this, I&#39;ll run these queries locally and also on the Render VM. Thankfully it&#39;s pretty easy to just run DuckDB in both and compare the results from explain / analyze. Here&#39;s the initial result from the VM:&lt;/p&gt;
&lt;div class=&quot;language-text codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-text codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌────────────────────────────────────────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│┌──────────────────────────────────────────────┐│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;││               Total Time: 2.78s              ││&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│└──────────────────────────────────────────────┘│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└────────────────────────────────────────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌───────────────────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│           QUERY           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└─────────────┬─────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┴─────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│      EXPLAIN_ANALYZE      │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ────────────────────   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│           0 Rows          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          (0.07s)          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└─────────────┬─────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┴─────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│         PROJECTION        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ────────────────────   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│__internal_decompress_strin│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│           g(#0)           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│             #1            │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│             #2            │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│             #3            │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│             #4            │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          60 Rows          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          (0.00s)          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└─────────────┬─────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┴─────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          ORDER_BY         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ────────────────────   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│      parsed_full.main     │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    .hotspot_popularity    │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│.avg_weekly_number_of_obser│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│        vations DESC       │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          60 Rows          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          (0.00s)          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└─────────────┬─────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┴─────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│         PROJECTION        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ────────────────────   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│__internal_compress_string_│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│        uhugeint(#0)       │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│             #1            │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│             #2            │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│             #3            │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│             #4            │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          60 Rows          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          (0.00s)          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└─────────────┬─────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┴─────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│         PROJECTION        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ────────────────────   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│        locality_id        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│       locality_name       │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          latitude         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│         longitude         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│avg_weekly_number_of_observ│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│           ations          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          60 Rows          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          (0.00s)          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└─────────────┬─────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┴─────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│         HASH_JOIN         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ────────────────────   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│      Join Type: INNER     │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│        Conditions:        ├──────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ locality_id = locality_id │              │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │              │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          60 Rows          │              │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          (0.02s)          │              │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└─────────────┬─────────────┘              │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┴─────────────┐┌─────────────┴─────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│         TABLE_SCAN        ││         TABLE_SCAN        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ────────────────────   ││    ────────────────────   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│           Table:          ││     Table: localities     │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│     hotspot_popularity    ││   Type: Sequential Scan   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           ││                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│   Type: Sequential Scan   ││        Projections:       │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           ││        locality_id        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│        Projections:       ││          LOCALITY         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│        locality_id        ││          LATITUDE         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│avg_weekly_number_of_observ││         LONGITUDE         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│           ations          ││                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           ││          Filters:         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          Filters:         ││     locality_type=&#39;H&#39;     │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          month=5          ││    (ST_Distance_Sphere    │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│avg_weekly_number_of_observ││  (geometry, &#39;\x00\x00\x00 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│        ations&amp;gt;=1.0        ││  \x00\x00\x00\x00\x00\x00 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           ││  \x00\x00\x00\x01\x00\x00 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           ││  \x00\xF46\xB5p\xC8\xF5D@ │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           ││ \x04\x06\xE7\x88\xB8\xCAQ │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           ││ \xC0&#39;::GEOMETRY) &amp;lt;= 30000 │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           ││            .0)            │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           ││                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│         52280 Rows        ││          362 Rows         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          (2.45s)          ││          (0.19s)          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└───────────────────────────┘└───────────────────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Something I noticed right away is my index isn&#39;t being used! I glanced over it above, but I actually initially started off using &lt;code&gt;ST_DWithin&lt;/code&gt; but stopped using it since it wasn&#39;t really what I wanted. Someone halfway proficient in Geo data could probably spot the issue and solution right away... but I&#39;m only seeing the first half. Basically if I want to use an index I should use &lt;code&gt;ST_DWithin&lt;/code&gt; but if I want accuracy I need to use &lt;code&gt;ST_Distance_Sphere&lt;/code&gt;. I wonder if I could get away with using both? Use the former for an initial efficient narrowing of the points and then &lt;code&gt;ST_Distance_Sphere&lt;/code&gt; to narrow that down to an actually accurate picture? I played around with this and while it sometimes was used by the query planner, most of the time DuckDB would just skip the index and default to a sequential scan. Claude seems to think I just need to define an index on the &lt;code&gt;month&lt;/code&gt; of the observation and everything will be fine. I think it just thinks it&#39;s operating on a Postgres DB and is wrong. It&#39;s probably actually right and I&#39;m ignoring it incorrectly, but let&#39;s try and John Henry this a bit longer.&lt;/p&gt;
&lt;p&gt;DuckDB does have excellent docs across the board, so I took a glance at their docs on performance. Two things stuck out to me: 1) they really highlight the &lt;a href=&quot;https://duckdb.org/docs/stable/guides/performance/indexing#the-effect-of-ordering-on-zonemaps&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;importance of ordering here when possible&lt;/a&gt;. 2) They talk about the &lt;a href=&quot;https://duckdb.org/docs/stable/guides/performance/schema#microbenchmark-joining-on-strings&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;inefficiency of joining on strings here&lt;/a&gt;. I know we&#39;re not explicitly ordering these tables and I also know we&#39;re using strings as our IDs in most cases (Locality IDs). 1) is because I didn&#39;t think about and 2) is because the key is a string... or does it have to be? So let&#39;s try both. Let&#39;s order both tables when parsing them. And then let&#39;s try converting the hotspot ID to an integer.&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;ordering-data&quot;&gt;Ordering Data&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-2#ordering-data&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Ordering Data&quot; title=&quot;Direct link to Ordering Data&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;For ordering... well it turns out I already was ordering the localities table but not the hotspots one. Let&#39;s see what happens there. Just saying this again, these databases are fascinating, truly. Ordering the hotspot popularity table vastly reduced the initial DB creation performance (going from a few seconds to over 2 minutes), but the DB is now nearly 1/3 the original size (250MB vs 650MB before).&lt;/p&gt;
&lt;p&gt;Let&#39;s just replicate that on the VM to test things out by creating a quick new version of that table:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;or&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; hotspot_popularity_v2 &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; hotspot_popularity&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; locality_id&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;month&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Connection &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ssh&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;oregon&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;render&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;com closed &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; remote host&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;Connection &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ssh&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;oregon&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;render&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;com closed&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Oh no. &lt;code&gt;Ran out of memory (used over 512MB) while running your code.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Might have to just rebuild the DB and push it back up. Thankfully it&#39;s a lot smaller now!&lt;/p&gt;
&lt;p&gt;Unfortunately, if there was a speedup here it was a modest one. On to the IDs!&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;using-integer-ids&quot;&gt;Using integer IDs&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-2#using-integer-ids&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Using integer IDs&quot; title=&quot;Direct link to Using integer IDs&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Let&#39;s peak into the locality IDs:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ locality_id │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│   varchar   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;├─────────────┤&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L1          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L10         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L10000087   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L1000010    │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L10000105   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L1000013    │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ·        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ·        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L10003217   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L10003228   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L10003279   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L10003410   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│ L1000354    │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└─────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;So these should be pretty easy to just turn into integers with something like:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;CAST&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;SUBSTRING&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;locality_id&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;AS&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;INTEGER&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; locality_id_int&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I &lt;em&gt;think&lt;/em&gt; it might be nice to only store these as integers and then be able to convert back to the string version. However, I&#39;m not seeing an obvious pattern there in how the string version is being created. I&#39;m sure I could finesse some sort of left-padding... or I could store both for now. We&#39;ll store both in the &lt;code&gt;localities&lt;/code&gt; table and then we&#39;ll only use the integer ID in hotspots. Somehow that further reduced the DB size, now to just 165 MB? I should state at this point I am in no ways confident in my method of sizing the database...&lt;/p&gt;
&lt;p&gt;Nor am I that confident in how I&#39;m measuring performance. It feels like I halved the performance here again, but these queries are still taking 1-2 seconds. It&#39;s at this point that I&#39;m starting to feel truly silly for spending more and more time trying to get an analytic database to perform like this.&lt;/p&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;trying-other-tactics&quot;&gt;Trying other tactics&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb-part-2#trying-other-tactics&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Trying other tactics&quot; title=&quot;Direct link to Trying other tactics&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Stepping back and looking at that explain again, I&#39;m thinking that maybe the join isn&#39;t really the expensive part. What looks to me to be more problematic is the sequence scan on the localities table:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌─────────────┴─────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│         TABLE_SCAN        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│    ────────────────────   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│           Table:          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│     hotspot_popularity    │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│   Type: Sequential Scan   │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│        Projections:       │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│      locality_id_int      │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│avg_weekly_number_of_observ│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│           ations          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          Filters:         │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          month=5          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│avg_weekly_number_of_observ│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│        ations&amp;gt;=1.0        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│                           │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│         52421 Rows        │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│          (0.26s)          │&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└───────────────────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I would love to somehow force DuckDB to try and use the index I defined on the lat/long geometry for this table I defined with:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;CREATE&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;INDEX&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; idx_localities_spatial &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;ON&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; localities &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;USING&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;RTREE&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;geometry&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;However, I&#39;m not seeing an easy way to force the query planner to use this index like you can in other DBs. Digging through the docs, I noticed 1 concern right away:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One of the arguments to the spatial predicate function must be a “constant” (i.e., an expression whose result is known at query planning time). This is because the query planner needs to know the bounding box of the query region before the query itself is executed in order to use the R-tree index scan.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Oof. I don&#39;t think this would work in my query case since I&#39;m using parameters like so:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;AND&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ST_DWithin&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;geometry&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ST_Point&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Maybe I could just do the query templating myself before hand to make this a more static query?&lt;/p&gt;
&lt;p&gt;Playing around a bit with some simplified queries, though, I realized that even a simple, static &lt;code&gt;ST_DWithin&lt;/code&gt; with a &lt;code&gt;ST_Point&lt;/code&gt; wasn&#39;t using the RTREE index on either my local DB nor on the slower VM. However, if I just threw in an ST_Envelope instead suddenly the index would be used. This is surprising and confusing. It turns out that &lt;code&gt;ST_DWithin&lt;/code&gt; isn&#39;t supported but &lt;code&gt;ST_Within&lt;/code&gt; is... oof... More oof&#39;s ensued after this as I tried many various query patterns to try and get an index to be used when joining against the hotspot data. I found a lot of different ways to get an index to apply to a query just against &lt;code&gt;localities&lt;/code&gt; (the table with the lat/long geometry on it), but nothing when a join was applied to it. I very well could have missed something, but I, and subsequently Claude, tried many different combinations of query definitions with no dice.&lt;/p&gt;
&lt;p&gt;I did have another thought when trying out various ideas: what if I could just add the geometry data to the hotspot data. Or better yet... why even have 2 tables if really only need 1. I had started off with separate tables because of separation of concerns and whatnot, but really if I continue down the path of &quot;only return data for a hotspot&quot; then I might realistically can get away with only querying a single table. If I did that then the join condition might be entirely obviated and I might get better index performance?&lt;/p&gt;
&lt;p&gt;So I created a simple, unified table... and presto:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;┌────────────────────────────────────────────────┐&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│┌──────────────────────────────────────────────┐│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;││              Total &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;Time&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;: &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;0.0530&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;s             ││&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;│└──────────────────────────────────────────────┘│&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;└────────────────────────────────────────────────┘&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;That&#39;s on the VM. That plan didn&#39;t even use the index nor does it seem like it even needs it!&lt;/p&gt;
&lt;h1&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;At this point, it&#39;s probably time to wrap up this rambling post. I sure did learn a lot doing this, though it&#39;s a bit difficult to sum up neatly. Here are some sundry bullet points firing around at the end:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;DuckDB is still very cool. It&#39;s super easy to get up and running, the docs are great and I was able to try things out, deploy them, adjust them live all with minimal effort.&lt;/li&gt;
&lt;li&gt;Using it as a production style DB is fairly silly. I&#39;m sure I could avoided half the mistakes and rabbit holes here if I just started using Postgres sooner.&lt;/li&gt;
&lt;li&gt;On using Claude / AI for things like this: I don&#39;t think Claude has a great idea of how to work with DuckDB. Most of its strategies were built for relational DBs, like its tendency to just say &quot;you need an index for this&quot; even though DuckDB strongly recommends against using indexes for most things. Even giving it direct access to the DuckDB docs really didn&#39;t help. What it really shines at though is being a test bench of sorts for various strategies. It can write, run and analyze queries so much faster than I can. Giving it a few ideas and test strategies and then watching it go performed really nicely. Asking it to solve my problems left a bit more to be desired.&lt;/li&gt;
&lt;li&gt;Like with a lot of criticisms (valid or not) of AI, they also apply to humans. My instincts from the get-go were &quot;oh this query isn&#39;t using an index, let&#39;s jump through hoops to get it to&quot;. DuckDB doesn&#39;t really tend to want to work that way, and when you&#39;re working with an analytics DB for the first time it might make sense to read the docs more.&lt;/li&gt;
&lt;/ul&gt;</content>

    <author>
      <name>David Meadows</name>

      <uri>https://dtmeadows.me</uri>

    </author>
  </entry>

  <entry>
    <title>My Review of the PATH</title>
    <link href="https://www.scd31.com/posts/path-review"/>
    <id>https://www.scd31.com/posts/path-review</id>
    <updated>2025-08-25T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Cursed Knowledge</title>
    <link href="https://tomeraberba.ch/cursed-knowledge"/>
    <id>https://tomeraberba.ch/cursed-knowledge</id>
    <updated>2025-08-25T00:00:00Z</updated>


    <content type="html">Cursed knowledge I have learned over time that I wish I never knew. Inspired by Immich&#39;s Cursed Knowledge. The knowledge is ordered from most to least recently learned. \&gt; Ajv multipleOf validation is…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>Analyzing the full eBird Dataset with DuckDB</title>
    <link href="https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb"/>
    <id>https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb</id>
    <updated>2025-08-16T00:00:00Z</updated>


    <content type="html">&lt;p&gt;Ever since I started birding, I’ve had an ever growing list of questions. Questions about the species I’ve seen, the ones I haven’t, the habitats they’re found in, the activity and the participants itself. When do Eastern Phoebes typically show up in New York? Where do they go when they leave? Where do they like to nest? How frequently are they seen in New York? Why does that frequency dip in the summertime?&lt;/p&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;the-journey&quot;&gt;The Journey&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb#the-journey&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to The Journey&quot; title=&quot;Direct link to The Journey&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I love having this mounting list of questions as it’s one of the key drivers of a “mid life renaissance” of sorts in my curiosity. I’d love to write about that spark even more soon, but for now I’d like to talk about one of the avenues I’ve turned to both quench and deepen this curiosity. Soon after getting into birding, I started getting quite curious about the underlying data that birding both depends upon and also adds to. Birding, more so than any other hobby I’ve encountered, focuses on both the reading and contribution of data. Birders study weather forecasts, migration timelines and previous year’s records to answer questions like: “when are the first migrants of the year going to show up?”, “are the winds out of the South this weekend?” (a good sign here in New York in the spring that tropical migrants will be coming in, born along by the favorable winds) or “I just saw an owl species in Brooklyn. Which species might be most likely that matches the brief glimpse I got?” Just as interesting, and particularly compelling to many of us, is the fact that by recording our checklists, taking photos and recording audio that we’re also adding to the ever growing data set of birding observations (most commonly here in the US via Cornell’s eBird).&lt;/p&gt;
&lt;p&gt;As I got started into birding I was participating in both of these, but I was also starting to get curious about what the underlying data might look like. Soon into my birding journey I found out that eBird had an API you could use to access this data. I’ve already made more than a few fun projects off of that, but one of the limitations I ran into there is that an API like this is best suited for answering questions served by targeted, smaller data sets. It was great to see a list of the most recent species in an area or to get more information on hotspots around me. But, for questions that stretched across larger areas or across long time periods, it would take quite a long time (and more than likely hitting some rate limits) and not be that efficient to use an API for this. APIs like this are great at serving small-to-medium amounts of data very quickly based on specific filters or conditions. They’re not so great at providing deeper or richer datasets spanning thousands or millions of entries. And while my questions weren’t of the “millions” (yet), I was running into the limitations of the API when it came to many of my questions. Thankfully, eBird offers another route to accessing its data, via pre-packaged downloads of large swaths of data. This wouldn’t have the immediate, quick accessibility of the API, but would more than make up for it in its comprehensive and rich dataset.&lt;/p&gt;
&lt;p&gt;What’s funny to me is that it took me nearly a year before I actually downloaded my first segment of the dataset. I can’t say the exact reason why, but one of the reasons why surely had to do with my unfamiliarity around digging through data like this. This isn’t to say I wasn’t comfortable or even happy around data analysis. While working at Stripe I had the distinction (happy or not) of being one of the top users of our shared analytics system (being summoned on more than one occasion to explain I insisted on running so many SQL queries every day). But it’s that last part, SQL, that maybe best explains my hesitancy with this form of data querying. See, the main recommendation for working with the eBird dataset was with R. No shade meant to the language, it’s just not something I had really ever worked with before. So, while I had a little fun poking around at the dataset before, I very quickly got frustrated by both the slowness of asking relatively simple questions via R (and by slowness I mean my slowness in figuring out just how to get anything working with it). I put this aside and promised myself I’d return to it another day when maybe my curiosity was willing to drive me a bit further.&lt;/p&gt;
&lt;p&gt;Sometime later, however, my curiosity actually led me in a different direction. If I was having such a hard time (and honestly not a fun time) trying to look at this data in this way… why not just try and do it the way that I was used to? To be fair to me, I think I assumed that using what I used at Stripe would be just too cost prohibitive and difficult for a small personal project (I mean it’s called a data warehouse for starters). On a whim, I asked if what I wanted was even possible on my own humble computer. Not only was I not discouraged, but right away I was pointed to a good solution.&lt;/p&gt;
&lt;p&gt;Enter DuckDB. Okay, right from the get go, it has to be prefect right? It has Duck in the name. But, before we go further, let’s take a brief aside to even talk about what’s going on here in a more detailed manner.&lt;/p&gt;
&lt;p&gt;The problem is basically this: in its raw form the ebird data set is just lines and lines of text separated by tabs (the sister file format to the very commonly used CSV). In fact, you probably can read this here and get a sense of what the data includes!&lt;/p&gt;
&lt;div class=&quot;language-tsv codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-tsv codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;URN:CornellLabOfOrnithology:EBIRD:OBS2919749158	2025-03-01 23:25:39.781016	21333	species	avibase-69544B59	American Crow	Corvus brachyrhynchos				2					United States	US	Alabama	US-AL	Tuscaloosa	US-AL-125		27			Lakefield Drive, Tuscaloosa	L10510092	H	33.1559565	-87.6336990	2025-03-01	06:21:00	obsr1073731		S215984764	Traveling	Traveling	P22			24	1.092		2	1	G14142220	0	1	0&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;URN:CornellLabOfOrnithology:EBIRD:OBS2933256785	2025-03-05 07:43:54.669907	21333	species	avibase-69544B59	American Crow	Corvus brachyrhynchos				1					United States	US	Alabama	US-AL	Tuscaloosa	US-AL-125		27			23rd Street Area, Tuscaloosa	L12228602	P	33.1882745	-87.5396724	2025-03-05	06:27:00	obsr451424		S216708215	Stationary	Stationary	P21			15			1	1		0	1	0&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;URN:CornellLabOfOrnithology:EBIRD:OBS3009418896	2025-03-30 17:15:29.683739	21333	species	avibase-69544B59	American Crow	Corvus brachyrhynchos				3					United States	US	Alabama	US-AL	Tuscaloosa	US-AL-125		27			River Bend Turf	L5039140	H	33.1335213	-87.6534175	2025-03-30	15:11:00	obsr451424		S221934152	Traveling	Traveling	P22			60	13.532		2	1	G14355008	0	1	0&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;That’s nice for general readability and it also makes working with this across a broad audience and technical background quite feasible. The downside here is that, in this form, this file is prohibitively large and also quite inefficient to use. I’m not going to go too deep into the technical reasons why, but suffice it to say that, by default, most any analysis of this file is going to require reading each and every line: all 1,809,934,873 of them for the full file. There are lots of different ways we could make this easier! Possibly the most commonly used method would be to throw it into a general purpose database like Postgres. This would certainly speed things up and reduce the size of the file. However, a database like this isn’t really suited for what I want to do here. I want to be able to ask near arbitrary types of questions without spending time before hand defining things like indexes which are basically telling the database which axes I want to analyze data along. I could define an index for the date of the observation and then looking only for observations on a certain date would be blazingly fast! But this comes with 2 downsides: first, I’d need to wait a lengthy amount of time for the database to create that index (it needs to sort through all the data to see where it lives on that axis) and, moreover, I’m interested in so many more axes than that!&lt;/p&gt;
&lt;p&gt;The thing I really wanted was a columnar database! Not only would a columnar index not really require indexes, but it was perfectly suited for this kind of “Ask me anything” style approach. The thing that I didn’t know was that columnar databases can be used in any variety of uses cases: from the large corporate behemoths to me poking around to see where the best place to see an Eastern Phoebe in March was. The exact reason why columnar databases are so good is honestly still beyond me. But I do know one of their biggest advantages is how they treat the data they store. See, in a traditional database, all elements in one of the rows in that TSV are going to be stored together. That’s quite useful when you’re trying to do something like look at one specific observation or a few to see all the relevant details about it: the species, the date, the observer, the location etc. Columnar databases flip this idea on its head though and treat everything by column and not by row. In an analytics use case, I don’t care as much about all the details of one observation; instead, I’m far more interested in a few details about a large list of observations. By storing data by column, they make it far more efficient to analyze along these. So, instead of thinking of these entries as one row (“I saw an Eastern Phoebe, on May 1st at Prospect Park on this checklist”), it flips this to instead sort things by the species, or date, or location. (How do they do this so quickly? I still have no idea… I believe it has something to do with the fact that they’re storing all this data along an index by default?)&lt;/p&gt;
&lt;p&gt;In any case, we’ve got our data principles set: now we can talk about ducks again. DuckDB caught my interest right away based on the simple install instructions (geared toward a local user), the ease at which I could work with CSV files, the fact they had just shipped a UI for running queries (I’m all for running my queries in a terminal, but in 2025 I don’t really wanna be interpreting tab and pipe characters in one) and, I mean, come on, they have Duck in the name! I loaded up the very small starter dataset I had and, glancing at the instructions, I had it working in less than 30 seconds. Here’s where I should start to really emphasize my unfamiliarity with what I was doing (if I haven’t laid it on thick already). I started trying to figure out if I needed to put this in S3 (a cloud-based storage system that probably backs nearly everything on the Internet in some way) or some other remote file storage system. I started playing around with just the right way to store this in Parquet (a newer file storage format far more suited for this problem than a TSV). I was game-planning the right way to stream the data from eBird’s server, through my machine as a TSV, up into the cloud as Parquet… when it dawned on me. This TSV was large, but not prohibitively so: 137GB. And while I didn’t have a spare cloud server sitting around… I did have quite a number of unused hard drives from my color editing work with my brother. Why not just throw the file on that?&lt;/p&gt;
&lt;p&gt;I thought I’d go from that step to querying data in an hour or two… but boy was that over-confident. For one, the download would end up taking far longer than that. Not only cause I didn’t have the best internet speed, but my drive also kept going to sleep halfway through it… in any case, it was actually for the best, since it meant I spent more time testing and playing around with a more modest 25GB data set instead. This really paid off in the long run since I actually figured out a good plan with a quicker dataset than just trying and failing repeatedly on the first one. Here’s the first thing that threw me off: I had assumed that I needed to figure out my own specific method of data storage. It turned out that, while DuckDB will happily spit out Parquet (and other forms of data), that’s not really the “database” it’s working off of! Instead the Database is its own file… in any case, it was definitely fun to see how Parquet can drastically reduce the size of the download! With the 25gb NY data set it reduced it down to just 3gb… and I’m not even sure I was doing any compression with Parquet either! (See the questions below for areas I still need to explore) [and an editor’s note; he certainly was doing compression].&lt;/p&gt;
&lt;p&gt;So I didn’t need Parquet to store the data… but, as usual when I’m doing anything technical (or honestly, just anything) I was far ahead of myself. I was already thinking of how to store the data when I first needed to not only finish downloading it… but then figure out how I’d even get that huge TSV file uncompressed and into DuckDB. The first part, decompression, on any modern computer is typically just as easy as double clicking the file, but that’s not as straightforward when you’re dealing with output that will end up exceeding 600gb (the native decompressor on my Mac never worked on this and was painfully slow). One obvious problem here was that I didn’t exactly have 600gb to spare on my local machine. However, if I moved this over to one of those aforementioned external drives then I ran into another issue: that these drives get real bogged down if you’re trying to read from them while writing to them. Put another way, if I’m trying to decompress the data into another file I need to tell the drive “Hey take these chunks of data and tell me what they are. Oh also, while you’re doing that, also remember these other chunks.” For, again, complicated reasons, a lot of drives don’t love doing this. They’d prefer one operation or another, but not both. A straightforward workaround would be then to read the file from my local drive and write to the external one. An even better find, however, was when I started looking for a tool to do this and realized that there were far quicker de-compressors available than the default one that lives on my machine (they use multiple cores to accomplish this speedup). Enter pigz (continuing with the animal-named tools on this one I guess):&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;@����n+Y�%���@&amp;gt;��Xdƺ�E�b�&quot;���[���U�h$Bd���`ɭT���yn��6�l����…&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Uhh that’s not quite right… oh yeah, you need to tell this tool to &lt;code&gt;decompress&lt;/code&gt;, but default it will just recompress things. Okay adding the right arg to pigz, and we’ve got data! Loading this into DuckDB was a bit more involved as I had to fiddle a bit more with the loading arguments to get everything right. The main issue was around how restricted characters in the TSV are quoted (isn’t always the case when dealing with these?). But… we’ve got the data in! Again, this is with the more limited NY only data set, yet still, it only took 30s to create the DuckDB table for this. That’s &lt;em&gt;way&lt;/em&gt; faster than any operation using R ever took! We can ask our first question… When was the first Eastern Phoebe observation in NY?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“Mounted at Wards [$1.25]; destroyed February 1907”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Always a nice reminder about how ornithology used to be just a bit more grisly. We’re close to realizing the dream! Writing that question was quick and the result was even quicker: 58ms. Now, we just need to work on the final goal, loading up the full data set.&lt;/p&gt;
&lt;p&gt;The key challenge here is still around file size. The uncompressed full dataset is around 600 gb. While I do have drives that can handle that, it’s a bit more difficult to figure out how best to decompress that file. What I ended up doing to avoid filling up my hard drive (which I mistakenly had already done twice) and to have this finish in a relatively speedy fashion was to actually keep the compressed file on my computer’s hard drive and then decompress the file directly to the external hard drive. This ended up taking about 2 hours which is far from great, but at least we have something that works! Something I’d like to figure out in the future is a better story around downloading -&amp;gt; decompression -&amp;gt; loading into DuckDB. Ideally one that’s quicker and involves less need for intermediate storage. You should be able to stream or pipe more of these processes to each other so that we don’t have to wait for the entire 600gb TSV file to be ready before loading it into DuckDB. The main reason I didn’t spend much time on this yet is that, a) it’s easier to reason about what I did b) premature optimization is the death of so many of my projects and c) dealing with issues is a bit harder in the streaming case than it is when I have more discrete check points along the way.&lt;/p&gt;
&lt;p&gt;Setting that aside, I’m now one step away from the final dream. I need to get this data into DuckDB. One downside I discovered here with DuckDB is that it’s pretty hard to track the timing of larger tasks like this. There’s not a way to get a progress bar, that I was able to find. A strategy I used for this, and a few other tasks here, was to use times from the smaller NY dataset to then estimate the time for the full dataset. This isn’t a perfect strategy, given the non-linear impact of compression and other things like disk caches that will have more of an impact on the smaller file sizes. Still, it was actually pretty good at predicting things like the final row length of the DB (off by about 5%), but less good at predicting the disk size. It ended up being around 214GB which is way bigger than I was planning for. It turns out DuckDB isn’t going to magically solve my disk size needs. Still, that’s a problem for future David (a fun problem no doubt) and not the focus of the day.&lt;/p&gt;
&lt;p&gt;The focus of the day is answering some burning questions, like who has seen the most ducks in a day? It surprisingly isn’t a tie… it’s this checklist here with 11 ducks: &lt;a href=&quot;https://ebird.org/checklist/S16191277&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;https://ebird.org/checklist/S16191277&lt;/a&gt;. Wood Duck, Mottled Duck, Ruddy Duck… Great work here. Plumed Whistling-Duck, Mandarin Duck, Ruddy Shelduck, what on earth is going on here? Oh all of these are escapees… that doesn’t feel right in the spirit of “most ducks.” I think that should mean wild ducks. If we filter those pesky escapees out… we’re down to 10 ducks reported. Still, surprisingly, there’s no tie! Only 1 leader emerges and it’s from 1996 in Ethiopia: &lt;a href=&quot;https://ebird.org/checklist/S77656343&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;https://ebird.org/checklist/S77656343&lt;/a&gt;. Here’s the query:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;SAMPLING EVENT IDENTIFIER&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    country&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    state&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token function&quot; style=&quot;color:#d73a49&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;distinct&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;SCIENTIFIC NAME&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; no_of_ducks_seen&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    array_agg&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;COMMON NAME&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    array_agg&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    ebd&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;full&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;where&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;COMMON NAME&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;ilike&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;%duck%&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; category &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;not&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;slash&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;spuh&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; APPROVED &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token comment&quot; style=&quot;color:#999988;font-style:italic&quot;&gt;-- no escapees here!&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;EXOTIC CODE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;is&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;or&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;EXOTIC CODE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;N&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    no_of_ducks_seen &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;desc&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;100&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;But another problem you say! That’s not the true definition of duck! You need to filter to the species in Anatidae… true that! But definitely another challenge for another day.&lt;/p&gt;
&lt;p&gt;Speaking of which, boy is there more work to do here. I’ve loved being able to scratch data questions easily and quickly. But it’s quick impracticable to have a 200+GB database sitting around your hard drive. It’s also unwieldy to have to leave things running for hours to refresh the dataset each month. Beyond that, I’d love to be able to use this in more broadly reachable tools. DuckDB offers a hosted version of their service and they have a free tier you can use if your dataset is below 10GB. A 5% reduction sounds really difficult, but I haven’t engaged much at all in reducing the dataset. For one, there are numerous columns here I don’t really need (comments, redundant taxonomic information, checklist IDs). In fact, it’d be a quite fun activity to see just what columns or portions of the DB actually contribute to its size. I also probably don’t need all data to answer most questions I have.&lt;/p&gt;
&lt;p&gt;Here are some other outstanding tasks I’d love to look into: - All the questions around performance here. Could I use ordering for better indexing? &lt;a href=&quot;https://duckdb.org/docs/stable/guides/performance/indexing#the-effect-of-ordering-on-zonemaps&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;https://duckdb.org/docs/stable/guides/performance/indexing#the-effect-of-ordering-on-zonemaps&lt;/a&gt; - Why is the first eBird checklist from Thailand on 1481-03-24 - Is there a form of Parquet compression that would reduce the size even more? - I should clean up data types more (move from integer to binary). This will reduce size more! Probably shifting things to enums could be massive too? - Convert column names to underscore to make them easier to auto_complete. - Figure out how to get x/y coords into data to perform distance analysis more quickly!&lt;/p&gt;
&lt;div class=&quot;theme-admonition theme-admonition-info admonition_xJq3 alert alert--info&quot;&gt;&lt;div class=&quot;admonitionHeading_Gvgb&quot;&gt;&lt;span class=&quot;admonitionIcon_Rf37&quot;&gt;&lt;svg viewBox=&quot;0 0 14 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M7 2.3c3.14 0 5.7 2.56 5.7 5.7s-2.56 5.7-5.7 5.7A5.71 5.71 0 0 1 1.3 8c0-3.14 2.56-5.7 5.7-5.7zM7 1C3.14 1 0 4.14 0 8s3.14 7 7 7 7-3.14 7-7-3.14-7-7-7zm1 3H6v5h2V4zm0 6H6v2h2v-2z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;info&lt;/div&gt;&lt;div class=&quot;admonitionContent_BuS1&quot;&gt;&lt;p&gt;One post-writing note that I figured out a few days later: at a few points I mention that I needed to change the file format or data types to take advantage of DuckDB’s compression. This isn’t necessarily true. I found out subsequently that DuckDB is already quite smart at inferring the right data type for a column all on its own. So there’s less tuning needed at the onset (great!), but it also means there aren’t as many quick wins as I imagined.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;
&lt;h2 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;step-by-step-guide-on-how-to-load-the-ebd-into-duckdb&quot;&gt;Step-by-step guide on how to load the EBD into DuckDB&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb#step-by-step-guide-on-how-to-load-the-ebd-into-duckdb&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Step-by-step guide on how to load the EBD into DuckDB&quot; title=&quot;Direct link to Step-by-step guide on how to load the EBD into DuckDB&quot;&gt;​&lt;/a&gt;&lt;/h2&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;ingredients&quot;&gt;Ingredients&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb#ingredients&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Ingredients&quot; title=&quot;Direct link to Ingredients&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;pigz&lt;/code&gt; or another tool for decompressing the EBD into a TSV. If you&#39;re not using a huge data set, you might can just use your native OS&#39;s decompression tool.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;duckdb&lt;/code&gt;: &lt;a href=&quot;https://duckdb.org/docs/installation/?version=stable&amp;amp;environment=cli&amp;amp;platform=macos&amp;amp;download_method=direct&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;download instructions here&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;the EBD! &lt;a href=&quot;https://ebird.org/data/download?_gl=1*1l247ec*_gcl_au*MTgxMjQ2NTMyNy4xNzUyMDkzNzAy*_ga*MTcwOTAzNDM4Ny4xNzQxOTgzOTIx*_ga_QR4NVXZ8BM*czE3NTUzNzgxNzEkbzQwJGcxJHQxNzU1Mzc4MTc1JGo1NiRsMCRoMA..&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Sign up and request data here&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;steps&quot;&gt;Steps&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb#steps&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Steps&quot; title=&quot;Direct link to Steps&quot;&gt;​&lt;/a&gt;&lt;/h3&gt;
&lt;h4 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;step-1-download-your-data&quot;&gt;Step 1: Download your Data&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb#step-1-download-your-data&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Step 1: Download your Data&quot; title=&quot;Direct link to Step 1: Download your Data&quot;&gt;​&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;This can take a few minutes or a few hours. Note, Cornell&#39;s download speeds seem to be capped at around 8-10MB/S so plan accordingly!&lt;/p&gt;
&lt;h4 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;step-2-decompress-your-data&quot;&gt;Step 2: Decompress your data&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb#step-2-decompress-your-data&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Step 2: Decompress your data&quot; title=&quot;Direct link to Step 2: Decompress your data&quot;&gt;​&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;You can do this whatever tool you like. I liked using &lt;code&gt;pigz&lt;/code&gt; for performance. Here&#39;s a handy snippet I used to use &lt;code&gt;pigz&lt;/code&gt; with progress tracking:&lt;/p&gt;
&lt;div class=&quot;language-bash codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-bash codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;pv ebd_relJan-2025.tsv.gz | pigz -d &amp;gt; ebd_relJan-2025.tsv&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;pv&lt;/code&gt; is used to monitor the progress of data coming through the pipe.&lt;/p&gt;
&lt;p&gt;At the end of this step you should have the &quot;raw&quot; TSV, looking something like this:&lt;/p&gt;
&lt;div class=&quot;language-tsv codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-tsv codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;URN:CornellLabOfOrnithology:EBIRD:OBS2919749158	2025-03-01 23:25:39.781016	21333	species	avibase-69544B59	American Crow	Corvus brachyrhynchos				2					United States	US	Alabama	US-AL	Tuscaloosa	US-AL-125		27			Lakefield Drive, Tuscaloosa	L10510092	H	33.1559565	-87.6336990	2025-03-01	06:21:00	obsr1073731		S215984764	Traveling	Traveling	P22			24	1.092		2	1	G14142220	0	1	0&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;URN:CornellLabOfOrnithology:EBIRD:OBS2933256785	2025-03-05 07:43:54.669907	21333	species	avibase-69544B59	American Crow	Corvus brachyrhynchos				1					United States	US	Alabama	US-AL	Tuscaloosa	US-AL-125		27			23rd Street Area, Tuscaloosa	L12228602	P	33.1882745	-87.5396724	2025-03-05	06:27:00	obsr451424		S216708215	Stationary	Stationary	P21			15			1	1		0	1	0&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;URN:CornellLabOfOrnithology:EBIRD:OBS3009418896	2025-03-30 17:15:29.683739	21333	species	avibase-69544B59	American Crow	Corvus brachyrhynchos				3					United States	US	Alabama	US-AL	Tuscaloosa	US-AL-125		27			River Bend Turf	L5039140	H	33.1335213	-87.6534175	2025-03-30	15:11:00	obsr451424		S221934152	Traveling	Traveling	P22			60	13.532		2	1	G14355008	0	1	0&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;h4 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;step-3-load-the-data-into-duckdb&quot;&gt;Step 3: Load the data into DuckDB&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb#step-3-load-the-data-into-duckdb&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Step 3: Load the data into DuckDB&quot; title=&quot;Direct link to Step 3: Load the data into DuckDB&quot;&gt;​&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;There&#39;s a wide variety of ways to do this. I&#39;m including the simplest version I found here:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;TABLE&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ebd&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;full&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;AS&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;SELECT&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;FROM&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    read_csv&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;/path/to/decompressed/ebd.tsv&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        store_rejects &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token boolean&quot; style=&quot;color:#36acaa&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;        quote &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&#39;&#39;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The three things I&#39;d call out there are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;We&#39;re using &lt;code&gt;read_csv&lt;/code&gt; to well, read the TSV. DuckDB is smart enough to detect the delimiter.&lt;/li&gt;
&lt;li&gt;We&#39;re storing the rejected lines. It&#39;s always good to review these to make sure everything went alright!&lt;/li&gt;
&lt;li&gt;The last, most important part is &lt;code&gt;quote = &#39;&#39;&lt;/code&gt;. If you don&#39;t do this most if not all lines will get rejected for some reason...&lt;/li&gt;
&lt;/ol&gt;
&lt;h4 class=&quot;anchor anchorWithStickyNavbar_LWe7&quot; id=&quot;step-4-write-a-query-against-your-data&quot;&gt;Step 4: Write a query against your data!&lt;a href=&quot;https://beak-v2.onrender.com/blog/analyzing-ebd-with-duckdb#step-4-write-a-query-against-your-data&quot; class=&quot;hash-link&quot; aria-label=&quot;Direct link to Step 4: Write a query against your data!&quot; title=&quot;Direct link to Step 4: Write a query against your data!&quot;&gt;​&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;The most fun part. Time to learn from the data!&lt;/p&gt;
&lt;p&gt;Here&#39;s a query to see the first observation in the EBD:&lt;/p&gt;
&lt;div class=&quot;language-sql codeBlockContainer_Ckt0 theme-code-block&quot; style=&quot;--prism-color:#393A34;--prism-background-color:#f6f8fa&quot;&gt;&lt;div class=&quot;codeBlockContent_QJqH&quot;&gt;&lt;pre tabindex=&quot;0&quot; class=&quot;prism-code language-sql codeBlock_bY9V thin-scrollbar&quot; style=&quot;color:#393A34;background-color:#f6f8fa&quot;&gt;&lt;code class=&quot;codeBlockLines_e6Vv&quot;&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token operator&quot; style=&quot;color:#393A34&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; ebd&lt;/span&gt;&lt;span class=&quot;token punctuation&quot; style=&quot;color:#393A34&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;full&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token string&quot; style=&quot;color:#e3116c&quot;&gt;&quot;OBSERVATION DATE&quot;&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;span class=&quot;token-line&quot; style=&quot;color:#393A34&quot;&gt;&lt;span class=&quot;token plain&quot;&gt;&lt;/span&gt;&lt;span class=&quot;token keyword&quot; style=&quot;color:#00009f&quot;&gt;limit&lt;/span&gt;&lt;span class=&quot;token plain&quot;&gt; &lt;/span&gt;&lt;span class=&quot;token number&quot; style=&quot;color:#36acaa&quot;&gt;1&lt;/span&gt;&lt;br&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;hr&gt;</content>

    <author>
      <name>David Meadows</name>

      <uri>https://dtmeadows.me</uri>

    </author>
  </entry>

  <entry>
    <title>The special hell of Bolt, Europe&#39;s Uber clone</title>
    <link href="https://brandur.org/fragments/special-hell-of-bolt-app"/>
    <id>tag:brandur.org,2025-07-15:fragments/special-hell-of-bolt-app</id>
    <updated>2025-07-15T14:50:08Z</updated>


    <content type="html">&lt;p&gt;I was in Latvia a few weeks ago. Riga&amp;rsquo;s one of the Europeans cities without a good transit link from the airport into city. Snooping around online, I found that the recommended way to get a ride was the use of an app called Bolt, a European clone of Uber. I realize now that I didn&amp;rsquo;t actually check that Uber wasn&amp;rsquo;t available in Latvia, but I&amp;rsquo;m not against experimenting with a new app here and there.&lt;/p&gt;

&lt;p&gt;I used it twice to get to and from the city center, and it worked perfectly. Neither of my drivers spoke English and I didn&amp;rsquo;t speak a word of Latvian, but that&amp;rsquo;s what technology&amp;rsquo;s for. The rides went off without a hitch and I got exactly where I was supposed to be both times.&lt;/p&gt;

&lt;p&gt;I arrived in Lyon recently and figured, hey, this is Europe, why not try the European app again, and used Bolt.&lt;/p&gt;

&lt;h2 id=&quot;ride-attempt-1&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#ride-attempt-1&quot;&gt;Ride attempt no. 1&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Car pulls into airport, drives to the waiting spot, stops up ahead of me, I walk over to it, driver pulls away, and leaves the airport. Mystified, I photograph the guy&amp;rsquo;s license plate as he drives off figuring I might need it for dispute evidence.&lt;/p&gt;

&lt;p&gt;The driver doesn&amp;rsquo;t cancel the ride as he rides off into the distance, leaving me to do it, presumably so it falls to me to pay the app&amp;rsquo;s €7 cancellation fee.&lt;/p&gt;

&lt;h2 id=&quot;ride-attempt-2&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#ride-attempt-2&quot;&gt;Ride attempt no. 2&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;I cancel and try again. I get a ride parked not far off, but with a message: &amp;ldquo;This is an automated acceptance. This car is set to charge for another 45 minutes.&amp;rdquo; Sure enough, it&amp;rsquo;s unmoving and unresponsive, and eventually the ride times out (thankfully, avoiding another €7 charge).&lt;/p&gt;

&lt;h2 id=&quot;ride-attempt-3&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#ride-attempt-3&quot;&gt;Ride attempt no. 3&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;No message this time, but another car that appears to be charging and/or long term parked (it&amp;rsquo;s a Tesla, so I suspect charging again). I leave the app, waiting for the pick up to time out.&lt;/p&gt;

&lt;h2 id=&quot;ride-attempt-4&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#ride-attempt-4&quot;&gt;Ride attempt no. 4&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;I give up on Bolt, and switch to Uber. I match a driver right away. It&amp;rsquo;s almost suspicious how quickly I matched him. But this is good! Progress. He drives over and I walk up to meet him. I get in the car and we start moving. Finally, this fiasco is over.&lt;/p&gt;

&lt;p&gt;But then a guy runs up to the driver&amp;rsquo;s window. Hey, he shouts, you&amp;rsquo;re our ride! We booked you on Bolt. We just talked about on the phone a few minutes ago, remember?&lt;/p&gt;

&lt;p&gt;Knowing that his license plate and photo matches what&amp;rsquo;s on their screen, the driver doesn&amp;rsquo;t bother denying it, and instead just points to his phone&amp;rsquo;s screen and says, I pick up Brandur. See?&lt;/p&gt;

&lt;p&gt;Even as the car&amp;rsquo;s &amp;ldquo;winner&amp;rdquo; (I&amp;rsquo;m not sure if this was because I got to the car first or the Uber fare was more favorable for the driver), I have principles, and of course don&amp;rsquo;t love this situation either, but my only alternative would be to get out and cancel the ride, for which I&amp;rsquo;d surely get hit with another fee. Unfortunately my best option is to stay quiet about it, let the Bolt user get another ride, and give the driver a low rating later. Naturally, the driver didn&amp;rsquo;t cancel the other guy&amp;rsquo;s Bolt ride (at least as far as I observed from the back seat), which would&amp;rsquo;ve left the user to eat the €7 fee.&lt;/p&gt;

&lt;p&gt;As we drove away from the airport, I suddenly realized: wait! this must be what happened to &lt;em&gt;me&lt;/em&gt; during my first ride.&lt;/p&gt;

&lt;h2 id=&quot;ride-attempt-1-part-2&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#ride-attempt-1-part-2&quot;&gt;Ride attempt no. 1, part 2&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;I go back into the Bolt app and open a support conversation. This option is purposely hidden deep inside submenus of submenus of submenus, so it took me five minutes to find it. I explain what happened and include the photographic evidence. From the first response it&amp;rsquo;s obvious they have me talking to an AI. I drop all formality, and type only the minimum viable number of characters to get the next response. The AI promises me a refund for my €7 cancellation fee, then proceeds to provide no refund.&lt;/p&gt;

&lt;p&gt;Eventually I&amp;rsquo;m escalated to a human operator, who somehow manages to be worse than the AI. After explaining the situation again, I&amp;rsquo;m told that fine, in this extremely rare, never-before-seen, once-in-a-cosmic-era situation, they&amp;rsquo;ll refund the €7 fee. But don&amp;rsquo;t fuck up again!&lt;/p&gt;

&lt;p&gt;Don&amp;rsquo;t worry Bolt, I won&amp;rsquo;t. My days of using you scam peddlers are over.&lt;/p&gt;

&lt;p&gt;When something works well enough, it&amp;rsquo;s easy to take it for granted. As much flak as Uber and Lyft take, my experience with Bolt made me stop and think that even given 10+ years and hundreds of rides on both apps, my bad experiences have numbered like maybe, two? That sort of quality bar isn&amp;rsquo;t an easy thing to maintain.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Non-Euclidean Monitor Layout With Cursor Wormholes</title>
    <link href="https://www.scd31.com/posts/cursor-wormholes"/>
    <id>https://www.scd31.com/posts/cursor-wormholes</id>
    <updated>2025-07-12T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Occasionally injected clocks in Postgres</title>
    <link href="https://brandur.org/fragments/postgres-clocks"/>
    <id>tag:brandur.org,2025-06-29:fragments/postgres-clocks</id>
    <updated>2025-06-29T17:48:16Z</updated>


    <content type="html">&lt;p&gt;In a standard app deployment that&amp;rsquo;s scaled horizontally across many nodes, we can expect the clocks to be a little askew across the fleet. It&amp;rsquo;s generally not a huge problem these days because our &lt;a href=&quot;https://en.wikipedia.org/wiki/Network_Time_Protocol&quot;&gt;use of NTP&lt;/a&gt; is so good and so widespread, but minor drift is still present.&lt;/p&gt;

&lt;p&gt;Where a single source of time authority is desired, a nice trick is to use the database. A single database is shared across all deployed nodes, so by using the database&amp;rsquo;s &lt;code&gt;now()&lt;/code&gt; function instead of &lt;code&gt;time.Now()&lt;/code&gt; in code, we can expect perfect consistency across all created records.&lt;/p&gt;

&lt;p&gt;But a downside of this approach is that it makes time hard to stub because Postgres&amp;rsquo; time is hard to stub. Stubbing time is often a necessity in tests and not being able to do so is a deal breaker.&lt;/p&gt;

&lt;p&gt;We&amp;rsquo;ve been using a hybrid approach with some success. A call to &lt;code&gt;coalesce&lt;/code&gt; prefers an injected timestamp if there is one, but falls back on &lt;code&gt;now()&lt;/code&gt; most of the time (including in production) to share a clock.&lt;/p&gt;

&lt;h2 id=&quot;sql-sqlc&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#sql-sqlc&quot;&gt;Step 1: SQL + sqlc&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Here&amp;rsquo;s a sample query showing the &lt;code&gt;coalesce&lt;/code&gt; in action. &lt;code&gt;sqlc.narg&lt;/code&gt; defines a parameter as nullable.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- name: QueuePause :execrows
UPDATE queue
SET paused_at = CASE
                WHEN paused_at IS NULL THEN coalesce(
                    sqlc.narg(&#39;now&#39;)::timestamptz,
                    now()
                )
                ELSE paused_at
                END
WHERE name = @name;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In &lt;code&gt;sqlc.yaml&lt;/code&gt;, tell sqlc to emit nullable timestamps as &lt;code&gt;*time.Time&lt;/code&gt; pointers:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;version: &amp;quot;2&amp;quot;
sql:
  - engine: &amp;quot;postgresql&amp;quot;
    queries: ...
    schema: ...
    gen:
      go:
        overrides:
          - db_type: &amp;quot;timestamptz&amp;quot;
            go_type:
              type: &amp;quot;time.Time&amp;quot;
              pointer: true
            nullable: true
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Which generates this code:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;const queuePause = `-- name: QueuePause :execrows
UPDATE queue
SET
    paused_at = CASE WHEN paused_at IS NULL THEN coalesce($1::timestamptz, now()) ELSE paused_at END
WHERE CASE WHEN $2::text = &#39;*&#39; THEN true ELSE name = $2 END
`

type QueuePauseParams struct {
    Now  *time.Time
    Name string
}

func (q *Queries) QueuePause(ctx context.Context, db DBTX, arg *QueuePauseParams) (int64, error) {
    result, err := db.Exec(ctx, queuePause, arg.Now, arg.Name)
    if err != nil {
        return 0, err
    }
    return result.RowsAffected(), nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;stubbable-time-generator&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#stubbable-time-generator&quot;&gt;Step 2: Stubabble time generator&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Working in Go, define a &lt;code&gt;TimeGenerator&lt;/code&gt; interface:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When unstubbed, it returns the current time from &lt;code&gt;NowUTC()&lt;/code&gt; or &lt;code&gt;nil&lt;/code&gt; from &lt;code&gt;NowUTCOrNil()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;When stubbed, it returns the stubbed time from &lt;code&gt;NowUTC()&lt;/code&gt; or a pointer version of the same from &lt;code&gt;NowUTCOrNil()&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// TimeGenerator generates a current time in UTC. In test
// environments it&#39;s implemented by TimeStub which lets the
// current time be stubbed. Otherwise, it&#39;s implemented as
// UnstubbableTimeGenerator which doesn&#39;t allow stubbing.
type TimeGenerator interface {
    // NowUTC returns the current time. This may be a stubbed
    // time if the time has been actively stubbed in a test.
    NowUTC() time.Time

    // NowUTCOrNil returns if the currently stubbed time _if_
    // the current time is stubbed, and returns nil otherwise.
    // This is generally useful in cases where a component may
    // want to use a stubbed time if the time is stubbed, but
    // to fall back to a database time default otherwise.
    NowUTCOrNil() *time.Time
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A stubbable implementation for tests:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;type TimeStub struct {
    nowUTC *time.Time
}

func (t *TimeStub) NowUTC() time.Time {
    if t.nowUTC == nil {
        return time.Now().UTC()
    }

    return *t.nowUTC
}

func (t *TimeStub) NowUTCOrNil() *time.Time {
    return t.nowUTC
}

func (t *TimeStub) StubNowUTC(nowUTC time.Time) time.Time {
    t.nowUTC = &amp;amp;nowUTC
    return nowUTC
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;An unstubbable time generator for production:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;type UnstubbableTimeGenerator struct{}

func (g *UnstubbableTimeGenerator) NowUTC() time.Time       { return time.Now() }
func (g *UnstubbableTimeGenerator) NowUTCOrNil() *time.Time { return nil }

func (g *UnstubbableTimeGenerator) StubNowUTC(nowUTC time.Time) time.Time {
    panic(&amp;quot;time not stubbable outside tests&amp;quot;)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;shared-time-generator&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#shared-time-generator&quot;&gt;Step 3: Distributing a shared time generator&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;The next key aspect is that all code needs to share a single instance of &lt;code&gt;TimeGenerator&lt;/code&gt; so that when it&amp;rsquo;s stubbed from a test, all services and subservices get the same stubbed value.&lt;/p&gt;

&lt;p&gt;We put a &lt;code&gt;TimeGenerator&lt;/code&gt; on a base service archetype that&amp;rsquo;s automatically injected from top-level services to subservices:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func (c *Client[TTx]) QueuePauseTx(ctx context.Context, tx TTx, name string, opts *QueuePauseOpts) error {
    executorTx := c.driver.UnwrapExecutor(tx)

    if err := executorTx.QueuePause(ctx, &amp;amp;QueuePauseParams{
        Name:   name,
        Now:    c.baseService.Time.NowUTCOrNil(), // &amp;lt;-- accessed here
        Schema: c.config.Schema,
    }); err != nil {
        return err
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;By default, it&amp;rsquo;s instantiated as &lt;code&gt;UnstubbableTimeGenerator&lt;/code&gt;. From tests, it&amp;rsquo;s a &lt;code&gt;TimeStub&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func BaseServiceArchetype(tb testing.TB) *baseservice.Archetype {
    tb.Helper()

    return &amp;amp;baseservice.Archetype{
        Logger: Logger(tb),
        Time:   &amp;amp;TimeStub{},
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In a test, time is stubbed like:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;stubbedNow := client.baseService.Time.StubNowUTC(time.Now().UTC())
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;loose-conviction&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#loose-conviction&quot;&gt;Loose conviction&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Consider this one a loose recommendation. It&amp;rsquo;s useful in some situations where timestamp consistency is critically important, but not in others where it isn&amp;rsquo;t. Server clocks tend to be pretty good nowadays, and it&amp;rsquo;s a lot of code to avoid a few tens of microseconds worth of drift.&lt;/p&gt;

&lt;p&gt;Also, consider that there might be a downside to using the database clock. In SQL, &lt;code&gt;CURRENT_TIMESTAMP&lt;/code&gt; and &lt;code&gt;now()&lt;/code&gt; in Postgres represent the current time &lt;em&gt;at the start of the current transaction&lt;/em&gt; rather than the current time. This might be a benefit as all records created during a transaction are assigned the same created time, but it&amp;rsquo;s just as often undesirable because depending on the duration of the transaction, timestamps can be wildly unrepresentative of when things actually happened.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Schoenberg: The MIDI Esoteric Programming Language</title>
    <link href="https://tomeraberba.ch/schoenberg"/>
    <id>https://tomeraberba.ch/schoenberg</id>
    <updated>2025-06-23T00:00:00Z</updated>


    <content type="html">Schoenberg is an esoteric programming language where programs are written as MIDI files. A MIDI file is basically digital sheet music that tells a computer which notes to play when and how loudly. The…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>Testing the graceful handling of request cancellation in Go, 499s</title>
    <link href="https://brandur.org/fragments/testing-request-cancellation"/>
    <id>tag:brandur.org,2025-06-20:fragments/testing-request-cancellation</id>
    <updated>2025-06-19T22:16:09Z</updated>


    <content type="html">&lt;p&gt;We had a situation a few days ago where a lazy loading problem in our Ruby code led to long running requests that our Dashboard, with an optimistic five second deadline on backend requests, was timing out. This raised a question in Slack: if our frontend does time out a backend request, does the request keep running? Or does the API know how to save resources by abandoning it midway through?&lt;/p&gt;

&lt;p&gt;If the API stack&amp;rsquo;s being bombarded by expensive requests that are largely being canceled early, it&amp;rsquo;s a huge optimization to make sure that they only use the resources that they absolutely to. Requests discarded early stop executing immediately and no further effort is put toward servicing them.&lt;/p&gt;

&lt;p&gt;In most code I&amp;rsquo;ve ever worked in, I could quite confidently answer the question above with a definitive and resounding &amp;ldquo;no&amp;rdquo;. Doing a good job of request cancellation requires it be baked quite deeply into language and low level libraries, which isn&amp;rsquo;t common. And even when those handle it well, userland code usually doesn&amp;rsquo;t. Also, cancelling a request midway in services that don&amp;rsquo;t use transactions would be unacceptably dangerous &amp;ndash; &lt;a href=&quot;/acid#atomicity&quot;&gt;mutated state would be left mutated&lt;/a&gt;, and that&amp;rsquo;d cause untold trouble later on.&lt;/p&gt;

&lt;h2 id=&quot;go-cancellation&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#go-cancellation&quot;&gt;Cancellation in Go&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;But in a Go stack, the built-in HTTP server &lt;a href=&quot;https://pkg.go.dev/net/http#Request.Context&quot;&gt;should handle cancellations using context&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;For incoming server requests, the context is canceled when the client&amp;rsquo;s connection closes, the request is canceled (with HTTP/2), or when the ServeHTTP method returns.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And with our code being widely safeguarded by transactions, the feature should even be safe to use!&lt;/p&gt;

&lt;h2 id=&quot;prove-it&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#prove-it&quot;&gt;Now prove it&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Theory is one thing, but reality is another. If request cancellations indeed work, we should be able to prove it, so I set up a little bootstrap in pursuit of that. To make testing easy, add an artificial API endpoint waiting on sleep or context finished:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;select {
case &amp;lt;-time.After(5 * time.Second):
case &amp;lt;-ctx.Done():
        return nil, ctx.Err()
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Start the API server. Then from another terminal, run cURL and interrupt it after a few seconds:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ curl -i http://localhost:5222/sleep
^C
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I found that we were handling canceled requests reasonably well, but that the error we were logging wasn&amp;rsquo;t right. The code was checking context cancellation, but getting confused between context that was canceled from the HTTP server versus one canceled by our built-in timeout middleware, improperly sending a &lt;code&gt;408 Request timeout&lt;/code&gt; to logs.&lt;/p&gt;

&lt;h2 id=&quot;local-vs-request&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#local-vs-request&quot;&gt;Local vs. request context&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;After a little refactoring, I ended up with this code:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func (e *APIEndpoint[TReq, TResp]) Execute(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()

    // Add a default timeout for all API requests to ensure there&#39;s
    // always a backstop in case of degenerate behavior. Rescued
    // below and turned into a more user-friendly error.
    ctx, cancel := context.WithTimeout(ctx, RequestTimeout)
    defer cancel()
    
    ...
    
    ret, err := e.serviceHandler(ctx, req)
    if err != nil {
        if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
            // Distinct error message when the request itself was
            // canceled above the API stack versus we had a
            // cancellation/timeout occur within the API endpoint.
            if r.Context().Err() != nil {
                // This is a non-standard status code (499), but
                // fairly widespread because Nginx defined it.
                err = apierror.NewClientClosedRequestError(ctx, errMessageRequestCanceled).WithSpecifics(ctx, err)
            } else {
                err = apierror.NewRequestTimeoutError(ctx, errMessageRequestTimeout).WithSpecifics(ctx, err)
            }
        }

        WriteError(ctx, w, err)
        return
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Should a context error occur, we return a &lt;code&gt;408 Request timeout&lt;/code&gt; in case of a timeout on local &lt;code&gt;ctx&lt;/code&gt;, but a &lt;code&gt;499 Client closed request&lt;/code&gt; if context was canceled upstream by the HTTP server canceling &lt;code&gt;r.Context()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;499&lt;/code&gt; isn&amp;rsquo;t real status code, but rather one invented by Nginx which happens to be useful here. It doesn&amp;rsquo;t really matter what status code we use because the end user (who canceled the request before the status code returned) will never see it. It&amp;rsquo;s purely for our own logging and telemetry.&lt;/p&gt;

&lt;p&gt;Looking at local logs running the sleep/cancel routine, I now see this:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-txt&quot;&gt;canonical_api_line GET /sleep -&amp;gt; 499 (4.162702459s)
    api_error_cause=&amp;quot;context canceled&amp;quot;
    api_error_internal_code=client_closed_request
    api_error_message=&amp;quot;Context of incoming request canceled; API endpoint stopped executing.&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;generalizing-cancellation&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#generalizing-cancellation&quot;&gt;Generalizing cancellation handling&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Although our demo uses an artificial sleep statement, importantly this still works for any normal requests. Our code isn&amp;rsquo;t littered with &lt;code&gt;&amp;lt;-ctx.Done()&lt;/code&gt; checks all over the place, but it does have a great many database operations like this one:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;account, err := dbsqlc.New().AccountTouchLastSeenAt(ctx, e, apiKey.AccountID)
if err != nil {
    return nil, xerrors.Errorf(&amp;quot;error looking up account: %w&amp;quot;, err)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;These call into Sqlc which call into Pgx, and Pgx detects a canceled context and sends back an error. In the event of a canceled request, the first database operation will come back with an error that&amp;rsquo;ll bubble back up the stack to our API endpoint infrastructure. There it&amp;rsquo;ll be turned it into a &lt;code&gt;499&lt;/code&gt;. Subsequent database operations won&amp;rsquo;t run, saving time and resources.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// API service handler error handling. Repeated from above.
ret, err := e.serviceHandler(ctx, req)
if err != nil {
    if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
        // Distinct error message when the request itself was
        // canceled above the API stack versus we had a
        // cancellation/timeout occur within the API endpoint.
        if r.Context().Err() != nil {
            // This is a non-standard status code (499), but
            // fairly widespread because Nginx defined it.
            err = apierror.NewClientClosedRequestError(ctx, errMessageRequestCanceled).WithSpecifics(ctx, err)
        } else {
            err = apierror.NewRequestTimeoutError(ctx, errMessageRequestTimeout).WithSpecifics(ctx, err)
        }
    }

    WriteError(ctx, w, err)
    return
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Pgx is one example of a library that&amp;rsquo;ll check context cancellation, but it&amp;rsquo;ll generally occur in any low level library that&amp;rsquo;s doing I/O. As another example, SDKs like AWS or Stripe will usually go through &lt;code&gt;net/http&lt;/code&gt;, which will catch them.&lt;/p&gt;

&lt;p&gt;With code exercised (and adequate new testing in place), I was confident returning to Slack and declaring that &amp;ldquo;yes&amp;rdquo;, request cancellation is handled smoothly. I can&amp;rsquo;t say the same about our Ruby code, but that&amp;rsquo;s an adventure for another day.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Be careful with Dropbox</title>
    <link href="https://brandur.org/fragments/careful-with-dropbox"/>
    <id>tag:brandur.org,2025-05-26:fragments/careful-with-dropbox</id>
    <updated>2025-05-26T06:28:04Z</updated>


    <content type="html">&lt;p&gt;I&amp;rsquo;ve been a Dropbox users going on fifteen years now. It&amp;rsquo;s one of the most frustrating products in my arsenal because fifteen years ago it was &lt;em&gt;perfect&lt;/em&gt;, but every new release just makes it a little bit worse than it was before. It&amp;rsquo;s still fine to use, but you can see the writing on the wall as the long term trend is all in the wrong direction.&lt;/p&gt;

&lt;p&gt;Despite that, I previously would&amp;rsquo;ve lavished it with praise in that I&amp;rsquo;ve never once had trouble with data loss or data integrity. Despite increasing feature bloat, it did what it was supposed to, syncing files to the right places, and doing so &lt;em&gt;reliably&lt;/em&gt;, which is pretty much all I need out of it.&lt;/p&gt;

&lt;p&gt;That ended Friday, when I was installing Dropbox on a new laptop. My Dropbox size runs ~500 GB, so when credentialing a new machine, I copy it from another computer on the network for speed, and to conserve precious bandwidth &lt;sup id=&quot;footnote-1-source&quot;&gt;&lt;a href=&quot;#footnote-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rsync &lt;code&gt;~/Dropbox&lt;/code&gt; from an existing computer to the new one.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;brew install dropbox&lt;/code&gt;. Open it, log in, close it.&lt;/li&gt;
&lt;li&gt;Replace the contents of &lt;code&gt;~/Dropbox&lt;/code&gt; with the &lt;code&gt;rsync&lt;/code&gt;ed copy.&lt;/li&gt;
&lt;li&gt;Open Dropbox, let it sync against the new data. It should find everything it needs already there.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Dropbox made a change in the last couple years wherein they moved the standard &lt;code&gt;~/Dropbox&lt;/code&gt; on Mac to a new &lt;code&gt;~/Library/CloudStorage/Dropbox&lt;/code&gt; location. I now know that folders in this directory are meant for use with Apple&amp;rsquo;s &lt;a href=&quot;https://developer.apple.com/documentation/fileprovider/&quot;&gt;File Provider API&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Apparently the change had been introduced for macOS Ventura (two major versions ago), but there must&amp;rsquo;ve been an incremental roll out because I set up a computer last year and didn&amp;rsquo;t run into it then. Once you&amp;rsquo;ve been opted into the feature, you cannot opt out. Changing back to &lt;code&gt;~/Dropbox&lt;/code&gt; is not an option.&lt;/p&gt;

&lt;p&gt;Normally I do a wholesale swap of &lt;code&gt;~/Dropbox&lt;/code&gt; with my locally copied version, but seeing this new magic folder in &lt;code&gt;~/Library&lt;/code&gt;, I worried there&amp;rsquo;d be some irreversible effect if I did it the normal way. Instead, I closed Dropbox, &lt;code&gt;cd&lt;/code&gt;ed into the folder to &lt;code&gt;rm&lt;/code&gt; all the files acting as cloud &amp;ldquo;stubs&amp;rdquo;, intending to replace them with materialized versions from my local copy.&lt;/p&gt;

&lt;p&gt;What a mistake. I dumbly assumed that with Dropbox closed, any changes I made to the folder would be safe, just like they were in every previous version of Dropbox. Not so. At all.&lt;/p&gt;

&lt;p&gt;I got suspicious after about ten seconds. Normally an &lt;code&gt;rm&lt;/code&gt; even on gigantic directories is near instant, but this one was running long. I &lt;code&gt;SIGINT&lt;/code&gt;ed it, but the damage was done.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m sure you guessed what happened already. &lt;code&gt;~/Library/CloudStorage&lt;/code&gt; is a magic location, and folders in it use macOS extension voodoo to make arbitrary changes in a cloud storage API. Despite Dropbox not being open, it&amp;rsquo;d used a Mac API to intercept the &lt;code&gt;rm&lt;/code&gt; and started to remove everything. My other computers had already synced the deletions. 100s of GBs gone in seconds.&lt;/p&gt;

&lt;p&gt;Dropbox has a good &amp;ldquo;undelete&amp;rdquo; function, so I was able to log into their web UI and recover all the deleted files, but I was left with the problem of all my other computers having purged their local contents, with potentially 100s of GBs on each needing to be synced back down (and I thought I was &lt;em&gt;saving&lt;/em&gt; bandwidth when I started doing this). Worse yet, Dropbox puts any files it deletes into a &lt;code&gt;~/Dropbox/.dropbox.cache&lt;/code&gt; directory, but can&amp;rsquo;t reuse any of that data when files are recovered, so it just makes a copy. Dropbox doesn&amp;rsquo;t purge its cache often, even if disk space gets critically low, so every computer potentially needed 2 * 500 GB =~ 1 TB of free space for the full recovery, which they didn&amp;rsquo;t have.&lt;/p&gt;

&lt;p&gt;Two days later, I got everything back to where it should be, but all I could think afterwards was what a stupid, unforced error this all was. A mandatory move to &lt;code&gt;~/Library/CloudStorage/Dropbox&lt;/code&gt;/File Provider API has no marginal utility for the user &lt;sup id=&quot;footnote-2-source&quot;&gt;&lt;a href=&quot;#footnote-2&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, even if it makes product managers at the company feel good about themselves.&lt;/p&gt;

&lt;p&gt;Being particular incensed at this moment, I started looking into alternatives immediately. There&amp;rsquo;s dozens out there, but my approximate evaluation is that there isn&amp;rsquo;t one that&amp;rsquo;s a crystal clear, unambiguous win that&amp;rsquo;d I&amp;rsquo;d be excited about doing the work to switch over to, which is too bad.&lt;/p&gt;

&lt;p&gt;What I&amp;rsquo;m really looking for is Dropbox circa 2011. The one without the gratuitous/dangerous product changes, without an Electron app, and without the nags to upgrade my account which I already pay $120/year for.&lt;/p&gt;

&lt;p&gt;Anyway, I doubt most users will run into this one as it was a confluence of stupid things that led me down this path, but I&amp;rsquo;d just caution like the title says: be careful with Dropbox. Don&amp;rsquo;t &lt;code&gt;rm&lt;/code&gt; too much. Don&amp;rsquo;t assume intuitive cause and effect. Don&amp;rsquo;t assume operations are safe even if the app is closed.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Migrating My Servers (and me!) to NYC</title>
    <link href="https://www.scd31.com/posts/migrating-servers-to-nyc"/>
    <id>https://www.scd31.com/posts/migrating-servers-to-nyc</id>
    <updated>2025-05-19T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>My Second M68k SBC</title>
    <link href="https://www.scd31.com/posts/another-m68k-sbc"/>
    <id>https://www.scd31.com/posts/another-m68k-sbc</id>
    <updated>2025-04-12T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Toasting QR Codes onto Tortillas</title>
    <link href="https://www.scd31.com/posts/qr-tortillas"/>
    <id>https://www.scd31.com/posts/qr-tortillas</id>
    <updated>2025-04-06T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>My Time at The Recurse Center</title>
    <link href="https://www.scd31.com/posts/my-time-at-rc"/>
    <id>https://www.scd31.com/posts/my-time-at-rc</id>
    <updated>2025-04-01T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Optimizing JPEGs with MozJPEG for local archival</title>
    <link href="https://brandur.org/fragments/optimizing-jpegs-for-archival"/>
    <id>tag:brandur.org,2025-03-29:fragments/optimizing-jpegs-for-archival</id>
    <updated>2025-03-29T19:35:10Z</updated>


    <content type="html">&lt;p&gt;Call me old fashioned, but I like to keep my photo collection as local files on disk rather than symbolic pointers in the cloud, or sent off to deep storage on large archival drives, neither of which I&amp;rsquo;m likely to ever look at again. It&amp;rsquo;s nice having quick access to them that still works over a bad internet link or on an airplane.&lt;/p&gt;

&lt;p&gt;It&amp;rsquo;s a great system, but it&amp;rsquo;s been getting more difficult as time goes by. My photo collection grows year by year, but Apple&amp;rsquo;s hard drive sizes stay frozen circa 2012. I&amp;rsquo;m running the same 1 TB drive that I was five years ago, which is only incrementally larger than five years before that (and even the mizerly 1 TB is still a $200 upcharge over the default &lt;em&gt;512 GB&lt;/em&gt; that&amp;rsquo;s somehow a thing that Apple sells in 2025).&lt;/p&gt;

&lt;p&gt;Realistically, I know that I&amp;rsquo;ll never look at the majority of these photos again, so I already prune the collections aggressively to keep only the highlights, but was looking for storage opportunities beyond that. Years ago I wrote about &lt;a href=&quot;/fragments/libjpeg-mozjpeg&quot;&gt;optimizing JPEGs for this site using MozJPEG&lt;/a&gt;, and knowing that a lot of cameras produce suboptimally compressed JPEGs, realized there was a similar opportunity for archival.&lt;/p&gt;

&lt;p&gt;I ended up writing a wrapper around around MozJPEG that saves about 80% of space compared to what comes out of my camera. Here&amp;rsquo;s a sample run:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;
    $ optimize 001-ana-nuevo/*
    created: 001-ana-nuevo/2W4A6210.jpg (9.02MB -&amp;gt; 2.11MB / saved 77%)
    created: 001-ana-nuevo/2W4A6212.jpg (8.21MB -&amp;gt; 1.79MB / saved 78%)
    created: 001-ana-nuevo/2W4A6216.jpg (11.0MB -&amp;gt; 2.68MB / saved 76%)
    created: 001-ana-nuevo/2W4A6218.jpg (6.36MB -&amp;gt; 1.29MB / saved 80%)
    created: 001-ana-nuevo/2W4A6219.jpg (12.11MB -&amp;gt; 3.01MB / saved 75%)
    created: 001-ana-nuevo/2W4A6224.jpg (7.3MB -&amp;gt; 1.69MB / saved 77%)
    created: 001-ana-nuevo/2W4A6228.jpg (7.75MB -&amp;gt; 1.72MB / saved 78%)
    created: 001-ana-nuevo/2W4A6230.jpg (8.62MB -&amp;gt; 1.99MB / saved 77%)
    created: 001-ana-nuevo/2W4A6236.jpg (8.14MB -&amp;gt; 1.87MB / saved 77%)
    created: 001-ana-nuevo/2W4A6237.jpg (6.65MB -&amp;gt; 1.48MB / saved 78%)
    created: 001-ana-nuevo/2W4A6238.jpg (7.59MB -&amp;gt; 1.69MB / saved 78%)
    created: 001-ana-nuevo/2W4A6240.jpg (9.38MB -&amp;gt; 2.21MB / saved 76%)
    created: 001-ana-nuevo/2W4A6242.jpg (9.26MB -&amp;gt; 2.22MB / saved 76%)
    created: 001-ana-nuevo/2W4A6243.jpg (10.17MB -&amp;gt; 2.44MB / saved 76%)
    created: 001-ana-nuevo/2W4A6247.jpg (10.49MB -&amp;gt; 2.56MB / saved 76%)
    created: 001-ana-nuevo/2W4A6251.jpg (7.92MB -&amp;gt; 1.84MB / saved 77%)
    created: 001-ana-nuevo/2W4A6252.jpg (8.97MB -&amp;gt; 2.12MB / saved 76%)
    created: 001-ana-nuevo/2W4A6253.jpg (7.74MB -&amp;gt; 1.75MB / saved 77%)
    created: 001-ana-nuevo/2W4A6254.jpg (9.43MB -&amp;gt; 2.3MB / saved 76%)
    created: 001-ana-nuevo/2W4A6255.jpg (10.78MB -&amp;gt; 2.65MB / saved 75%)
    created: 001-ana-nuevo/2W4A6258-pups.jpg (9.13MB -&amp;gt; 2.22MB / saved 76%)
    created: 001-ana-nuevo/2W4A6259.jpg (10.46MB -&amp;gt; 2.55MB / saved 76%)
    created: 001-ana-nuevo/2W4A6260.jpg (8.54MB -&amp;gt; 2.04MB / saved 76%)
    created: 001-ana-nuevo/2W4A6262.jpg (10.3MB -&amp;gt; 2.59MB / saved 75%)
    created: 001-ana-nuevo/2W4A6266.jpg (8.81MB -&amp;gt; 2.19MB / saved 75%)
    created: 001-ana-nuevo/2W4A6267.jpg (9.64MB -&amp;gt; 2.31MB / saved 76%)
    created: 001-ana-nuevo/2W4A6268.jpg (9.83MB -&amp;gt; 2.33MB / saved 76%)
    created: 001-ana-nuevo/2W4A6269.jpg (8.93MB -&amp;gt; 2.14MB / saved 76%)
    created: 001-ana-nuevo/2W4A6271.jpg (7.38MB -&amp;gt; 1.74MB / saved 76%)
    created: 001-ana-nuevo/2W4A6272.jpg (7.19MB -&amp;gt; 1.68MB / saved 77%)
    created: 001-ana-nuevo/2W4A6283-water-fight.jpg (7.65MB -&amp;gt; 1.73MB / saved 77%)
    created: 001-ana-nuevo/2W4A6284.jpg (8.02MB -&amp;gt; 1.77MB / saved 78%)
    created: 001-ana-nuevo/2W4A6286.jpg (5.82MB -&amp;gt; 1.11MB / saved 81%)
    created: 001-ana-nuevo/2W4A6287.jpg (6.03MB -&amp;gt; 1.14MB / saved 81%)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I&amp;rsquo;m sure there&amp;rsquo;s some subtle downside to the extra compression, but I&amp;rsquo;ve tried zooming all the way in on a couple samples before and after, and I can see differences right at the pixel level, but the optimized version isn&amp;rsquo;t clearly worse to my eye.&lt;/p&gt;

&lt;p&gt;My script&amp;rsquo;s use-at-your-own-risk me-ware that I&amp;rsquo;m not publishing in any official sense, but &lt;a href=&quot;https://gist.github.com/brandur/8a7a7c7870fce52bcf1ac0c34d66af30&quot;&gt;here it is for reference&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Some gotchas I ran into and which might save someone else time/trouble:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The MozJPEG binary to compress JPEGs is called &lt;code&gt;cjpeg&lt;/code&gt;. This is an old Linux style project, and naming the binary after the project would make things too easy and too obvious for users. Under the strict edicts of 1970s Unix philosophy, that&amp;rsquo;s completely unacceptable.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;You might have multiple packages on your system providing &lt;code&gt;cjpeg&lt;/code&gt;. Make sure you&amp;rsquo;re using MozJPEG&amp;rsquo;s because it offers much better compression than libjpeg or libjpeg-turbo. You can see here that my default &lt;code&gt;cjpeg&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; MozJPEG&amp;rsquo;s:&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ which cjpeg
/opt/homebrew/bin/cjpeg

$ ls -l /opt/homebrew/bin/cjpeg
lrwxr-xr-x@ 1 brandur  admin    36B Feb 10 11:45 /opt/homebrew/bin/cjpeg -&amp;gt; ../Cellar/jpeg-turbo/3.1.0/bin/cjpeg
&lt;/code&gt;&lt;/pre&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The original libjpeg &lt;code&gt;cjpeg&lt;/code&gt; didn&amp;rsquo;t support &lt;em&gt;reading&lt;/em&gt; JPEGs, only writing them, and would encourage you to read JPEGs with another binary called &lt;code&gt;djpeg&lt;/code&gt; and pipe that into &lt;code&gt;cjpeg&lt;/code&gt; (again, the wonders of Unix philosophy). You can do that with MozJPEG too, but DO NOT DO THAT! Piping will strip EXIF data, which &lt;a href=&quot;/fragments/stop-stripping-exif&quot;&gt;you shouldn&amp;rsquo;t do&lt;/a&gt;. Unlike libjpeg&amp;rsquo;s version, MozJPEG&amp;rsquo;s &lt;code&gt;cjpeg&lt;/code&gt; does read JPEGs, so piping is not necessary.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;If you&amp;rsquo;re writing to a new a file and then replacing the original after (which you probably should for safety), make sure to copy the original create/modify timestamps to the new file. The easiest way to do this is with &lt;code&gt;touch -r &amp;lt;original&amp;gt; &amp;lt;new&lt;/code&gt;&amp;gt;`&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-txt&quot;&gt;TOUCH(1)						      General Commands Manual							  TOUCH(1)

NAME
     touch – change file access and modification times

SYNOPSIS
     touch [-A [-][[hh]mm]SS] [-achm] [-r file] [-t [[CC]YY]MMDDhhmm[.SS]] [-d YYYY-MM-DDThh:mm:SS[.frac][tz]] file ...

DESCRIPTION
     The touch utility sets the modification and access times of
     files. If any file does not exist, it is created with default
     permissions.

     By default, touch changes both modification and access times.
     The -a and -m flags may be used to select the access time or
     the modification time individually.  Selecting both is
     equivalent to the default.  By default, the timestamps are set
     to the current time. The -d and -t flags explicitly specify a
     different time, and the -r flag specifies to set the times
     those of the specified file.  The -A flag adjusts the values
     by a specified amount.

     The following options are available:

     ...

     -r      Use the access and modifications times from the
             specified file instead of the current time of day.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Another approach would be to do away with JPEG completely and go to HEIC or WebP, but I&amp;rsquo;m still finding support for those a little spotty, and navigating them in a file browser feels slow because the compression takes longer to render. I&amp;rsquo;ll check in on that again in a year or two.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>The right way to do data fixtures in Go</title>
    <link href="https://brandur.org/fragments/go-data-fixtures"/>
    <id>tag:brandur.org,2025-03-20:fragments/go-data-fixtures</id>
    <updated>2025-03-20T15:56:52Z</updated>


    <content type="html">&lt;p&gt;Every test suite should start early in building a strong convention to generate data fixtures. If it doesn&amp;rsquo;t, data fixtures will still emerge (they&amp;rsquo;re that necessary), but in a way that&amp;rsquo;s poorly designed, with no API (or a poorly designed one), and not standardized.&lt;/p&gt;

&lt;p&gt;Other languages tend to have common libraries for fixture generation. As if often does, Go goes its own way and doesn&amp;rsquo;t have a ubiquitous fixtures package, but especially when combining sqlc and &lt;a href=&quot;https://github.com/go-playground/validator&quot;&gt;validator&lt;/a&gt;, it does well without one.&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s one of our project&amp;rsquo;s 130 fixtures:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package dbfactory

type MultiFactorOpts struct {
    ID          *uuid.UUID              `validate:&amp;quot;-&amp;quot;`
    AccountID   uuid.UUID               `validate:&amp;quot;required&amp;quot;`
    ActivatedAt *time.Time              `validate:&amp;quot;-&amp;quot;`
    ExpiresAt   *time.Time              `validate:&amp;quot;-&amp;quot;`
    Kind        *dbsqlc.MultiFactorKind `validate:&amp;quot;-&amp;quot;`
}

func MultiFactor(ctx context.Context, t *testing.T, e db.Executor, opts *MultiFactorOpts) *dbsqlc.MultiFactor {
    t.Helper()

    validateOpts(t, opts)

    var (
        num          = nextNumSeq()
        numFormatted = formatNumSeq(num)
    )

    multiFactor, err := dbsqlc.New().MultiFactorInsert(ctx, e, dbsqlc.MultiFactorInsertParams{
        ID:          ptrutil.ValOrDefaultFunc(opts.ID, func() uuid.UUID { return ptesting.ULID(ctx).New() }),
        AccountID:   opts.AccountID,
        ActivatedAt: ptrutil.TimeSQLNull(opts.ActivatedAt),
        ExpiresAt:   ptrutil.TimeSQLNull(opts.ExpiresAt),
        Kind:        string(ptrutil.ValOrDefault(opts.Kind, dbsqlc.MultiFactorKindTOTP)),
        Name:        fmt.Sprintf(&amp;quot;%s no. %s&amp;quot;, ptrutil.ValOrDefault(opts.Kind, dbsqlc.MultiFactorKindTOTP), numFormatted),
    })
    require.NoError(t, err)

    return multiFactor
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The minimum viable use of the fixture needs only &lt;code&gt;AccountID&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;mf := dbfactory.MultiFactor(ctx, t, tx, &amp;amp;dbfactory.MultiFactorOpts{
    AccountID: account.ID,
})
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But all salient properties are settable, so a more elaborate use just involves sending more overrides:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;expiredMF := dbfactory.MultiFactor(ctx, t, bundle.tx, &amp;amp;dbfactory.MultiFactorOpts{
    AccountID: account.ID,
    ExpiresAt: ptrutil.Ptr(time.Now().Add(-5 * time.Minute)),
    Kind:      ptrutil.Ptr(dbsqlc.MultiFactorKindWebAuthn),
})
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;observations&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#observations&quot;&gt;Observations&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;A few aspects worth calling out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Under the principle of not mocking the database, fixtures are real live data records. They&amp;rsquo;re queryable using the full expressiveness of SQL, are valid according to the schema&amp;rsquo;s data types/checks/triggers, and satisfy foreign keys.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Fixtures never return an error, instead failing their input &lt;code&gt;t&lt;/code&gt; so that generating a fixture is a one liner for the caller and doesn&amp;rsquo;t need an &lt;code&gt;if err != nil { ... }&lt;/code&gt; check.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Inputs are annotated with &lt;a href=&quot;https://github.com/go-playground/validator&quot;&gt;the Go validate framework&lt;/a&gt; to demarcate required versus non-required or more complex validations as needed. This is a godsend because it keeps validations short (zero additional lines instead of a minimum of three for an &lt;code&gt;if&lt;/code&gt; statement) and fast/easy to write.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;As few properties are made &lt;code&gt;validate:&amp;quot;required&amp;quot;&lt;/code&gt; as possible, with non nullable fields given defaults instead of marked mandatory for the caller to fill. This makes fixtures easier to use and reduces boilerplate at call sites. e.g. &lt;code&gt;name&lt;/code&gt; is a required property on &lt;code&gt;multi_factor&lt;/code&gt; above, but the fixture generates a sane default.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Insert statements are generated with &lt;a href=&quot;/sqlc&quot;&gt;sqlc&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- name: MultiFactorInsert :one
INSERT INTO multi_factor (
    id,
    account_id,
    activated_at,
    expires_at,
    kind,
    name
) VALUES (
    @id,
    @account_id,
    @activated_at,
    @expires_at,
    @kind,
    @name
) RETURNING *;
&lt;/code&gt;&lt;/pre&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;We use of a lot of custom pointer helpers like &lt;code&gt;ptrutil.TimeSQLNull&lt;/code&gt; (changes a pointer to a &lt;code&gt;sql.NullTime&lt;/code&gt;) and &lt;code&gt;ptrutil.ValOrDefault&lt;/code&gt;. Each one of these changes a ~4 line local variable declaration and &lt;code&gt;if&lt;/code&gt; block to one LOC that it&amp;rsquo;s inlined into the insert. True Go dogmatists won&amp;rsquo;t like this, but it saves dozens of lines per test fixture, and given hundreds of test fixtures, this adds up to thousands of lines saved overall.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Each test case gets its own lazily marshaled monotonic ULID generated based on &lt;code&gt;t&lt;/code&gt;. Separate generators guarantee monotonicity even if some test cases rewind their generators to generate ULIDs at particular times.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;var-blocks&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#var-blocks&quot;&gt;Organizing with var blocks&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Typically, fixtures are generated together in a &lt;code&gt;var ( ... )&lt;/code&gt; block, keeping tests looking nice and tidy:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;t.Run(&amp;quot;SetNameSSOJoinSCIMError&amp;quot;, func(t *testing.T) {
    t.Parallel()

    bundle, ctx := setup(t)

    var (
        org  = dbfactory.Organization(ctx, t, bundle.tx, &amp;amp;dbfactory.OrganizationOpts{SCIMEnabled: true})
        team = dbfactory.Team(ctx, t, bundle.tx, &amp;amp;dbfactory.TeamOpts{OrganizationID: &amp;amp;org.ID})
        _    = dbfactory.AccessGroupAccount_Admin(ctx, t, bundle.tx, team.ID, bundle.account.ID)
    )

    _, err := pservicetest.InvokeHandler(bundle.svc.Update, ctx, &amp;amp;TeamUpdateRequest{
        Name:   ptrutil.Ptr(&amp;quot;new name&amp;quot;),
        TeamID: eid.EID(team.ID),
    })
    prequire.APIErrorWithMessage(t, &amp;amp;apierror.BadRequestError{}, fmt.Sprintf(errMessageTeamUpdateSCIM, &amp;quot;name&amp;quot;), err)
})
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;standardize-conventions&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#standardize-conventions&quot;&gt;Standardize conventions, even the small ones&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;We have a few helpers that are used in almost every test fixture. These are so trivial that they almost don&amp;rsquo;t need to be extracted into their own functions, but we&amp;rsquo;ve done so to prevent implementations from drifting and keep code maximally succinct.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// Formats a number like &amp;quot;000007&amp;quot;. Typically used in conjunction
// with nextNumSeq to make identifiers prettier and so they align
// better.
func formatNumSeq(num int64) string {
    return fmt.Sprintf(&amp;quot;%06d&amp;quot;, num)
}

var numSeq int64

// Gets a unique number that can be used in names, etc. and which
// is more friendly to look at than a UUID.
func nextNumSeq() int64 {
    return atomic.AddInt64(&amp;amp;numSeq, 1)
}

func validateOpts(t *testing.T, opts any) {
    t.Helper()

    err := validate.Struct(opts)
    require.NoError(t, err)
}
&lt;/code&gt;&lt;/pre&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>EXTREME SERVER SIDE RENDERING</title>
    <link href="https://www.scd31.com/posts/extreme-server-side-rendering"/>
    <id>https://www.scd31.com/posts/extreme-server-side-rendering</id>
    <updated>2025-02-16T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>An LLM, a Formal Grammar, and 40 Sacred Words</title>
    <link href="https://www.scd31.com/posts/person-do-thing-llm"/>
    <id>https://www.scd31.com/posts/person-do-thing-llm</id>
    <updated>2025-02-07T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Profiling production for memory overruns + canonical log stats</title>
    <link href="https://brandur.org/fragments/profiling-production"/>
    <id>tag:brandur.org,2025-02-02:fragments/profiling-production</id>
    <updated>2025-02-03T01:02:27Z</updated>


    <content type="html">&lt;p&gt;You&amp;rsquo;re only lucky for so long. After four years of running our Go API in production with no memory trouble whatsoever, last week we started seeing instantaneous bursts of ~1.5 GB suddenly allocated, enough to cause Heroku to kill the dyno for being &amp;ldquo;vastly over quota&amp;rdquo; (our steady state memory use sits around ~50 MB, so we run on 512 MB dynos).&lt;/p&gt;

&lt;p&gt;This was of course, concerning. We were only experiencing a few of these a day, but with no idea what was causing them, and having appeared very suddenly, we had to assume that they might get more frequent. Not only is the API suddenly being taken offline at any moment is a bad place to be UX-wise, and even with our careful use of transactions, makes resource leaks between components possible.&lt;/p&gt;

&lt;h2 id=&quot;alloc-delta&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#alloc-delta&quot;&gt;Alloc delta&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;To localize the problem, I used Go&amp;rsquo;s &lt;a href=&quot;https://pkg.go.dev/runtime#MemStats&quot;&gt;&lt;code&gt;runtime.MemStats&lt;/code&gt;&lt;/a&gt; in conjunction with our &lt;a href=&quot;/nanoglyphs/025-logs&quot;&gt;canonical API lines&lt;/a&gt;, making a new &lt;code&gt;total_alloc_delta&lt;/code&gt; property available to see how many allocations took place during the period of an API request:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func (m *CanonicalAPILineMiddleware) Wrapper(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        var (
            memStats      runtime.MemStats
            memStatsBegin = m.TimeNow()
        )
        runtime.ReadMemStats(&amp;amp;memStats)
        var (
            memStatsBeginDuration = m.TimeNow().Sub(memStatsBegin)

            // TotalAlloc doesn&#39;t decrement on heap frees, so it gives
            // us useful info even if the GC runs during the request.
            totalAllocBegin = memStats.TotalAlloc
        )

        // API request served here
        next.ServeHTTP(w, r)
    
        // Middleware continues ...
        memStatsEnd := m.TimeNow()
        // Since we&#39;re only interested in one field, reuse the same
        // struct so we don&#39;t need to allocate a second one.
        runtime.ReadMemStats(&amp;amp;memStats)
        var (
            memStatsEndDuration = m.TimeNow().Sub(memStatsEnd)
            totalAllocDelta     = memStats.TotalAlloc - totalAllocBegin
        )
        
        logData := &amp;amp;CanonicalAPILineData{
            ID:                   m.ULID.New(),
            HTTPMethod:           r.Method,
            HTTPPath:             r.URL.Path,
            ...
            ReadMemStatsDuration: timeutil.PrettyDuration(memStatsBeginDuration + memStatsEndDuration),
            TotalAllocDelta:      totalAllocDelta,
        }

        plog.Logger(ctx).WithFields(structToFields(logData)).
            Infof(
                &amp;quot;canonical_api_line %s %s -&amp;gt; %v %s(%s)&amp;quot;,
                r.Method,
                routeOrPath,
                logData.Status,
                idempotencyReplayStr,
                duration,
            )
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;MemStats&lt;/code&gt; provides a large bucket of properties to pick from, but &lt;code&gt;TotalAlloc&lt;/code&gt;&amp;rsquo;s a useful one because it represents bytes allocated to the heap, but unlike similar stats like &lt;code&gt;HeapAlloc&lt;/code&gt;, it&amp;rsquo;s monotonically increasing. It&amp;rsquo;s not decremented as objects are freed:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// TotalAlloc is cumulative bytes allocated for heap objects.
//
// TotalAlloc increases as heap objects are allocated, but
// unlike Alloc and HeapAlloc, it does not decrease when
// objects are freed.
TotalAlloc uint64
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is good because it means that all API requests will end up with the same memory heuristic, and made roughly comparable. Garbage collection may or may not occur during a request. Using &lt;code&gt;TotalAlloc&lt;/code&gt; makes it irrelevant whether it did or not.&lt;/p&gt;

&lt;p&gt;With that deployed, I can search logs for outliers (&lt;code&gt;:&amp;gt;500000000&lt;/code&gt; means greater than 5 MB):&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-txt&quot;&gt;source:platform app:app[web] canonical_api_line (-http_route:/health-checks/{name})
    total_alloc_delta:&amp;gt;500000000
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And voila, we turn up the bad ones. Here, an API request that spiked memory a full 5 GB!&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-txt&quot;&gt;Jan 29 10:18:33 platform app[web] info canonical_api_line
    POST /queries -&amp;gt; 503 (2.53252138s)
total_alloc_delta=5008335944
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;parallel-allocations&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#parallel-allocations&quot;&gt;Parallel allocations&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;The use of &lt;code&gt;TotalAlloc&lt;/code&gt; is imperfect because it not only tracks allocations of the current API request, but allocations across the current API request &lt;em&gt;and&lt;/em&gt; all parallel requests.&lt;/p&gt;

&lt;p&gt;We can see this effect through false positives:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-txt&quot;&gt;Feb 1 23:07:18 platform app[web] info canonical_api_line
    GET /clusters/{cluster_id}/databases -&amp;gt; 504 (2m57.322010348s)
total_alloc_delta=743772480
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It looks like this API request allocated 744 MB, but what actually happened is that it was a bad timeout that executed for a full three minutes &lt;sup id=&quot;footnote-1-source&quot;&gt;&lt;a href=&quot;#footnote-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. During that time, other API requests served in the interim allocated the majority of that memory. It &lt;em&gt;didn&amp;rsquo;t&lt;/em&gt; crash our 512 MB dyno because multiple GCs also occurred during that time.&lt;/p&gt;

&lt;h2 id=&quot;pprof-to-s3&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#pprof-to-s3&quot;&gt;Pprof to S3&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Getting our memory overruns localized to a particular endpoint was good, but even having done that, I&amp;rsquo;d need a little more help to figure out where the rogue memory was going. To that end, I put in one more clause in the middleware so that in case of a huge overrun, the process dumps a &lt;a href=&quot;https://github.com/google/pprof&quot;&gt;pprof&lt;/a&gt; heap profile to S3:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;    ...

    // If we used a particularly huge amount of memory during the
    // request, upload a profile to S3 for analysis. Buckets have a
    // configured life cycle so objects will expire out after some
    // time.
    if err := m.maybeUploadPprof(ctx, logData.RequestID, totalAllocDelta); err != nil {
        plog.Logger(ctx).Errorf(m.Name+&amp;quot;: Error uploading pprof profile: %s&amp;quot;, err)
}

...

const pprofTotalAllocDeltaThreshold = 1_000_000_000

func (m *CanonicalAPILineMiddleware) maybeUploadPprof(ctx context.Context, requestID uuid.UUID, totalAllocDelta uint64) error {
    if !m.pprofEnable || totalAllocDelta &amp;lt; m.pprofTotalAllocDeltaThreshold {
        return nil
    }

    profKey := fmt.Sprintf(&amp;quot;%s/pprof/%s.prof&amp;quot;, m.EnvName, requestID)

    var buf bytes.Buffer
    if err := pprof.WriteHeapProfile(&amp;amp;buf); err != nil {
        return xerrors.Errorf(&amp;quot;error writing heap profile: %w&amp;quot;, err)
    }

    if _, err := m.aws.S3_PutObject(ctx, &amp;amp;s3.PutObjectInput{
        Body:   &amp;amp;buf,
        Bucket: ptrutil.Ptr(awsclient.S3Bucket),
        Key:    &amp;amp;profKey,
    }); err != nil {
        return xerrors.Errorf(&amp;quot;error putting heap profile to S3 at path %q: %w&amp;quot;, profKey, err)
    }

    plog.Logger(ctx).Infof(m.Name+&amp;quot;: pprof_profile_generated_line: TotalAlloc delta %d exceeded %d; generated pprof profile to S3 key %q&amp;quot;,
        totalAllocDelta, m.pprofTotalAllocDeltaThreshold, profKey)

    return nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Our memory problem ended up being a queries endpoint that was overly willing to read giant result sets into memory, then serialize the whole thing into a big JSON buffer for response, which was also pretty indented (and in Go&amp;rsquo;s &lt;code&gt;encoding/json&lt;/code&gt;, indenting a JSON response requires a &lt;em&gt;second&lt;/em&gt; giant buffer 2x the size of the first one). I fixed it by reducing the maximum number of rows we were willing to read into the response.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m not expecting to run into new memory overruns or leaks anytime soon, but I left the pprof code in place for the time being. It only does work in case of huge memory increases so there&amp;rsquo;s no performance penalty most of the time, and it might come in handy again.&lt;/p&gt;

&lt;h2 id=&quot;stop-the-world&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#stop-the-world&quot;&gt;Stop the world&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;A token glance at the implementation of &lt;code&gt;runtime.ReadMemStats&lt;/code&gt; looks a little concerning:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// ReadMemStats populates m with memory allocator statistics.
//
// The returned memory allocator statistics are up to date as of the
// call to ReadMemStats. This is in contrast with a heap profile,
// which is a snapshot as of the most recently completed garbage
// collection cycle.
func ReadMemStats(m *MemStats) {
    _ = m.Alloc // nil check test before we switch stacks, see issue 61158
    stw := stopTheWorld(stwReadMemStats)

    systemstack(func() {
        readmemstats_m(m)
    })

    startTheWorld(stw)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;To produce accurate stats, the runtime needs to &amp;ldquo;stop the world&amp;rdquo;, meaning that all active goroutines are paused, a sample taken, and resumed.&lt;/p&gt;

&lt;p&gt;Intuitively, that seems like it could be pretty slow, and some initial googling seemed to confirm that. However, I later found a &lt;a href=&quot;https://go-review.googlesource.com/c/go/+/34937&quot;&gt;patch from 2017&lt;/a&gt; that&amp;rsquo;d improved the situation considerably by doing cumulative tracking of relevant stats so only a very brief stop the world was required. It indicated a reduction in timing down to 25µs, even at 100 concurrent goroutines.&lt;/p&gt;

&lt;p&gt;I added a separate log stat to see how long my two &lt;code&gt;ReadStatMems&lt;/code&gt; calls were taking, and found they were averaging ~100µs for both:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-txt&quot;&gt;read_mem_stats_duration=0.000098s
read_mem_stats_duration=0.000110s
read_mem_stats_duration=0.000113s
read_mem_stats_duration=0.000126s
read_mem_stats_duration=0.000123s
read_mem_stats_duration=0.000084s
read_mem_stats_duration=0.000091s
read_mem_stats_duration=0.000092s
read_mem_stats_duration=0.000090s
read_mem_stats_duration=0.000083s
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;That&amp;rsquo;s 50µs per invocation instead of 25µs, but given that a single DB query takes an order or two of magnitude longer at 1-10ms, a little delay to get memory stats is acceptable. If our stack was hyper performance sensitive or saturated with huge request volume, I&amp;rsquo;d take it out.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>The Taylorator - All Your Frequencies Are Belong to Us</title>
    <link href="https://www.scd31.com/posts/taylorator"/>
    <id>https://www.scd31.com/posts/taylorator</id>
    <updated>2025-01-27T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Sierpinski Triangles on a Pen Plotter</title>
    <link href="https://www.scd31.com/posts/sierpinski-triangle"/>
    <id>https://www.scd31.com/posts/sierpinski-triangle</id>
    <updated>2025-01-10T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Rubic - a BASIC-inspired Ruby REPL</title>
    <link href="https://www.scd31.com/posts/basic-repl"/>
    <id>https://www.scd31.com/posts/basic-repl</id>
    <updated>2025-01-09T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Go&#39;s bytes.Buffer vs. strings.Builder</title>
    <link href="https://brandur.org/fragments/bytes-buffer-vs-strings-builder"/>
    <id>tag:brandur.org,2025-01-02:fragments/bytes-buffer-vs-strings-builder</id>
    <updated>2025-01-03T05:34:32Z</updated>


    <content type="html">&lt;p&gt;I was writing some Go code today that generated other Go code. Writing it line by line, mostly in a loop, but with pre- and post-matter.&lt;/p&gt;

&lt;p&gt;My usual go to for this type of thing is &lt;a href=&quot;https://pkg.go.dev/bytes#Buffer&quot;&gt;&lt;code&gt;bytes.Buffer&lt;/code&gt;&lt;/a&gt;, but after I&amp;rsquo;d finished the implementation, given that I was working entirely with strings, I started to wonder if I should&amp;rsquo;ve used &lt;a href=&quot;https://pkg.go.dev/strings#Builder&quot;&gt;&lt;code&gt;strings.Builder&lt;/code&gt;&lt;/a&gt; instead.&lt;/p&gt;

&lt;p&gt;I realized that I had no idea whether one was faster than the other, so I wrote a quick benchmark to check:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package main

import (
    &amp;quot;bytes&amp;quot;
    &amp;quot;strings&amp;quot;
    &amp;quot;testing&amp;quot;
)

var fragments = []string{
    &amp;quot;This&amp;quot;,
    &amp;quot;is a series of&amp;quot;,
    &amp;quot;string fragments&amp;quot;,
    &amp;quot;that will be concatenated together&amp;quot;,
    &amp;quot;into a single larger string&amp;quot;,
    &amp;quot;so that we can&amp;quot;,
    &amp;quot;determine which of Go&#39;s various&amp;quot;,
    &amp;quot;tools for doing this&amp;quot;,
    &amp;quot;is most efficient.&amp;quot;,
    &amp;quot;I found a few articles&amp;quot;,
    &amp;quot;online&amp;quot;,
    &amp;quot;but most were poorly cited&amp;quot;,
    &amp;quot;or&amp;quot;,
    &amp;quot;behind a Medium login wall&amp;quot;,
    &amp;quot;or otherwise&amp;quot;,
    &amp;quot;not of admirable quality.&amp;quot;,
}

func BenchmarkBytesBuffer(b *testing.B) {
    for range b.N {
        var buf bytes.Buffer

        for _, fragment := range fragments {
            buf.WriteString(fragment)
            buf.WriteString(&amp;quot; &amp;quot;)
        }

        _ = buf.String()
    }
}

func BenchmarkConcatenateStrings(b *testing.B) {
    for range b.N {
        var str string

        for _, fragment := range fragments {
            str += fragment
            str += &amp;quot; &amp;quot;
        }
    }
}

func BenchmarkStringBuilder(b *testing.B) {
    for range b.N {
        var sb strings.Builder

        for _, fragment := range fragments {
            sb.WriteString(fragment)
            sb.WriteString(&amp;quot; &amp;quot;)
        }

        _ = sb.String()
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ go test -bench=. -benchmem
goos: darwin
goarch: arm64
pkg: github.com/brandur/go-builder-vs-buffer
cpu: Apple M4
BenchmarkBytesBuffer-10           5013081    217.3 ns/op    1280 B/op    5 allocs/op
BenchmarkConcatenateStrings-10    1603748    753.5 ns/op    5557 B/op    31 allocs/op
BenchmarkStringBuilder-10         6916813    146.9 ns/op    752 B/op     6 allocs/op
PASS
ok      github.com/brandur/go-builder-vs-buffer 4.724s

&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So there you have it. At least when it comes to concatenating only strings at relatively modest sizes, &lt;code&gt;strings.Builder&lt;/code&gt; is about 33% faster, and 80% faster than &lt;sup id=&quot;footnote-1-source&quot;&gt;&lt;a href=&quot;#footnote-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; than concatenating strings. Given that the DX is identical between the two, I&amp;rsquo;ll make it my new default go to.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Postgres UUIDv7 + per-backend monotonicity</title>
    <link href="https://brandur.org/fragments/uuid-v7-monotonicity"/>
    <id>tag:brandur.org,2024-12-31:fragments/uuid-v7-monotonicity</id>
    <updated>2024-12-31T22:32:43Z</updated>


    <content type="html">&lt;p&gt;An implementation for &lt;a href=&quot;https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=78c5e141e9c139fc2ff36a220334e4aa25e1b0eb&quot;&gt;UUIDv7 was committed to Postgres&lt;/a&gt; earlier this month. These have all the benefits of a v4 (random) UUID, but are generated with a more deterministic order using the current time, and perform considerably better on inserts using ordered structures like B-trees.&lt;/p&gt;

&lt;p&gt;A nice surprise is that the random portion of the UUIDs will be monotonic within each Postgres backend:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In our implementation, the 12-bit sub-millisecond timestamp fraction
is stored immediately after the timestamp, in the space referred to as
&amp;ldquo;rand_a&amp;rdquo; in the RFC. This ensures additional monotonicity within a
millisecond. The rand_a bits also function as a counter. We select a
sub-millisecond timestamp so that it monotonically increases for
generated UUIDs within the same backend, even when the system clock
goes backward or when generating UUIDs at very high
frequency. Therefore, the monotonicity of generated UUIDs is ensured
within the same backend.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is a hugely valuable feature in practice, especially in testing. Say you want to generate five objects for testing an API list endpoint. It&amp;rsquo;s possible they&amp;rsquo;re generated in-order by virtue of being across different milliseconds or by getting lucky, but probability is against you, and the likelihood is that some will be out of order. A test case has to generate the five objects, then do an initial sort before making use of them. That&amp;rsquo;s not the end of the world, but it&amp;rsquo;s more test code and adds noise.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;test_accounts = 5.times.map { TestFactory.account }

# maybe IDs were in order, but maybe not, so do an initial sort
test_accounts.sort_by! { |a| a.id }

# API endpoint will return accounts ordered by ID
resp = make_api_request :get, &amp;quot;/accounts&amp;quot;
expect(resp.map { _1[&amp;quot;id&amp;quot;] }).to eq(test_accounts.map(&amp;amp;:id))
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;With Postgres ensuring monotonicity for UUIDv7s, the five generated objects get five in-order IDs, making the test safer &lt;sup id=&quot;footnote-1-source&quot;&gt;&lt;a href=&quot;#footnote-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and faster to write. Montonicity isn&amp;rsquo;t guaranteed across backends, but that&amp;rsquo;s okay in well written test suites. Patterns like &lt;a href=&quot;/fragments/go-test-tx-using-t-cleanup&quot;&gt;test transactions&lt;/a&gt; will guarantee that each test case speaks to exactly one backend.&lt;/p&gt;

&lt;h2 id=&quot;12-bits-more-clock&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#12-bits-more-clock&quot;&gt;12 bits more clock&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;My grasp on monotonicity has always been tenuous at best, so I was curious how it was implemented here. I looked at the patch, and its approach was more obvious than I expected:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;/*
 * Generate UUID version 7 per RFC 9562, with the given timestamp.
 *
 * UUID version 7 consists of a Unix timestamp in milliseconds (48
 * bits) and 74 random bits, excluding the required version and
 * variant bits. To ensure monotonicity in scenarios of high-
 * frequency UUID generation, we employ the method &amp;quot;Replace
 * LeftmostRandom Bits with Increased Clock Precision (Method 3)&amp;quot;,
 * described in the RFC. This method utilizes 12 bits from the
 * &amp;quot;rand_a&amp;quot; bits to store a 1/4096 (or 2^12) fraction of sub-
 * millisecond precision.
 *
 * ns is a number of nanoseconds since start of the UNIX epoch.
 * This value is used for time-dependent bits of UUID.
 */
static pg_uuid_t* generate_uuidv7(int64 ns) {

...

/*
 * sub-millisecond timestamp fraction (SUBMS_BITS bits, not
 * SUBMS_MINIMAL_STEP_BITS)
 */
increased_clock_precision = ((ns % NS_PER_MS) * (1 &amp;lt;&amp;lt; SUBMS_BITS)) / NS_PER_MS;

/* Fill the increased clock precision to &amp;quot;rand_a&amp;quot; bits */
uuid-&amp;gt;data[6] = (unsigned char) (increased_clock_precision &amp;gt;&amp;gt; 8);
uuid-&amp;gt;data[7] = (unsigned char) (increased_clock_precision);

/* fill everything after the increased clock precision with random bytes */
if (!pg_strong_random(&amp;amp;uuid-&amp;gt;data[8], UUID_LEN - 8))
    ereport(ERROR,
            (errcode(ERRCODE_INTERNAL_ERROR),
            errmsg(&amp;quot;could not generate random values&amp;quot;)));
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;UUIDv7 dictates an initial 48 bits that encodes a timestamp down to millisecond precision. A millisecond is a short amount of time for a human, but quite long for a computer, and many UUIDs could easily be generated with the space of a single ms.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt; 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      48 bits unix_ts_ms                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   48 bits unix_ts_ms (cont)   |  ver  |    12 bits rand_a     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var|                    62 bits rand_b                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                     62 bits rand_b (cont)                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The Postgres patch solves the problem by repurposing 12 bits of the UUID&amp;rsquo;s random component to increase the precision of the timestamp down to nanosecond granularity (filling &lt;code&gt;rand_a&lt;/code&gt; above), which in practice is too precise to contain two UUIDv7s generated in the same process. It makes a repeated UUID &lt;em&gt;between&lt;/em&gt; processes more likely, but there&amp;rsquo;s still 62 bits of randomness left to make use of, so collisions remain vastly unlikely.&lt;/p&gt;

&lt;h2 id=&quot;wait&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#wait&quot;&gt;The wait is on&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;UUIDv7s are going to make a great core addition to Postgres, and I can&amp;rsquo;t wait to start using them. Quite unfortunately, their commit was delayed past the freeze for Postgres 17, so they won&amp;rsquo;t make it into an official version until Postgres 18 is cut in late 2025. So now, we wait.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Stripe V2</title>
    <link href="https://brandur.org/fragments/stripe-v2"/>
    <id>tag:brandur.org,2024-12-28:fragments/stripe-v2</id>
    <updated>2024-12-29T06:56:24Z</updated>


    <content type="html">&lt;p&gt;I happened to notice by way of a Slack bot today that Stripe released a &lt;a href=&quot;https://docs.stripe.com/api-v2-overview&quot;&gt;V2 version of their API&lt;/a&gt;. I thought this must&amp;rsquo;ve been a soft launch right before the holidays, surely to be followed up by a more formal blog post, but the Way Back Machine clocked the page in &lt;a href=&quot;https://web.archive.org/web/20241004013621/https://docs.stripe.com/api-v2-overview&quot;&gt;early October&lt;/a&gt;, making it three months old. It&amp;rsquo;s been there all along, I just hadn&amp;rsquo;t seen it before.&lt;/p&gt;

&lt;p&gt;The V1 and V2 APIs are separate namespaces and what&amp;rsquo;s available in V2 is currently very minimal (only events and event destinations), so integrations will still use V1 for almost everything, but the overview page tells us about its aspirational design intentions.&lt;/p&gt;

&lt;h2 id=&quot;json-hateoas&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#json-hateoas&quot;&gt;JSON, with a sprinkling of HATEOAS&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;A few highlights:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;By far the best and biggest change is that request bodies are sent as JSON instead of &lt;code&gt;application/x-www-form-urlencoded&lt;/code&gt;. Form encoding isn&amp;rsquo;t the worst thing in the world, but it falls flat on its face when encoding complex data types like arrays and maps (or worse, &lt;em&gt;nested&lt;/em&gt; arrays and maps). It&amp;rsquo;s also just weird and out of place in 2024. This change should&amp;rsquo;ve happened ten years ago.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Pagination has picked up a hypermedia-esque veneer (see &lt;a href=&quot;https://en.wikipedia.org/wiki/HATEOAS&quot;&gt;HATEAOS&lt;/a&gt;), returning a &lt;code&gt;next_page_url&lt;/code&gt; that&amp;rsquo;s requested directly instead of a cursor and having the caller build the next URL themselves.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;The new API is trying to move away from a model where sub-objects in an API resource are expanded by default, to one where they need to be requested with an &lt;code&gt;include&lt;/code&gt; parameter. We had plenty of discussions about this before I left. The purpose of the change is to make API requests faster (Stripe&amp;rsquo;s API is quite slow) by rendering less for most requests. I counted only two places where this is actually used so far though, so time will tell whether the gambit actually succeeds or not.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Endpoints will try for &amp;ldquo;real&amp;rdquo; idempotency where callers can converge failed operations to either success or definitive failure by calling them again:&lt;/p&gt;

&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;When you provide the same idempotency key for two requests:

&lt;ul&gt;
&lt;li&gt;API v1 always returns the previously-saved response of the first API request, even if it was an error.&lt;/li&gt;
&lt;li&gt;API v2 attempts to retry any failed requests without producing side effects (any extraneous change or observable behavior that occurs as a result of an API call) and provide an updated response.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;Previously (and still for most endpoints), failures from an intermittent blip or bug were a big problem. The idempotency layer dumbly returned whatever canned response had been recorded on the initial go around (including internal server errors), so users wouldn&amp;rsquo;t get closure on what exactly happened. Their best hope would that be a Stripe engineer would eventually repair their charge manually at some later time, and send a webhook about it.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;rest-ish&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#rest-ish&quot;&gt;REST-ish v4-ever&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Lots of positive progress there, but a new API version also presents an opportunity to clear out blemishes, and I expected to see more of that. A few points that are less good:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;I was hoping they&amp;rsquo;d fix their verbs to play more nicely with modern REST conventions. Instead of using &lt;code&gt;POST&lt;/code&gt; everywhere, use &lt;code&gt;POST&lt;/code&gt; for endpoints that are knowingly not idempotent (without an idempotency key), &lt;code&gt;PUT&lt;/code&gt; for mutation endpoints that are, and &lt;code&gt;PATCH&lt;/code&gt; for mutation endpoints that aren&amp;rsquo;t. I admit it&amp;rsquo;s pedantic, but it&amp;rsquo;s so absolutely trivial to implement, and the use of a good verb signals more information than a reader would otherwise have with a cursory glance at API structure.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;They&amp;rsquo;re still doing the RPC-style calls like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;POST /v2/core/event_destinations/:id/enable
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Also pedantic, but &lt;code&gt;enable&lt;/code&gt; here should theoretically be reserved for a nested resource. I think it&amp;rsquo;s cleaner to model actions as IDs under a shared &amp;ldquo;actions&amp;rdquo; subresource:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;POST /v2/core/event_destinations/:id/actions/enable
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;nouveau-dx&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#nouveau-dx&quot;&gt;Nouveau DX&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Frankly, I was a bit shocked by how little attention this got. There was a time not too long ago when Stripe cutting a new API version would&amp;rsquo;ve been a major event in the tech world, but in three months I didn&amp;rsquo;t come across a single person who mentioned it.&lt;/p&gt;

&lt;p&gt;A major part of this is that Stripe is no longer a great technical leader in the same sense that it used to be. But also, as &lt;a href=&quot;https://x.com/tweetsbycolin/status/1873241754784411656&quot;&gt;Colin points out&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This is an undeniable sign that &amp;ldquo;a great REST API&amp;rdquo; is no longer the benchmark for great DX&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That&amp;rsquo;s got to be true too. Few of us want to be making manual HTTP calls out to APIs anymore. These days a great SDK, not a great API, is a hallmark, and maybe even a necessity, of a world class development experience.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Go&#39;s maximum time.Duration</title>
    <link href="https://brandur.org/fragments/go-max-time-duration"/>
    <id>tag:brandur.org,2024-12-21:fragments/go-max-time-duration</id>
    <updated>2024-12-21T17:30:01Z</updated>


    <content type="html">&lt;p&gt;While working on a River bug related to retry policy, I came across a case where it was actually plausible to overflow Go&amp;rsquo;s built-in &lt;code&gt;time.Duration&lt;/code&gt; and wrap back around to negative number.&lt;/p&gt;

&lt;p&gt;A duration has a much simpler representation than a timestamp. It&amp;rsquo;s an &lt;code&gt;int64&lt;/code&gt; counted in nanoseconds:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// A Duration represents the elapsed time between two instants
// as an int64 nanosecond count. The representation limits the
// largest representable duration to approximately 290 years.
type Duration int64
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As the comment states, the maximum duration is about 290 years. More precisely, 292 (non-leap) years, 171 days, and 23 hours:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func main() {
    const (
        maxDuration time.Duration = 1&amp;lt;&amp;lt;63 - 1

        day  = 24 * time.Hour
        year = 365 * day
    )

    var (
        years        = maxDuration / year
        withoutYears = maxDuration % year

        days        = withoutYears / day
        withoutDays = withoutYears % day
    )

    fmt.Printf(&amp;quot;max duration: %dy%dd%s\n&amp;quot;, years, days, withoutDays)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ go run main.go
max duration: 292y171d23h47m16.854775807s
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;292 years is a long time, and it&amp;rsquo;s not likely most programs will need more than that, but our retry algorithm is exponential, and crosses that threshold after 310 retries.&lt;/p&gt;

&lt;h2 id=&quot;compile-v-runtime-overflow&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#compile-v-runtime-overflow&quot;&gt;Compile v. runtime overflow&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;When performing a direct calculation on a constant, the compiler will detect the overflow:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func main() {
    const maxDuration time.Duration = 1&amp;lt;&amp;lt;63 - 1
    var maxDurationSeconds = float64(maxDuration / time.Second)

    notOverflowed := time.Duration(maxDurationSeconds) * time.Second
    fmt.Printf(&amp;quot;not overflowed: %+v\n&amp;quot;, notOverflowed)

    overflowed := time.Duration(int64(maxDuration)+1) * time.Second
    fmt.Printf(&amp;quot;overflowed: %+v\n&amp;quot;, overflowed)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ go run main.go
./main.go:15:30: int64(maxDuration) + 1 (constant 9223372036854775808 of type int64) overflows int64
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But performing the same operation on a variable will happily wrap around:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;overflowed := time.Duration(maxDurationSeconds+1) * time.Second
fmt.Printf(&amp;quot;overflowed: %+v\n&amp;quot;, overflowed)
&lt;/code&gt;&lt;/pre&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ go run main.go
not overflowed: 2562047h47m16s
overflowed: -2562047h47m16.709551616s
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;well-defined&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#well-defined&quot;&gt;Little practical use, but well defined&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;I &lt;a href=&quot;https://github.com/riverqueue/river/pull/698&quot;&gt;fixed River&amp;rsquo;s back offs at large attempt counts&lt;/a&gt; by using Go 1.21&amp;rsquo;s &lt;code&gt;min&lt;/code&gt; function combined with the maximum known number of seconds that&amp;rsquo;ll fit in a &lt;code&gt;time.Duration&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// The maximum value of a duration before it overflows. About 292 years.
const maxDuration time.Duration = 1&amp;lt;&amp;lt;63 - 1

// Same as the above, but changed to a float represented in seconds.
var maxDurationSeconds = maxDuration.Seconds()

func (p *DefaultClientRetryPolicy) NextRetry(job *rivertype.JobRow) time.Time {
    return time.Now().Add(timeutil.SecondsAsDuration(
        p.retrySeconds(len(job.Errors) + 1),
    ))
}

func (p *DefaultClientRetryPolicy) retrySeconds(attempt int) float64 {
    retrySeconds := math.Pow(float64(attempt), 4)
    return min(retrySeconds, maxDurationSeconds)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;After hitting retry attempt 310, the algorithm backs off 292 years at a time. This behavior will never be of any real use to anybody, but I changed it to be &lt;em&gt;well defined&lt;/em&gt; behavior of no real use to anybody, with no risk of odd bugs that might otherwise result from an overflow.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>ERROR: invalid byte sequence for encoding UTF8: 0x00 (and what to do about it)</title>
    <link href="https://brandur.org/fragments/invalid-byte-sequence"/>
    <id>tag:brandur.org,2024-12-19:fragments/invalid-byte-sequence</id>
    <updated>2024-12-19T21:58:05Z</updated>


    <content type="html">&lt;p&gt;One of the oldest errors I ever remember seeing in an error tracker:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;ERROR: invalid byte sequence for encoding &amp;ldquo;UTF8&amp;rdquo;: &lt;code&gt;0x00&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Through my time at Heroku it was like a distant friend. Not one that you&amp;rsquo;d see every day, but one who&amp;rsquo;d appear to be surprise you a few dozen times a year. Since it didn&amp;rsquo;t seem to be causing any major fallout and I never heard a user complain about it, I&amp;rsquo;m somewhat embarrassed to say that in four years neither myself nor anyone else ever bothered to look into it.&lt;/p&gt;

&lt;p&gt;These days, on a Go stack and with much better control and insight into any changes we make, we&amp;rsquo;re pretty aggressive about trying to prune Sentry errors down to zero. Over a few months I&amp;rsquo;d see the &lt;code&gt;0x00&lt;/code&gt; error come and go, and finally decided to look into it.&lt;/p&gt;

&lt;p&gt;The problem comes from Postgres raising an error when a caller tries to insert a text/varchar value containing a value of &lt;code&gt;0x00&lt;/code&gt;, or zero byte. The same value that&amp;rsquo;s used to terminate a string in plain old C. Postgres &lt;a href=&quot;https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-ESCAPE&quot;&gt;explicitly disallows it&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The character with the code zero cannot be in a string constant.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The tricky part is that although Postgres won&amp;rsquo;t take a zero byte, almost every programming language ever created &lt;em&gt;will&lt;/em&gt;, thereby creating a natural asymmetry between database and language stack.&lt;/p&gt;

&lt;p&gt;As far as I know, there aren&amp;rsquo;t any legitimate uses for sending a zero byte to an API or web app. Looking back through our logs, the main places I&amp;rsquo;ve seen it are from bots out on the internet, presumably using common attack patterns to probe for weaknesses, or from pentest teams that we paid to do the same.&lt;/p&gt;

&lt;h2 id=&quot;edges&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#edges&quot;&gt;Validating at the edges&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;We&amp;rsquo;re using the &lt;a href=&quot;https://github.com/go-playground/validator&quot;&gt;validate framework for Go&lt;/a&gt; to check that API inputs are sound, like that they&amp;rsquo;re present, below a max length, or within bounds. In a language known for its verbosity, validate annotations are succinct and quick to write.&lt;/p&gt;

&lt;p&gt;The custom validations &lt;code&gt;apistring200&lt;/code&gt;, &lt;code&gt;apistrong2000&lt;/code&gt;, &lt;code&gt;apistring20000&lt;/code&gt;, etc. are assigned to API string parameters in &lt;a href=&quot;/text#varchars&quot;&gt;order of magnitude tiers&lt;/a&gt;. Their implementation denies &lt;code&gt;\x00&lt;/code&gt;s that come in with request payloads:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// API strings are meant to provide a reasonable default validation
// for strings that come in via the API that aren&#39;t already
// validated more strictly. The main idea is to make sure that
// we&#39;re not getting long, unbounded input that&#39;ll either store a
// very invalid value to the database or be rejected by a DB-level
// constraint (which would bubble up as a 500 with little context).
//
// They also validate that strings contain no invalid unicode
// sequences, and that no `\x00` zero bytes are present, both of
// which Postgres will reject.
must(registerAPIString(&amp;quot;apistring200&amp;quot;, 200))
must(registerAPIString(&amp;quot;apistring2000&amp;quot;, 2_000))
must(registerAPIString(&amp;quot;apistring20000&amp;quot;, 20_000))
must(registerAPIString(&amp;quot;apistring200000&amp;quot;, 200_000))

const (
    apiStringErrorMessage = &amp;quot;`{0}` should be a non-empty string with a maximum length of %d characters, and contain no invalid unicode sequences or zero bytes&amp;quot;
)

func registerAPIString(tag string, maxLength int) error {
    if err := validate.RegisterValidation(tag, func(fl validator.FieldLevel) bool {
        val := fl.Field().String()

        if len(val) == 0 || len(val) &amp;gt; maxLength {
            return false
        }

        if !utf8.ValidString(val) {
            return false
        }

        // A zero (0x00) rune is valid UTF-8 and won&#39;t be caught
        // by the unicode check above, but Postgres will refuse
        // to insert it.
        if strings.Contains(val, &amp;quot;\x00&amp;quot;) {
            return false
        }

        return true
    }); err != nil {
        return err
    }

    return registerTranslation(tag, fmt.Sprintf(apiStringErrorMessage, maxLength))
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Notably, it also denies invalid UTF-8 byte sequences (&lt;code&gt;\x00&lt;/code&gt; is not desirable, but it is valid UTF-8), another common malformed input that internet bots like to send, and which will cause its own Postgres error.&lt;/p&gt;

&lt;p&gt;Struct fields are tagged with validations, making use easy and concise:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// Request for creating a new account.
type AccountCreateRequest struct {
    // Full name for the new account.
    Name *string `json:&amp;quot;name&amp;quot; validate:&amp;quot;apistring200&amp;quot;`
    
    ...
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;raw-request-properties&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#raw-request-properties&quot;&gt;Storing raw request properties&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;That takes care of input forms, but another place we&amp;rsquo;d see the problem is when trying to insert &lt;a href=&quot;/canonical-log-lines&quot;&gt;canonical API lines&lt;/a&gt; to the database for operational visibility. Even where we denied a request with invalid input with a 400, we record a canonical line for it, invalid input and all.&lt;/p&gt;

&lt;p&gt;For this case, we take anything invalid in the input and replace it with a placeholder token that&amp;rsquo;s safely storable to Postgres:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// TrimInvalidUTF8 replaces any invalid UTF-8 or \x00 bytes with
// symbolic stand-in tokens. This lets strings that contain invalid
// UTF-8 be stored to Postgres, which normally won&#39;t tolerate
// invalid UTF-8 in string-like fields.
func TrimInvalidUTF8(s string) string {
    if !utf8.ValidString(s) {
        s = strings.ToValidUTF8(s, &amp;quot;[invalid UTF-8]&amp;quot;)
    }

    // A zero (0x00) rune is valid UTF-8 and won&#39;t be caught by the
    // check above, but Postgres will refuse to insert it. Replace
    // all instances with a marker that Postgres can tolerate and
    // which is indicative of what happened. This should only ever
    // happen because of random probing from malicious internet
    // actors sending garbage into HTTP paths and what not.
    if strings.Contains(s, &amp;quot;\x00&amp;quot;) {
        s = strings.ReplaceAll(s, &amp;quot;\x00&amp;quot;, &amp;quot;[0x00 UTF-8 rune]&amp;quot;)
    }

    return s
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is combined with another helper to that samples inputs longer than we&amp;rsquo;re willing to store:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// Returns a string that&#39;s been truncated the given max length and
// stripped of any invalid UTF-8 that Postgres might balk at.
// Returns an empty string on `nil` for purposes of the batch
// insert will treat empty strings as NULL.
validTruncatedStringOrEmpty := func(sPtr *string, maxLength int) string {
    if sPtr == nil {
        return &amp;quot;&amp;quot;
    }

    return stringutil.SampleLongN(stringutil.TrimInvalidUTF8(*sPtr), maxLength)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;When inserting a canonical line for a request, inputs are sanitized and truncated. This happens for obvious fields where an invalid input can be sent like a query string or form body, but for less obvious ones as well. Invalid input can come in almost anywhere, including headers like &lt;code&gt;Content-Type&lt;/code&gt; or &lt;code&gt;User-Agent&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;insertParams.ContentType[i] =
    validTruncatedStringOrEmpty(logData.ContentType, 200)
insertParams.HTTPPath[i] =
    validTruncatedStringOrEmpty(&amp;amp;logData.HTTPPath, 200)
insertParams.QueryString[i] =
    validTruncatedStringOrEmpty(logData.QueryString, 2000)
insertParams.UserAgent[i] =
    validTruncatedStringOrEmpty(logData.UserAgent, 200)
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;one-down&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#one-down&quot;&gt;0x01 down&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;This is one of those little housekeeping tasks that may not be that important, but is quite gratifying. With the steps above we&amp;rsquo;ve eradicated &amp;ldquo;invalid byte sequence&amp;rdquo; errors, taking us a step closer to our target steady state of zero Sentry issues.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Building an Over-engineered Basement Monitor</title>
    <link href="https://www.scd31.com/posts/overengineered-basement-monitor"/>
    <id>https://www.scd31.com/posts/overengineered-basement-monitor</id>
    <updated>2024-11-24T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>The parallel test bundle, a convention for Go testing</title>
    <link href="https://brandur.org/fragments/parallel-test-bundle"/>
    <id>tag:brandur.org,2024-10-27:fragments/parallel-test-bundle</id>
    <updated>2024-10-27T23:06:21Z</updated>


    <content type="html">&lt;p&gt;A year ago we went through of process of getting every test case in our project tagged with &lt;a href=&quot;/t-parallel&quot;&gt;&lt;code&gt;t.Parallel&lt;/code&gt; and ratcheted with &lt;code&gt;paralleltest&lt;/code&gt;&lt;/a&gt;. I was initially skeptical about this being worth the effort because testing across Go packages was already happening in parallel, but it turned out to be a major boon for running large packages individually where we reduced test time by 30%+. We did one more step from there to tag every &lt;em&gt;subtest&lt;/em&gt; with &lt;code&gt;t.Parallel&lt;/code&gt; too. The gains from that weren&amp;rsquo;t as big, but it helps when running tests with many subtests one off, and isn&amp;rsquo;t much effort to sustain now that it&amp;rsquo;s in place.&lt;/p&gt;

&lt;p&gt;We&amp;rsquo;re running close to 5,000 tests at this point. Large scale code refactoring tools aren&amp;rsquo;t widespread in Go, so I did most of the refactoring with some &lt;em&gt;very&lt;/em&gt; gnarly multi-line regexes, and even with those, the only reason that it was possible was that we&amp;rsquo;re obsessive with keeping strong code convention. Most test cases were structured with an identical layout, which might&amp;rsquo;ve seemed like unnecessary pedantry when it was first going in, but later paid off in reams as I refactored thousands of tests in hours instead of weeks.&lt;/p&gt;

&lt;p&gt;Let me showcase a test convention that we&amp;rsquo;ve found to be useful for making subtests parallel-safe, keeping them DRY (unlike many languages, Go doesn&amp;rsquo;t have built-in facilities for setup/teardown blocks in tests), and keeping code readable. I try to be honest in the assessment of programming conventions and am not always certain about new ones, but we&amp;rsquo;ve been using the parallel test bundle for months and I&amp;rsquo;d rate it a &lt;sup&gt;10&lt;/sup&gt;&amp;frasl;&lt;sub&gt;10&lt;/sub&gt; strong recommendation. Better yet, it&amp;rsquo;s all just plain Go code and doesn&amp;rsquo;t require the adoption of anything weird/novel.&lt;/p&gt;

&lt;h2 id=&quot;bundle-struct&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#bundle-struct&quot;&gt;The test bundle struct&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;The test bundle itself is simple struct containing the object under test and useful fixtures to have available across subtests:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;type testBundle struct {
    account *dbsqlc.Account
    svc     *playgroundTutorialService
    team    *dbsqlc.Team
    tx      db.Tx
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;setup-function&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#setup-function&quot;&gt;The setup function&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;It&amp;rsquo;s paired with a &lt;code&gt;setup&lt;/code&gt; helper function that returns a bundle:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;setup := func(t *testing.T) (*testBundle, context.Context) {
    t.Helper()

    // These two vars are standard across almost every test case.
    var (
        ctx = ptesting.Context(t)
        tx  = ptesting.TestTx(ctx, t)
    )

    // Group of data fixtures.
    var (
        team    = dbfactory.Team(ctx, t, tx, &amp;amp;dbfactory.TeamOpts{})
        account = dbfactory.Account(ctx, t, tx, &amp;amp;dbfactory.AccountOpts{})
        _       = dbfactory.AccessGroupAccount_Admin(ctx, t, tx, team.ID, account.ID)
    )
    ctx = authntest.Account(account).Context(ctx)

    return &amp;amp;testBundle{
        account: account,
        svc:     pservicetest.InitAndStart(ctx, t, NewPlaygroundTutorialService(), tx.Begin, nil),
        team:    team,
        tx:      tx,
    }, ctx
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Along with a test bundle, the function also returns a context &lt;sup id=&quot;footnote-1-source&quot;&gt;&lt;a href=&quot;#footnote-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, which is useful for seeding context with a context logger that makes sure all &lt;a href=&quot;/t-parallel#logging&quot;&gt;logging output is collated with the test&lt;/a&gt; being run instead of &lt;code&gt;stdout&lt;/code&gt; where its output would be interleaved with that of other tests running parallel. Tests that don&amp;rsquo;t need a context omit the second return value.&lt;/p&gt;

&lt;h2 id=&quot;subtests&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#subtests&quot;&gt;Subtest invocations&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Each subtest marks itself as parallel, and calls &lt;code&gt;setup&lt;/code&gt; to procure a test bundle:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;t.Run(&amp;quot;AllProperties&amp;quot;, func(t *testing.T) {
    t.Parallel()

    bundle, ctx := setup(t)
    
    ...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Each instance of a test bundle is fully insulated from every other instance, ensuring that no side effects from a test can leak into any other. Every test case uses a test transaction so that it&amp;rsquo;s got its own private snapshot into the database for purposes of raising fixtures or querying.&lt;/p&gt;

&lt;p&gt;We tend to put test bundles in every test case, even where the bundle contains only a single field. This is a courtesy to a future developer who might need to augment the test and where a preexisting test bundle makes that faster to do. It also keeps convention strong in case we need to do another broad refactor down the line.&lt;/p&gt;

&lt;h2 id=&quot;complete-example&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#complete-example&quot;&gt;Complete example&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Here&amp;rsquo;s a full code sample with all the steps together:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func TestPlaygroundTutorialServiceCreate(t *testing.T) {
   t.Parallel()

   type testBundle struct {
      account *dbsqlc.Account
      svc     *playgroundTutorialService
      team    *dbsqlc.Team
      tx      db.Txer
   }

   setup := func(t *testing.T) (*testBundle, context.Context) {
      t.Helper()

      var (
         ctx = ptesting.Context(t)
         tx  = ptesting.TestTx(ctx, t)
      )

      var (
         team    = dbfactory.Team(ctx, t, tx, &amp;amp;dbfactory.TeamOpts{})
         account = dbfactory.Account(ctx, t, tx, &amp;amp;dbfactory.AccountOpts{})
         _       = dbfactory.AccessGroupAccount_Admin(ctx, t, tx, team.ID, account.ID)
      )
      ctx = authntest.Account(account).Context(ctx)

      return &amp;amp;testBundle{
         account: account,
         svc:     pservicetest.InitAndStart(ctx, t, NewPlaygroundTutorialService(), tx.Begin, nil),
         team:    team,
         tx:      tx,
      }, ctx
   }

   t.Run(&amp;quot;AllProperties&amp;quot;, func(t *testing.T) {
      t.Parallel()

      bundle, ctx := setup(t)

      resp, err := pservicetest.InvokeHandler(bundle.svc.Create, ctx, &amp;amp;PlaygroundTutorialCreateRequest{
         BootstrapSQL: ptrutil.Ptr(`SELECT unnest(array[1,2,3]);`),
         Name:         &amp;quot;My playground tutorial&amp;quot;,
         Content:      &amp;quot;# My tutorial\n\nThis is my SQL tutorial, created by **me**.&amp;quot;,
         IsPinned:     true,
         IsPublic:     true,
         TeamID:       eid.EID(bundle.team.ID),
         Weight:       ptrutil.Ptr(int32(100)),
      })
      require.NoError(t, err)
      prequire.PartialEqual(t, &amp;amp;apiresourcekind.PlaygroundTutorial{
         BootstrapSQL: ptrutil.Ptr(`SELECT unnest(array[1,2,3]);`),
         Content:      &amp;quot;# My tutorial\n\nThis is my SQL tutorial, created by **me**.&amp;quot;,
         IsPinned:     true,
         IsPublic:     true,
         Name:         &amp;quot;My playground tutorial&amp;quot;,
         TeamID:       eid.EID(bundle.team.ID),
         Weight:       ptrutil.Ptr(int32(100)),
      }, resp)

      _, err = dbsqlc.New().PlaygroundTutorialGetByID(ctx, bundle.tx, uuid.UUID(resp.ID))
      require.NoError(t, err)

      prequire.EventForActor(ctx, t, bundle.tx, &amp;quot;playground_tutorial.created&amp;quot;, bundle.account.ID)
   })
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;See also the &lt;a href=&quot;/fragments/partial-equal&quot;&gt;&lt;code&gt;PartialEqual&lt;/code&gt; helper&lt;/a&gt; which I wasn&amp;rsquo;t completely sure about when I first put it in, but am now fully bought into now because it&amp;rsquo;s shown itself to be so effective at keeping many consecutive assertions very tidy.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Mutability Isn’t Variability</title>
    <link href="https://blog.bruce-hill.com/mutability-isnt-variability"/>
    <id>https://blog.bruce-hill.com/mutability-isnt-variability</id>
    <updated>2024-10-24T08:00:00Z</updated>


    <content type="html">How programmers confuse ideas about things changing over
time.</content>

    <author>
      <name>Bruce Hill</name>

      <uri>https://bruce-hill.com</uri>

    </author>
  </entry>

  <entry>
    <title>Rails World 2024</title>
    <link href="https://brandur.org/fragments/rails-world-2024"/>
    <id>tag:brandur.org,2024-10-06:fragments/rails-world-2024</id>
    <updated>2024-10-06T20:17:03Z</updated>


    <content type="html">&lt;p&gt;I attended Rails World again this year, this time in Toronto. A quick recap while it&amp;rsquo;s still fresh.&lt;/p&gt;

&lt;p&gt;What a great event. Both this year and last the organizers went out of their way to pick some of the most incredible venues I&amp;rsquo;ve ever seen. Many places are adequate to the task of containing a conference for a few days, but few make your mouth go wide with a &amp;ldquo;wow&amp;rdquo; as you walk into the place.&lt;/p&gt;

&lt;p&gt;This year&amp;rsquo;s was held at Evergeen Brick Works, an old factory that lapsed into a state of disrepair for many years, and later converted to event venue. Its renovators decided to keep some aspects of the previous abandoned wreck. Its roof that&amp;rsquo;d fallen in wasn&amp;rsquo;t replaced, leaving the evergreens that&amp;rsquo;d grown in the interim stretching through up into the sky (unclear what would&amp;rsquo;ve happened if it&amp;rsquo;d rained). Derelict machinery and the more tasteful graffiti had been left in place to add to the character. Meanwhile, ultra-modern acoustics and AV equipment made for excellent talks, and clashed nicely with the exposed brick.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/photographs/fragments/rails-world-2024/evergreen.jpg&quot; srcset=&quot;/photographs/fragments/rails-world-2024/evergreen@2x.jpg 2x, /photographs/fragments/rails-world-2024/evergreen.jpg 1x&quot; loading=&quot;lazy&quot; class=&quot;rounded-md&quot;&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/photographs/fragments/rails-world-2024/graffiti.jpg&quot; srcset=&quot;/photographs/fragments/rails-world-2024/graffiti@2x.jpg 2x, /photographs/fragments/rails-world-2024/graffiti.jpg 1x&quot; loading=&quot;lazy&quot; class=&quot;rounded-md&quot;&gt;&lt;/p&gt;

&lt;p&gt;Attention was paid to every detail. Quality drinks and delicious snacks were always on offer between sessions, and three food trucks operated all day outside (and good choices too: pizza served out of a decommissioned fire truck, beaver tails, and poutine, only Canada&amp;rsquo;s best! &lt;sup id=&quot;footnote-1-source&quot;&gt;&lt;a href=&quot;#footnote-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;). One of my favorite details that was a holdover from the conference&amp;rsquo;s first year is that all breakfast and lunch food is edible standing up, and served out of the same area that made up the convention floor. With few tables available, people mingle organically while eating, preventing a common conference lunch problem of groups self-siloing at tables where they stay immobile for 30+ minutes and meet few new people, if any. Organizers responded dynamically to fix problems as they arose. For example, lunch lines were too long on the first day, so by day two there were double the number of food stations. Pair programming sessions were available all day through Test Double.&lt;/p&gt;

&lt;p&gt;This was all a nice change after attending RailsConf a few years back. There you couldn&amp;rsquo;t even get coffee outside a tight 30 minute availability window in the morning. This was understandable because money was tight. Ruby Central was spending it on more important things, like paying out $500k cancellation penalties to send a political &amp;ldquo;fuck you&amp;rdquo; to the entire state of Texas, which happily took their money and proceeded to not notice at all. (It may not be a big surprise to hear that 2025 will be the last year of RailsConf.)&lt;/p&gt;

&lt;p&gt;DHH is &lt;a href=&quot;https://world.hey.com/dhh/wonderful-rails-world-vibes-7a6141d2&quot;&gt;pretty transparent on numbers&lt;/a&gt;, and was up front that Rails World operates at a loss that&amp;rsquo;s backstopped by the large companies that form Rails Foundation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Rails Foundation, the founding core members listed above, as well as the contributing members [&amp;hellip;], were willing to happily underwrite a loss of over $100,000 on the conference itself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I love it. This is one of the best ways for companies getting good leverage out of Ruby/Rails to give back to the community. We&amp;rsquo;re not contributing anywhere near what a colossus like Shopify is, but it felt great to have Crunchy sponsoring the event.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/photographs/fragments/rails-world-2024/rails-8.jpg&quot; srcset=&quot;/photographs/fragments/rails-world-2024/rails-8@2x.jpg 2x, /photographs/fragments/rails-world-2024/rails-8.jpg 1x&quot; loading=&quot;lazy&quot; class=&quot;rounded-md&quot;&gt;&lt;/p&gt;

&lt;h2 id=&quot;tech-highlights&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#tech-highlights&quot;&gt;Tech highlights&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;I spent most of the conference at our booth, so I mostly only got a chance to catch &lt;a href=&quot;https://www.youtube.com/watch?v=-cEn_83zRFw&quot;&gt;the keynotes&lt;/a&gt;, but that was enough to catch the broad themes. A few notable highlights.&lt;/p&gt;

&lt;h3 id=&quot;solid-cache&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#solid-cache&quot;&gt;Solid Cache&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Like last year, David touched upon Solid Cache. This is such a great concept: caches traditionally always needed to be memory bound using a component like memcached or Redis because memory was fast and disks were slow. Now, memory is still fast, but with modern SSDs, disk is &lt;em&gt;also&lt;/em&gt; fast, and available in much larger denominations. 37 Signal&amp;rsquo;s products like Hey put their cache in MySQL, where they run it on a 30 TB disk with 60 days retention, and which has a 96% cache hit rate. This especially improves cache hits for the long tail of older keys that would&amp;rsquo;ve been long since the evicted given a less spacious in-memory data set.&lt;/p&gt;

&lt;p&gt;Solid Cache also dovetails well with the &lt;a href=&quot;/fragments/single-dependency-stacks&quot;&gt;single dependency stack&lt;/a&gt;. Three years later we still run one and exactly one persistence component: Postgres. It&amp;rsquo;s amazing just how plausible this is even for a mature stack, and it makes you realize that even the most fundamental belief systems of the programming world should be reevaluated every once in a while.&lt;/p&gt;

&lt;p&gt;37 Signals stubbornly cargo cults Oracle products, but as Andrew covers, &lt;a href=&quot;https://andyatkinson.com/solid-cache-rails-postgresql&quot;&gt;Solid Cache can be made workable on Postgres too&lt;/a&gt;. Although let me caveat that to say I&amp;rsquo;ve never done it, and suspect that there might be issues with long-lived deletion expiration queries at the scale of 30 TB of data since Postgres isn&amp;rsquo;t particularly good at efficiently deleting rows (a big reason that recent partitioning improvements are so important).&lt;/p&gt;

&lt;h3 id=&quot;server-phobia&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#server-phobia&quot;&gt;Server-phobia&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;For the last few months David&amp;rsquo;s been on an anti-cloud mission. One of keynote slides highlights the size, capacity and cost of a Performance M dyno (1 core/2 threads w/ 2.5GB for $250/mo.), with the next showing a rough equivalent on Hetzner (48 cores/96 threads w/ 256GB for $220/mo.), the clear message being that the Hetzner box is 50-100x more capable, and also cheaper. A big new piece of Rails is &lt;a href=&quot;https://kamal-deploy.org/&quot;&gt;Kamal&lt;/a&gt;, a system that&amp;rsquo;s meant to make deployment to raw metal as simple as it is on Heroku. Kamal bundles the new &lt;a href=&quot;https://github.com/basecamp/kamal-proxy&quot;&gt;Kamal Proxy&lt;/a&gt;, a reverse proxy that coordinates deploys, terminates TLS, and handles graceful restarts.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/photographs/fragments/rails-world-2024/performance-m.jpg&quot; srcset=&quot;/photographs/fragments/rails-world-2024/performance-m@2x.jpg 2x, /photographs/fragments/rails-world-2024/performance-m.jpg 1x&quot; loading=&quot;lazy&quot; class=&quot;rounded-md&quot;&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/photographs/fragments/rails-world-2024/hetzner.jpg&quot; srcset=&quot;/photographs/fragments/rails-world-2024/hetzner@2x.jpg 2x, /photographs/fragments/rails-world-2024/hetzner.jpg 1x&quot; loading=&quot;lazy&quot; class=&quot;rounded-md&quot;&gt;&lt;/p&gt;

&lt;p&gt;He&amp;rsquo;s got a point with this one. For a long time servers represented a huge capital investment and distraction from building an actual product, and in that context AWS and its ancillaries are an attractive idea. But as anyone who&amp;rsquo;s used a lot of AWS could tell you, it may be cheap in the beginning, but it&amp;rsquo;s only a matter of time until that inverts, and AWS bills become a recurring nightmare.&lt;/p&gt;

&lt;p&gt;That said, if I were trying to send this message I&amp;rsquo;d be careful to make it clear that this is a trade off. You&amp;rsquo;re unquestionably going to save money on hardware, but you&amp;rsquo;ll spend more time on management. Someone&amp;rsquo;s also going to be the one carrying the pager for all these boxes, and presumably that&amp;rsquo;s not the 37 Signals CEO or any of its executive team.&lt;/p&gt;

&lt;h3 id=&quot;rails-8-1&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#rails-8-1&quot;&gt;Rails 8.1: Et tu search?&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Rails 8 was released that day, and he closed the keynote by touching on some expected features for its next major release, 8.1. Next in its sights is the beast that no sane person wants to run: ElasticSearch, with the promise of bringing a sophisticated search engine into Rails itself. Also up for inclusion is &amp;ldquo;House (MD)&amp;rdquo;, which would make Markdown a more native piece of the Rails stack.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# search on any field
Post.search &amp;quot;announcement&amp;quot;

# by specific fields
Post.search title: &amp;quot;announcement&amp;quot;, content: &amp;quot;solid search&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;img src=&quot;/photographs/fragments/rails-world-2024/conference-hall.jpg&quot; srcset=&quot;/photographs/fragments/rails-world-2024/conference-hall@2x.jpg 2x, /photographs/fragments/rails-world-2024/conference-hall.jpg 1x&quot; loading=&quot;lazy&quot; class=&quot;rounded-md&quot;&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;20-min&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#20-min&quot;&gt;Twenty min&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Rails World was bigger this year than last, but it&amp;rsquo;s far from a huge conference, as shown by the competitive ticketing process, where tickets were gone 20 minutes after going on sale.&lt;/p&gt;

&lt;p&gt;Hard-to-get tickets are bad, but a positive side effect is that everyone at Rails World &lt;em&gt;really wanted&lt;/em&gt; to be at Rails World. You don&amp;rsquo;t get there by accident. The result is that every single person you spoke to had something interesting to say. In one case I&amp;rsquo;d randomly started talking to a couple Dutch guys staying at the same hotel I was, and 15 minutes later we were talking about the trade offs of Aurora versus vanilla Postgres. This will sound self-serving, but I met quite a few people that were already familiar with this website, and they&amp;rsquo;d ask &lt;em&gt;me&lt;/em&gt; about topics I&amp;rsquo;d written about recently like &lt;a href=&quot;https://www.crunchydata.com/blog/real-world-performance-gains-with-postgres-17-btree-bulk-scans&quot;&gt;Postgres 17 bulk B-tree lookups&lt;/a&gt; or &lt;a href=&quot;/fragments/secure-bytes-without-pgcrypto&quot;&gt;generating a couple secure bytes with &lt;code&gt;gen_random_uuid()&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I love it. The passion and expertise is the closest I&amp;rsquo;ve experienced at any event to what we used to get in the halcyon days of the early 2010s, before tech was so obviously the most important industry in the world, and became ludicrously financialized as every venture firm and Stanford graduate jumped to get a piece of it.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;toronto&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#toronto&quot;&gt;Unpopular opinion: Toronto&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Going on its second year now, there&amp;rsquo;s a traditional announcement in the closing keynote of where the next Rails World will be held. In 2025, it&amp;rsquo;ll be back in Amsterdam, and I admit to breathing a sigh of relief (assuming I can even get in).&lt;/p&gt;

&lt;p&gt;The Evergreen Brickworks venue is gorgeous, Shopify&amp;rsquo;s Toronto office is fabulous, and I had a good time visiting the city. But. Toronto&amp;rsquo;s downtown is enormous, and it&amp;rsquo;s the kind of place where every street, at every hour day or night, is characterized by the constant roar of total, all-encompassing, gridlock traffic. And like anywhere, when traffic is bad and tempers are heated, roads are never enough space for the pinnacle of human innovation, the automobile, and cars spill over onto every crosswalk and bike lane. With the bike lanes full, bike traffic moves onto the sidewalks, 90%+ of which is also motorized, with few riders even bothering to give lip service to those little foot rest doodads on the bottom of the bike that before the advent of the lithium battery and lightweight motor, used to be for peddling. Stop signs, red lights, and traffic priority all become the loosest of possible suggestions.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;d be exploring an inner city suburb, with leafy canopy and the most gorgeous, stately houses that positively &lt;em&gt;ooze&lt;/em&gt; history in all directions. Amazing! Beautiful! Except, these otherwise quiet streets are filled to the brim with hundreds of bumper-to-bumper SUVs (no self-respecting Canadian drives anything smaller than an SUV, and a family of two or more should ideally upgrade to something a little more size appropriate, like an F-350) inching their way onward at a pace only marginally faster than a brisk walk. I&amp;rsquo;d cross a bridge over a deep, forested ravine. Look over the edge, expecting to see a peaceful, bubbling brook far below. What do I see instead? A highway of course, which Torontonians have seen fit to plough through each of the city&amp;rsquo;s precious few parks.&lt;/p&gt;

&lt;p&gt;After one of the evening parties I found myself talking to a guy who was professing his undying love for the city of Toronto. Me: what exactly do you like about it? Him: the &lt;em&gt;diversityyyyyy&lt;/em&gt; man. Me: &amp;hellip; okay, &amp;hellip; anything else?&lt;/p&gt;

&lt;p&gt;Sorry, I can&amp;rsquo;t help myself. But also, Amsterdam is the correct answer.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;To recap, great event, great people. I hope to see many of you there next year.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>TIL: Variables in custom VSCode snippets</title>
    <link href="https://brandur.org/fragments/vscode-snippets"/>
    <id>tag:brandur.org,2024-10-04:fragments/vscode-snippets</id>
    <updated>2024-10-04T18:18:21Z</updated>


    <content type="html">&lt;p&gt;This blog is entirely driven by Markdown, TOML, and Git. Publishing an &lt;a href=&quot;/atoms&quot;&gt;atom&lt;/a&gt; or &lt;a href=&quot;/sequences&quot;&gt;sequence&lt;/a&gt; involves popping open a TOML file, adding a new item to the top, committing to Git, and pushing to origin to trigger a CI action that deploys the site:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-toml&quot;&gt;[[atoms]]
  published_at = 2024-10-04T10:24:22-07:00
  description = &amp;quot;&amp;quot;&amp;quot;\
Hello, world!
&amp;quot;&amp;quot;&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This generally works quite well, and in this developer&amp;rsquo;s humble opinion, far preferable to something involving a web UI with a little text box, but when I&amp;rsquo;m being honest with myself, I have to admit that the friction to editing is a little too high, and prevents me from publishing posts that I would&amp;rsquo;ve done if I was on a platform &lt;em&gt;with&lt;/em&gt; a web UI and a little text box, like Twitter.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;d been using &lt;a href=&quot;https://code.visualstudio.com/docs/editor/userdefinedsnippets&quot;&gt;VSCode snippets&lt;/a&gt; to speed up inserting a new TOML item, but the &lt;code&gt;published_at&lt;/code&gt; date wasn&amp;rsquo;t automated, so I&amp;rsquo;d have to jump to a terminal, get a timestamp with &lt;code&gt;date&lt;/code&gt;, then jump back and paste it. Not a big deal, but a little slow and mildly annoying.&lt;/p&gt;

&lt;p&gt;I went back and RTFMed. It turns out that custom snippets support a number of built-in variables like &lt;code&gt;$TM_FILENAME&lt;/code&gt;, &lt;code&gt;$CURRENT_SECONDS_UNIX&lt;/code&gt;, or even &lt;code&gt;$UUID&lt;/code&gt; for a random V4 UUID.&lt;/p&gt;

&lt;p&gt;With a few more variables I got it to insert RFC3339 dates exactly like the ones I&amp;rsquo;d been grabbing from my terminal:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
	&amp;quot;New atom&amp;quot;: {
		&amp;quot;prefix&amp;quot;: &amp;quot;at&amp;quot;,
		&amp;quot;body&amp;quot;: [
			&amp;quot;&amp;quot;,
			&amp;quot;[[atoms]]&amp;quot;,
			&amp;quot;  published_at = $CURRENT_YEAR-$CURRENT_MONTH-${CURRENT_DATE}T$CURRENT_HOUR:$CURRENT_MINUTE:$CURRENT_SECOND$CURRENT_TIMEZONE_OFFSET&amp;quot;,
			&amp;quot;  description = \&amp;quot;\&amp;quot;\&amp;quot;\\&amp;quot;,
			&amp;quot;$1&amp;quot;,
			&amp;quot;\&amp;quot;\&amp;quot;\&amp;quot;&amp;quot;,
			&amp;quot;&amp;quot;
		],
		&amp;quot;description&amp;quot;: &amp;quot;New atom&amp;quot;
	}
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There&amp;rsquo;s quite a few other useful built-ins (e.g. currently selected text, contents of clipboard, start comment), and &lt;a href=&quot;https://code.visualstudio.com/docs/editor/userdefinedsnippets#_transform-examples&quot;&gt;transformations with regex&lt;/a&gt; are supported.&lt;/p&gt;

&lt;p&gt;I also took the time to get the whitespace around the inserted block exactly right, so no extra time is needed to correct it after insertion. All in all I probably saved myself about ten seconds for each snippet use, but it&amp;rsquo;s enough of a gain to make myself marginally more likely to do it.&lt;/p&gt;

&lt;p&gt;Next up (hopefully): a mobile publishing workflow, something that&amp;rsquo;s been sorely missing for years.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Automatic Transcription and Accident Detection from Radio Chatter</title>
    <link href="https://www.scd31.com/posts/automated-police-scanner"/>
    <id>https://www.scd31.com/posts/automated-police-scanner</id>
    <updated>2024-10-02T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>A few secure, random bytes without `pgcrypto`</title>
    <link href="https://brandur.org/fragments/secure-bytes-without-pgcrypto"/>
    <id>tag:brandur.org,2024-09-24:fragments/secure-bytes-without-pgcrypto</id>
    <updated>2024-09-24T18:38:37Z</updated>


    <content type="html">&lt;p&gt;In Postgres it&amp;rsquo;s common to see the SQL &lt;code&gt;random()&lt;/code&gt; function used to generate a random number, but it&amp;rsquo;s a pseudo-random number generator, and not suitable for cases where real randomness is required critical. Postgres also provides a way of getting secure random numbers as well, but only through the use of the &lt;code&gt;pgcrypto&lt;/code&gt; extension, which makes &lt;code&gt;gen_random_bytes&lt;/code&gt; available.&lt;/p&gt;

&lt;p&gt;Pulling &lt;code&gt;pgcrypto&lt;/code&gt; into your database is probably fine&amp;mdash;at least it&amp;rsquo;s a core extension that&amp;rsquo;s distributed with Postgres itself&amp;mdash;but while testing the RC version of &lt;a href=&quot;https://www.crunchydata.com/blog/real-world-performance-gains-with-postgres-17-btree-bulk-scans&quot;&gt;Postgres 17&lt;/a&gt; last week, I found that it was surprisingly difficult to build Postgres against OpenSSL, which is required to build &lt;code&gt;pgcrypto&lt;/code&gt;, thereby making &lt;code&gt;pgcrypto&lt;/code&gt; itself hard to build.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m broadly against the use of Postgres extensions because they make upgrades harder and projects less portable &lt;sup id=&quot;footnote-1-source&quot;&gt;&lt;a href=&quot;#footnote-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, so we have a minimal posture when it comes to them, depending only on &lt;code&gt;btree_gist&lt;/code&gt; and &lt;code&gt;pgcrypto&lt;/code&gt;. Like &lt;code&gt;pgcrypto&lt;/code&gt;, &lt;code&gt;btree_gist&lt;/code&gt; is also distributed with Postgres, but unlike &lt;code&gt;pgcrypto&lt;/code&gt;, doesn&amp;rsquo;t have an OpenSSL dependency, making it trivial to build.&lt;/p&gt;

&lt;p&gt;Rather than wasting more time trying to get OpenSSL configured, I did a quick code audit to find out where we were using &lt;code&gt;pgcrypto&lt;/code&gt;, and found that we were using it in exactly one place to generate random bytes for use in &lt;a href=&quot;/nanoglyphs/026-ids&quot;&gt;a ULID&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- 10 entropy bytes
ulid = timestamp || gen_random_bytes(10);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Needing a whole extension for generating a few random bytes seems like a waste, but unfortunately Postgres doesn&amp;rsquo;t offer a built-in way to get cryptographically secure random bytes in any other way &amp;hellip; or does it?&lt;/p&gt;

&lt;h2 id=&quot;secure-bytes&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#secure-bytes&quot;&gt;Secure bytes, just not for you&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Internally, Postgres has a module called &lt;code&gt;pg_strong_random.c&lt;/code&gt; that exports a &lt;code&gt;pg_strong_random()&lt;/code&gt; function that will use OpenSSL if available, but can fall back to &lt;code&gt;/dev/urandom&lt;/code&gt; in case it&amp;rsquo;s not, which is perfectly fine for our purposes:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;/*
 * pg_strong_random &amp;amp; pg_strong_random_init
 *
 * Generate requested number of random bytes. The returned bytes are
 * cryptographically secure, suitable for use e.g. in authentication.
 *
 * Before pg_strong_random is called in any process, the generator must first
 * be initialized by calling pg_strong_random_init().
 *
 * We rely on system facilities for actually generating the numbers.
 * We support a number of sources:
 *
 * 1. OpenSSL&#39;s RAND_bytes()
 * 2. Windows&#39; CryptGenRandom() function
 * 3. /dev/urandom
 *
 * Returns true on success, and false if none of the sources
 * were available. NB: It is important to check the return value!
 * Proceeding with key generation when no random data was available
 * would lead to predictable keys and security issues.
 */
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So secure randomness is available without needing to dip into OpenSSL or &lt;code&gt;pgcrypto&lt;/code&gt;. Postgres just doesn&amp;rsquo;t make it available to you.&lt;/p&gt;

&lt;h2 id=&quot;roundabout-randomness&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#roundabout-randomness&quot;&gt;Roundabout randomness&lt;/a&gt;&lt;/h2&gt;
 

&lt;p&gt;Luckily, there&amp;rsquo;s a workaround. &lt;code&gt;pg_strong_random()&lt;/code&gt; is called through another function that&amp;rsquo;s exported to userspace, Postgres 13&amp;rsquo;s &lt;code&gt;gen_random_uuid()&lt;/code&gt; which generates a V4 UUID that&amp;rsquo;s secure, random data with the exception of six variant/version bits in the middle:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;Datum
gen_random_uuid(PG_FUNCTION_ARGS)
{
    pg_uuid_t  *uuid = palloc(UUID_LEN);

    if (!pg_strong_random(uuid, UUID_LEN))
        ereport(ERROR,
                (errcode(ERRCODE_INTERNAL_ERROR),
                 errmsg(&amp;quot;could not generate random values&amp;quot;)));

    /*
     * Set magic numbers for a &amp;quot;version 4&amp;quot; (pseudorandom) UUID, see
     * http://tools.ietf.org/html/rfc4122#section-4.4
     */
    uuid-&amp;gt;data[6] = (uuid-&amp;gt;data[6] &amp;amp; 0x0f) | 0x40;    /* time_hi_and_version */
    uuid-&amp;gt;data[8] = (uuid-&amp;gt;data[8] &amp;amp; 0x3f) | 0x80;    /* clock_seq_hi_and_reserved */

    PG_RETURN_UUID_P(uuid);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Given our use of &lt;code&gt;pgcrypto&lt;/code&gt; is so limited, and we only need ten random bytes at a time for a ULID, I changed our &lt;code&gt;gen_ulid()&lt;/code&gt; implementation to find ten bytes of randomness by pulling five bytes off the front and back of a V6 UUID:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- 10 entropy bytes
--
-- We extract these by generating a random UUID and extracting
-- the first five bytes and last bytes out of it (thus avoiding
-- versioning bits in the middle). This is a roundabout way of
-- doing this, but is done to avoid a dependency on the pgcrypto
-- extension just to get `gen_random_bytes()`.
--
-- `uuid_send()` changes `uuid` to `bytea`.
random_uuid = uuid_send(gen_random_uuid());
ulid = timestamp ||
    substring(random_uuid FROM 1 FOR 5) ||
    substring(random_uuid FROM 12 FOR 5);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Which then lets us rid ourselves of &lt;code&gt;pgcrypto&lt;/code&gt;, along with OpenSSL:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;DROP EXTENSION pgcrypto;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Making tests against a locally built version of Postgres considerably easier.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m hoping we can ditch this hack as soon as V7 UUIDs land in core (they didn&amp;rsquo;t make Postgres 17, which is very sad), but in the mean time, this trick might be useful to someone else.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Direnv&#39;s `source_env`, and how to manage project configuration</title>
    <link href="https://brandur.org/fragments/direnv-source-env"/>
    <id>tag:brandur.org,2024-09-20:fragments/direnv-source-env</id>
    <updated>2024-09-20T11:53:58Z</updated>


    <content type="html">&lt;p&gt;For years I&amp;rsquo;ve been using &lt;a href=&quot;https://direnv.net/&quot;&gt;Direnv&lt;/a&gt; to manage configuration in projects. It&amp;rsquo;s a small program that loads env vars out of an &lt;code&gt;.envrc&lt;/code&gt; file on a directory by directory basis, using a shell hook to load vars as you enter a folder, and unload them as you leave.&lt;/p&gt;

&lt;p&gt;A typical &lt;code&gt;.envrc&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;export API_URL=&amp;quot;http://localhost:5222&amp;quot;
export DATABASE_URL=&amp;quot;postgres://localhost:5432/project-db&amp;quot;
export ENV_NAME=dev
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The beauty of Direnv is not only that it&amp;rsquo;s 12-factor friendly, but that it&amp;rsquo;s language agnostic, and unlike its language-specific alternatives that hook into program code in various creative ways, Direnv makes configuration available to your main program &lt;em&gt;and&lt;/em&gt; anything else you need to run with it.&lt;/p&gt;

&lt;p&gt;So configuration is available for your project&amp;rsquo;s core programs:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;# gets DATABASE_URL from env
make build/api &amp;amp;&amp;amp; build/api
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And for all adjacent utilities, including ones that you didn&amp;rsquo;t write, and would otherwise have no way of hooking into a bespoke configuration system:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;# still works fine!
goose -dir ./migrations/main postgres $DATABASE_URL
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;uneven-distribution&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#uneven-distribution&quot;&gt;Uneven distribution&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;For years I&amp;rsquo;ve recommended in project READMEs to get started by copying an &lt;code&gt;.envrc&lt;/code&gt; template and running the program:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;cp .envrc.sample .envrc
direnv allow
go test ./...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;.envrc.sample&lt;/code&gt; is committed to Git while &lt;code&gt;.envrc&lt;/code&gt; is not due to the presumption that it may eventually be edited to include user-specific secrets.&lt;/p&gt;

&lt;p&gt;That works fine, but has always had the downside in that if configuration changes and &lt;code&gt;.envrc.sample&lt;/code&gt; is updated, other developers don&amp;rsquo;t get those changes unless they copy a fresh &lt;code&gt;.envrc.sample&lt;/code&gt;, and they almost certainly won&amp;rsquo;t think to do that. This is an advantage that I&amp;rsquo;d thought language-specific configuration systems like &lt;a href=&quot;https://www.npmjs.com/package/dotenv0&quot;&gt;Dotenv&lt;/a&gt; have had over Direnv, where they can often read multiple env files, some of which may contain shared configuration that&amp;rsquo;s versioned with the repo.&lt;/p&gt;

&lt;h2 id=&quot;section-1&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#section-1&quot;&gt;The missing piece of the puzzle: `source_env`&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Well, after being a Direnv user for &lt;em&gt;ten years&lt;/em&gt;, yesterday I learnt of the existence of &lt;a href=&quot;https://direnv.net/man/direnv-stdlib.1.html&quot;&gt;&lt;code&gt;source_env&lt;/code&gt;&lt;/a&gt;, a special directive that can go in an &lt;code&gt;.envrc&lt;/code&gt; and which will read out out of another envrc file.&lt;/p&gt;

&lt;p&gt;This simplifies the configuration of my projects &lt;em&gt;dramatically&lt;/em&gt;. They have an &lt;code&gt;.envrc.sample&lt;/code&gt;, but it&amp;rsquo;s stripped down to almost nothing, containing only a &lt;code&gt;source_env&lt;/code&gt; statement and room to add customization.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;# Common configuration for al developers, committed to Git.
source_env .envrc.local

# Custom env values go here.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Meanwhile, all default configuration migrates to a &lt;code&gt;.envrc.local&lt;/code&gt; (the &lt;code&gt;.local&lt;/code&gt; suffix not having any special meaning, but rather just a convention to use):&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;#
# .envrc.local
#
# Shared env vars commmitted to Git and made available to all
# developers. As # much configuration should go here as possible
# so that new env vars don&#39;t break # anyone and everyone gets to
# benefit from improvements, but don&#39;t add anything too secret or
# too custom.
#

export API_URL=&amp;quot;http://localhost:5222&amp;quot;
export DATABASE_URL=&amp;quot;postgres://localhost:5432/project-db&amp;quot;
export ENV_NAME=dev
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;.envrc.local&lt;/code&gt; is committed to Git, and when anyone changes configuration, all other developers get the updates the next time they pull from master.&lt;/p&gt;

&lt;p&gt;This doesn&amp;rsquo;t account for truly sensitive configuration that shouldn&amp;rsquo;t be stored in a Git repository, but my advice on that: projects should always be able to gracefully degrade so they can run (at least in development mode) with no sensitive secrets at all. And &lt;em&gt;certainly&lt;/em&gt; the test suite should be able to. If your project can&amp;rsquo;t do that, something is wrong.&lt;/p&gt;

&lt;p&gt;For my money, Direnv + &lt;code&gt;source_env&lt;/code&gt; is a perfect dev configuration system, and one that works cleanly in any language ecosystem.&lt;/p&gt;</content>

    <author>
      <name>brandur (fragments)</name>

      <uri>https://brandur.org/fragments</uri>

    </author>
  </entry>

  <entry>
    <title>Engineering Mindset</title>
    <link href="https://tomeraberba.ch/engineering-mindset"/>
    <id>https://tomeraberba.ch/engineering-mindset</id>
    <updated>2024-09-20T00:00:00Z</updated>


    <content type="html">I recently left Google and during my last two weeks several people asked me the same question: How did you get to where you are? What they were asking is how I developed my software engineering skills…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>The 1 Hour per Year Bug (But Only in Pacific Time!)</title>
    <link href="https://tomeraberba.ch/the-1-hour-per-year-bug"/>
    <id>https://tomeraberba.ch/the-1-hour-per-year-bug</id>
    <updated>2024-08-15T00:00:00Z</updated>


    <content type="html">:::note This article was updated in response to some helpful comments on Hacker News. I originally incorrectly formatted the Pacific Time Zone daylight saving times relative to the Eastern Time Zone!…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>lonely cosmopolite</title>
    <link href="https://tomeraberba.ch/lonely-cosmopolite"/>
    <id>https://tomeraberba.ch/lonely-cosmopolite</id>
    <updated>2024-08-05T00:00:00Z</updated>


    <content type="html">I composed, produced, and released an instrumental lofi track! https://www.youtube.com/watch?v=g54wSssQCTA You can also find the track on Spotify, Apple Music, YouTube Music, and other music streaming…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>HP 1651B Boot Disk Creation</title>
    <link href="https://www.scd31.com/posts/1651b-boot-disk"/>
    <id>https://www.scd31.com/posts/1651b-boot-disk</id>
    <updated>2024-07-10T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>To Dedupe Then Sort or Sort Then Dedupe?</title>
    <link href="https://tomeraberba.ch/to-dedupe-then-sort-or-sort-then-dedupe"/>
    <id>https://tomeraberba.ch/to-dedupe-then-sort-or-sort-then-dedupe</id>
    <updated>2024-06-30T00:00:00Z</updated>


    <content type="html">I recently came across a deceptively simple problem. I wanted to dedupe and sort a list of integers in-place. My initial instinct was to first remove duplicates and then sort using the programming…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>Eradicating N+1s: The Two-phase Data Load and Render Pattern in Go</title>
    <link href="https://brandur.org/two-phase-render"/>
    <id>tag:brandur.org,2024-05-28:two-phase-render</id>
    <updated>2024-05-28T18:50:47Z</updated>


    <content type="html">&lt;p&gt;&lt;em&gt;Author’s note:&lt;/em&gt; This is a longer piece that starts off with exposition into the nature of the N+1 query problem. If you&amp;rsquo;re already well familiar with it, you may want to skip my description of N+1 to a story involving a creative use of &lt;a href=&quot;#fibers-and-intents&quot;&gt;Ruby fibers at Stripe&lt;/a&gt; to try and plug this hole, or the &lt;a href=&quot;#two-phase&quot;&gt;two-phase load and render&lt;/a&gt; that I&amp;rsquo;ve put in my current company&amp;rsquo;s Go codebase, a pattern we&amp;rsquo;ve been using for two years now that&amp;rsquo;s rid of us N+1s, and for which I&amp;rsquo;d have trouble citing any deficiency (aside from Go&amp;rsquo;s normal trouble with verbosity). It works.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;n-plus-one&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#n-plus-one&quot;&gt;N+1 in a nutshell&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Let&amp;rsquo;s say we have a model &lt;code&gt;Product&lt;/code&gt; that can render a public-facing API resource for itself by implementing &lt;code&gt;#render&lt;/code&gt;. I&amp;rsquo;ll be talking about API resources a lot because that&amp;rsquo;s what I&amp;rsquo;m used, but keep in mind that this could also be an object that&amp;rsquo;s used to render an HTML view and all the same concepts apply.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class Product &amp;lt; ApplicationRecord
  belongs_to :owner # needs to lazy load an owner

  def render
    {
      id:          self.id,
      name:        self.name,
      owner_id:    self.owner_id,
      owner_email: self.owner.email,
    }
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Some of the properties in &lt;code&gt;#render&lt;/code&gt; like &lt;code&gt;id&lt;/code&gt; or &lt;code&gt;name&lt;/code&gt; come directly from the model itself, and nothing beyond the initial model needs to be loaded from the database. But some, like &lt;code&gt;owner_email&lt;/code&gt; must be accessed through an associated record (&lt;code&gt;product.owner&lt;/code&gt;), which the data framework (ActiveRecord in this case) will happily lazy load.&lt;/p&gt;

&lt;p&gt;Now, say ten products are rendered in a loop:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;Product.limit(10).map do |product|
  product.render
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In this naive loop, the number of database queries issued to render all products is one (&lt;code&gt;Product.limit(10)&lt;/code&gt;) plus ten as &lt;code&gt;owner&lt;/code&gt; is lazily loaded on each product. That&amp;rsquo;s where we get &amp;ldquo;N+1&amp;rdquo; &amp;ndash; one initial fetch, and N as its objects are iterated and do their own loading.&lt;/p&gt;

&lt;p&gt;This practically invisible problem is probably number two to only forgotten indexes as the most common reason for poor performance of web apps around. It&amp;rsquo;s an easy mistake to make, and there&amp;rsquo;s a broad lack of guard rails to protect against it.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/two-phase-render/n_plus_one.svg&quot; alt=&quot;N+1.&quot;&gt;&lt;/p&gt;

&lt;h3 id=&quot;n-m-plus-one&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#n-m-plus-one&quot;&gt;N*M+1 and more&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;11 queries doesn&amp;rsquo;t sound like much, but in the real world it never stops there. Let&amp;rsquo;s look at a more complicated example where &lt;code&gt;Product&lt;/code&gt; now has multiple associated resources along with a &lt;code&gt;Widget&lt;/code&gt; subresource that has its own associations.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class Product &amp;lt; ApplicationRecord
  belongs_to :owner
  belongs_to :team
  has_many :widgets # has many widgets

  def render
    {
      id:          self.id,
      name:        self.name,
      owner_id:    self.owner_id,
      owner_email: self.owner.email,
      team_id:     self.team_id,
      team_name:   self.team.name,
      widget:      self.widgets.map { |w| w.render },
    }
  end
end

class Widget &amp;lt; ApplicationRecord
  belongs_to :factory # needs to lazy load a factory

  def render
    {
      id:           self.id,
      factory_id:   self.factory_id,
      factory_name: self.factory.name,
      name:         self.name,
    }
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We&amp;rsquo;re now at more like N*M+1. This is the more realistic example, and in real life it just keeps snowballing from there. Models have dozens of associations, and their subresources have subresources which have subresources. Rendering a single API resource/web page might take hundreds, or even thousands, of database queries.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/two-phase-render/n_times_m_plus_one.svg&quot; alt=&quot;N*M+1.&quot;&gt;&lt;/p&gt;

&lt;p&gt;Luckily for all of us, databases are pretty fast, and even when abused in this fashion can still tend get the job done in a timely manner. ORMs like ActiveRecord also have features like &lt;a href=&quot;https://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations&quot;&gt;eager loading&lt;/a&gt;, that can be used to prefetch what otherwise would&amp;rsquo;ve been loaded lazily.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;Product.includes(owner: [], team: [], widget: [:factory]).limit(10)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But even these sophisticated strategies have their own problems. In a large application with lots of layers, it&amp;rsquo;s not obvious from any particular query if the right prefetching is happening, and it&amp;rsquo;s easy to forget eager loads or put them in the wrong place.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;fibers-and-intents&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#fibers-and-intents&quot;&gt;A digression: Fibers and intents&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Sometimes you have to get creative to solve N+1s.&lt;/p&gt;

&lt;p&gt;A story from Stripe: due to an architecture built around Mongo, records were almost always point loaded by nothing more complex than a point index lookup (i.e. no fancy joins, eager loading, or anything else, just the equivalent of &lt;code&gt;WHERE id = @id&lt;/code&gt;). N+1s were the rule, not the exception, but with fast hardware and modest performance expectations, it’s amazing how far you can get with this brute force approach. An API request could easily run thousands of database ops.&lt;/p&gt;

&lt;p&gt;It’s a good example of how pernicious N+1s can be. Databases are fast, and especially in the beginning, you can have the least sophisticated internal practices imaginable and they’ll still be viable. A request might be making 50 database calls, 45 of which would be unnecessary in a better-designed system, but with each taking only 1-2 ms, everything’s still done in 50-100 ms.&lt;/p&gt;

&lt;p&gt;But over the years 50 calls becomes 1,000, and users start to notice that things are slow. And once things are this far gone, there’s no obvious fix. The latency isn’t due to only one factor, it’s a confluence of years worth of haphazardly written code, and now there&amp;rsquo;s millions of lines of it.&lt;/p&gt;

&lt;p&gt;With no easy solutions in sight, one of my colleagues came up with what to this day is still the most novel and effective hack I&amp;rsquo;ve ever seen work in production.&lt;/p&gt;

&lt;p&gt;API endpoints mapped to an API resource that they render. API resources were backed by a database model. Sometimes properties on the API resource mapped directly 1:1 to properties on the model, but especially over time, these representations tended to diverge, and custom overrides were required to map internal schema to public representation.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class Charge &amp;lt; APIResource
  prop :amount_total                               # maps to model directly
  prop :refund_total, render: :render_refund_total # renders with custom function
  prop :user_email, render :render_user_email      # renders with custom function
  
  def render_refund_total
    @model.refunds.sum { |r| r.amount_total }
  end
  
  def render_user_email
    @model.user.email
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It was these custom overrides where N+1s were most pervasive. Models used an ORM similar to ActiveRecord or Sequel that lazily loaded related records, and rendering would more often than not require loading relations. Custom overrides often rendered subresources of their own, each of which might have its own N+1s, amplifying expense to unbounded proportions.&lt;/p&gt;

&lt;h3 id=&quot;dynamic-aggregates&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#dynamic-aggregates&quot;&gt;Dynamic aggregates&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;This is where the innovation came in. Ruby has a construct called &lt;a href=&quot;https://docs.ruby-lang.org/en/master/Fiber.html&quot;&gt;fibers&lt;/a&gt; which are coroutines with a smaller memory footprint than a thread (using only small 4 kB stacks), and which can be paused and started again. The devised scheme:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every custom &lt;code&gt;#render_*&lt;/code&gt; override would be wrapped in a fiber during invocation.&lt;/li&gt;
&lt;li&gt;If the fiber called into the database layer, it&amp;rsquo;d be paused. Its &amp;ldquo;intent&amp;rdquo; to query was recorded, and the next fiber started.&lt;/li&gt;
&lt;li&gt;After every fiber was either paused or completed, paused fibers were examined and their database intents aggregated into batch operations.&lt;/li&gt;
&lt;li&gt;Batch operations were invoked. Their results were disaggregated, and the appropriate data distributed back to each parked fiber.&lt;/li&gt;
&lt;li&gt;Paused fibers were continued. If new database calls were made, the sequence would start over again.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So from the example above, if 10 charges were rendered that mapped to 10 separate users, the users were bulked loaded with &lt;code&gt;user_id IN (?, ?, ?, ...)&lt;/code&gt; instead of a single &lt;code&gt;user_id = ?&lt;/code&gt;, but each fiber would get back a single account as if it&amp;rsquo;d performed a point load.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class Charge &amp;lt; APIResource
  ...
  
  def render_user_email
    #
    # fiber paused, N charge renders become `user_id IN (?, ?, ?)`, results
    # disaggregated and handed to fibers, which are then continued
    #
    @model.user.email   
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/two-phase-render/fibers.svg&quot; alt=&quot;Loading data via fibers.&quot;&gt;&lt;/p&gt;

&lt;p&gt;The system had broad limitations (e.g. only point loads could be aggregated; no complex queries were supported), but despite some gnarly code, it worked, and helped knock considerable latency off API calls.&lt;/p&gt;

&lt;p&gt;Importantly, options were limited and this was one of the few ways to have a large effect across millions of lines of code. The time where the situation could&amp;rsquo;ve been rescued with a prettier/more optimal abstraction was long since past.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;rails-strict-loading&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#rails-strict-loading&quot;&gt;Rails strict loading&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;N+1s are a constant threat in frameworks like ActiveRecord where lazy loading is common. Lazy loading is preventable with eager loading like &lt;code&gt;#includes&lt;/code&gt; / &lt;code&gt;#eager_load&lt;/code&gt; / &lt;code&gt;#preload&lt;/code&gt;, but is difficult to guarantee because even if all relations were eager loaded initially, it’s easy to accidentally regress as a new lazy load is introduced.&lt;/p&gt;

&lt;p&gt;To help ratchet down on the problem, &lt;a href=&quot;https://rubyonrails.org/2020/12/9/Rails-6-1-0-release#strict-loading-associations&quot;&gt;Rails 6.1 introduced &lt;strong&gt;strict loading&lt;/strong&gt;&lt;/a&gt;, wherein lazy loading becomes an error. The idea is that tests will exercise code which will fail if it performs a lazy load, allowing all instances of it to be banished before deployment.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;config.active_record.strict_loading_by_default = true
&lt;/code&gt;&lt;/pre&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class Article &amp;lt; ApplicationRecord
  self.strict_loading_by_default = true

  has_many :comments
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Strict loading is an important feature and &lt;em&gt;major&lt;/em&gt; innovation in this area, but not a panacea. Test coverage needs to be substantial to make sure problems are caught before hitting production.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;go-verbosity&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#go-verbosity&quot;&gt;Loading data in Go, exceptional verbosity&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;This brings us to Go, where loading data is hard even without considering N+1s.&lt;/p&gt;

&lt;p&gt;Go can aptly be described as a newer, safer C, but with even less flexibility. You couldn’t write a good ORM for the language if you wanted to (they do exist, but rely on a lot of untyped &lt;code&gt;any&lt;/code&gt; shenanigans, which defeats the type advantages of Go in the first place since problems are only caught at runtime), and in the absence of one, the Go philosophy is to avoid abstraction. If you need something like an API resource, piece it together query-by-query, with requisite &lt;code&gt;if err != nil { ... }&lt;/code&gt; blocks after every statement.&lt;/p&gt;

&lt;p&gt;For larger applications with dozens or hundreds of associations, the default result is a breathtaking amount of boilerplate to accomplish what would be a modest amount of code in a language with more succinct syntax and a dynamic ORM.&lt;/p&gt;

&lt;p&gt;The increased verbosity does nothing to make N+1s less likely, which are still easy to introduce in a loop, especially with layers of indirection. It also makes them harder to fix because there might be a lot of refactoring involved. One of the first bugs I ever fixed coming onto the job was an N+1:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-git-commit&quot;&gt;commit de58e3552eaef78c9b3d7779ddf9c646d5009985
Author: Brandur &amp;lt;brandur@brandur.org&amp;gt;
Date:   Thu Jun 3 13:06:56 2021 -0700

    Fix N+1 query getting replicas on cluster list

    We currently have an N+1 situation when listing clusters wherein we query
    replicas for every cluster picked up in the original list. This leads to
    poor performance where a user has many clusters.

    Here we fix the problem by introducing a new query that&#39;s able to select
    replicas based on a set of input IDs, and after fetching them, we assign
    them to cluster objects appropriately.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It was about as classic of a mistake as is possible. A query in a loop:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;for _, cluster := range clusters {
    replicas, err := svc.getReplicasByClusterID(ctx, svc.executor(), cluster.ID)

    if err != nil {
        plog.Logger(ctx).Errorf(&amp;quot;could not retrieve replicas for cluster id=[%s]: %s&amp;quot;,
            cluster.ID, err.Error())
        continue
    }

    cluster.Replicas = replicas
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This one&amp;rsquo;s is easy to spot, but once queries are folded into functions and other abstractions, they get less visible and harder to address.&lt;/p&gt;

&lt;p&gt;The fix was to query many clusters at once before the loop, and piece them together inside of it, requiring an impressive amount of code for quite a commonplace operation. (This was before generics arrived in 1.18, so even basic tasks like mapping a slice to a keyed map wasn&amp;rsquo;t possible with less than four lines of code.)&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// Code in this block retrieves any replicas for these clusters and assigns
// them appropriately. All replicas are selected in one query to avoid an N+1
// problem. It would be nice to generalize this pattern because it&#39;s not pretty.
{
    clusterIDs := make([]pgtype.UUID, len(clusters))
    for i, cluster := range clusters {
        clusterIDs[i] = db.MakeUUID(cluster.ID).UUID
    }

    replicas, err := svc.getReplicasByClusterIDs(ctx, svc.executor(), clusterIDs)
    if err != nil {
        return nil, err
    }

    clusterMap := make(map[string]*dbops.Cluster)
    for _, cluster := range clusters {
        clusterMap[cluster.ID] = cluster
    }

    for _, replica := range replicas {
        cluster := clusterMap[replica.ClusterID]
        cluster.Replicas = append(cluster.Replicas, replica)
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Beyond the eyesore, this case-by-case approach doesn&amp;rsquo;t scale well code wise either. Even this example for a single API resource with one sub-list is already messy. What would happen for one with dozens of subresources, each of which might have dozen of subresources of their own? Then add a half dozen different developers into the equation, none of whom will have perfect insight into or understanding of code that anyone else wrote.&lt;/p&gt;

&lt;p&gt;Despite Go&amp;rsquo;s ad nauseum verbosity, it&amp;rsquo;s no less susceptible to N+1s than a language heavy in metaprogramming like Ruby.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;two-phase&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#two-phase&quot;&gt;Two-phase load and render&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;This is where our generalized data loading pattern comes in. It doesn&amp;rsquo;t make N+1s impossible, but it forces developers to break convention to introduce them, making adding a new one harder than not doing so.&lt;/p&gt;

&lt;p&gt;As the name suggests, it&amp;rsquo;s broken down into two distinct render phases:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Load phase:&lt;/strong&gt; Generates a &lt;strong&gt;load bundle&lt;/strong&gt; from the database containing everything needed to render an &lt;strong&gt;arbitrary number&lt;/strong&gt; of resources. Load phases always load data for N resources, even if only a single one is being rendered.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Render phase:&lt;/strong&gt; Using a load bundle, renders a single resource. No database access is allowed.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key insight is that the load phase knows how to load data to a bundle that&amp;rsquo;s sufficient to render N resources. For a list endpoint, render may then be called using that bundle for N resources in the list. For a point retrieval endpoint, it&amp;rsquo;ll render only one resource. Either way, the process is the same.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/two-phase-render/render_load_bundle.svg&quot; alt=&quot;Rendering a load bundle.&quot;&gt;&lt;/p&gt;

&lt;p&gt;Let&amp;rsquo;s look at a basic example. A product API resource, each of which has one admin and belongs to a team:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package apiresourcekind

type Product struct {
    apiresource.APIResourceBase

    ID         uuid.UUID `json:&amp;quot;id&amp;quot;`
    Name       string    `json:&amp;quot;name&amp;quot;`
    OwnerID    uuid.UUID `json:&amp;quot;owner_id&amp;quot;`
    OwnerEmail string    `json:&amp;quot;owner_email&amp;quot;`
    TeamID     uuid.UUID `json:&amp;quot;team_id&amp;quot;`
    TeamName   string    `json:&amp;quot;team_email&amp;quot;`
}
&lt;/code&gt;&lt;/pre&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;//
// Phase 1: Load data into a bundle
//

type ProductLoadBundle struct {
    accounts map[uuid.UUID]*dbsqlc.Account // account ID -&amp;gt; account
    teams    map[uuid.UUID]*dbsqlc.Team    // team ID -&amp;gt; team
}

func (_ *Product) LoadBundle(
    ctx context.Context, e db.Executor, baseParams *pbaseparam.BaseParams, products []*dbsqlc.Product
) (*ProductLoadBundle, error) {
    var (
        bundle  = &amp;amp;ProductLoadBundle{}
        queries = dbsqlc.New(e)
    )

    // Load owners for all products, map them in bundle by ID.
    {
        accounts, err := queries.AccountGetByIDMany(ctx,
            sliceutil.Map(products, func(p *dbsqlc.Product) uuid.UUID { return p.OwnerID }))
        if err != nil {
            return nil, xerrors.Errorf(&amp;quot;error getting accounts: %w&amp;quot;, err)
        }
        bundle.accounts = sliceutil.KeyBy(accounts, func(a *dbsqlc.Account) uuid.UUID { return a.ID })
    }

    // Load teams for all products, map them in bundle by ID.
    {
        teams, err := queries.TeamGetByIDMany(ctx,
            sliceutil.Map(products, func(p *dbsqlc.Product) uuid.UUID { return p.TeamID }))
        if err != nil {
            return nil, xerrors.Errorf(&amp;quot;error getting teams: %w&amp;quot;, err)
        }
        bundle.teams = sliceutil.KeyBy(teams, func(t *dbsqlc.Team) uuid.UUID { return t.ID })
    }

    return bundle, nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;(Once again, please forgive the verbosity &amp;ndash; there is literally no way to make this code more succinct in Go. It&amp;rsquo;s already boiled down as far as possible.)&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/two-phase-render/product_load_bundle.svg&quot; alt=&quot;Product load bundle.&quot;&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;//
// Phase 2: Use a bundle to render a single resource
//

func (_ *Product) Render(
    ctx context.Context, baseParams *pbaseparam.BaseParams, bundle *ProductLoadBundle, product *dbsqlc.Product
) (*Product, error) {
    return &amp;amp;Product{
        ID:         product.ID,
        Name:       product.Name,
        OwnerID:    product.OwnerID,
        OwnerEmail: bundle.accounts[product.OwnerID].Email,
        TeamID:     product.TeamID,
        TeamName:   bundle.teams[product.TeamID].Name,
    }, nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A &lt;code&gt;Product&lt;/code&gt; is rendered from a &lt;code&gt;ProductLoadBundle&lt;/code&gt; bundle and &lt;code&gt;dbsqlc.Product&lt;/code&gt; database model. Some properties like &lt;code&gt;ID&lt;/code&gt; and &lt;code&gt;Name&lt;/code&gt; are inherent to the product itself and are reflected directly into the API resource, but others like &lt;code&gt;OwnerEmail&lt;/code&gt; and &lt;code&gt;TeamName&lt;/code&gt; are only accessible by loading other database records and accessing their properties.&lt;/p&gt;

&lt;p&gt;So, the full render process is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;LoadBundle&lt;/code&gt; is invoked once (regardless of the number of products being rendered).

&lt;ul&gt;
&lt;li&gt;Owner and team records are loaded in bulk for every product (e.g. &lt;code&gt;queries.AccountGetByIDMany&lt;/code&gt; is generated by &lt;a href=&quot;/sqlc&quot;&gt;sqlc&lt;/a&gt;, and maps to roughly &lt;code&gt;SELECT * FROM account WHERE id = any(@id::uuid[])&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Owners and teams are placed into maps on &lt;code&gt;ProductLoadBundle&lt;/code&gt; key to their IDs.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Render&lt;/code&gt; is invoked for each product individually, but reusing the same load bundle from (1).

&lt;ul&gt;
&lt;li&gt;Properties like &lt;code&gt;ID&lt;/code&gt; and &lt;code&gt;Name&lt;/code&gt; map directly from model to API resource.&lt;/li&gt;
&lt;li&gt;Indirect properties like &lt;code&gt;OwnerEmail&lt;/code&gt; and &lt;code&gt;TeamName&lt;/code&gt; are pulled off the records added to the load bundle in (1).&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;renderable&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#renderable&quot;&gt;Renderable&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Implementing a full two-phase render involves a fair bit of code (again, it&amp;rsquo;s Go), but once it&amp;rsquo;s done, that type of API resource can easily be rendered from anywhere else:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;resource, err := apiresource.Render[*apiresourcekind.Product](
    ctx, tx, svc.BaseParams, product
)
if err != nil {
    return nil, err
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And rendering many API resources at once (like on a list endpoint) looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;resources, err := apiresource.RenderMany[*apiresourcekind.Product](
    ctx, tx, svc.BaseParams, products
)
if err != nil {
    return nil, err
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Returned API resources implement &lt;code&gt;Renderable&lt;/code&gt;, which holds types for bundle, model, and API resource:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package apiresource

// Renderable is an API resource that can be rendered by Render or RenderMany.
type Renderable[TLoadBundle any, TModel any, TResource any] interface {
    // LoadBundle loads a load bundle for the given models, usually from a
    // database, which can then be used along with a model to render a full API
    // resource.
    //
    // It may seem odd that this takes a slice of models instead of a model, but
    // this is for a good reason: it lets us batch load all data dependencies
    // all at once instead of loading them one-by-one, causing an N+1 problem.
    LoadBundle(ctx context.Context, e db.Executor, baseParams *pbaseparam.BaseParams, models []TModel) (TLoadBundle, error)

    // Render renders an API resource using a load bundle and model as input.
    Render(ctx context.Context, baseParams *pbaseparam.BaseParams, bundle TLoadBundle, model TModel) (TResource, error)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;From there, implementations for &lt;code&gt;Render&lt;/code&gt; and &lt;code&gt;RenderMany&lt;/code&gt; are trivial, each loading a bundle once, and then rendering either a single or slice of API resources:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package apiresource

// Render renders an API resource.
//
// The type parameters may appear to be in a weird order as you might expect
// TModel before TRenderable, but it&#39;s like this for a good reason. Type
// parameters that can be inferred can be omitted, and in general use of Render
// only TRenderable needs to be included. Both TModel and TRenderable are
// inferred and should be omitted.
func Render[TRenderable Renderable[TLoadBundle, TModel, TRenderable], TLoadBundle any, TModel any](
    ctx context.Context, e db.Executor, baseParams *pbaseparam.BaseParams, model TModel,
) (TRenderable, error) {
    var renderable TRenderable

    bundle, err := renderable.LoadBundle(ctx, e, baseParams, []TModel{model})
    if err != nil {
        return renderable, xerrors.Errorf(&amp;quot;error loading bundle: %w&amp;quot;, err)
    }

    resource, err := renderable.Render(ctx, baseParams, bundle, model)
    if err != nil {
        return renderable, xerrors.Errorf(&amp;quot;error rendering resource: %w&amp;quot;, err)
    }

    return resource, nil
}

// RenderMany is similar to Render, but renders many API resources at once.
func RenderMany[TRenderable Renderable[TLoadBundle, TModel, TRenderable], TLoadBundle any, TModel any](
    ctx context.Context, e db.Executor, baseParams *pbaseparam.BaseParams, models [TModel,
) ([]TRenderable, error) {
    var renderable TRenderable

    bundle, err := renderable.LoadBundle(ctx, e, baseParams, models)
    if err != nil {
        return nil, xerrors.Errorf(&amp;quot;error loading bundle: %w&amp;quot;, err)
    }

    resources := make([]TRenderable, len(models))

    for i := range resources {
        resources[i], err = renderable.Render(ctx, baseParams, bundle, models[i])
        if err != nil {
            return nil, xerrors.Errorf(&amp;quot;error rendering resource: %w&amp;quot;, err)
        }
    }

    return resources, nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Edit (2024/06/14):&lt;/strong&gt; This section was updated after &lt;a href=&quot;https://github.com/roman-vanesyan&quot;&gt;Roman&lt;/a&gt; &lt;a href=&quot;https://github.com/brandur/sorg/issues/368&quot;&gt;pointed out&lt;/a&gt; that by swapping the positions of two generic parameters, most of them can be inferred by the compiler, and &lt;code&gt;Render&lt;/code&gt; can be called with only a single generic parameter.&lt;/p&gt;

&lt;h3 id=&quot;nested-resources&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#nested-resources&quot;&gt;Nested resources&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;But what about subresources? If we need to call &lt;code&gt;apiresource.Render&lt;/code&gt; inside the &lt;code&gt;Render&lt;/code&gt; implementation of another resource, N+1s boomerang right back.&lt;/p&gt;

&lt;p&gt;This is where the pattern shines. N+1s are avoided by composing load bundles onto &lt;em&gt;other load bundles&lt;/em&gt; so the &lt;code&gt;Load&lt;/code&gt; implementation of a resource invokes &lt;code&gt;Load&lt;/code&gt; for its subresources as well, always ensuring that there is never more than one &lt;code&gt;Load&lt;/code&gt; per resource type.&lt;/p&gt;

&lt;p&gt;This is best demonstrated by example. Let&amp;rsquo;s augment &lt;code&gt;Product&lt;/code&gt; above so that it renders a list of &lt;code&gt;Widget&lt;/code&gt; subresources. Widgets need to do some data loading of their own, to get the location of the factory they&amp;rsquo;re produced at. &lt;code&gt;Widget&lt;/code&gt;&amp;rsquo;s &lt;code&gt;Renderable&lt;/code&gt; implementation (widget is a leaf resource so there&amp;rsquo;s nothing exotic here):&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package apiresourcekind

type Widget struct {
	apiresource.APIResourceBase

	ID              uuid.UUID `json:&amp;quot;id&amp;quot;`
	FactoryID       uuid.UUID `json:&amp;quot;factory_id&amp;quot;`
	FactoryLocation string    `json:&amp;quot;factory_location&amp;quot;`
	Name            string    `json:&amp;quot;name&amp;quot;`
}

//
// Renderable implementation
//

type WidgetLoadBundle struct {
	factories map[uuid.UUID]*dbsqlc.Factory // factory ID -&amp;gt; factory
}

func (_ *Widget) LoadBundle(ctx context.Context, e db.Executor, baseParams *pbaseparam.BaseParams, widgets []*dbsqlc.Widget) (*WidgetLoadBundle, error) {
	var (
		bundle  = &amp;amp;WidgetLoadBundle{}
		queries = dbsqlc.New(e)
	)

	// Load factories for all widgets, map them in bundle by ID.
	{
		factories, err := queries.FactoryGetByIDMany(ctx,
			sliceutil.Map(widgets, func(w *dbsqlc.Widget) uuid.UUID { return w.FactoryID }))
		if err != nil {
			return nil, xerrors.Errorf(&amp;quot;error getting factories: %w&amp;quot;, err)
		}
		bundle.factories = sliceutil.KeyBy(factories, func(f *dbsqlc.Factory) uuid.UUID { return f.ID })
	}

	return bundle, nil
}

func (_ *Widget) Render(ctx context.Context, baseParams *pbaseparam.BaseParams, bundle *WidgetLoadBundle, widget *dbsqlc.Widget) (*Widget, error) {
	return &amp;amp;Widget{
		ID:              widget.ID,
		FactoryID:       widget.FactoryID,
		FactoryLocation: bundle.factories[widget.FactoryID].Location,
		Name:            widget.Name,
	}, nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/two-phase-render/product_load_bundle_with_widget.svg&quot; alt=&quot;Product load bundle with internalized widget load bundle.&quot;&gt;&lt;/p&gt;

&lt;p&gt;Now, back to product&amp;rsquo;s (the parent resource) &lt;code&gt;Renderable&lt;/code&gt; implementation, now modified to include widgets. &lt;code&gt;WidgetLoadBundle&lt;/code&gt; is embedded on &lt;code&gt;ProductLoadBundle&lt;/code&gt; and populated on &lt;code&gt;Load&lt;/code&gt;. Product&amp;rsquo;s &lt;code&gt;Render&lt;/code&gt; invokes &lt;code&gt;Render&lt;/code&gt; for each of its embedded widgets, passing through the common load bundle:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;package apiresourcekind

type Product struct {
	apiresource.APIResourceBase

	ID         uuid.UUID `json:&amp;quot;id&amp;quot;`
	Name       string    `json:&amp;quot;name&amp;quot;`
	OwnerID    uuid.UUID `json:&amp;quot;owner_id&amp;quot;`
	OwnerEmail string    `json:&amp;quot;owner_email&amp;quot;`
	TeamID     uuid.UUID `json:&amp;quot;team_id&amp;quot;`
	TeamName   string    `json:&amp;quot;team_email&amp;quot;`
	Widgets    []*Widget `json:&amp;quot;widget&amp;quot;`     // NEW!!
}

//
// Renderable implementation
//

type ProductLoadBundle struct {
	accounts     map[uuid.UUID]*dbsqlc.Account  // account ID -&amp;gt; account
	teams        map[uuid.UUID]*dbsqlc.Team     // team ID -&amp;gt; team
	widgetBundle *WidgetLoadBundle              // &amp;lt;-- the product load bundle has a widget load bundle!
	widgets      map[uuid.UUID][]*dbsqlc.Widget // product ID -&amp;gt; widgets
}

func (_ *Product) LoadBundle(ctx context.Context, e db.Executor, baseParams *pbaseparam.BaseParams, products []*dbsqlc.Product) (*ProductLoadBundle, error) {
	var (
		bundle  = &amp;amp;ProductLoadBundle{}
		queries = dbsqlc.New(e)
	)

    ...

	// Load widgets for all products, group them in bundle by product ID, and load widget bundle.
	{
		widgets, err := queries.WidgetGetByProductIDMany(ctx,
			sliceutil.Map(products, func(p *dbsqlc.Product) uuid.UUID { return p.ID }))
		if err != nil {
			return nil, xerrors.Errorf(&amp;quot;error getting widgets: %w&amp;quot;, err)
		}
		bundle.widgets = sliceutil.GroupBy(widgets, func(w *dbsqlc.Widget) uuid.UUID { return w.ProductID })

		bundle.widgetBundle, err = (&amp;amp;Widget{}).LoadBundle(ctx, e, baseParams, widgets)
		if err != nil {
			return nil, err
		}
	}

	return bundle, nil
}

func (_ *Product) Render(ctx context.Context, baseParams *pbaseparam.BaseParams, bundle *ProductLoadBundle, product *dbsqlc.Product) (*Product, error) {
	// Render widget subresources.
	var widgetResources []*Widget
	if widgets, ok := bundle.widgets[product.ID]; ok {
		widgetResources := make([]*Widget, len(widgets))
		for i, widget := range widgets {
			var err error
			widgetResources[i], err = (&amp;amp;Widget{}).Render(ctx, baseParams, bundle.widgetBundle, widget)
			if err != nil {
				return nil, err
			}
		}
	}

	return &amp;amp;Product{
		ID:         product.ID,
		Name:       product.Name,
		OwnerID:    product.OwnerID,
		OwnerEmail: bundle.accounts[product.OwnerID].Email,
		TeamID:     product.TeamID,
		TeamName:   bundle.teams[product.TeamID].Name,
		Widgets:    widgetResources,
	}, nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The beauty of this approach is that even if your resources which have subresources &lt;em&gt;which have subresources&lt;/em&gt;, it&amp;rsquo;s still okay. All load bundles map 1:1:1, and regardless of number of resources or hierarchy, we still perform a constant number of database operations. Predictable performance is always maintained.&lt;/p&gt;

&lt;h3 id=&quot;beyond-go&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#beyond-go&quot;&gt;Beyond Go&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Go is special because of its overwhelming verbosity and total lack of dynamic features. Even if we hadn&amp;rsquo;t designed a framework to avoid N+1s, we would&amp;rsquo;ve had to build one to help with basic data loading, so with the two-phase load and render approach we kill two birds with one stone.&lt;/p&gt;

&lt;p&gt;With that said, Rails&amp;rsquo; strict loading feature is a bit of an abberation. Many ORMs offer similar dynamic APIs that perform lazy loading, but without safety rails, which practically makes N+1s the default. Common practice is to live with them, and if a particular hot spot becomes a performance problem, to go in and whack-a-mole N+1s one at a time.&lt;/p&gt;

&lt;p&gt;The two-phase approach could be extended to other languages to help make N+1s less common and more easily addressable. The syntax above looks intimidating, but once again that&amp;rsquo;s mostly a Go verbosity problem. In most languages, you could do something similar with half the lines of code.&lt;/p&gt;

&lt;p&gt;The specific code above is meant more for inspiration than anything else, and I&amp;rsquo;m not providing any particular package prescriptions. But it involves only a few plain Go structs, one interface, and two functions, so it&amp;rsquo;s easy to reproduce.&lt;/p&gt;</content>

    <author>
      <name>brandur</name>

      <uri>https://brandur.org</uri>

    </author>
  </entry>

  <entry>
    <title>The Notifier Pattern for Applications That Use Postgres</title>
    <link href="https://brandur.org/notifier"/>
    <id>tag:brandur.org,2024-05-06:notifier</id>
    <updated>2024-05-06T05:54:07Z</updated>


    <content type="html">&lt;p&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/sql-listen.html&quot;&gt;Listen/notify in Postgres&lt;/a&gt; is an incredible feature that makes itself useful in all kinds of situations. I&amp;rsquo;ve been using it a long time, started taking it for granted long ago, and was somewhat shocked recently looking into MySQL and SQLite to learn that even in 2024, no equivalent exists.&lt;/p&gt;

&lt;p&gt;In a basic sense, listen/notify is such a simple concept that it needs little explanation. Clients subscribe on topics and other clients can send on topics, passing a message to each subscribed client. The idea takes only three seconds to demonstrate using nothing more than a psql shell:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;=# LISTEN test_topic;
LISTEN
Time: 2.828 ms

=# SELECT pg_notify(&#39;test_topic&#39;, &#39;test_message&#39;);
 pg_notify
-----------

(1 row)

Time: 17.892 ms
Asynchronous notification &amp;quot;test_topic&amp;quot; with payload &amp;quot;test_message&amp;quot; received from server process with PID 98481.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But despite listen/notify&amp;rsquo;s relative simplicity, when it comes to applications built on top of Postgres, it&amp;rsquo;s common to use it less than optimally, eating through scarce Postgres connections and with little regard to failure cases.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Here&amp;rsquo;s where the &lt;strong&gt;notifier pattern for Postgres&lt;/strong&gt; comes in. It&amp;rsquo;s an extremely simple idea, but in my experience, one that&amp;rsquo;s rarely seen in practice. Let&amp;rsquo;s start with these axioms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;LISTEN&lt;/code&gt;s are affixed to specific connections. After listening, the original connection must still be available somewhere to successfully receive messages.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;There may be many components within an application that&amp;rsquo;d like to listen on topics for completely orthogonal uses.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Despite optimizations over the years, connections in Postgres are still somewhat of a precious, limited resource, and should be conserved. We&amp;rsquo;d like to minimize the number of them required for listen/notify use.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;A single connection can listen on any number of topics.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With those stated, we can explain the role of the notifier. Its job is to &lt;strong&gt;hold a single Postgres connection per process, allow other components in the same program to use it to subscribe to any number of topics, wait for notifications, and distribute them to listening components as they&amp;rsquo;re received&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The &amp;ldquo;single Postgres connection per process&amp;rdquo; piece is key. Use of a notifier keeps the number of Postgres connections dedicated to use with listen/notify down to &lt;strong&gt;one per program&lt;/strong&gt;, a major advantage compared to the naive version, which is &lt;em&gt;one connection per topic per program&lt;/em&gt;. Especially for languages like Go that make a in-process concurrency easy and cheap, the notifier reduces listen/notify connection overhead to practically nil.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/notifier/notifier.svg&quot; alt=&quot;Notifier distributing notifications to program components&quot;&gt;&lt;/p&gt;

&lt;h2 id=&quot;implementation&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#implementation&quot;&gt;A few implementation details&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;From a conceptual standpoint, the notifier&amp;rsquo;s not difficult to understand, and with only this high level description, most readers would be able to implement it themselves. I&amp;rsquo;m not going to go through an implementation in full detail, but let&amp;rsquo;s look at a few important aspects of one. (For a complete reference, you can take a look &lt;a href=&quot;https://github.com/riverqueue/river/tree/master/internal/notifier&quot;&gt;at River&amp;rsquo;s notifier&lt;/a&gt;, which is quite well vetted.)&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s a listen function to establish a new subscription:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// Listen returns a subscription that lets a caller receive values from a
// notification channel.
func (l *Notifier) Listen(channel string) *Subscription {
    l.mu.Lock()
    defer l.mu.Unlock()

    existingSubs := l.subscriptions[channel]

    sub := &amp;amp;Subscription{
        channel:        channel,
        listenChan:     make(chan string, 100),
        notifyListener: l,
    }
    l.subscriptions[channel] = append(existingSubs, sub)

    if len(existingSubs) &amp;gt; 0 {
        // If there&#39;s already another subscription for this channel, reuse its
        // established channel. It may already be closed (to indicate that the
        // connection is established), but that&#39;s okay.
        sub.establishedChan = existingSubs[0].establishedChan
        sub.establishedChanClose = func() {} // no op since not channel owner

        return sub
    }

    // The notifier will close this channel after it&#39;s successfully established
    // `LISTEN` for the given channel. Gives subscribers a way to confirm a
    // listen before moving on, which is especially useful in tests.
    sub.establishedChan = make(chan struct{})
    sub.establishedChanClose = sync.OnceFunc(func() { close(sub.establishedChan) })

    l.channelChanges = append(l.channelChanges,
        channelChange{channel, sub.establishedChanClose, channelChangeOperationListen})

    // Cancel out of blocking on WaitForNotification so changes can be processed
    // immediately.
    l.waitForNotificationCancel() 

    return sub
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A few key details to notice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Subscriptions use a &lt;strong&gt;buffered channel&lt;/strong&gt; like &lt;code&gt;make(chan string, 100)&lt;/code&gt; and &lt;strong&gt;non-blocking sends&lt;/strong&gt; (using &lt;code&gt;select&lt;/code&gt; with &lt;code&gt;default&lt;/code&gt;). A notifier may receive a high volume of notifications, and if it were to block on every component successfully receiving and processing each one, it could easily fall behind. Instead, a received notification is sent into the channel using a non-blocking send. The non-blocking send means that the send operation will never block: instead the notification is discarded if the channel is full. The buffer provides a tunable amount of slack to make sure this won&amp;rsquo;t happen too easily. It&amp;rsquo;s each component&amp;rsquo;s job to make sure its processing its inbox in a timely manner. This is important because even in the event of one component falling behind, the system as a whole stays healthy.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Multiple components may want to subscribe to the same topic. Since only one connection is in use, the notifier only needs to issue one &lt;code&gt;LISTEN&lt;/code&gt; per topic. Internally, it organizes subscriptions by topic, and if it notices that a topic already exists, a new subscription is added without issuing &lt;code&gt;LISTEN&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Subscriptions provide an &lt;strong&gt;established channel&lt;/strong&gt; that&amp;rsquo;s closed when a &lt;code&gt;LISTEN&lt;/code&gt; has been successfully issued and the notifier is up and listening. This isn&amp;rsquo;t strictly necessary for most production uses, but it&amp;rsquo;s invaluable for use in testing. If a test case issues &lt;code&gt;pg_notify&lt;/code&gt; before the notifier has started listening, that notification is lost &amp;ndash; a problem that can lead to tortuous test intermittency &lt;sup id=&quot;footnote-1-source&quot;&gt;&lt;a href=&quot;#footnote-1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. Instead, a test case tells the notifier to listen, &lt;em&gt;waits for the listen to succeed&lt;/em&gt;, then moves on to send &lt;code&gt;pg_notify&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// EstablishedC is a channel that&#39;s closed after the notifier&#39;s successfully
// established a connection. This is especially useful in test cases, where it
// can be used to wait for confirmation that not only that the listener is
// started, but that it&#39;s successfully established started listening on a
// channel before continuing. For a new subscription on an already established
// channel, EstablishedC is already closed, so it&#39;s always safe to wait on it.
//
// There&#39;s no full guarantee that the notifier can ever successfully establish a
// listen, so callers will usually want to `select` on it combined with a
// context done, a stop channel, and/or a timeout.
//
// The channel is always closed as a notifier is stopping.
func (s *Subscription) EstablishedC() &amp;lt;-chan struct{} { return s.establishedChan }
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;interruptible-receives&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#interruptible-receives&quot;&gt;Interruptible receives&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;There&amp;rsquo;s no standard SQL for waiting for a notification. Typically, it&amp;rsquo;s accomplished using a special driver-level function like &lt;a href=&quot;https://pkg.go.dev/github.com/jackc/pgx/v5#Conn.WaitForNotification&quot;&gt;Pgx&amp;rsquo;s &lt;code&gt;WaitForNotification&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;These commonly block until receiving a notification, which can be problem since we&amp;rsquo;re only using a single connection. What if the notifier is in a blocking receive loop, but another component wants to add a new subscription that requires &lt;code&gt;LISTEN&lt;/code&gt; be issued?&lt;/p&gt;

&lt;p&gt;You&amp;rsquo;ll want to handle this case by making sure that the wait loop is interruptible. Here&amp;rsquo;s one way to accomplish that in Go:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func (l *Notifier) runOnce(ctx context.Context) error {
    if err := l.processChannelChanges(ctx); err != nil {
        return err
    }

    // WaitForNotification is a blocking function, but since we want to wake
    // occasionally to process new `LISTEN`/`UNLISTEN` operations, we put a
    // context deadline on the listen, and as it expires don&#39;t treat it as an
    // error unless it&#39;s unrelated to context expiration.
    notification, err := func() (*pgconn.Notification, error) {
        const listenTimeout = 30 * time.Second

        ctx, cancel := context.WithTimeout(ctx, listenTimeout)
        defer cancel()

        // Provides a way for the blocking wait to be cancelled in case a new
        // subscription change comes in.
        l.mu.Lock()
        l.waitForNotificationCancel = cancel
        l.mu.Unlock()

        notification, err := l.conn.WaitForNotification(ctx)
        if err != nil {
            return nil, xerrors.Errorf(&amp;quot;error waiting for notification: %w&amp;quot;, err)
        }

        return notification, nil
    }()
    if err != nil {
        // If the error was a cancellation or the deadline being exceeded but
        // there&#39;s no error in the parent context, return no error.
        if (errors.Is(err, context.Canceled) ||
            errors.Is(err, context.DeadlineExceeded)) &amp;amp;&amp;amp; ctx.Err() == nil {
            return nil
        }

        return err
    }

    l.mu.RLock()
    defer l.mu.RUnlock()

    // Notify subscribers (this is a no-op if no subs/empty slice).
    for _, sub := range l.subscriptions[notification.Channel] {
        sub.listenChan &amp;lt;- notification.Payload
    }

    return nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The inner closure calls into &lt;code&gt;WaitForNotification&lt;/code&gt;, but has a default context timeout of 30 seconds that automatically cycles the function periodically. It also stores the special context cancellation function &lt;code&gt;l.waitForNotificationCancel&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When &lt;code&gt;Listen&lt;/code&gt; is invoked and a new subscription needs to be added, &lt;code&gt;l.waitForNotificationCancel&lt;/code&gt; is called. The wait is cancelled immediately, new subscriptions are processed, and the closure is reentered to wait anew.&lt;/p&gt;

&lt;h3 id=&quot;let-it-crash&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#let-it-crash&quot;&gt;Let it crash&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Given there&amp;rsquo;s now a single master connection that&amp;rsquo;s handling all notifications for a program, it&amp;rsquo;s fairly critical that its health be monitored, and the notifier reacts appropriately. If not, all uses of listen/notify would degrade simultaneously.&lt;/p&gt;

&lt;p&gt;The obvious way to react would be to close the connection, use a connection pool to procure a new connection, reissue &lt;code&gt;LISTEN&lt;/code&gt;s for each active subscription, then reenter the wait loop.&lt;/p&gt;

&lt;p&gt;It can be a little tricky sometimes to guarantee that state is reset cleanly, so another possibility is to adhere to the &amp;ldquo;let it crash&amp;rdquo; school of thought. If the connection becomes irreconcilably unhealthy, stop the program, and have it come back to a healthy state by virtue of its normal start up.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// If the notifier gets unhealthy, restart the worker. This will generally
// never happen as the notifier has a built-in retry loop that try its best
// to keep established before giving up.
notifier.AddUnhealthyCallback(closeShutdown)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We&amp;rsquo;ve found this sort of edge to be so rare (I&amp;rsquo;ve only seen it happen once in a year+ of use) that letting the program crash when it does happen hasn&amp;rsquo;t produced any undue disruption.&lt;/p&gt;

&lt;h2 id=&quot;pgbouncer&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#pgbouncer&quot;&gt;PgBouncer&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Using &lt;a href=&quot;https://www.pgbouncer.org/features.html&quot;&gt;PgBouncer&lt;/a&gt;, &lt;code&gt;LISTEN&lt;/code&gt; is only supported using session pooling (as opposed to transaction pooling) because notifications are only sent to the original session that issued a &lt;code&gt;LISTEN&lt;/code&gt; for them.&lt;/p&gt;

&lt;p&gt;Use of a notifier requires an app to dedicate a single connection per program for listen/notify, but every other part of the application is free to use PgBouncer in transaction pooling or statement pooling mode, thereby maximizing the efficiency of connection use.&lt;/p&gt;</content>

    <author>
      <name>brandur</name>

      <uri>https://brandur.org</uri>

    </author>
  </entry>

  <entry>
    <title>Web APIs: Enriched DX By Disallowing Unknown Fields</title>
    <link href="https://brandur.org/disallow-unknown-fields"/>
    <id>tag:brandur.org,2024-05-05:disallow-unknown-fields</id>
    <updated>2024-05-05T06:21:27Z</updated>


    <content type="html">&lt;p&gt;Go&amp;rsquo;s JSON library provides the &lt;a href=&quot;https://pkg.go.dev/encoding/json#Decoder.DisallowUnknownFields&quot;&gt;decoder option &lt;code&gt;DisallowUnknownFields&lt;/code&gt;&lt;/a&gt; which even if not intuitively obvious, is a handy option fo adding a layer of improved DX to web APIs. As the name would suggest, it causes a decoder to error when encountering a property in a JSON object being decoded that&amp;rsquo;s not present in the struct being decoded to.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;type Request struct {
    Message string `json:&amp;quot;message&amp;quot;`
}

data := `{&amp;quot;message&amp;quot;:&amp;quot;Hello.&amp;quot;,&amp;quot;unknown&amp;quot;:&amp;quot;Not a field on the struct.&amp;quot;}`

decoder := json.NewDecoder(bytes.NewReader([]byte(data)))
decoder.DisallowUnknownFields()

var req Request
if err := decoder.Decode(&amp;amp;req); err != nil {
    log.Fatal(err) // json: unknown field &amp;quot;unknown&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;hr /&gt;

&lt;p&gt;When a user is integrating a web API, especially in the beginning, it&amp;rsquo;s common for the initial prototype to be written by a human, and humans are prone to making mistakes. Say you&amp;rsquo;re trying to programmatically procure an access token against &lt;code&gt;POST /access-tokens&lt;/code&gt;. The endpoint takes an optional parameter called &lt;code&gt;expires_in&lt;/code&gt; which is a number of seconds after which the new access token will expire automatically. By virtue of reading the documentation slightly wrong, you&amp;rsquo;re accidentally sending &lt;code&gt;expires: 3600&lt;/code&gt; instead of &lt;code&gt;expires_in: 3600&lt;/code&gt;. The result is that your requested expiry time is silently ignored, not only producing the wrong result, but possibly even a security leak as your account accidentally amasses access tokens that never expire.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;DisallowUnknownFields&lt;/code&gt; widely fixes this class of mistake for all an API&amp;rsquo;s users. Some code extracted from our API:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;decoder := json.NewDecoder(bytes.NewReader(rawPayload))

// Balks if finding fields in the request payload that don&#39;t map to anything
// in the target request structure. Acts as a small DX aid for users who may
// have accidentally misnamed a field.
//
// Specific API endpoints can invert this behavior through and option while
// defining the endpoint.
if !allowUnknownJSONFields {
        decoder.DisallowUnknownFields()
}

if err := decoder.Decode(v); err != nil {
    apierror.NewBadRequestError(
        r.Context(),
        fmt.Sprintf(&amp;quot;Invalid JSON in request body: %s.&amp;quot;, err),
    ).Write(r.Context(), w)
    return nil, false
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now, sending &lt;code&gt;expires&lt;/code&gt; instead of &lt;code&gt;expires_in&lt;/code&gt; is an error that tells the user exactly what&amp;rsquo;s wrong:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ curl -i -H &amp;quot;Authorization: Bearer $CRUNCHY_API_KEY&amp;quot; \
    -H &amp;quot;Content-Type: application/json&amp;quot;
    -X POST $CRUNCHY_API_URL/access-tokens -d &#39;{&amp;quot;expires&amp;quot;:3600}&#39;

HTTP/2 400
{
    &amp;quot;message&amp;quot;:&amp;quot;Invalid JSON in request body: json: unknown field \&amp;quot;expires\&amp;quot;.&amp;quot;,
    &amp;quot;request_id&amp;quot;:&amp;quot;5d2078fe-6ea5-4f41-816e-4717cf6c22b7&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It&amp;rsquo;s a feature that&amp;rsquo;s not needed every day, but it&amp;rsquo;s easy to implement, and the day it is, it&amp;rsquo;ll save hours worth of time and frustration.&lt;/p&gt;

&lt;h2 id=&quot;caveats-and-edges&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#caveats-and-edges&quot;&gt;Caveats and edges&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;There are a few sharp edges to be aware of. They&amp;rsquo;re easy to avoid once you know about them, but aren&amp;rsquo;t totally apparent for those integrating the pattern for the first time.&lt;/p&gt;

&lt;h3 id=&quot;safely-on&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#safely-on&quot;&gt;Turning it on safely&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;If you have an existing API with existing users, &lt;code&gt;DisallowUnknownFields&lt;/code&gt; isn&amp;rsquo;t universally safe to turn on because there may be integrations out there that have been sending invalid JSON fields for years, but which was never a problem before. Those previously happy users become unhappy when disallowing unknown fields suddenly breaks all their requests.&lt;/p&gt;

&lt;p&gt;You can still turn it on, but doing so takes a few more steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Start by organizing the API by pre and post &lt;code&gt;DisallowUnknownFields&lt;/code&gt;. New API endpoints get the check automatically while existing ones default to it off.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Add logging probes to existing endpoints that fire when they encounter an unknown parameter. Search your logs for these later to see what unknown parameters are present, if any, and how many.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;if err := decoder.Decode(v); err != nil {
    if strings.Contains(err.Error(), &amp;quot;unknown field&amp;quot;) {
        plog.Logger(ctx).WithFields(logrus.Fields{
            &amp;quot;api_endpoint_method&amp;quot;: r.Method,
            &amp;quot;api_endpoint_path&amp;quot;:   r.URL.Path,
        }).Warnf(&amp;quot;Unknown field error: %s.&amp;quot;, err)

        decoderAllowingUnknown := json.NewDecoder(bytes.NewReader(rawPayload))
        err = decoderAllowingUnknown.Decode(v)
    }

    if err != nil {
        apierror.NewBadRequestError(
            r.Context(),
            fmt.Sprintf(&amp;quot;Invalid JSON in request body: %s.&amp;quot;, err),
        ).Write(r.Context(), w)
        return nil, false
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;ul&gt;
&lt;li&gt;Reaching out to individual users and asking them to correct bad parameters is possible, but probably more trouble than it&amp;rsquo;s worth. A cheaper solution is to grandfather in existing errors by adding hidden fields to JSON structs that&amp;rsquo;ll let &lt;code&gt;DisallowUnknownFields&lt;/code&gt; be enabled for the endpoint, but keep existing integrations compatible.&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// Request parameters for creating a new access token.
type AccessTokenCreateRequest struct {
    ...

    // When activating strict JSON parameter validation we found that Customer X
    // was accidentally sending `expires` instead of `expires_in`. We&#39;ve asked
    // them to stop, but in the meantime we allow this parameter so we don&#39;t
    // break them.
    Expires int `json:&amp;quot;expires&amp;quot; openapi:&amp;quot;hide&amp;quot; validate:&amp;quot;-&amp;quot;`
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There&amp;rsquo;s a point where doing this for too many unknown fields becomes impractical, but for all but the largest APIs, unknown fields will be an edge that with a little luck, isn&amp;rsquo;t that common.&lt;/p&gt;

&lt;h3 id=&quot;deprecating-fields&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#deprecating-fields&quot;&gt;Deprecating fields carefully&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;When removing an old field from the API it might be tempting to strip it out request structs completely. It just makes sense right? If it&amp;rsquo;s ignored anyway and not used anywhere then why should it be in there.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;DisallowUnknownFields&lt;/code&gt; will require more care in deprecating fields. Even if the parameter hasn&amp;rsquo;t been doing anything useful in years, it may still be sent by users, and if it&amp;rsquo;s removed, those existing integrations break.&lt;/p&gt;

&lt;p&gt;The workaround is to keep deprecated parameters passed their expiration date, but mark them as such in a way that bubbles up to public documentation and generated bindings that makes it clear that they&amp;rsquo;re not useful and should no longer be used.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// Request parameters for creating a new access token.
type AccessTokenCreateRequest struct {
    ...

    // Client ID is the unique identifier of the API key that the new access
    // token should be associated with.
    //
    // Deprecated: This field used to be required, but an associated access
    // token is now inferred automatically using the secret included as part of
    // the `Authorization` header. This parameter is now ignored.
    ClientID *eid.EID `json:&amp;quot;client_id&amp;quot; validate:&amp;quot;-&amp;quot;`
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Once again, logging probes come in handy here. Add a unique string like &lt;code&gt;access_token_client_id_received&lt;/code&gt; that&amp;rsquo;s easily searchable in logs, and some time later once it hasn&amp;rsquo;t been seen in a long time, do a clean up pass and strip the old parameter out.&lt;/p&gt;

&lt;h3 id=&quot;escape-hatch&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#escape-hatch&quot;&gt;Prepare an escape hatch&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Use of &lt;code&gt;DisallowUnknownFields&lt;/code&gt; is suitable for most API endpoints, but an escape hatch &lt;em&gt;will&lt;/em&gt; be required, so prepare for it.&lt;/p&gt;

&lt;p&gt;A common place where &lt;code&gt;DisallowUnknownFields&lt;/code&gt; should not be applied are webhook receive endpoints. Although in a fashion they&amp;rsquo;re technically part of your API&amp;rsquo;s surface area, they&amp;rsquo;re really more like the &lt;em&gt;push&lt;/em&gt; API of another vendor, and because adding a new field to an API is widely considered to not be a breaking change, that vendor may add new parameters to their webhook pushes anytime.&lt;/p&gt;

&lt;p&gt;The problem can be especially insidious because the webhook APIs of many large vendors are quite stable, so your receiver will be working fine with &lt;code&gt;DisallowUnknownFields&lt;/code&gt; for many months or years, before suddenly every request starts failing overnight as a new parameter is added.&lt;/p&gt;

&lt;p&gt;Our in house API endpoint framework takes the option &lt;code&gt;AllowUnknownJSONFields&lt;/code&gt; to indicate that JSON requests should not ban unknown fields:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// Webhook endpoint where Stripe broadcasts asynchronous message about customer
// payment information.
type StripeWebhookEndpoint struct{}

func (e *StripeWebhookEndpoint) Materialize() apiendpoint.APIEndpointer {
    return &amp;amp;apiendpoint.APIEndpoint[StripeWebhookRequest, StripeWebhookResponse]{
        Extras: apiendpoint.APIEndpointExtras{
            AllowUnknownJSONFields: true, // &amp;lt;-- unknown fields allowed
        },
        Method: http.MethodPost,
        Route:  &amp;quot;/webhook&amp;quot;,
        ServiceHandler: func(svc any) func(ctx context.Context, req *StripeWebhookRequest) (*StripeWebhookResponse, error) {
            return svc.(StripeService).Webhook
        },
        SuccessStatusCode: http.StatusOK,
        Title:             &amp;quot;Stripe webhook receiver&amp;quot;,
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;outside-go&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#outside-go&quot;&gt;Use outside Go&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;DisallowUnknownFields&lt;/code&gt; is obviously an option specific to Go, but this pattern is widely reusable in other languages, and easy to implement yourself if it&amp;rsquo;s not built into the ecosystem&amp;rsquo;s dominant JSON package.&lt;/p&gt;

&lt;h2 id=&quot;levenshtein&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#levenshtein&quot;&gt;Augmentation with Levenshtein distance&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;An obvious next augmentation is not only to indicate that a parameter name doesn&amp;rsquo;t exist, but to use the &lt;a href=&quot;https://en.wikipedia.org/wiki/Levenshtein_distance&quot;&gt;Levenshtein distance&lt;/a&gt; to known parameter names to suggest one. So a user who sends &lt;code&gt;expires&lt;/code&gt; is told that they probably meant &lt;code&gt;expires_in&lt;/code&gt;, giving them a path to resolution that takes seconds instead of minutes.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;Invalid JSON in request body: unknown field &amp;quot;expires&amp;quot;. Did you mean &amp;quot;expires_in&amp;quot;?&amp;quot;
&lt;/code&gt;&lt;/pre&gt;</content>

    <author>
      <name>brandur</name>

      <uri>https://brandur.org</uri>

    </author>
  </entry>

  <entry>
    <title>MIT Flea Market Survival Guide</title>
    <link href="https://www.scd31.com/posts/mit-survival"/>
    <id>https://www.scd31.com/posts/mit-survival</id>
    <updated>2024-04-25T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>The 29 Days per Year Bug (30 Days for Leap Years!)</title>
    <link href="https://tomeraberba.ch/the-29-days-per-year-bug"/>
    <id>https://tomeraberba.ch/the-29-days-per-year-bug</id>
    <updated>2024-04-23T00:00:00Z</updated>


    <content type="html">A few years ago my coworker was working on a Google Docs feature when they came across a bewildering bug. The feature 📅 For one component of this feature, users were able to pick any date, save it,…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>Validating Markdoc attribute values with Zod</title>
    <link href="/blog/2024/04/validating-markdoc-attribute-values-with-zod/"/>
    <id>/blog/2024/04/validating-markdoc-attribute-values-with-zod/</id>
    <updated>2024-04-19T00:00:00Z</updated>


    <content type="html">In a React application written in TypeScript, the language&amp;rsquo;s type system is the first line of defense against type errors in component props. In Markdoc, where content is decoupled from rendering logic and can&amp;rsquo;t take advantage of TypeScript types, the schema-based validation system plays an important role in bridging the gap between declarative tag definitions and typed React components.
The Markdoc validator ensures that tag attribute values in a document match the types declared in the corresponding schema.</content>

    <author>
      <name>Ryan (phault)</name>

      <uri>https://seg.phault.net</uri>

    </author>
  </entry>

  <entry>
    <title>Accessing parent nodes in a Markdoc validation function</title>
    <link href="/blog/2024/04/accessing-parent-nodes-in-a-markdoc-validation-function/"/>
    <id>/blog/2024/04/accessing-parent-nodes-in-a-markdoc-validation-function/</id>
    <updated>2024-04-18T00:00:00Z</updated>


    <content type="html">A recent Markdoc release introduced a new feature that makes it possible for validation functions to access the parents of the current node. This feature is useful for writing custom validation logic that analyzes document structure or enforces rules based on a node&amp;rsquo;s position in the hierarchy.
For example, you might want to impose restrictions on where an image can be nested in a document, prohibiting the use of images in specific tags like callouts and asides.</content>

    <author>
      <name>Ryan (phault)</name>

      <uri>https://seg.phault.net</uri>

    </author>
  </entry>

  <entry>
    <title>My Total Solar Eclipse Balloon Launch</title>
    <link href="https://www.scd31.com/posts/eclipse-balloon"/>
    <id>https://www.scd31.com/posts/eclipse-balloon</id>
    <updated>2024-04-09T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>ASMR and vulnerability</title>
    <link href="https://youngjin.io/2024/asmr-and-vulnerability/"/>
    <id>https://youngjin.io/2024/asmr-and-vulnerability/</id>
    <updated>2024-03-15T19:12:47Z</updated>


    <content type="html">I&amp;rsquo;ve been really into ASMR videos recently, which if you&amp;rsquo;ve been living under a rock are videos that contain sounds that promote the tingly sensation in the back of your head. Not everyone experiences this and it&amp;rsquo;s still a little unknown (from my brief review of literature). This is my pseudo-scientific attempt to explain it.
First, if you&amp;rsquo;ve never experienced it, I&amp;rsquo;d say that the feeling is oddly similar to the tingly feeling you get if you point a sharp object (or even someone else&amp;rsquo;s finger) between your eyes.</content>

    <author>
      <name>youngjin</name>

      <uri>https://youngjin.io</uri>

    </author>
  </entry>

  <entry>
    <title>The end of Airplane.dev</title>
    <link href="https://yolken.net/blog/end-of-airplanedev"/>
    <id>https://yolken.net/blog/end-of-airplanedev</id>
    <updated>2024-03-03T20:42:00Z</updated>


    <content type="html">&lt;p&gt;I worked at &lt;a href=&quot;https://airplane.dev&quot;&gt;Airplane&lt;/a&gt;, an internal tooling startup, for nearly two years.
Earlier this year, &lt;a href=&quot;https://news.ycombinator.com/item?id=38861271&quot;&gt;it was announced&lt;/a&gt; that the company was being acquired by
&lt;a href=&quot;https://airtable.com&quot;&gt;Airtable&lt;/a&gt; and that the product would be shut down. In this blog post, I want to
explain what happened from my perspective as a former employee.&lt;/p&gt;

&lt;h2 id=&quot;background&quot;&gt;Background&lt;/h2&gt;

&lt;p&gt;I joined Airplane in March of 2022 because I was excited by the internal
tooling space and, after some time at Twilio, wanted to work at a smaller
company.&lt;/p&gt;

&lt;p&gt;For the first 16 months or so, it felt like the company was doing well. We
were onboarding lots of new customers, we expanded the team from 12 people to
more than 20, and there was a general sense of excitement about our prospects
both internally and in conversations with users.&lt;/p&gt;

&lt;p&gt;Personally, I was also pretty happy. Although I had had some negative
interactions with the CTO, I really liked all of my other coworkers, the
mission of the company, and the technologies that I got to use day-to-day. I
felt like I was &lt;a href=&quot;https://www.airplane.dev/blog/category/hash-benjamin&quot;&gt;learning a lot as an engineer&lt;/a&gt;,
and I was hopeful that the company would continue to expand and provide
growth opportunities for me in the future.&lt;/p&gt;

&lt;h2 id=&quot;initial-headwinds&quot;&gt;Initial headwinds&lt;/h2&gt;

&lt;p&gt;In the summer of 2023, the company hit some headwinds. First, although we were
still adding customers, our revenue growth rate had noticeably slowed down.
From what I understand, this was a trend across the entire SaaS tooling space
that wasn’t unique to Airplane, but it still hurt morale and made us less
optimistic about the company’s near-term future.&lt;/p&gt;

&lt;p&gt;In addition, people on the team started quitting. Up until that summer, we had
had zero employee attrition. Then, within a three month period, we lost 4 engineers
and the company’s head of growth. These colleagues had all been at the company
for a while and all had legitimate reasons for resigning unrelated to the revenue
slowdown (several, for instance, wanted to start their own companies), but the
latter didn’t soften the blow. They were all well-respected and great at their
jobs, so we were sad to see them go.&lt;/p&gt;

&lt;p&gt;To top it all off, the CEO (who was also a co-founder) announced that he was
leaving so that he could work on a personal, AI-related passion project.
The CTO (the other co-founder) would be taking his place. His departure wasn’t a
huge surprise- he had been very clearly burnt out and disengaged for months. But,
it’s still never a good sign to lose the leader of your organization.&lt;/p&gt;

&lt;p&gt;My colleagues and I were obviously upset by these developments, but we figured
that the new CEO knew what he was doing and would successfully navigate us through.&lt;/p&gt;

&lt;h2 id=&quot;stabilization&quot;&gt;Stabilization&lt;/h2&gt;

&lt;p&gt;By the fall of 2023, things had stabilized a bit. The employee attrition wave
died down, and we added several new engineers and a new growth lead to the team.
We also had some substantial product launches and signed two big, new enterprise
customer deals. Our Q3 revenue numbers were strong, and we had new customers
excited to sign with us in Q4.&lt;/p&gt;

&lt;p&gt;In early November, we did a company retreat in Napa Valley to hang out together
in-person and hack on various new product features. The CEO was a little less
engaged than usual, but otherwise morale was high and there were no signs that
anything was wrong.&lt;/p&gt;

&lt;h2 id=&quot;brewing-trouble&quot;&gt;Brewing trouble&lt;/h2&gt;

&lt;p&gt;After we got back from the retreat, we got an abrupt message from the CEO in our
team’s Slack workspace. Effective immediately, we were winding down our hiring
pipeline. We were also rescinding two engineering offers that we had just extended.&lt;/p&gt;

&lt;p&gt;My colleagues and I knew that this was a bad sign. Rescinding offers is something
that causes a lot of reputational harm and isn’t done casually. Companies generally
only do this if they’re about to do layoffs or go through some sort of big
transformation like an acquisition. The former didn’t make any sense, though,
because we had lots of money in the bank (enough to last for many years at our
current burn rate), had already slimmed the team down via voluntary attrition, and
had an overflowing backlog of new features to work on. That left an acquisition as
the most likely explanation.&lt;/p&gt;

&lt;p&gt;Several people on the team confronted the CEO, but he denied that there was an
acquisition brewing. Instead, he made vague statements about “figuring out the
direction of the company” and potentially doing some sort of product pivot to
reignite growth. He claimed that any update would be good news and that he’d make
sure we were taken care of.&lt;/p&gt;

&lt;h2 id=&quot;the-announcement&quot;&gt;The announcement&lt;/h2&gt;

&lt;p&gt;On the morning of December 4, around 3 weeks after the new hire offers were
rescinded, the CEO scheduled an all-hands for 1PM that day. He also sent out a Slack
message that this was a very important meeting and that we should cancel other
things if needed to be there. We knew that we would finally be getting answers
about what was going on.&lt;/p&gt;

&lt;p&gt;The CEO, who was presenting from his home in SF, started with his usual statement
of Airplane’s mission, “We’re here because we believe that software is a force
multiplier…”, but ended with the clause that we’d be continuing that mission as part
of Airtable.&lt;/p&gt;

&lt;p&gt;He then explained that the Airplane product would be shut down, but that most of us
would be getting “extremely strong” offers from Airtable. However, we’d have to do
interviews for leveling purposes, and the financial details wouldn’t be made
available to us until later. This was a great outcome for us, he said, much better
than the alternatives he had considered, so we should be happy. The goal was to wrap
everything up before the holidays.&lt;/p&gt;

&lt;p&gt;After the Zoom was turned off, our room in the NYC office was silent. While we were
optimistic at the beginning when we heard about Airtable, the mood had clearly
soured as we got more information. Basically, our product was being shut down, we’d
have to interview for our new jobs, it was unclear what we’d be doing at the new
company, and our stock was potentially worthless. It was also unclear why, exactly,
we were doing this given that Airplane had many happy customers and tens of millions
of dollars in the bank.&lt;/p&gt;

&lt;p&gt;For the remainder of the day, we met 1:1 with the CEO to talk more about our
personal situations. For me and the other engineers, the CEO told us that he enjoyed
working with us and hoped that we’d continue at Airtable together. Our common stock
was worth $0, but we’d be getting extra cash bonuses from Airtable to compensate.
The exact details of our levels and roles were not known and would depend on the
interviews to be done later in the week.&lt;/p&gt;

&lt;p&gt;Based on these discussions, we put the pieces together and realized that this was an
&lt;a href=&quot;https://en.wikipedia.org/wiki/Acqui-hiring&quot;&gt;acqui-hire&lt;/a&gt;. Airtable wasn’t interested in our product, our technology, or our
customers. They wanted our CEO to lead their new AI effort, and the rest of the
company was baggage that would be accommodated to the minimum degree required. It
was depressing for all of us.&lt;/p&gt;

&lt;h2 id=&quot;figuring-out-next-steps&quot;&gt;Figuring out next steps&lt;/h2&gt;

&lt;p&gt;From this point forward, all regular work stopped. We continued supporting our
existing customers (who didn’t yet know anything was happening), but there would be
no more feature development and all conversations with prospective customers were
cut off.&lt;/p&gt;

&lt;p&gt;For the next week, we sat in the office doing interview prep and chatting with each
other about our options.&lt;/p&gt;

&lt;p&gt;I was personally not very excited about Airtable. I had interviewed there and gotten
an offer back in 2019 but had decided at the time that it wasn’t a fit for me. Since
then, the company had expanded greatly but then contracted abruptly via two large
layoffs. Airtable’s headcount was about half of what it had been a year before,
which seemed kind of depressing to me.&lt;/p&gt;

&lt;p&gt;The interviews took place a week later and, thankfully, were fairly low stress. We
were mostly asked about our previous experiences and projects. In my case, it was
clear they were probing how high they would move me in the IC chain. My interviewers
were interested in hearing about how I managed cross-company projects at the large
companies on my resume as opposed to any of my technical work at Airplane.&lt;/p&gt;

&lt;p&gt;Later that week, we got our Airtable offers. They were strong but definitely not as
out-of-this-world as the CEO had promised. In addition to a base salary and equity,
we were each offered a $50-75k signing bonus that would have to be paid back in full
if we quit in the first year. We were given three business days to make a decision,
and there would be no severance if we declined.&lt;/p&gt;

&lt;p&gt;I said no, without much hesitation. As mentioned previously, I wasn’t excited about
Airtable to begin with, and the whole acquisition process had left a sour taste in
my mouth. In addition to shutting down the product and abandoning our customers,
Airtable had never given us a product demo, detailed financial information about the
company, or precise details on what we’d be working on aside from “AI-related
features”. The whole process seemed like a rushed, disorganized mess, and I knew
that there were better things out there.&lt;/p&gt;

&lt;p&gt;Several of my colleagues came to a similar conclusion and also declined.&lt;/p&gt;

&lt;h2 id=&quot;winding-down&quot;&gt;Winding down&lt;/h2&gt;

&lt;p&gt;On January 3, the acquisition and product shutdown was &lt;a href=&quot;https://www.airplane.dev/blog/airtable&quot;&gt;publicly announced&lt;/a&gt;. As
expected, customers were shocked and upset. Many had been using Airplane for
critical workflows within their organizations, and they now had to replace huge
chunks of internal tooling before the final shutoff on March 1. The CEO tried to
steer people to alternatives like &lt;a href=&quot;https://windmill.dev&quot;&gt;windmill.dev&lt;/a&gt; and &lt;a href=&quot;https://retool.com&quot;&gt;Retool&lt;/a&gt;, but neither of these is a
drop-in replacement for Airplane. Even if users found replacement tools, it would
take a lot of time and effort for them to do the migration.&lt;/p&gt;

&lt;p&gt;Two days later was our last day of payroll. Our email and Slack access was shut off,
and we were no longer Airplane employees. We were never given any termination
paperwork to sign, and there were no formal goodbyes. The company had disintegrated
with a sad whisper.&lt;/p&gt;

&lt;h2 id=&quot;why&quot;&gt;Why?&lt;/h2&gt;

&lt;p&gt;We were never told directly why the company was acquired under such unfavorable
terms. Contrary to what one would logically think in these circumstances, we didn’t
run out of money or even come close to that. In fact, we had tens of millions of
dollars in the bank (allowing for years of runway) in addition to a great team,
relatively happy customers, low churn, and solid revenue growth.&lt;/p&gt;

&lt;p&gt;When pressed in 1:1 conversations, the CEO said that he didn’t see a path to
significantly higher revenue. He claimed we would have to do a significant product
pivot and/or change our sales strategy to grow beyond a particular point, and that
this transformation would be risky and hard. It was easier to quit while we were
ahead, shut everything down, and work on something new.&lt;/p&gt;

&lt;p&gt;To me and others, it seemed like the CEO was tired of the startup grind and simply
giving up. Ever since his fellow co-founder quit, he’d been running both the
technical and go-to-market sides of the company, and it was clearly very draining.
We felt that he wanted a less stressful position with a lower risk payday, and
this kind of deal was the best way to achieve that.&lt;/p&gt;

&lt;h2 id=&quot;final-thoughts&quot;&gt;Final thoughts&lt;/h2&gt;

&lt;p&gt;Overall, I had a good experience at Airplane and up until the last month I was
pretty happy. In looking back, the thing that upsets me the most is not the loss of
money (I never put much value on my stock options anyways) but rather the multitude
of customer relationships that were flushed down the drain by abandoning the
product. Many of these customers had personally stuck their necks out within their
organizations to drive adoption of Airplane, and had spent hundreds of hours
migrating critical, internal workflows to our product. Now, they were being thrown
under the bus.&lt;/p&gt;

&lt;p&gt;Were there any alternatives to what the CEO did? I obviously don’t know all the
details of the offers that the CEO considered, but I feel like he could have either:
(1) held out for an acquisition offer that involved keeping the product running, (2)
separately sold the Airplane technology to another entity that would continue
developing it, or (3) open-sourced the product. In the end, he unilaterally chose an
outcome that I believe was not the best for our product or our customers, and it was
sad that we all had to go along with it.&lt;/p&gt;

&lt;p&gt;Building Airplane was a great experience, and I hope to work at a small startup
again at some point in my career. For now, though, I’m going to decompress at a
bigger company.&lt;/p&gt;</content>

    <author>
      <name>yolken</name>

      <uri>https://yolken.net</uri>

    </author>
  </entry>

  <entry>
    <title>Radiosonde Hunting - Success!</title>
    <link href="https://www.scd31.com/posts/radiosonde-success"/>
    <id>https://www.scd31.com/posts/radiosonde-success</id>
    <updated>2024-03-03T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Leverage</title>
    <link href="https://tomeraberba.ch/leverage"/>
    <id>https://tomeraberba.ch/leverage</id>
    <updated>2024-02-26T00:00:00Z</updated>


    <content type="html">I composed, produced, and released an instrumental jazz track! https://www.youtube.com/watch?v=mOxsT5-jnc8 You can also find the track on Spotify, Apple Music, YouTube Music, and other music streaming…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>Radiosonde Hunting - A Post-Mortem</title>
    <link href="https://www.scd31.com/posts/radiosonde-post-mortem"/>
    <id>https://www.scd31.com/posts/radiosonde-post-mortem</id>
    <updated>2024-02-24T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>River: a Fast, Robust Job Queue for Go + Postgres</title>
    <link href="https://brandur.org/river"/>
    <id>tag:brandur.org,2023-11-20:river</id>
    <updated>2023-11-20T14:18:48Z</updated>


    <content type="html">&lt;p&gt;Years ago I wrote about &lt;a href=&quot;/postgres-queues&quot;&gt;my trouble with a job queue in Postgres&lt;/a&gt;, in which table bloat caused by long-running queries slowed down the workers&amp;rsquo; capacity to lock jobs as they hunted across millions of dead tuples trying to find a live one.&lt;/p&gt;

&lt;p&gt;A job queue in a database can have sharp edges, but I&amp;rsquo;d understated in that writeup the benefits that came with it. When used well, transactions and background jobs are a match made in heaven and completely sidestep a whole host of distributed systems problems that otherwise don&amp;rsquo;t have easy remediations.&lt;/p&gt;

&lt;p&gt;Consider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In a transaction, a job is emitted to a Redis-based queue and picked up for work, but the transaction that emitted it isn&amp;rsquo;t yet committed, so none of the data it needs is available. The job fails and will need to be retried later.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;ml-auto w-10/12&quot;&gt;&lt;img src=&quot;/assets/images/river/data-not-visible.svg&quot; alt=&quot;Job failure because data is not yet visible&quot;&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;ul&gt;
&lt;li&gt;A job is emitted from a transaction which then rolls back. The job fails and will also fail every subsequent retry, pointlessly eating resources despite never being able to succeed, eventually landing the dead letter queue.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;ml-auto w-10/12&quot;&gt;&lt;img src=&quot;/assets/images/river/data-roll-back.svg&quot; alt=&quot;Job failure because data rolled back&quot;&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;ul&gt;
&lt;li&gt;In an attempt to work around the data visibility problem, a job is emitted to Redis &lt;em&gt;after&lt;/em&gt; the transaction commits. But there&amp;rsquo;s a brief moment between the commit and job emit where if the process crashes or there&amp;rsquo;s a bug, the job is gone, requiring manual intervention to resolve (if it&amp;rsquo;s even noticed).&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;ml-auto w-10/12&quot;&gt;&lt;img src=&quot;/assets/images/river/job-emit-failure.svg&quot; alt=&quot;Job post-transaction emit failure&quot;&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;ul&gt;
&lt;li&gt;If both queue and store are non-transactional, all of the above and more. Instead of data not being visible, it may be that it&amp;rsquo;s in a partially ready state. If a job runs in the interim, all bets are off.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;ml-auto w-10/12&quot;&gt;&lt;img src=&quot;/assets/images/river/data-not-complete.svg&quot; alt=&quot;Job failure because data is not complete&quot;&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;p&gt;Work in a transaction has other benefits too. Postgres&amp;rsquo; &lt;a href=&quot;https://www.postgresql.org/docs/current/sql-notify.html&quot;&gt;&lt;code&gt;NOTIFY&lt;/code&gt;&lt;/a&gt; respects transactions, so the moment a job is ready to work a job queue can wake a worker to work it, bringing the mean delay before work happens down to the sub-millisecond level.&lt;/p&gt;

&lt;p&gt;Despite our operational trouble, we never did replace our database job queue at Heroku. The price of switching would&amp;rsquo;ve been high, and despite blemishes, the benefits still outweighed the costs. I then spent the next six years staring into a maelstrom of pure chaos as I worked on a non-transactional data store. No standard for data consistency was too low. Code was a morass of conditional statements to protect against a million possible (and probable) edges where actual state didn&amp;rsquo;t line up with expected state. Job queues &amp;ldquo;worked&amp;rdquo; by brute force, bludgeoning jobs through until they could reach a point that could be tacitly called &amp;ldquo;successful&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;I also picked up a Go habit to the point where it&amp;rsquo;s now been my language of choice for years now. Working with it professionally during that time, there&amp;rsquo;s been more than a few moments where I wished I had a good framework for transactional background jobs, but didn&amp;rsquo;t find any that I particularly loved to use.&lt;/p&gt;

&lt;h2 id=&quot;river-is-born&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#river-is-born&quot;&gt;River is born&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;So a few months ago, &lt;a href=&quot;https://github.com/bgentry&quot;&gt;Blake&lt;/a&gt; and I did what one should generally never do, and started writing a new job queue project built specifically around Postgres, Go, and our favorite Go driver, &lt;a href=&quot;https://github.com/jackc/pgx&quot;&gt;pgx&lt;/a&gt;. And finally, after long discussions and much consternation around API shapes and implementation approaches, it&amp;rsquo;s ready for beta use.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;d like to introduce River (&lt;a href=&quot;https://github.com/riverqueue/river&quot;&gt;GitHub link&lt;/a&gt;), a job queue for building fast, airtight applications.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://riverqueue.com&quot;&gt;&lt;img src=&quot;/assets/images/river/river-home.png&quot; srcset=&quot;/assets/images/river/river-home@2x.png 2x, /assets/images/river/river-home.png 1x&quot; alt=&quot;Screen shot of River home page&quot; class=&quot;rounded-3xl&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;generics&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#generics&quot;&gt;Designed for generics&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;One of the relatively new features in Go (since 1.18) that we really wanted to take full advantage of was the use of generics. A river worker takes a &lt;code&gt;river.Job[JobArgs]&lt;/code&gt; parameter that provides strongly typed access to the arguments within:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;type SortWorker struct {
    river.WorkerDefaults[SortArgs]
}

func (w *SortWorker) Work(ctx context.Context, job *river.Job[SortArgs]) error {
    sort.Strings(job.Args.Strings)
    fmt.Printf(&amp;quot;Sorted strings: %+v\n&amp;quot;, job.Args.Strings)
    return nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;No raw JSON blobs. No &lt;code&gt;json.Unmarshal&lt;/code&gt; boilerplate in every job. No type conversions. 100% reflect-free.&lt;/p&gt;

&lt;p&gt;Jobs are raw Go structs with no embeds, magic, or shenanigans. Only a &lt;code&gt;Kind&lt;/code&gt; implementation that provides a unique, stable string to identify the job as it round trips to and from the database:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;type SortArgs struct {
    // Strings is a slice of strings to sort.
    Strings []string `json:&amp;quot;strings&amp;quot;`
}

func (SortArgs) Kind() string { return &amp;quot;sort&amp;quot; }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Beyond the basics, River supports batch insertion, error and panic handlers, periodic jobs, subscription hooks for telemetry, unique jobs, and a host of other features.&lt;/p&gt;

&lt;p&gt;Job queues are never really done, but we&amp;rsquo;re pretty proud of the API design and initial feature set. Check out &lt;a href=&quot;https://github.com/riverqueue/river&quot;&gt;the project&amp;rsquo;s README&lt;/a&gt; and &lt;a href=&quot;https://riverqueue.com/docs&quot;&gt;getting started guide&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;performance&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#performance&quot;&gt;With performance in mind&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;One of the reasons we like to write things in Go is that it&amp;rsquo;s fast. We wanted River to be a good citizen of the ecosystem and designed it to use fast techniques where we could:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;It takes advantage of pgx&amp;rsquo;s implementation of Postgres&amp;rsquo; binary protocol, avoiding a lot marshaling to and parsing from strings.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;It minimizes round trips to the database, performing batch selects and updates to amalgamate work.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Operations like bulk job insertions make use of &lt;a href=&quot;https://www.postgresql.org/docs/current/sql-copy.html&quot;&gt;&lt;code&gt;COPY FROM&lt;/code&gt;&lt;/a&gt; for efficiency.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We haven&amp;rsquo;t even begun to optimize it so I won&amp;rsquo;t be showing any benchmarks (which tend to be misleading anyway), but on my commodity MacBook Air it works ~10k trivial jobs a second. It&amp;rsquo;s not slow.&lt;/p&gt;

&lt;h2 id=&quot;whats-different&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#whats-different&quot;&gt;What&#39;s different now?&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;You might be thinking: Brandur, you&amp;rsquo;ve had trouble with job queues in databases before. Now you&amp;rsquo;re promoting one. Why?&lt;/p&gt;

&lt;p&gt;A few reasons. The first is, as described above, transactions are really &lt;em&gt;just a really good idea&lt;/em&gt;. Maybe &lt;em&gt;the best&lt;/em&gt; idea in robust service design. For the last few years I&amp;rsquo;ve been putting my money where my mouth is and building a service modeled entirely around transactions and strong data constraints. Data inconsistencies are still possible, but especially in a relative sense, they functionally don&amp;rsquo;t exist. The amount of time this saves operators from having to manually mess around in consoles fixing things cannot be overstated. It&amp;rsquo;s the difference between night and day.&lt;/p&gt;

&lt;h3 id=&quot;single-dependency-stacks&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#single-dependency-stacks&quot;&gt;Single dependency stacks&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Another reason is that dependency minimization is great. I&amp;rsquo;ve written previously about how at work &lt;a href=&quot;/fragments/single-dependency-stacks&quot;&gt;we run a single dependency stack&lt;/a&gt;. No ElastiCache, no Redis, no bespoke queueing components, just Postgres. If there&amp;rsquo;s a problem with Postgres, we can fix it. No need to develop expertise in how to operate rarely used, black box systems.&lt;/p&gt;

&lt;p&gt;This idea isn&amp;rsquo;t unique. An interesting development in Ruby on Rails 7.1 is the addition of &lt;a href=&quot;https://github.com/rails/solid_cache&quot;&gt;Solid Cache&lt;/a&gt;, which 37 Signals uses to cache in the same database that they use for the rest of their data (same database, but different instances of it of course). Ten years ago this would&amp;rsquo;ve made little sense because you&amp;rsquo;d want a hot cache that&amp;rsquo;d serve content from memory only, but advancements in disks (SSDs) has been so great that they measured a real world difference in the double digits (25-50%) moving their cache from Redis to MySQL, but with a huge increase in cache hits because a disk-based system allows cache space to widen expansively.&lt;/p&gt;

&lt;h3 id=&quot;ruby-non-parallelism&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#ruby-non-parallelism&quot;&gt;Ruby non-parallelism&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;A big part of our queue problem at Heroku was the design of the specific job system we were using, and Ruby deployment. Because Ruby doesn&amp;rsquo;t support real parallelism, it&amp;rsquo;s commonly deployed with a &lt;a href=&quot;/nanoglyphs/027-15-minutes&quot;&gt;process forking model&lt;/a&gt; to maximize performance, and this was the case for us. Every worker was its own Ruby process operating independently.&lt;/p&gt;

&lt;p&gt;This produced a lot of contention and unnecessary work. Running independently, every worker was separately competing to lock every new job. So for &lt;em&gt;every&lt;/em&gt; new job to work, &lt;em&gt;every&lt;/em&gt; worker contended with &lt;em&gt;every other&lt;/em&gt; worker and iterated millions of dead job rows &lt;em&gt;every&lt;/em&gt; time. That&amp;rsquo;s a lot of inefficiency.&lt;/p&gt;

&lt;p&gt;A River cluster may run with many processes, but there&amp;rsquo;s orders of magnitude more parallel capacity within each as individual jobs are run on goroutines. A producer inside each process consolidates work and locks jobs for all its internal executors, saving a lot of grief. Separate Go processes may still contend with each other, but many fewer of them are needed thanks to superior intra-process concurrency.&lt;/p&gt;

&lt;h3 id=&quot;postgres-improvements&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#postgres-improvements&quot;&gt;Improvements in Postgres&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;During my last queue problems we would&amp;rsquo;ve been using Postgres 9.4. We have the benefits of nine new major versions since then, which have brought a lot of optimizations around performance and indexes.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The most important for a queue was the addition of &lt;a href=&quot;https://www.2ndquadrant.com/en/blog/what-is-select-skip-locked-for-in-postgresql-9-5/&quot;&gt;&lt;code&gt;SKIP LOCKED&lt;/code&gt;&lt;/a&gt; in 9.5, which lets transactions find rows to lock with less effort by skipping rows that are already locked. This feature is old (although no less useful) now, but we didn&amp;rsquo;t have it at the time.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Postgres 12 brought in &lt;code&gt;REINDEX CONCURRENTLY&lt;/code&gt;, allowing queue indexes to be rebuilt periodically to remove detritus and bloat.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Postgres 13 added &lt;a href=&quot;https://www.postgresql.org/docs/13/btree-implementation.html#BTREE-DEDUPLICATION&quot;&gt;B-tree deduplication&lt;/a&gt;, letting indexes with low cardinality (of which a job queue has multiple of) be stored much more efficiently.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Postgres 14 brought in an optimization to &lt;a href=&quot;https://www.postgresql.org/docs/14/btree-implementation.html#BTREE-DELETION&quot;&gt;skip B-tree splits&lt;/a&gt; by removing expired entries as new ones are added. Very helpful for indexes with a lot of churn like a job queue&amp;rsquo;s.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And I&amp;rsquo;m sure there&amp;rsquo;s many I&amp;rsquo;ve forgotten. Every new Postgres release brings dozens of small improvements and optimizations, and they add up.&lt;/p&gt;

&lt;p&gt;Also exciting is the &lt;a href=&quot;https://www.postgresql.org/message-id/CAAhFRxiQsRs2Eq5kCo9nXE3HTugsAAJdSQSmxncivebAxdmBjQ@mail.gmail.com&quot;&gt;potential addition of a transaction timeout setting&lt;/a&gt;. Postgres has timeouts for individual statements and being idle in a transaction, but not for the total duration of a transaction. Like with many OLTP operations, long-lived transactions are hazardous for job queues, and it&amp;rsquo;ll be a big improvement to be able to put an upper bound them.&lt;/p&gt;

&lt;h2 id=&quot;try-it&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#try-it&quot;&gt;Try it&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;Anyway, &lt;a href=&quot;https://riverqueue.com/&quot;&gt;check out River&lt;/a&gt; (see also the &lt;a href=&quot;https://github.com/riverqueue/river&quot;&gt;GitHub repo&lt;/a&gt; and &lt;a href=&quot;https://riverqueue.com/docs&quot;&gt;docs&lt;/a&gt;) and we&amp;rsquo;d appreciate it if you helped kick the tires a bit. We prioritized getting the API as polished as we could (we&amp;rsquo;re &lt;em&gt;really&lt;/em&gt; trying to avoid a &lt;code&gt;/v2&lt;/code&gt;), but are still doing a lot of active development as we refactor internals, optimize, and generally nicen things up.&lt;/p&gt;</content>

    <author>
      <name>brandur</name>

      <uri>https://brandur.org</uri>

    </author>
  </entry>

  <entry>
    <title>When Two Bugs Cancel Out</title>
    <link href="https://tomeraberba.ch/when-two-bugs-cancel-out"/>
    <id>https://tomeraberba.ch/when-two-bugs-cancel-out</id>
    <updated>2023-09-18T00:00:00Z</updated>


    <content type="html">I encountered a perplexing bug while implementing table cell splitting in Google Docs. The bug report 🐛 The QA team found a bug where some of a table&#39;s borders would be missing after a table cell…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>Real Time Video From a High Altitude Balloon</title>
    <link href="https://www.scd31.com/posts/real-time-balloon-video"/>
    <id>https://www.scd31.com/posts/real-time-balloon-video</id>
    <updated>2023-09-10T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>On Using Go&#39;s `t.Parallel()`</title>
    <link href="https://brandur.org/t-parallel"/>
    <id>tag:brandur.org,2023-08-26:t-parallel</id>
    <updated>2023-08-26T20:48:45Z</updated>


    <content type="html">&lt;p&gt;One of Go&amp;rsquo;s best features is not only that it does parallelism well, but that it&amp;rsquo;s deeply baked in. It&amp;rsquo;s best exemplified by primitives like goroutines and their dead simple ease of use, but extends all the way up the chain to the built-in tooling. When running tests for many packages with &lt;code&gt;go test ./...&lt;/code&gt;, packages automatically run in parallel up to a maximum equal to the number of CPUs on the machine. Between that and the language&amp;rsquo;s famously fast compilation, test suites are fast &lt;em&gt;by default&lt;/em&gt; instead of something that needs to be painstakingly optimized later on.&lt;/p&gt;

&lt;p&gt;Within any specific package, tests run sequentially, and as long as packages aren&amp;rsquo;t too mismatched in test suite size, that&amp;rsquo;s generally good enough.&lt;/p&gt;

&lt;p&gt;But having uniformly sized package test suites isn&amp;rsquo;t always a given, and some packages can grow to be quite large. We have a &lt;code&gt;./server/api&lt;/code&gt; package that contains the majority of our product&amp;rsquo;s API and ~200 tests to exercise it, and it&amp;rsquo;s measurably slower than most packages in the project.&lt;/p&gt;

&lt;p&gt;For cases like this, Go has another useful parallel facility: &lt;a href=&quot;https://pkg.go.dev/testing#T.Parallel&quot;&gt;&lt;code&gt;t.Parallel()&lt;/code&gt;&lt;/a&gt;, which lets specific tests &lt;em&gt;within a package&lt;/em&gt; be flagged to run in parallel with each other. When applied to our large package, it reduced the time needed for a single run by 30-40% or by 2-3x for ten consecutive runs.&lt;/p&gt;

&lt;p&gt;Before &lt;code&gt;t.Parallel()&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ go test ./server/api -count=1
ok      github.com/crunchydata/priv-all-platform/server/api     1.486s
$ go test ./server/api -count=10
ok      github.com/crunchydata/priv-all-platform/server/api     11.786s
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;After &lt;code&gt;t.Parallel()&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;$ go test ./server/api -count=1
ok      github.com/crunchydata/priv-all-platform/server/api     0.966s
$ go test ./server/api -count=10
ok      github.com/crunchydata/priv-all-platform/server/api     3.959s
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;These tests were already pretty fast (to beat a dead horse again: running &lt;em&gt;every&lt;/em&gt; API test for this project is 3-5x+ faster than it took to run &lt;em&gt;a single test case&lt;/em&gt; during my time at Stripe; language choice and infrastructure design makes a big difference), but this is one of the packages that we run tests on most frequently, so a 30-40% speed up makes a noticeable difference in DX when iterating.&lt;/p&gt;

&lt;p&gt;After adding &lt;code&gt;t.Parallel()&lt;/code&gt; to this one package, we then went through and added it to every test in every package, and then put in a ratchet with &lt;a href=&quot;https://golangci-lint.run/usage/linters/#paralleltest&quot;&gt;the &lt;code&gt;paralleltest&lt;/code&gt; linter&lt;/a&gt; to mandate it for future additions.&lt;/p&gt;

&lt;p&gt;Should you bother adding &lt;code&gt;t.Parallel()&lt;/code&gt; like we did? Maybe. It&amp;rsquo;s a pretty easy standard to adhere to when starting from scratch, and for existing ones it&amp;rsquo;ll be easier to add it today than at any point later on, so it&amp;rsquo;s worth considering.&lt;/p&gt;

&lt;h2 id=&quot;section-0&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#section-0&quot;&gt;Is `t.Parallel()` broadly recommended practice?&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;As far as I can tell, no.&lt;/p&gt;

&lt;p&gt;I like to use the Go language&amp;rsquo;s own source code to glean convention, and by my rough measurement only about 1/10th of its test suite uses &lt;code&gt;t.Parallel()&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sh&quot;&gt;# total number of tests
$ ag --no-filename --nobreak &#39;func Test&#39; | wc -l
    7786
    
# total number of uses of `t.Parallel()`
$ ag --no-filename --nobreak &#39;t\.Parallel\(\)&#39; | wc -l
     620
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This isn&amp;rsquo;t too surprising. As discussed above, parallelism across packages is usually good enough, and when iterating tests in one specific package, Go&amp;rsquo;s already pretty fast. For smaller packages adding parallelism is probably a wash, and for very small ones the extra overhead probably makes them slower (although trivially so).&lt;/p&gt;

&lt;p&gt;Still, it might not be a bad idea. As some packages grow to be large, parallel testing will keep them fast, and annotating tests with &lt;code&gt;t.Parallel()&lt;/code&gt; from the beginning is a lot easier than going back to add it to every test case and fix parallelism problems later on.&lt;/p&gt;

&lt;h2 id=&quot;sharp-edges&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#sharp-edges&quot;&gt;Sharp edges&lt;/a&gt;&lt;/h2&gt;

&lt;h3 id=&quot;test-tx&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#test-tx&quot;&gt;Sharing a database with test transactions&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;The biggest difficulty for many projects will be to have a strategy for the test database that can support parallelism. It&amp;rsquo;s easy to build a system where multiple tests target the same test database and insert data that conflicts with each other.&lt;/p&gt;

&lt;p&gt;We use &lt;a href=&quot;/fragments/go-test-tx-using-t-cleanup&quot;&gt;test transactions&lt;/a&gt; to avoid this. Each test opens a transaction, runs everything inside it, and rolls the transaction back as it finishes up. A simplified test helper looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func TestTx(ctx context.Context, t *testing.T) pgx.Tx {
    tx, err := getPool().Begin(ctx)
    require.NoError(t, err)

    t.Cleanup(func() {
        err := tx.Rollback(ctx)
        if !errors.Is(err, pgx.ErrTxClosed) {
            require.NoError(t, err)
        }
    })

    return tx
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Invocations of the helper share a package-level pgx pool that&amp;rsquo;s automatically parallel-safe (but still has a mutex to make sure that only one test case initializes it):&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;var (
    dbPool   *pgxpool.Pool
    dbPoolMu sync.RWMutex
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Usage is succinct and idiot-proof thanks to Go&amp;rsquo;s test &lt;code&gt;Cleanup&lt;/code&gt; hook:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;tx := TestTx(ctx, t)
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;deadlocks&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#deadlocks&quot;&gt;Deadlocks across transactions&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;The trickiest problem I had to fix while enabling &lt;code&gt;t.Parallel()&lt;/code&gt; involved Postgres upsert. We have a number of places where we seed data with an upsert to guarantee that it&amp;rsquo;s always in the database regardless of whether the program has run before or is starting for the first time. In the test suite, individual test cases would upsert a &amp;ldquo;known&amp;rdquo; resource:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;plan := dbfactory.Plan_AWS_Hobby2(ctx, t, tx)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Implemented as:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func Plan(ctx context.Context, t *testing.T, e db.Executor, opts *PlanOpts) *dbsqlc.Plan {
    validateOpts(t, opts)

    configPlan := providers.Default.MustGet(opts.ProviderID).MustGetPlan(opts.PlanID, true)

    plan, err := dbsqlc.New(e).PlanUpsert(ctx, dbsqlc.PlanUpsertParams{
        CPU:         int32(configPlan.CPU),
        Disabled:    configPlan.Disabled,
        DisplayName: configPlan.DisplayName,
        Instance:    configPlan.Instance,
        Memory:      configPlan.Memory,
        ProviderID:  opts.ProviderID,
        PlanID:      configPlan.ID,
        Rate:        int32(configPlan.Rate),
    })
    require.NoError(t, err)
    return &amp;amp;plan
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;To my surprise, adding &lt;code&gt;t.Parallel()&lt;/code&gt; would fail many tests at these invocations. Despite every test case running in its own transaction, it&amp;rsquo;s still possible for them to deadlock against other as they tried to upsert exactly the same data.&lt;/p&gt;

&lt;p&gt;We resolved the problem by moving to a fixture seeding model, so when the test database is being created, in addition to loading a schema and running migrations, we also load a common set of test data in it that all tests will share (test transactions ensure that any changes to it are rolled back):&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-make&quot;&gt;.PHONY: db/test
db/test:
    psql --echo-errors --quiet -c &#39;\timing off&#39; -c &amp;quot;DROP DATABASE IF EXISTS platform_main_test WITH (FORCE);&amp;quot;
    psql --echo-errors --quiet -c &#39;\timing off&#39; -c &amp;quot;CREATE DATABASE platform_main_test;&amp;quot;
    psql --echo-errors --quiet -c &#39;\timing off&#39; -f sql/main_schema.sql
    go run ./apps/pmigrate
    go run ./tools/src/seed-test-database/main.go
            
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So the implementation becomes a lookup instead:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func Plan(ctx context.Context, t *testing.T, e db.Executor, opts *PlanOpts) *dbsqlc.Plan {
    validateOpts(t, opts)

    _ = providers.Default.MustGet(opts.ProviderID).MustGetPlan(opts.PlanID, true)

    // Requires test data is seeded.
    provider, err := dbsqlc.New(e).PlanGetByID(ctx, dbsqlc.PlanGetByIDParams{
        PlanID:     opts.PlanID,
        ProviderID: opts.ProviderID,
    })
    require.NoError(t, err)

    return &amp;amp;provider
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;logging&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#logging&quot;&gt;Logging and `t.Log`&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;We make fairly extensive use of logging, and previously we&amp;rsquo;d just log to everything in tests to stdout. This is fine because Go automatically suppresses output to stdout without an additional &lt;code&gt;-test.v&lt;/code&gt; verbose flag, and because tests ran sequentially, even when testing verbosely the output looked fine, with logs for each test case correctly appearing within their begin/end banners.&lt;/p&gt;

&lt;p&gt;But with &lt;code&gt;t.Parallel()&lt;/code&gt;, everything became mixed together into a big log soup:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;=== RUN   TestClusterCreateRequest/StorageTooSmall
--- PASS: TestClusterCreateRequest (0.00s)
    --- PASS: TestClusterCreateRequest/StorageTooSmall (0.00s)
=== CONT  TestMultiFactorServiceList
=== RUN   TestMultiFactorServiceList/Success
=== RUN   TestMultiFactorServiceUpdate/SuccessWebAuthn
time=&amp;quot;2023-08-20T22:26:28Z&amp;quot; level=info msg=&amp;quot;password_hash_line: Match result: success [account: eee5c815-b7c6-4f19-8e1d-92428eed32ab] [hash time: 0.000496s]&amp;quot; account_id=eee5c815-b7c6-4f19-8e1d-92428eed32ab hash_duration=0.000496s hash_match=true
=== RUN   TestClusterServiceDelete/Owl410Gone
=== RUN   TestMultiFactorServiceList/Pagination
time=&amp;quot;2023-08-20T22:26:28Z&amp;quot; level=info msg=&amp;quot;sessionService: password_hash_upgrade_line: Upgraded password from \&amp;quot;argon2id\&amp;quot; to \&amp;quot;argon2id\&amp;quot; [account: eee5c815-b7c6-4f19-8e1d-92428eed32ab] [hash time: 0.000435s]&amp;quot; account_id=eee5c815-b7c6-4f19-8e1d-92428eed32ab new_algorithm=argon2id new_argon2id_memory=1024 new_argon2id_parallelism=4 new_argon2id_time=1 new_hash_duration=0.000435s old_algorithm=argon2id old_hash_iterations=0
=== RUN   TestClusterUpgradeServiceCreate/HobbyMaximum100GB
=== RUN   TestClusterServiceCreate/WithPostgresVersionID
=== RUN   TestMultiFactorServiceUpdate/WrongAccountNotFoundError
=== RUN   TestClusterServiceForkCreate/WithTargetTime
--- PASS: TestMultiFactorServiceList (0.01s)
    --- PASS: TestMultiFactorServiceList/Success (0.00s)
    --- PASS: TestMultiFactorServiceList/Pagination (0.00s)
=== CONT  TestClusterServiceActionTailscaleDisconnect
=== RUN   TestClusterServiceActionTailscaleDisconnect/Success
time=&amp;quot;2023-08-20T22:26:28Z&amp;quot; level=info msg=&amp;quot;password_hash_line: Match result: success [account: eee5c815-b7c6-4f19-8e1d-92428eed32ab] [hash time: 0.000828s]&amp;quot; account_id=eee5c815-b7c6-4f19-8e1d-92428eed32ab hash_duration=0.000828s hash_match=true
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This isn&amp;rsquo;t usually a problem because you&amp;rsquo;re not reading the logs anyway, but quickly becomes one if you get a test failure, and only have senseless noise around it to help you debug.&lt;/p&gt;

&lt;p&gt;The fix for this is &lt;a href=&quot;https://pkg.go.dev/testing?#T.Logf&quot;&gt;&lt;code&gt;t.Logf&lt;/code&gt;&lt;/a&gt;, which makes sure to collate log output for to the particular test case that emitted it. This will generally require a shim to use with a logging library like:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;// tlogWriter is an adapter between Logrus and Go&#39;s testing package,
// which lets us send all output to `t.Log` so that it&#39;s correctly
// collated with the test that emitted it. This helps especially when
// using parallel testing where output would otherwise be interleaved
// and make debugging extremely difficult.
type tlogWriter struct {
    tb testing.TB
}

func (lw *tlogWriter) Write(p []byte) (n int, err error) {
    // Unfortunately, even with this call to `t.Helper()` there&#39;s no
    // way to correctly attribute the log location to where it&#39;s
    // actually emitted in our code (everything shows up under
    // `entry.go`). A good explanation of this problem and possible
    // future solutions here:
    //
    // https://github.com/neilotoole/slogt#deficiency
    lw.tb.Helper()

    lw.tb.Logf((string)(p))
    return len(p), nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then with Logrus for example:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func Logger(tb testing.TB) *logrus.Entry {
    logger := logrus.New()
    logger.SetOutput(&amp;amp;tlogWriter{tb})
    return logrus.NewEntry(logger)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now when a test fails, any logs it produced are grouped correctly:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;--- FAIL: TestSessionServiceCreate (0.05s)
    --- FAIL: TestSessionServiceCreate/PasswordHashAlgorithmUpgrade (0.05s)
        entry.go:294: time=&amp;quot;2023-08-20T22:34:15Z&amp;quot; level=info msg=&amp;quot;password_hash_line: Match result: success [account: 81b967f7-4f5c-4ab4-b1d7-3c455db35767] [hash time: 0.000694s]&amp;quot; account_id=81b967f7-4f5c-4ab4-b1d7-3c455db35767 hash_duration=0.000694s hash_match=true
        entry.go:294: time=&amp;quot;2023-08-20T22:34:15Z&amp;quot; level=info msg=&amp;quot;sessionService: password_hash_upgrade_line: Upgraded password from \&amp;quot;argon2id\&amp;quot; to \&amp;quot;argon2id\&amp;quot; [account: 81b967f7-4f5c-4ab4-b1d7-3c455db35767] [hash time: 0.011716s]&amp;quot; account_id=81b967f7-4f5c-4ab4-b1d7-3c455db35767 new_algorithm=argon2id new_argon2id_memory=19456 new_argon2id_parallelism=4 new_argon2id_time=2 new_hash_duration=0.011716s old_algorithm=argon2id old_hash_iterations=0
        session_service_test.go:197:
                Error Trace:    /Users/brandur/Documents/crunchy/platform/server/api/session_service_test.go:197
                                                        /Users/brandur/Documents/crunchy/platform/server/api/session_service_test.go:158
                Error:          artificial failure
                Test:           TestSessionServiceCreate/PasswordHashAlgorithmUpgrade
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Bridges for common loggers like slog are usually available as public packages. &lt;a href=&quot;https://github.com/neilotoole/slogt&quot;&gt;Slogt&lt;/a&gt;, for example.&lt;/p&gt;

&lt;h3 id=&quot;goleak&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#goleak&quot;&gt;goleak&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Our tests use &lt;a href=&quot;https://github.com/uber-go/goleak&quot;&gt;goleak&lt;/a&gt; to detect any accidentally leaked goroutines, a practice that I&amp;rsquo;d recommend since leaking goroutines without realizing it is easily one of Go&amp;rsquo;s top footguns.&lt;/p&gt;

&lt;p&gt;Previously, we had a pattern in which every test case would check itself for goroutine leaks, but adding &lt;code&gt;t.Parallel()&lt;/code&gt; broke the pattern because test cases running in parallel would detect each other&amp;rsquo;s goroutines as leaks.&lt;/p&gt;

&lt;p&gt;The fix was to use goleak&amp;rsquo;s built-in &lt;code&gt;TestMain&lt;/code&gt; wrapper:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-go&quot;&gt;func TestMain(m *testing.M) {
    goleak.VerifyTestMain(m)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Leaked goroutines are only detected at package-level granularity, but as long as you&amp;rsquo;re starting off from a baseline of no leaks, that&amp;rsquo;s good enough to detect regressions.&lt;/p&gt;

&lt;h2 id=&quot;other-notes&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#other-notes&quot;&gt;Other notes&lt;/a&gt;&lt;/h2&gt;

&lt;h3 id=&quot;tests-not-subtests&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#tests-not-subtests&quot;&gt;Requiring `t.Parallel()` in tests, but not subtests&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;By default the &lt;code&gt;paralleltest&lt;/code&gt; lint will not only require that every test case define &lt;code&gt;t.Parallel()&lt;/code&gt;, but that every subtest (i.e. &lt;code&gt;t.Run(&amp;quot;Subtest&amp;quot;, func(t *testing.T) { ... })&lt;/code&gt;) define it as well. This is generally the right thing to do because it means that parallelism has better granularity and therefore more likely to produce more optimal throughput and lower the total runtime.&lt;/p&gt;

&lt;p&gt;Due to a historical tech decision made long ago, we were ubiquitously using a testing convention within test cases where we had plenty of subtests, but subtests were not parallel safe because they were all sharing a single &lt;code&gt;var&lt;/code&gt; block.&lt;/p&gt;

&lt;p&gt;Refactoring to total parallel-safety would&amp;rsquo;ve taken dozens of hours and wasn&amp;rsquo;t a good use of time, so we declared &lt;code&gt;t.Parallel()&lt;/code&gt; at the granularity of test cases but &lt;em&gt;not&lt;/em&gt; subtests to be &amp;ldquo;good enough&amp;rdquo;. I added an &lt;a href=&quot;https://github.com/kunwardeep/paralleltest/pull/32&quot;&gt;&lt;code&gt;ignoremissingsubtests&lt;/code&gt; option to &lt;code&gt;paralleltest&lt;/code&gt;&lt;/a&gt; to support that, and if your set up is anything like ours, maybe that&amp;rsquo;ll help you:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;linters-settings:
  paralleltest:
    # Ignore missing calls to `t.Parallel()` in subtests. Top-level
    # tests are still required to have `t.Parallel`, but subtests are
    # allowed to skip it.
    #
    # Default: false
    ignore-missing-subtests: true
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;takeaways&quot; class=&quot;link&quot;&gt;&lt;a href=&quot;#takeaways&quot;&gt;Takeaways&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;As noted above, it&amp;rsquo;s not exactly Go convention to make ubiquitous use of &lt;code&gt;t.Parallel()&lt;/code&gt;. That said, it&amp;rsquo;s reduced our test iteration time for large packages by 30-40%, and that&amp;rsquo;s enough of a development win that I personally intend to use it for future Go projects.&lt;/p&gt;

&lt;p&gt;And although increased test speed is its main benefit, when combined with &lt;code&gt;go test . -race&lt;/code&gt; it&amp;rsquo;s actually managed to help suss out some tricky parallel safety bugs that weren&amp;rsquo;t being caught with sequential-only test runs. That&amp;rsquo;s a big advantage because that whole class of bug is &lt;em&gt;very&lt;/em&gt; difficult to debug in production.&lt;/p&gt;

&lt;p&gt;Activating &lt;code&gt;t.Parallel()&lt;/code&gt; everywhere for an existing project could be a big deal, but integrating it from the beginning has very little ongoing cost, and might yield substantials benefits later on.&lt;/p&gt;</content>

    <author>
      <name>brandur</name>

      <uri>https://brandur.org</uri>

    </author>
  </entry>

  <entry>
    <title>Driving Poles Into the Ground</title>
    <link href="https://www.scd31.com/posts/driving-a-post"/>
    <id>https://www.scd31.com/posts/driving-a-post</id>
    <updated>2023-07-23T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>The Making of Keyalesce</title>
    <link href="https://tomeraberba.ch/the-making-of-keyalesce"/>
    <id>https://tomeraberba.ch/the-making-of-keyalesce</id>
    <updated>2023-07-02T00:00:00Z</updated>


    <content type="html">Have you ever wanted to use tuples or objects for the keys of a Map or the values of a Set? It&#39;s a very common question because the following code doesn&#39;t do what you might expect: The code behaves…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

  <entry>
    <title>Radio Mischief in the Montreal Metro</title>
    <link href="https://www.scd31.com/posts/radio-metro"/>
    <id>https://www.scd31.com/posts/radio-metro</id>
    <updated>2023-06-21T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Goodbye, San Francisco</title>
    <link href="https://yolken.net/blog/goodbye-san-francisco"/>
    <id>https://yolken.net/blog/goodbye-san-francisco</id>
    <updated>2023-06-18T20:30:00Z</updated>


    <content type="html">&lt;p&gt;After nearly 10 years of living in San Francisco, I decided to
leave at the end of last year. In this post, I explain what
motivated my departure and how the city can improve in the future.&lt;/p&gt;

&lt;figure&gt;
  &lt;img src=&quot;/assets/sf_skyline.jpeg&quot; alt=&quot;San Francisco skyline&quot; /&gt;
  &lt;figcaption&gt;
    View of my former neighborhood from my apartment.
  &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;h2 id=&quot;problems&quot;&gt;Problems&lt;/h2&gt;

&lt;p&gt;For the majority of my time in SF, I was pretty happy with life there.
I had a great apartment in a great location. Although I had a car,
I rarely used it because I could easily walk to the train, the city’s
main shopping district (Union Square), and the offices of my various
employers over the years.&lt;/p&gt;

&lt;p&gt;Starting 4-5 years ago, however, things started changing for the
worse, and then they accelerated downwards during the pandemic. It’s hard
to pin down exactly what happened and when, but I think the core problem
for me was that my sense of safety significantly degraded.&lt;/p&gt;

&lt;h3 id=&quot;physical-safety&quot;&gt;Physical safety&lt;/h3&gt;

&lt;p&gt;My biggest problem with SF was simply not feeling physically safe while
walking down the street or sitting in a train. SF has a pretty
low violent crime rate, at least by US standards, so I wasn’t worried
about getting shot or robbed or anything like that.&lt;/p&gt;

&lt;p&gt;Instead, the problem was that SF had (and still has today), thousands
of people wandering around who are suffering from untreated substance abuse
and/or severe mental illness. The vast majority of these folks were completely
harmless, but a small percentage were hostile, threatening, and, in some
cases, violent.&lt;/p&gt;

&lt;p&gt;In multiple cases over the last few years, I was followed, screamed at, and
threatened in broad daylight. Thankfully, nothing physically happened
to me, but it’s a really jarring experience to have this happen. Each time
I reported these incidents to the police, they never responded. I got the
sense that as long as no one was physically harmed, they didn’t really care.
Thus, there were no mechanisms in place to control, contain, or treat this
behavior.&lt;/p&gt;

&lt;p&gt;Over time, I realized that you needed to be hyper-vigilant every time you
left the house, constantly evaluating everyone around you and being ready
to cross the street or reverse direction at a moment’s notice. Is that
person screaming and throwing trash at the side of a bus shelter a
danger, or are they going to keep to themselves if I walk quickly past?
What about that group that’s selling drugs in the middle of the sidewalk?&lt;/p&gt;

&lt;p&gt;It just got really stressful and draining dealing with this after a while.
And, the threats seemed to get worse over time.&lt;/p&gt;

&lt;h3 id=&quot;drugs&quot;&gt;Drugs&lt;/h3&gt;

&lt;p&gt;Closely related to the issue of physical safety is drugs. When I moved to my
neighborhood
(&lt;a href=&quot;https://www.google.com/maps/place/1160+Mission+St,+San+Francisco,+CA+94103&quot;&gt;Mission Street between 7th and 8th&lt;/a&gt;),
it wasn’t fancy but at least it was pretty clean and quiet. Then, a few years
ago, gangs of drug dealers moved in. They first took over the 7th street
corner and then, during the pandemic, also expanded to the 8th street side of
my block. By the end, I couldn’t leave the house without walking through them.&lt;/p&gt;

&lt;p&gt;Most of the time, this activity was peaceful. However, it brought more people
to the neighborhood who were loud, destructive, and potentially threatening.
The sidewalks became blocked in places, and I often had to walk in the street
to get around the dealers, their customers, and the piles of trash they left
behind.&lt;/p&gt;

&lt;p&gt;Occasionally, violence did flare up. There were two, drug-related murders
on my block in the last year, one of which I heard the gunshots for. And, people
were constantly overdosing, and often dying. In my final month, I walked by a
dead body on the sidewalk being attended to by the coroner’s office.&lt;/p&gt;

&lt;h3 id=&quot;traffic-safety&quot;&gt;Traffic safety&lt;/h3&gt;

&lt;p&gt;The final safety dimension that affected me was my interactions with
drivers in the street. As I mentioned above, I had a car but tried to
walk or take transit whenever possible. I also biked a lot, particularly
if I was going to places that were more than a few miles away.&lt;/p&gt;

&lt;p&gt;San Francisco was never the most pedestrian or bike friendly city in the
world, but at least you could be reasonably confident that you wouldn’t
be hit by a car when going outside. Then, a few years ago, the city made
a conscious decision to stop nearly all traffic enforcement &lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. As a
result, the behavior of some drivers got terrible, to the point of being
dangerous. I would routinely see people running red lights, going the
wrong way down one way streets, driving on roads closed to private
vehicles, and speeding at 60+ in 25mph zones.&lt;/p&gt;

&lt;p&gt;In my last year, after nearly two decades of incident-free biking, a driver
making an illegal right turn hit me while I was in a bike lane on Market Street.
Thankfully, I escaped with only scrapes and bruises, but that was really upsetting
for me, and I felt that it was only a matter of time before something
worse happened.&lt;/p&gt;

&lt;h2 id=&quot;why-its-frustrating&quot;&gt;Why it’s frustrating&lt;/h2&gt;

&lt;p&gt;Many cities in the US and in other places around the world have problems
with crime, drugs, safety, homelessness, and other issues. But, there are a
couple aspects of San Francisco’s situation that make its problems
especially frustrating.&lt;/p&gt;

&lt;p&gt;First, the city is incredibly wealthy and has a massive budget at its
disposal- nearly $14 billion dollars &lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; for a population of only slightly more
than 800,000 people, which, on a per-capita basis, is one of the highest
in the country &lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;. Despite all this money and despite a budget that has grown
much faster than the city’s population over the last few years, SF can’t
keep its central areas safe and clean, and doesn’t seem to be providing
much help to the thousands of people suffering on its streets.&lt;/p&gt;

&lt;p&gt;Second, and even worse, the people in charge including the mayor,
the police leadership, and a majority of the Board of Supervisors (the
legislative body for SF), really just don’t seem to care one bit about
what’s happening. Sure, they will occasionally rant in public and promise
to fix things &lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;, but after a quick surge of activity, the streets just
go back to their previous state or get worse. I got the sense that they just
enjoyed the prestige and money of their jobs (SF officials are among the highest paid in
the country &lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;), and were coasting until they could find something longer-term.&lt;/p&gt;

&lt;h2 id=&quot;why-it-matters&quot;&gt;Why it matters&lt;/h2&gt;

&lt;p&gt;In the pre-pandemic days, most companies required employees to be in the
office at least 4 days a week. Many Bay Area companies, including most
of the big tech employers (Google, Facebook, etc.), had large offices
in downtown San Francisco, and as a result, many of their employees lived
in the city to avoid long commutes.&lt;/p&gt;

&lt;p&gt;With the pandemic, much of this disappeared. Companies that previously
had office-centric cultures started allowing fully remote work or at
least allowed people to come in less frequently than before.&lt;/p&gt;

&lt;p&gt;The result is that many tech employees, myself included, are no
longer tied to locations within easy commuting distance of an SF
office. They can live further away (e.g., in the suburbs) or
just totally leave the area, as I did. There is less foot traffic in
the central areas of the city, which has hurt retail businesses, who
are now shutting down or moving away as well.&lt;/p&gt;

&lt;p&gt;People now have the power to vote with their feet, and many have
taken advantage of and will continue to take advantage of this
power as conditions deteriorate. The city can no longer rest on its laurels
and spend from an infinite pile of money as it did in the good old days,
it now has to control spending and actively fight to retain businesses
and residents. Unfortunately, it hasn’t figured out how to do that yet.&lt;/p&gt;

&lt;h2 id=&quot;moving-away&quot;&gt;Moving away&lt;/h2&gt;

&lt;p&gt;After 10 years, I decided that I had had enough of feeling unsafe, of seeing
people suffering and dying in the streets, of watching businesses in
my neighborhood reduce their hours or shut down completely, and of seeing
no significant response from the city leadership. I packed up my stuff,
cancelled my lease, and moved back to the East Coast.&lt;/p&gt;

&lt;p&gt;I have a lot of great memories of my time in SF, and I feel really sad about
what’s happened to the city. Will it ever recover? Maybe with completely new
leadership and drastic policy changes it can. But, until that happens, I’m not
holding my breath.&lt;/p&gt;

&lt;h2 id=&quot;faq&quot;&gt;FAQ&lt;/h2&gt;

&lt;h4 id=&quot;why-didnt-you-just-move-to-a-different-part-of-sf&quot;&gt;Why didn’t you just move to a different part of SF?&lt;/h4&gt;

&lt;p&gt;I wanted to live in a dense, central area and not in a single family house
that’s 4 miles from downtown. Also, my neighborhood was perfectly nice when
I moved in, and then got bad. What’s to prevent that from happening to other
places in the city as well?&lt;/p&gt;

&lt;h4 id=&quot;what-about-homelessness-you-didnt-discuss-it-at-all-above-despite-the-attention-that-issue-has-gotten&quot;&gt;What about homelessness? You didn’t discuss it at all above despite the attention that issue has gotten.&lt;/h4&gt;

&lt;p&gt;Yes, homelessness is a very serious problem in San Francisco. However, it wasn’t
super prevalent in my immediate neighborhood. I also felt that the safety and drug
issues were more critical, and that homelessness was more of a symptom
than a cause of the city’s problems.&lt;/p&gt;

&lt;h4 id=&quot;what-about-property-crime&quot;&gt;What about property crime?&lt;/h4&gt;

&lt;p&gt;Property crime in San Francisco is terrible, but it didn’t affect me personally
because I lived in a building with multiple security people on-site 24/7. I also
kept my car in a secure garage and not on the street. If you’re moving to SF or
visiting, I would suggest you do the same.&lt;/p&gt;

&lt;p&gt;The sad thing about the city is that many people, particularly those with lower incomes,
don’t have this luxury, and it’s a big financial burden when their possessions
are stolen from their cars or homes. The city leadership, however, doesn’t really
care because burglaries and car break-ins are considered “victimless” crimes.&lt;/p&gt;

&lt;h4 id=&quot;i-drove-near-insert-some-place-mentioned-above-and-it-was-perfectly-fine-are-you-exaggerating-about-how-bad-it-is&quot;&gt;I drove near (insert some place mentioned above) and it was perfectly fine. Are you exaggerating about how bad it is?&lt;/h4&gt;

&lt;p&gt;No, I’m not. If anything, I’m understating how terrible things were in my neighborhood
and others nearby.&lt;/p&gt;

&lt;p&gt;One quirky thing to note about SF is that there can be significant variation in street
conditions by both location and time. One block could be perfectly fine, and a block
one street over could be a disaster zone. Or, a corner could be quiet now but then
get taken over by drug dealing next week.&lt;/p&gt;

&lt;p&gt;The city (when it does anything) typically just shuffles people around, does
a quick cleanup, and then forgets about the area for a while. The conditions have
little to do with the buildings, and more to do with whatever the city is allowing or
not allowing on the sidewalks at any particular time and place.&lt;/p&gt;

&lt;h4 id=&quot;the-stats-show-that-sf-is-doing-well-assaults-burglaries-substitute-some-other-metric-here-are-down&quot;&gt;The stats show that SF is doing well. Assaults, burglaries (substitute some other metric here) are down!&lt;/h4&gt;

&lt;p&gt;First of all, I care more about what I see with my eyes than any
statistics. Secondly, when the police take hours to respond to even serious crimes
like commercial burglaries &lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; or shootings &lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;, the incentive to report things
goes way down.&lt;/p&gt;

&lt;h4 id=&quot;other-cities-in-the-us-are-just-as-bad-why-are-you-picking-on-sf&quot;&gt;Other cities in the US are just as bad. Why are you picking on SF?&lt;/h4&gt;

&lt;p&gt;It’s simply false that SF is “just as bad as everywhere else”. I’ve spent a good
amount of time over the last year visiting many other places in the US, including a
bunch that the media love to hate on like Seattle, Baltimore, and New York, and I
felt significantly safer in all of those places. Also, they don’t have giant,
open-air drug markets in their city centers.&lt;/p&gt;

&lt;p&gt;Yes, lots of cities in the US have problems. But SF’s are genuine
outliers in multiple dimensions.&lt;/p&gt;

&lt;h4 id=&quot;sf-was-much-worse-in-the-90s-substitute-some-other-long-past-decade-here-shouldnt-we-be-grateful-about-the-progress-made-over-the-last-few-decades&quot;&gt;SF was much worse in the 90’s (substitute some other, long-past decade here). Shouldn’t we be grateful about the progress made over the last few decades?&lt;/h4&gt;

&lt;p&gt;I don’t think that’s relevant. The SF of today is a very different place from the
one of 30+ years ago, with many more resources, technologies, and policy learnings
at its disposal. By this same logic, we shouldn’t care about people dying from
infectious diseases today because many more died from them in some past epoch
(e.g., the early 20th century), which most would argue is a ridiculous assertion.&lt;/p&gt;

&lt;h4 id=&quot;wow-this-is-so-depressing-is-san-francisco-going-to-become-the-next-detroit&quot;&gt;Wow, this is so depressing. Is San Francisco going to become the next Detroit?&lt;/h4&gt;

&lt;p&gt;It’s hard to predict the future, but I think the chance of this happening is
low. SF’s geography and weather are pretty unique, and the broader metropolitan area
is extremely prosperous and continues to grow. However, I have no idea how low it will
go before it stabilizes and then, maybe, gets better again.&lt;/p&gt;

&lt;h4 id=&quot;why-are-you-complaining-without-providing-solutions&quot;&gt;Why are you complaining without providing solutions?&lt;/h4&gt;

&lt;p&gt;With all due respect, that’s not my job. I’m not an expert on criminology or
substance abuse or urban planning or whatever, nor am I in a position where I
control policy. When I lived in SF, I voted and paid taxes with the expectation
that the people in charge would try to figure these things out or hire smart
people who could. And, they failed miserably.&lt;/p&gt;

&lt;h4 id=&quot;youre-a-terrible-negative-person--good-riddance&quot;&gt;You’re a terrible, negative person- good riddance!&lt;/h4&gt;

&lt;p&gt;This kind of sentiment just drives more people and businesses away. Each departure
reduces the city’s revenue (personally, I paid tens of thousands of dollars
a year in state and local taxes) and activity, which makes the situation there
even worse.&lt;/p&gt;

&lt;h2 id=&quot;notes&quot;&gt;Notes&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://transpomaps.org/san-francisco/ca/sfpd-traffic-enforcement/analysis&quot;&gt;https://transpomaps.org/san-francisco/ca/sfpd-traffic-enforcement/analysis&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://www.sfchronicle.com/projects/2022/san-francisco-budget/&quot;&gt;https://www.sfchronicle.com/projects/2022/san-francisco-budget/&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://ballotpedia.org/Analysis_of_spending_in_America%27s_largest_cities&quot;&gt;https://ballotpedia.org/Analysis_of_spending_in_America%27s_largest_cities&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://www.thedailybeast.com/san-francisco-mayor-london-breed-orders-police-to-tenderloin-to-fight-bullshit-that-has-destroyed-our-city&quot;&gt;https://www.thedailybeast.com/san-francisco-mayor-london-breed-orders-police-to-tenderloin-to-fight-bullshit-that-has-destroyed-our-city&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://sfstandard.com/politics/city-hall/san-francisco-highest-paid-mayor-city-hall-employees/&quot;&gt;https://sfstandard.com/politics/city-hall/san-francisco-highest-paid-mayor-city-hall-employees/&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://www.sfchronicle.com/sf/bayarea/heatherknight/article/san-francisco-police-crime-17755470.php&quot;&gt;https://www.sfchronicle.com/sf/bayarea/heatherknight/article/san-francisco-police-crime-17755470.php&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://www.kron4.com/news/bay-area/thieves-break-into-cars-along-sfs-embarcadero-shoot-at-witness/&quot;&gt;https://www.kron4.com/news/bay-area/thieves-break-into-cars-along-sfs-embarcadero-shoot-at-witness/&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;</content>

    <author>
      <name>yolken</name>

      <uri>https://yolken.net</uri>

    </author>
  </entry>

  <entry>
    <title>500 Kbps From a High Altitude Balloon</title>
    <link href="https://www.scd31.com/posts/high-speed-balloon-comms"/>
    <id>https://www.scd31.com/posts/high-speed-balloon-comms</id>
    <updated>2023-06-13T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>A basic 70cm antenna</title>
    <link href="https://www.scd31.com/posts/70cm-antenna"/>
    <id>https://www.scd31.com/posts/70cm-antenna</id>
    <updated>2023-06-01T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>A tiny QRP low-pass filter</title>
    <link href="https://www.scd31.com/posts/low-pass-filter"/>
    <id>https://www.scd31.com/posts/low-pass-filter</id>
    <updated>2023-05-28T00:00:00Z</updated>


    <author>
      <name>Stephen Downward</name>

      <uri>https://scd31.com</uri>

    </author>
  </entry>

  <entry>
    <title>Avoid Layout Shifts Caused by Web Fonts With PostCSS Fontpie</title>
    <link href="https://tomeraberba.ch/avoid-layout-shifts-caused-by-web-fonts-with-postcss-fontpie"/>
    <id>https://tomeraberba.ch/avoid-layout-shifts-caused-by-web-fonts-with-postcss-fontpie</id>
    <updated>2023-05-24T00:00:00Z</updated>


    <content type="html">When your CSS references a web font before it finishes downloading, the browser renders text using a fallback system font instead, causing a layout shift if the text container&#39;s height changes once…</content>

    <author>
      <name>Tomer Aberbach</name>

      <uri>https://tomeraberba.ch</uri>

    </author>
  </entry>

</feed>
