NYC Data #376: Cutting a BigQuery Bill, Plaid, Department of Aging, Databricks and Sigma Computing Trivia Night, Imprint, Personalization vs Experimentation, How Expensive is Raising Kids in NYC?
Plus, Ctrl-F for NYC
Hi friends, it’s been fun to think about the Sunnyside Yard project happening; I used to live nearby in Woodside. It would be massive (literally and figuratively) but I’m not holding my breath: people have been talking about it for almost a decade.
As always, help me keep this space up-to-date: please send me posts, events, and job openings. If you know someone who might enjoy or benefit from this newsletter, please share it with them. [image credit: Chang W. Lee/The New York Times]
Good Local Posts
Very cool project from Cornell Tech’s Sean Hardesty Lewis: he ran a Vision Language Model across 20 million street-view images, covering every NYC block. Then he had the model describe what it saw, and then made it searchable, plotting the matches to any term. Really fun to play around with!
Wild post from the Manhattan Institute on how rich you need to be to afford children in NYC. This does not match up with personal experience (most of us don’t send kids to ‘exclusive independent schools’), but the dataset is pretty cool.
Michael Petro of Reddit wrote about The Algorithm That Saved Reddit 21% on BigQuery Slots. They run dynamic baseline slot allocation hourly! This is timely for me (don’t ask about my BigQuery bill) so I’m hoping they update the post with some of the next steps they are considering!
Upcoming In-Person Events (new listings in bold)
3/18: Databricks and Sigma Computing Trivia Night
3/19: ClickHouse New York Meetup @ The Loft
3/22 - 3/29: Open Data Week
3/24: Beyond the Prompt: Enterprise RAG
3/28: MeasureCamp
Open Roles
Headway is looking for a Director of Engineering - Data Platform.
The New York City Department for the Aging is hiring a Data Analyst.
Garner Health is seeking a Senior Data Product Manager.
Plaid is looking for a Data Scientist - Network Value.
Imprint is hiring a Data Scientist.
Miscellany
Interesting post from Spotify on using separate tech stacks for personalization and experimentation. Largely they do it so they can test personalization algorithms, but there are some interesting insights into short-term vs long-term trade-offs and why they don’t use bandits.
Ben Evans: The seasonality of US versus EU in GitHub is hilarious 😢
Thanks so much for being a subscriber. To see previous job listings (many of which are still open!) and blogs, check out the archive, which has emails from the tinyletter days. Feel free to forward this to anyone - they can subscribe here:



