NYC Data #356: Scenario Planning for Experiments, Fund for Public Health, Faire, Baseball Swing Analytics, Ampersand, Known, Modal's Audio Transcription Service
Plus, is AI CapEx propping up the US Economy?
Hi friends, I hope your summer is going well. I got to take advantage of Summer Streets in Manhattan last weekend: always one of my favorite events. There are still a few more weeks to participate!
As always, help me keep this space up-to-date: please send me posts, events, and job openings. If you know someone who might enjoy or benefit from this newsletter, please share it with them. [image credit: The City of New York]
Good Local Posts
The big tech news this week is the release of GPT-5. Ethan Mollick has been covering it and has lots of interesting posts, but his main points seem to be 1) this is a big improvement but on the same curve as the other top labs, and 2) GPT-5 is actually multiple models with opaque selection, which will lead to varied results. Also importantly to me: somebody else pointed out OpenAI’s awful chart crimes and has exposed them (the company has been apologetic about it, at least).
Spotify Research just published a paper on an offline experimentation scenario analysis approach they call ForTune (full paper here). I think the core idea of using scenario planning to estimate the tradeoff between different metrics is a valuable one: I can already think of a few experiments I’ve run where I would have benefited from really thinking through the impact on secondary & tertiary numbers.
Modal Labs has a technical post up about their transcription service, which they claim can transcribe “one week of audio… in just one minute… for just $1”. Pretty cool, with references to time-constrained packing problems and Hadoop jokes :-).
Upcoming In-Person Events (new listings in bold)
8/11: Arthur x AWS: Scaling AI w/ Confidence Workshop
8/13: What’s New in Tidymodel?
8/20: NYC Data Exploration with Plotly Python
8/25 - 8/27: Data Science & AI Conference
8/28: Python and Data: Project Night with PyData
9/8: Building Scalable Systems with ClickHouse & Docker
9/9: Got Data, Now What? Storytelling Through Accessible Design
9/18: Data Management Summit
9/19: Cornell University Artificial Intelligence Investing Conference
9/29 - 10/3: MLCon
10/22: Introduction to Analysis of Public Survey Data
Open Roles
Squarespace is hiring a Data Governance Lead and Data Platform Engineers.
Faire is looking for an Applied AI/ML Scientist.
Known is seeking a Data Scientist, Media Consultant.
Ampersand is looking for a Senior Data Engineer.
The Fund for Public Health in New York City is hiring a Data Equity Analyst.
The Center for Health Equity and Community Wellness is looking for a Data Research Scientist to eliminate racial and other inequities resulting in premature mortality, focusing on Brooklyn.
Miscellany
Economist Paul Kedrosky says current AI capital expenditures are so big that they’re affecting economic statistics, boosting the economy, and is starting to be comparable to the size of the railroad boom of the 1880s! Wild. I did some other fact checking for comparisons: looks like the Apollo program and Manhattan Project both peaked at about 0.4% of GDP, or about one-third of AI CapEx.
Like many people, I found I had an interest in statistics and analytics from following baseball as a kid. I’m still regularly amazed at the depth of data and analysis around the sport. After my White Sox traded for Curtis Mead, I found myself reading this 3,400-word opus on his swing, including metrics like bat speed, exit velocity, squared-up %, attack angle (and direction), swing path tilt, pull air %, average intercept point and Z-swing %. I am in awe of how this all gets collected and analyzed.
From Reddit: '“I asked ChatGPT to explain my job (Senior Data Analytics Consultant) to a 5-year-old and now I'm questioning my entire career”.
Thanks so much for being a subscriber. To see previous job listings (many of which are still open!) and blogs, check out the archive (which has emails from the tinyletter days!). Feel free to forward this to anyone: they can subscribe here: