It’s not a secret that ML/AI models are getting more powerful and sophisticated with each passing day. With high-profile projects from AlphaGo to Alexa, from the business success of FB to Netflix to Waymo, most companies now believe that ML/AI is crucial to their success. In fact, more than 50% of enterprises in a recent survey stated that they spend more than $20 million a year on ML projects. Yet success remains elusive and only a handful of big tech companies with deep pockets and equally deep expertise have been able to take advantage of the recent advances in ML/AI…
Intro to music theory…in python!
“Music is the universal language of mankind” — Henry Wadsworth Longfellow
“Mathematics is the language in which God has written the universe” — Galileo Galilei
The connection between music and math is a perennially popular topic in textbooks and articles. However, I always find being hands-on really helps me understand ideas at a deeper level. So, I decided to make a Jupyter notebook to play some music. You can read the rest of this post without the notebook, but it’s more fun if you download it and follow along.
We can’t all make the next…
Unified End-to-End Data Workflows on JupyterHub
At Tubi, we’ve standardized on Jupyter notebooks as the unified platform for doing analytics and data science work. Whether we’re exploring user behavior, doing time series forecasting for company KPIs, or building deep learning object detection models, we’re doing it on a customized JupyterHub installation we call Tubi Data Runtime (TDR).
We mentioned TDR in a previous post, and in this post we go into further detail. We will describe optimized data connectors, custom Jupyter cell magic and extensions, and domain-specific visualization tools. …
I’m now entering week three of fully remote work due to the pandemic. By osmosis, my son has reawakened my childhood love for tanks. This time though, I decided to do some more historical research and came across some really interesting stories and lessons from WWII.
Suppose I were to present two different strategies that were employed in the design and production of tanks in WWII:
In one strategy, the war planner made sure to meticulously collect battle statistics, use the most updated data to continually improve the design of their tanks and to give their troops an edge, they…
Burgers, TubiTV, and decision theory
It was a typical chilly San Francisco summer day when I met up with Marios for lunch at the now-closed Dirty Water restaurant inside the Twitter building. I don’t remember what we had, but knowing Marios, it was probably burgers.
It was one of the most fortuitous burgers I’ve had. At the time, my career was at a crossroads. Certainly, there were some highlights I could point to — I was the second major contributor to pandas; I co-founded (and sold) my own company; the GDP of small countries traded through the algorithms I wrote…
Part I — Introduction to Machine learning, data science, and data engineering at Tubi TV
Tubi is the market leader in free TV. It is our singular mission to make high-quality entertainment available to everyone. So what’s the secret sauce that’s enabled an upstart like Tubi to be successful without the infinitely deep pockets of industry giants like Disney, Apple, or Netflix? We believe it is our model-driven approach to making decisions throughout the company, focused around the three data disciplines of machine learning, data science, and data engineering.
This blog post is the first in a series of related…
Privilege, race, and admissions at elite universities
This week a judge ruled in favor of Harvard in the now-famous SFFA v Harvard case, which I wrote about in a previous post. The judge found there was “no persuasive documentary evidence of any racial animus or conscious prejudice against Asian Americans”. Whether you agree with the ruling is beside the point; what’s interesting is the admissions data made public as a result of the legal proceedings.
In a very recent paper, Arcidiacono, Kinsler, and Ransom analyzed Harvard’s admission data focusing on athletes, legacies, dean’s interest lists, and children of faculty/staff (ALDC)…
Ever since I was little, I’ve been geeking out over NASA and space missions. About a year after I moved to the US, before I could even speak English fluently, I went door-to-door in my little town of Muncie, Indiana to sell raffle tickets and chocolate to raise $200 to fund my school trip to Space Camp in Huntsville, AL. In high school, I was riveted by the NASA documentaries and movies that Mr. Engel showed in science class. The moon landing was always one of the most inspirational stories I recall.
or, Let the Punching Begin
Feature engineering is a critical component in any machine learning system. As the basic input into a predictive model, the quality of features can make or break the overall performance of the model.
Feature engineering also takes a tremendous amount of work. If a new machine learning application requires one new feature, it probably means that the ML engineers discarded ten other features and tried ten variations of each candidate feature. Features have to be computed, versioned, backfilled, and shared. …
Hey tech bros, remember the good old days when Klout plastered Stanford campus with recruiting posters calling on you to “bro down and crush some code”?
Please meet your new overlords, the Machine Learning (ML) bro. The ML bro represents a new evolutionary stage of the tech bro. This new sub-species of silicon valley Übermenschen is characterized by several distinguishing characteristics: