Thursday, June 7, 2012

True Strength: The Goals for True 2.0

As the 2011-12 season developed, I sadly sorta fell off of doing the posts I used to call "Notes from the Vault," in which I evaluated how my True Strength project was working out in practice. The upside is that it leaves me with a lot of new things to say about the tweaks in the number for this summer! So here's a quick preview of what's coming up later this summer. While it looks like a lot, most of these changes are not really that game-changing, but in total I hope they'll produce a much better number for 2012-13!

Ratings/Viewing Correlation. I think starting with what was essentially a modified version of "share" was a decent starting point. The metric in general worked seemed to work pretty well across heavy viewing-specific changes (daylight saving time, shows like The Finder and Extreme Makeover: Home Edition moving to Friday).

Early Fall Hype. The idea here was that early season ratings are inflated beyond anything that can be explained by overall viewing levels. There's certainly something to that, and the Early Fall Hype adjustment was definitely a part of why the True Strength numbers hold up better across the season than the raw numbers do. But this is worth re-evaluating, and perhaps something that should be thrown to the end of the process rather than added at the beginning. Worth noting that the 2011-12 season had a great fall and a terrible (by comparison) spring, so an expectation for Early Fall Hype might be even greater based on this season's ratings.

Methodology Adjustment. This was probably both 1) one of the best things I did all of last summer and 2) the one thing that is very difficult to ever check again, because there are no more Old Methodology viewing levels available. That is of some concern over the long term as viewing and DVRing habits change, but seems OK for now. I can take a look at the year-to-year viewing trends by hour to try to get some sense of changing DVR use, but it will all require several leaps of guesswork, so I don't particularly want to change this unless there is a very compelling reason.

Definition of Competition. I wrestled with whether to define a show's competition as "every broadcast show in the slot" or "every OTHER broadcast show in the slot." The second one intuitively seems more like what it should be, since a show doesn't really "face itself," but the first one works better as its own statistic. In practice, there's a drawback to both. With "every show," the show gets full credit for "facing itself," while with "every OTHER show," a big show gets zero credit for the fact that it reduces the ratings of the shows it faces. "Every show" (ultimately the choice for version 1.0) wasn't awful, but I think generally it made the rich a little richer than they should've been. My goal is to come up with something in the middle: start with every OTHER show, but then adjust those shows' ratings slightly based on what they would get if the show itself were not in the slot.

Ratings/Viewing/Competition Correlation. I basically came up with two separate competition baselines: 30% of the PUT-level for weeknights and 23% for weekends. I think that general idea was pretty good, but I definitely need to go from those two hard numbers to a consistent formula that can incorporate every night. Why? 1) The 30% expectation is too high on viewing-deflated weekdays (namely holidays) and 2) This will also help close the fall/spring gap even more, as there will be less expected competition in the lower-viewed spring.

10:00 Shows as Competition. The idea here was that since there are fewer networks with known ratings at 10:00, we need to make up for Fox/CW (which are still getting reasonable ratings with local news/etc.) plus the increased presence of cable. I actually think this adjustment generally seemed to work pretty well in competitive times of year, but it started to stick out like a sore thumb during holiday times, when there was so much relative competition added by this flat number (mostly derived from more "competitive" parts of the year) that the 10:00 shows had a huge advantage. So this is another one where I may need to convert the constant into a formula that's based on viewing levels.

Sports as Competition. I noticed that, after making the viewing adjustments and entering the initial competition formula, it seemed like Sunday shows had a big advantage in the fall compared to the spring. I assumed that this was because they got such a massive competition adjustment from going up against Sunday Night Football, even though that sports property tends to not cross audiences that much with other primetime stuff. In practice, it always looked like Sunday had a huge disadvantage compared to the other days of the week. But by the spring, with no football in play, the Sunday scores of the fall ended up looking pretty reasonable compared to the spring. Still, I may drop this. I think the formula for viewing/competition (see Ratings/Viewing/Competition Correlation.) will create higher expectations on the super-high viewed fall Sundays, so the Sunday Night Football presence won't stick out so much. I do worry that this new formula will create an even bigger strain on Sunday ratings in general, though (as it will punish the night even further for its high viewing). We'll see.

Normal Lead-in. The other two facets of this whole process last year (viewing and competition) essentially went like this: 1) figure out what "normal" viewing/competition is; and 2) figure out the expected change in a rating from a change in viewing/competition; and 3) apply the expected change based on the difference between the "normal" value and the actual one. With lead-ins, though, I got cute. I decided to make the lead-in portion not about a "normal" lead-in but about "the gap" between a show and its lead-in. In other words, a 4.4 show airing out of a 4.0 lead-in got the same lead-in adjustment as a 1.4 show airing out of a 1.0 lead-in. I'd get more into what I was thinking at the time, but the bottom line is this ended up being pretty problematic on several different levels. For True 2.0, I'm gonna simplify and go back to figuring out a "normal" lead-in.

The Local Programming Lead-in. I came up with an estimate for what the 7:30 local programming provides based on a very limited set of shows that aired both at 8:00 and elsewhere. I noticed throughout this season that among the shows that aired both at 8:00 and somewhere else, almost all of them did Truly better at 8:00. (Whitney, Up All Night and Raising Hope were the best regularly scheduled examples.) Now it's certainly possible that all these shows were just very resilient and can pull about the same ratings in any slot, and I guessed as much with Raising Hope. But overall, it seems more likely that I simply didn't give the local programming lead-in enough credit, especially when considering that the 8:00 shows already have a significant advantage in this metric from the low overall viewing. Also, this was (like the 10:00 Competition adjustment) a significant problem during holiday times, as it really disadvantaged weak 8:00 shows, though I think changing from "the gap" (see Normal Lead-in) may help.

Changes in Lead-ins. Here, I concluded that an hour-long show is expected to increase by 0.1 for each 0.6 increase in lead-in, while a half-hour show increases by 0.1 for each 0.5 increase. For the most part, those seemed pretty close to reality, but I'll continue to re-evaluate since (in theory) technologies should be eroding away at the value of a lead-in. Since I now have the data available, I may be able to shift to looking at the last half hour of the lead-in rather than the full program.

Special airing adjustment? As I have mentioned several times since I started this blog, shows in special out-of-timeslot airings are almost always disappointments, and I feel it's time to stop expecting these shows to remain as "strong" in those circumstances as in their regular slots and then being crushed every single time when it doesn't happen. So I may come up with some adjustment that tries to normalize a show's rating in those situations.

Implementation. Last summer, the goal was to create a number. This summer, the goal is to create a number in which I actually have some confidence and can use more practically. Something like "The True Top 25" is a fun thing to be able to do, but it's not really analysis of anything. The next logical step is to be able to get better at predicting future ratings. I would like to try to come up with a more formal way of predicting. Perhaps the endpoint of that effort would be to create an actual model for each show for the whole season, but that might be too much to ask for this one summer.

No comments:

Post a Comment

© 2009-2022. All Rights Reserved.