The Data is in the Details When it Comes to Testing Ad Creative. Live from Mobile Growth Summit in San Francisco, our host Peggy Anne Salz checks in with Phil Shpilberg, President and Founder of GamechangerSF. Phil was at the conference to present a session where he translated theory into action as he made and launched a large batch of ads in 15 minutes. He stopped by the podcast booth to discuss how he uses data and multivariate techniques to build winning creative to power user acquisition strategies for his clients.
Phil is finding that minor details—like the color of CTA buttons—are actually driving a lot of the variation that he sees in campaigns (and how his team built out a reporting system that could handle looking at so much detail). Phil also weighs in on what not to do when testing creative and shares why he looks for the outliers when he’s diving into data.
I know all about the company but, Phil, maybe you just want to kick it off for our audience, tell us a little bit about what GameChangerSF does.
PS Sure, thank you, thanks for having me back. So, GameChangerSF is a company that started in 2012 and we’re really a data science company focused on mobile apps and we do user acquisition, we do more and more creative development right now which is what we’re here talking about – how we’re using data and multi-varied techniques to get winning creative for our clients. You know, and we’re just really excited to be here talking creative.
PAS Absolutely, and it’s an exciting topic because, as I said, you know, a lot of people are saying, you know, this is maybe not the year, because it actually started before, but you’re feeling like the ground swell now of saying if I want to engage an audience, what they see is important and if I can fine tune that, if I can figure out how to make it more exciting, more interesting for my audience, then I’m going to keep them interested, I'm going to acquire the users I want because they find it very interesting – that’s pretty much it. That’s me as a layman talking about it...
PS No, no, I think you got it.
PAS But what’s sort of like the pitch you’re giving when you’re talking to marketers? I mean, is that their mindset or do they see other benefits in this?
PS I mean, there are some really great discussions at the conferences about the right way to do creative right now and I think there are a lot of ways to get to the same place but ultimately, with programmatic buying becoming bigger, with targeting being done based on lookalike audiences, creative is becoming a bigger lever than it ever used to be that determines winners and losers. So it’s really important for us to have a perspective about how to get the best creative so we’ve come up with our solution which is, you know, obviously like we knew many years ago that you needed to have a lot of creative concepts that you would test with audiences to see what resonated the most but what we found now is that, you know, as we vary the components, the variables inside a particular creative, we’re seeing so much variation in the way that creative performs so it’s almost been surprising as we built the tools to do on the fly multi variant testing for every creative that we put out there. We see that so much of the variation in the performance you get is actually the variables within a creative idea.
So even more difference then between creative ideas within a particular creative idea, the variables that we’re changing – and I’m talking about things like in the gameplay, if you’re a game, maybe the character, the spokesperson, colors of buttons, calls to action, these things that we thought were like minor executional details are actually driving a lot of the variation.
So our pitch is we’ve created a system for ourselves, it actually – I think I’ve told you this – it started as a production system because we needed to make a lot of localisations, we needed to make a lot of custom sizes for different networks so we needed to come up with a way not to create a thousand videos manually. And then we realized that same system could be used for multi-variant testing, so then we needed to make those videos, then we needed to figure out how to upload them – that was the next step because you’re not doing a thousand videos manually – and then the reporting system, we realized that the current reporting systems couldn’t handle variables, they weren’t built for it, so we built a reporting system.
And so, we called that multivid and that’s our answer, that’s our particular take on how you get the best performing creative but one of the big topics almost every session here is about how do you get the best creative? So, I think it’s just a really exciting time to be working on this.
PAS There’s so much noise in the industry too, as you said, every session’s about it. I mean, there’s no one way but I’d love to hear your thoughts on maybe not the dos because that will depend on your app and your category and your audience, but maybe some definite dont’s.
PS Yes, I can give you – that’s actually easier, yes. I would say so much of what we’re seeing right now tells me that when you have a creative concept for an ad and you execute it and you have one execution, like the way that execution performs is almost noise on what’s possible because we’re seeing a hundred versions of an ad. So I would say don’t do that, don’t have one version – find the way to do what I’m talking about, to have at least two or three variables that you’re testing and find the best particular execution of your creative because a lot of what we’ve been looking at, what we call good performance or bad performance, turned out to be noise once you saw the greater, once you zoomed out and saw what was actually happening.
So, to me, it’s like figuring out a way, a process that fits with your – either seek out or somebody who does this or find a way to do it internally – but don’t just put one thing out there and think that you have data, right? That’s the big don’t.
PAS An interesting point is you do have some data, we won’t go into all of it because I’m a bit of a data nerd and I was like going to do this offline at some point...
PS We will, we still will.
PAS Absolutely, absolutely we will but I was going to ask a little bit of the top line results because it’s exciting, you know, when marketers are thinking should I buy into this, should I do this – what’s the difference, what’s the real uplift, what can I expect? You’ve got some great data that actually says more than you would imagine actually.
PS Yes, I mean, so this is – the thing that shocked me is that take like 3, 4, 5 creative concepts for an app, right, and do a bunch of these tests. What you’ll see is, it’s like the results are normally distributed so statistically speaking, a lot of these creatives perform roughly the same way but the execution is what has all the variants. Like, if you’re trying to predict when a train will get somewhere, right, you want a predictable distribution. When you want the best performing creative, you want the outliers – the outlier kind of thing, that’s basically the business of what we’re in. So what I’m seeing is the way to find the outliers is to basically take a lot of shots on goal.
And so within a concept, we’re saying a 30, 40, 50% improvement in like sort of what we’ve seen before because like whether a concept performed great or not so great really does now look like noise so we want to see a bunch of executions of the same thing to find the outlier.
PAS To find the real one, so it’s not like this really is interesting, no, it might not be that at all so you have to do like what, several tests to find this really is outstanding, this change really did make a difference – is that how it goes?
PS Absolutely and some of the smallest changes we made, changing a hair color of an animated color, doubled the performance of the ad. These are not things I would have predicted, you know, and it’s not – when we looked at creatives that performed well in the past and we said okay, this performed well, we should make some changes to it and see what that does – some of these changes were too small for us to even consider in the past. We just wouldn’t have thought some of these really minor changes – people just seemed to have a visceral reaction to certain things, right? They’re making this decision in a split second when they see the ad. So, the things that catch their eye are not always predictable.
Like, we stand a better chance of predicting the category of things that might influence the decision, not the thing itself, right.
PAS How granular should a marketer go? I mean, you talked about the color of hair, okay, but you could even go like emoji on this, you could say, well the style of the hair and all sorts of things. How do you sort of like find that golden middle where you’re like improving and optimizing but you’re not getting so far down in the weeds that you’re like making minor changes that don’t move the needle very much...
PS And I think the answer depends on what your technology and process is. Like, we can get in the weeds because we set up a system to get in the weeds and sometimes the stuff in the weeds works and sometimes things don’t move the needle at all. So, you start with assumptions that are reasonable, like, hey, Gameplayer, different characters, like we did GI Joe, there’s a bunch of characters so the first test was, hey, we got these 5 iconic characters, which of them is going to perform? That one’s easy.
But then it was like what if you use a different helmet, like what if the whole – throw a big giant American flag behind this. Like, you know, you start with some hypotheses and when you get down the road, you get smaller, right, you see the big stuff and you’re like, okay, what else? And when you get to the what else, sometimes you find small ones. But again, this is within a concept, like, you’ll have completely different concepts that you’re also messing with.
So there are like essentially two dimensions to this, there’s the sort of lateral concepts and there’s the vertical executions, that’s the way I think of it.
PAS Well, I love this topic and we do have to go to a break right now but we are going to come back and we’re going to talk about, I think, a little bit about the data but some of those more exciting outliers as well. I mean, marketers, you will be surprised to know the changes you can make and the uplift you can get so we’re going to go to a break so don’t go away, we’ll be right back.
And we are back to Mobile Presence at the amazing Mobile Growth Summit here in San Francisco and if you can make it down, that’s awesome, there are also shows in New York and other cities, definitely a great place to talk to people in the industry which is what we’re doing right now because I’ve got Phil Shpilberg, as I said before from GameChangerSF.
You got me interested, outliers, outlier hunting because if you can nail the outlier, then you can really move the needle on your app. So, some hints on outlier hunting – how do I get prepared for this one?
PS So, I mean, I could just tell you some of the stuff, I’ll tell you the process where we found some. So, the S’more app that I did as a case study – this one was where first it’s an app that doesn’t have like – it’s not a game, it’s a reward app that shows you ads. So, we created a campaign for it, we made a virtual spokesperson so we experimented with male, female characters, different kind of features and we came up with this woman we called Sadie, right, she just kind of – she has a certain look. Then we said okay, that’s pretty good, we were getting 10, 20% improvements and then we said, let’s just mess with her appearance – I wonder what – her hair is brown, what if we made it blue, what if we made it rainbow, what if we put it up, what if we put it down and that started to really move the needle, like blue hair down – doubled performance.
PAS You would not imagine that.
PS You wouldn’t and I don’t know that you could repeat that on a different app but we just kind of like followed that road and then when we run out of road we try to find the side road and sometimes that’s where it is. And if that didn’t work, we would just start on some other path, we would maybe mess with the background.
The other thing we tested for example, the way you get rewards on that app are gift cards so we had like fifty gift cards available to us. We tested those cards, Amazon did the best, right? So, these things aren’t hard though because you identify the major variables in your ad and then you vary those and once you run out of those, you start nitpicking at the small stuff and sometimes you hit the jackpot with the small stuff.
PAS So, I should be looking at the variables and I should be doing different versions. Is there any rule of thumb to say okay, do ten, twenty or forty – fifty, a hundred?
PS You get to some ridiculous numbers pretty fast, so like we try to do like four by four is probably the biggest because you get a lot of variation, if you think about a couple of audiences, a couple of localisations because you might get different answers for different audiences. So, it depends like if you’re doing one big lookalike audience and it’s one country, one platform as S'more is for us, it’s US and Android, so we could do pretty deep stuff.
So sometimes you have to make compromises because you could, like if you do the maths, the permutations get out of control fast, you can come up with 100,000 variations...
PAS And you can come up with one heck of a budget for that.
PS That’s the problem, it would take you like three weeks and you’ll have some ads with like two impressions so you do over time start to hone in on it. I like things between 25 and 100, like those kind of results because you could run that in a reasonable amount of time, but if you have millions of dollars, then you can go crazy because you’ll get results.
PAS You can do that anyway, actually, if you have a couple of million dollars.
PS Yes, if you have a couple of million, then go for it...
PAS It’s interesting how emotional this topic is because I was actually in the ladies restroom and there was this debate breaking out in there between marketers, they were in there and they were saying, you know, I heard in that session we’re supposed to change it up in videos, he must be crazy, it costs $10,000 to do a video – why would I do a couple of them and test it? So, I’m going to throw that devil’s advocate question out because I was in there and I thought this is interesting, there’s going to be a debate going on in the bathrooms about this because they’re very emotional about it, you know, you tell a newbie marketer they’ve got to put out $10k for a video and then do it four or five times...
PS But that’s a thing, so not to pitch the multivid product too much but it actually doesn’t cost more to do videos so remember it’s a technology so we script stuff on the front end so there’s a bit more work, you pay a little more than you would for one video but then you do get 250 videos because remember, they’re thinking we’re sitting there producing it so I think they kind of miss the point. The point is we’ve automated the rendering, like we set it up and then it will render overnight and you’ll come in the next morning and you have your 250 videos. So, I'm not telling them to spend more, in fact, they would spend...
PAS You should have been in there talking to them.
PS I should have been in the ladies, not a problem...
PAS But seriously, I think that that’s a misconception.
PS It is a misconception... well, that’s what we were after because it costs more than one video but it costs less than five, know what I mean, so it certainly doesn’t cost – like, people come with that perception and I think one of my challenges as a marketer is there’s an anchor price over there, right, like people have an idea of what a creative should cost and I’m trying to introduce a whole new way of doing it so I’m trying to get people to accept the different kind of price structure which is you’re not going to pay $100,000 for creative but you shouldn’t expect to pay what you pay for one video either.
PAS Right, it’s somewhere in the middle because you’re taking – you’re rendering it, so you’re sort of like mixing it up in your own system and, you know, I guess what happens, I'm just speaking as a layman but what you’re able to do is just make these minute changes with the one piece of collateral, marketing collateral, the one asset that you have, right?
PS Well, so the way to think about it is just the way Photoshop has layers, we create layers so let’s just say there are three pieces, you know, like we tell a story in ten seconds, right, so opening, middle, closing. So let’s just say we have three pieces and for each one of those, we have three variations – so we have three layers on each one of those, right, three by three by three. It’s going to render, you know, start one, start two, start three, start one, start two – sorry, it’s going to basically run through all the permutations, right, I didn’t do that well.
So start one, middle one, end one, then it’s going to do, let’s say, start two, middle one, end one...
PAS I get it, I get it.
PS So there’s not a human sitting there doing this...
PAS I get it, it’s matching it... who makes the decision or what is it in the system that says, hey, let’s throw in that blue hair now? That’s a human factor, is it?
PS That’s the human factor, that’s with our clients.
PAS So you work with your client, it’s like we’ve got all these different permutations, all these sort of like parallels of the original creative and now let’s mix it up a little bit.
PS Right, and just to be clear, like those are the human factors and I don’t want to make it sound like every one of these things is a success – sometimes you pick things and you don’t get variation and you just go back and pick other things. So, it does work better that’s iterative – if somebody comes to us for one video and they don’t get variation, I always feel a little guilty – I’ll say, okay, let’s just, let’s thrown in another one and we’ll get the variation.
So, you know, garbage in, garbage out – we can design bad creative if we wanted to but we tend to, like, build up learning over time. So, the human intuition, right, the things you know about your products still play, right, because people will come and they’ll say I can go with an emotional appeal or I could be very functional. Like, this is the way my product makes you feel, this is what you do with my product. Like, these are the things we know we should test, right, and a lot of times we can do that within the context of one creative.
PAS We talk a lot about the video but do you do other creative formats?
PS We do, it’s 80% of our creative these days is video so I talk a lot about video...
PAS It’s the most engaging...
PS Yes, and it’s really like the way you show things off the best but there are categories, like kids apps, branded kids apps do very well on static, display ads. The process is much easier because then you literally could just use script and photoshop, so it’s an easier problem to solve but you still need the upload and the reporting.
PAS So what about the human element within your company? I mean, I come to you with all my assets and I might say I’m dead set on a video when actually it should be interstitial – how does that work, do you like sit and consult with me or... just trying to understand how I would engage with Gamechangers.
PS We try to understand your goals before we get into like what the execution should be, we really try to understand what you’re trying to achieve. You know, we’ve had seven years of history, we try to see what we have that might inform us and then we work through it. If you come and you’re like I absolutely need this form of creative, we won’t argue with you, we’ll just try to...
PAS Customer always knows best, sometimes they screw it up but still...
PS We won’t consciously help people fail, if it’s awful, we will try not to make it, we probably won’t make it but if it’s reasonable, we’ll tell you what we think and we’ll work with you. We don’t tend to have that many disagreements with our clients, not meaningful ones. We’ll have like some friendly, like, hey, I think this is going to work or I think this is going to work, but we could do that in the context of our system, like, you know – we have much less discussions about what will work and more discussions about how do we set it up to test, right? I was always bad at picking creative out of a line-up...
PAS I’m always terrible at that, they always tell you at the conferences was it A or was it B and I’m like that one, and I was wrong. And the uplift is astounding.
PS I’m color blind so it’s even harder for me, I really rely on data.
PAS Okay, well, we’re going to talk about some of the data, we do have to go to a break right now, Phil, I almost hate to do it but we will come back, we’ve got one last segment full of really cool stuff to discuss so don’t go away, we’ll be right back after the break.
And we are back at Mobile Presence. I’m Peggy Anne Salz, this is Mobile Growth Summit, it is a fun event and I’m talking to some great people including our guest. Phil, before the break, we were talking about GameChangerSF, we were talking about outlier hunting, figuring out the creative that’s really going to nail it but ultimately what you need to tell marketers is about the data, you know, you have to convince them this is worth it – I’m convinced it’s worth it because I’ve seen some of the data. I mean, we can’t do this, this is a podcast but imagine for a moment – we’re going to take our audience into some of that cool data, I’ll probably be writing about it in an article pretty soon, but you’ve shown me one side that was just mind blowing because you thought there wasn’t much of an uplift but if you look at it differently, you see this gradient that’s really revealing. So, tell us, you know, audio, bring us through that because that is a pretty shocking – it’s an eye opener, really.
PS I’m looking forward to like putting the data out there so people can see it but in this particular case, we presented the slide today where I had three different creative concepts and I had results, right? There’s a KPI, just imagine it’s a KPI and one of the creatives outperforms the other two and it’s pretty clear, right, there’s one data point and then in the next slide, I show that it was actually a handpicked data point out of three creatives where there were thirty other data points and the data point I showed and the hypothesis you came to was completely the wrong one because I picked them in a way that they were just sort of noise, right? And that actually represents the way I think we’ve been doing creative since the beginning.
PAS So we haven’t been looking at the right KPIs, we’ve just been limited to CTR only or what is it?
PS It’s like we... it’s almost like we didn’t have the microscope to see what was going, the full view of what was going on, we were focused on one data point where it was actually like a much – part of a much bigger picture and now we have the technology and the means to really explore, like, what a particular creative concept can yield. So, I think we’ve been doing it wrong, you know, because we were doing the tools. It’s not like developing – NES was bad because it was 8-bit, that was the technology we had, so this kind of advertising that I think most of us – even us until recently were doing – is sort of 8-bit advertising in an HD world.
PAS It’s kind of interesting because I'm thinking about, like, the thinking I’ve been hearing about CPIs, for example, there was this school of thought, it’s like too expensive, not going to touch it and now you’ve got people who say, no, you look at it and you see that they’re engaging, it’s a highly valuable audience so, yes, I will go up beyond $11 CPI because I’m getting the return. So is it the same thing in creatives where we’re looking at maybe CTR and we should have been looking at, you know, number of sessions or something like that?
PS I mean, possibly but what I’m saying isn’t so much that we’re looking at the wrong KPI, I'm saying we were looking at one data point that was part of a bigger story.
PAS So maybe like day three instead of day six...
PS Maybe but my point is that it was one execution of a creative idea and there were thirty other possibilities and what we thought was signal was noise, you know? So whatever KPI you’re looking at, if you’re looking at just one execution of a creative and one result, like that’s like a dart throw.
PAS Exactly. So what should it be ideally, what would be the ideal approach to arriving at a conclusive decision, outcome?
PS The most – the interesting answer is I think we’re just scratching the surface of it because we’re doing 3 x 3 x 3, 4 x 4 x 4, we used to look at one or two data points, now we’re looking at thirty, it’s a hundred. So, I haven’t looked at a thousand, a hundred thousand, maybe we could do that in a bit but I think certainly you’d want, like, a 3 x 3 x 3 – if you had three variables that you executed three different ways, twenty-seven ads for every creative idea, that would be my recommendation.
PAS That’s a good start.
PS Yes, I think it’s a great start.
PAS And I mean to be fair, we’re talking about human behavior and it makes me think – it’s almost like incrementality, you know, I love that discussion, it’s like what really drove me to download app X and it could be that someone just told me the street, it was no ad I saw, it was not even like an outdoor, out of home ad, it was just me. I mean, to be fair, there’s a lot of variables there.
PS Right, and we’re just trying to kind of – in some ways we’re hacking the why in some ways and getting to the what because the whys gets us into a lot of human psychology. Like, we’ll look at results and go why do people react to blue hair, and we go, well, probably because they’re like – we adapt to what’s expected, right, and we didn’t expect blue hair on this ad and we saw it and it engaged us, right? But that’s just me trying to work a story in to that.
PAS Yes, exactly, the data won’t tell you that but it is interesting fine tweaks make a big difference. You talked about the blue hair, in closing, what is the biggest surprise you’ve had? You’ve worked with some major brands, maybe you can share an anecdote where it’s like – it even blew your mind and you thought, wow, we’re onto something here.
PS Well, I think I keep coming back to it but I really thought that most of creative performance was in the creative execution, like the creative idea and just the amount of variants we can get by doing multi-variant testing, I really thought we were talking about 10, 20%, I didn’t think we were talking about these hundreds of percent differences. So the scale of what we’re seeing is really what’s surprising me. Like, you know, I think the first time we talked, I said I’ve been looking at the data and I just have to come to the conclusion that we’ve been doing it wrong, like for fifty years or however long we’ve been doing modern advertising, and that’s not like a small – that’s not anything I would say – I wouldn’t say that lightly, so it’s been pretty shocking. Like, I was looking for something small and got something big.
PAS I mean these are, in some cases, 2x and 3x is what we’re talking about.
PS It’s between not being profitable and being very nicely profitable, you know.
PAS That’s something to take home, that’s food for thought. We’re coming to the close, Phil, it won’t be the last time we speak, for sure.
PS I hope not.
PAS In the meantime, I’m sure that people are saying, hey, I want to find out more about GameChangerSF, I might want to even dig into that data if you ever do some amazing blog posts – how could people keep up to date with you?
PS We actually – I think we have the case study up on gamechangersf.com so there’s a case study section where we share a lot of the data. I’ll publish parts of today’s presentation there but people can follow us on Twitter @gamechangersf and email me at phil@gamechangersf, I love talking about this stuff so email me, tell me I’m right, tell me I’m wrong, tell me you have data that says something else – I love those kind of discussions.
PAS And it’s going to be an ongoing one for sure, it’s a hot topic in 2020. I am delighted to have you on the show, thanks again, Phil.
PS Thanks, Peggy-Anne, it’s always a pleasure.
PAS And everybody listening in, hey, we’re doing this live so this is not scripted, so I’ll just end it the way I think I usually do which is first of all, you can find us on iTunes, Stitcher, Spreaker, Spotify and iheartRadio just by searching Mobile Presence – there, I got that right. And don’t forget you can also check out earlier episodes on webmasterradio.fm – you can also check out snippets and bits and approaches, sort of articles around the podcast, they’re over at mobilegroove.com, mobilegroove.com is also where you can find my portfolio of app marketing and content marketing services.
And it’s a wrap, friends, have a great day and remember - every minute is mobile, so make every minute count. We’ll see you soon.