The Rule Of Thumb: GDPR, Plugins And Understanding Data

The Rule of Thumb: GDPR, Plugins, and Understanding Your Data – Interview with Helmut Januschka

Throughout the episode, Maciej and Helmut explore various aspects of GDPR compliance, providing valuable insights and advice. Helmut emphasizes the importance of understanding and only installing plugins that are necessary, while also recommending against saving data that you don’t know you need. He shares his personal experience of navigating GDPR compliance, including his decision to not use Google Analytics for his captcha website.

➡️ Helmut Januschka, Head of Software Engineering @krone_at Austrian biggest private news publisher, is an expert in website protection and security. He understands the risks that businesses face when their websites are targeted by bots, leading to potential loss of business.

You can also listen on Spotify and Apple Podcasts!

If you like this episode you might also like Discovering the secrets of the SEO world with Thomas Kloos

Maciej Nowak [00:00:00]:

Hello everyone. My name is Maciej Nowak, and welcome to the Osom to Know podcast, where we discuss all things WordPress. My today’s guest is Helmut Januschka, who is the CEO of Captcha. It a product that has been created within one of the biggest publishers in Austria. And this is a very interesting example of a tool that has been born out of changing regulation environment, namely creation of GDPR rules. And we are also discussing changing initial project assumptions across the whole, let’s say, project journey. If you are watching this on YouTube, please give us a thumbs. This means a world to us. And if you want to keep learning more about WordPress, please subscribe to our newsletter osomstudio.com/newsletter. This is osomstudio.com/newsletter. Without further ado, please enjoy my conversation with Helmut Januschka.

Lector [00:00:59]:

Hey, everyone, it’s good to have you here. We’re glad you decided to tune in for this episode of the Osom to Know podcast.

Maciej Nowak [00:01:08]:

Hello, Helmut. How are you?

Helmut Januschka [00:01:11]:

Hello. Thank you. I’m fine. How are you?

Maciej Nowak [00:01:14]:

I’m very good. I’m very happy we can have this conversation today. And my first question to you would be what is Captcha? By your meaning? By your understanding, what is Captcha?

Helmut Januschka [00:01:29]:

In my understanding, it’s like a service that helps you to protect your website and your business from being tackled or attacked by bots. Like, Capture tries to segmentize real humans from automated users and will stop those automated tasks and pods to, let’s say, sign up for a newsletter or register for an account or do a fake purchase, a fake vote and stuff like that.

Maciej Nowak [00:01:59]:

All right. And why this is important, I mean, this can be automated and exploited, let’s say, by bots. But what is the extent of those actions? So is it just filling out the forms? What’s the downside if you are not protected?

Helmut Januschka [00:02:18]:

If you’re not protected, you risk, like, first of all, getting out of business because your website is just getting hammered by bots. And the second thing is when we take the typical case that everyone knows, like contact forms, usually those contact forms produce somehow an email or something like that goes into some back office. And if you like a company that gets a lot of those emails, then then you need to find out the real ones. And you might miss a real user support case while it’s being flooded by spammers. And therefore it reduces human efforts that is needed to work through those contact forms.

Maciej Nowak [00:02:55]:

And Helmut, the name Captcha is also the name of the project that you recently came up with. And I would love to know what’s the reasoning behind your project and how the idea originated.

Helmut Januschka [00:03:09]:

The project itself is called Captcha EU. Our mother company is the Krone AT, which is the Austrian’s biggest news publisher. And we had the problem with the GDPR that we needed to remove all the other capture services that we previously used, not to name shame anyone and we removed the US services and then we immediately saw increases in ghost signups in faked voting results and stuff like that. And then we didn’t really have a solution at that time. We looked around the market and didn’t find any perfect fit for us at our scale. They either were too expensive or they also were US based. And so we decided to give this thing a try. And we were working through a lot of white papers from MIT, Stanford and we actually over the weekend we pumped out a quick POC and rolled it out in our most crucial parts like subscriber management login and stuff like that. And over the weekend we saw immediate success. So it was like we were shocked at the first place that it worked so good. And in the weeks to follow, we talked inside the corporate group if they can also give it a try, because they had the same basically same GDPR thing going on and they were already talking to other vendors. So they gave our solution a try and they also reported that it’s a huge success and it works super great. And we did that for a year. It was like running under the hood without the users recognizing it because our approach to captures and detecting bots is not by letting them search traffic lights or buses or anything you might know when you think about captures. And after a year we came to the conclusion it might be a good product to sell because it solves our problem so well inside the whole corporate group. And so we decided to give it like we took the last mile, like building dashboards, building reportings, because we did not have that, because internally we did not need it. We just saw that the bots were stopped and we are happy. So we invested like half a year into polishing the product, into making it an end user product, and also to get all the lawyer agreements and all the stuff that we get green lighted when we say it’s GDPR compliant. And yeah, we recently, in April we launched the final product. Since we are using WordPress in the launch, we came up with a WordPress plugin so it’s ready to run if you have a WordPress with all the major form plugins, it works out of the box. A couple of weeks later, we already have now like Craft, CMS, we also have Humbler and various other platforms that we now support and we are already always increasing the platforms because we see them as a multiplier. The first months in the business it looks quite good. We have a pretty diverse customer base, so from all the branches to different sizes, it’s really interesting to see how it goes there. It’s a tough market because we are competiting with a free product. So we need to somehow tell the customer please don’t use Google Recapture, it’s free. Please give us a little bit of money. It’s really not a huge amount. But at the end of the day we see customers being happy because our solution is invisible. Like you don’t have to mess around with your UX or UI.

Maciej Nowak [00:06:58]:

Yeah, this is super interesting because when I think of Captcha and most of people will think of Captcha, the first thing is either picking the parts of buses or boats or typing those sketchy letters into some fields. So why those elements are not visible on your solution?

Helmut Januschka [00:07:20]:

Basically, we decided to go with the invisible approach. Google, for example, also offers that thing. And we took that road because in our mother company, the Gonzatong, we have such a big audience varying from teenagers to pensioners and we absolutely wanted to avoid any unhappy customers. So we needed to be GDPR legal and we needed to be like customer happiness. So the most important is we accept one bot, but we do not accept one dropped real user. So this is the approach that we followed and with the invisible way, we took the data that we collected in the initial phase, that is more than a year, and we built a strong machine learning model which ultimately helps us to separate bots from non bots. Quite good. Even though we don’t have any visible user interaction.

Maciej Nowak [00:08:17]:

You should be calling it AI approach that there is an AI.

Helmut Januschka [00:08:21]:

I don’t call it AI because AI is too much magic. It’s just machine learning.

Maciej Nowak [00:08:26]:

Yeah, but maybe it would help you with the size.

Helmut Januschka [00:08:32]:

We should probably call it Capture GPT or whatever.

Maciej Nowak [00:08:36]:

Captcha AI?

Helmut Januschka [00:08:37]:

Yeah. No, explicitly it’s not AI. It’s based on true data that we in the company, in the mother company, we have like dozens of data sets and records and we use those to build a model that helps us predict the device if it’s a bot or not. And the machine learning is only one piece. There’s like multiple signals that we take into consideration on if you’re a bot or not.

Maciej Nowak [00:09:04]:

And they are coming from those papers you have read over that exactly weekend.

Helmut Januschka [00:09:11]:

Yeah, it’s a mixture. It’s heavily based on those white papers that are like pre 2000 most likely. So pretty old tech. But old tech is not always bad tech. And we also added our experience since we are a web publisher since 1998. So almost all the generations of bots that are outside there, we already have had them. We also like as a publisher, a pretty political target. So every bot from China, Russia and Co already was at our platform and we take that data to help our customers on the other company capture to stay protected.

Maciej Nowak [00:09:54]:

Yeah, this is very interesting that you can create something out of white papers over the weekend that works. Or maybe I’m wrong, so correct me if I’m wrong, but it sounds a little bit like it wasn’t a herculean task at the very beginning to create working solution like overnight over the weekend that would save, let’s say, your life, because you cannot use captcha to protect your website overnight. Right, or not. I mean, what I’m trying to ask is that you had to remove all of the captcha solutions and you had to create something GDPR compliant of your own. And you mentioned that the POC was quick to set up, right?

Helmut Januschka [00:10:45]:

Yeah, but the POC like it’s one and a half years ago. The POC was really a wood and a nail, like trying different white papers and just surviving the weekend. And then we invested more than a year in fine tuning it. So the first version worked quite well, but it was like two weeks later, bots from a different country came and they ran through it again. So it was like more wood nail, wood nail, wood nail.

Maciej Nowak [00:11:13]:

I see.

Helmut Januschka [00:11:14]:

Okay. The basic idea was there from the white papers, but making it a diamond took quite a while.

Maciej Nowak [00:11:23]:

Yeah, obviously. But this is amazing that you were able to introduce something that would protect you still at the very beginning and then grow. So the data was collected that you trained your model over the months, like those one and a half a year ago. But did you have that data before that you could use and analyze or were you collecting only after it started with the new?

Helmut Januschka [00:11:49]:

Actually, we have so much traffic, whenever we deploy anything new, it’s just like an hour, for example. And I have multiple thousand data set records, so we could easily say, okay, that machine learning model failed. Ultimately, we just reboot. We dropped it, the whole database, and then we just let it run for half a day and then we had enough to work on from there since we have a lot of logins and stuff like that.

Maciej Nowak [00:12:16]:

Okay, I see the company created a product, a digital product, and best. But for the listeners that don’t know, Krone is a big Austrian publisher, one of the biggest, I guess. Right. And my perspective is that it’s a big organization corporation in the publishing sector. And this is very specific, very niche product solving, very unique problem. And I would imagine this is not a place on one hand, it’s not a place for such a product to originate from because big organization creating something out of their, let’s say, space, not publishing. At the same time, there is such a huge amount of data you can use that this is the best place for such a product to originate. Right. So it’s like two different approaches or way of thinking about this. So it’s a little bit surprising for me. But at the same time, who else, a big organization with a lot of data can create such a thing. Yeah, this is interesting that this kind of organization created such a product.

Helmut Januschka [00:13:46]:

Yeah. So the thing is, I think the funny part is that we did not sit in a room and say what can we create to sell afterwards? But it was more like, okay, how can we come up with a loose solution to survive in that GDPR space today? And then we sold that. So it was not easy to get the approvals to sell a product because we are like selling newspapers for around 100 years or something like that. And then we came up with the idea of selling a digital something that has nothing to do with content itself. But yeah, actually we are doing good so far and it’s quite interesting. And in these days, economically, it totally makes sense to give that thing a try. And even if the sales is not successful, we still need the product. So it’s like every euro that comes in is great for us.

Maciej Nowak [00:14:35]:

It’s just a bonus, right? It is not necessary. It’s nice to have, right?

Helmut Januschka [00:14:40]:

We need it anyway because we are eating our own dog food in that naming and if we can help others, we like it.

Maciej Nowak [00:14:51]:

And you referred to what I was thinking as a problem, because a company that is selling newspapers and even entering the digital era with online publishing, this is something totally out of area of expertise of salespeople of lawyers, because this is not what they are worrying about day to day, right? This is something new, creating a product and selling a product instead of a newspaper. So I’m very curious, how did it go in terms of winning the management decision? Like selling this idea to the management, let’s sell it. So I’m very curious because this is not an usual situation, because my thinking about this is that those kind of products are created by a freelancer after hours or two guys out of university that had fresh idea to revolutionize something and so on. But here it’s a very established business with a long history.

Helmut Januschka [00:16:00]:

The thing is, we have a pretty agile team and we pretty much have the principle of doing the core business ourselves. So we do not hire external developers for core business. Everything that makes money should be built in house so that we do not have our income controlled by others. And this is a mentality in the engineers that we do things like that. We are not afraid of doing things like that. But as you said, it was not that easy to convince the upper management to really sell that. Now they were happy that we built a solution for us and that we are good to go and we did not need to pay for any external services. But telling them okay, come on, let’s create a company, let’s sell that product was not the easiest thing. So we needed a lot of things to do in first place. We needed to hire lawyers that looked into if the whole product is capable of being legal when used by others and how does it need to be formulated and pretty much the last mile was almost exclusively talking to lawyers and getting all the papers done. The second thing is we needed to do source code audits. So we needed to send the source code to an external auditor and they needed to confirm the management that what we did here is a feasible thing. Because one of the first critics was, okay, so how can we do that when Facebook Google, why can’t they do this? And the thing is, they can do it too, but they are still an American company. That is the crucial difference. Since the problem is like European wide, we really tried to explain that this is not a local Austrian thing. This could most likely escalate into other countries because every publisher or every other website vendor has the same problem.

Maciej Nowak [00:18:06]:

Let’s talk about that problem. Exactly. So just refer back a little bit. How did it all start? Why did you have to create different solution regulation? What changed that forced you to abandon?

Helmut Januschka [00:18:29]:

When GDPR came in effect, we went through all the services US based and removed almost everything from Google, for example, except like fonts and capture. Because we argued they are system relevant and they are critical to protect ourselves. And the GDPR department said, okay, that’s a super gray zone. It’s like, why is it technically required when you read an article that we have a capture on the contact form? So it’s like, well, but it was gray zoned and generally most people see it gray zoned still nowadays. But then like half a year ago, I think there was this Google fonts lawyer letters that get sent out. And we as a publisher, we’re like a high prestige target. Like every GDPR lawyer looks at our platform because if we do something wrong, it’s in their terms of publicity. They gain more than if they go to a dentist, for example, and say, hey, you’re using an American service. So since GDPR, we always had the be the best in class call from the management. And with the removal of the Google fonts, like hosting them on our own, the GDPR lawyer said, okay, it’s the time we need to remove that. We cannot grace on that anymore, it’s too critical because they will definitely kick it out anyway and we have stupid argumentations and lawyer costs. So we removed it. And to be honest, in that moment we thought, okay, bots is not a real thing anymore. We do not have bots, so just remove it, everything will be fine. Most likely the Internet is a good place nowadays. Yeah, funny. And you were surprised, right, 20 minutes after the rollout of the removal of Google recapture and all our core stack, it was like immediately like auto scaling and emails and everything went in the wrong direction at that place in the first place. We were really shocked because we didn’t know what to do. We wanted to reactivate Google recapture. We could not do that because we already said the lawyers that we removed it and it would not make a good footprint if you reactivate it five minutes later. So we did the traditional things like blocking countries and blocking those things from the survive the first few days and then over the weekend we took this POC and fine tuned it the next couple of days or sometimes weeks actually.

Maciej Nowak [00:21:04]:

All right, so this is the reason this was so stressful and had to be done over the weekend because I was thinking maybe the regulation kicked in.

Helmut Januschka [00:21:12]:

And no, the thing is actually we only needed it for logins, but a year ago we started a premium subscription model. So logging is like one of the most important things if you pay for a subscription multiple euros a month and commit for a two year contract and then you cannot log in because bots are flooding the system. Up to a recent point we can scale, but at some point scaling is not the solution because like extra, it’s not only paying, it doesn’t work because the bots also scale and they scale cheaper.

Maciej Nowak [00:21:53]:

Yeah, there is leverage, right? There is even cheap ping that kicks out of systems running. All right. And going even earlier, GDPR was when it was coming to the effect, let’s say there were many different worries about how to interpret GDPR rules and so on. And it’s been a couple of years already and now there are still precedent rulings, of course, or PR stunt actions like with that lawyers for Google phones and so on. So I wonder, my understanding is that even though it’s been a couple of years, this is still not very settled. So a lot of companies are regulated right now. Not regulated compliant with what GDPR says, some are not are neglecting this. But there are many myths around the whole GDPR idea that you cannot store cookies, for example, or other. And what’s your opinion? Because you are the practitioner here, right? You with your team created the solution that let’s say saved the company. So I wonder what is your opinion about stuff that people think they cannot do but they actually can, but it has to be compliant. So what does it mean?

Helmut Januschka [00:23:36]:

I think that the world is divided into two tiers. From the publisher side, we are on the first tier where we have to apply to every regulation that is possible floating around because otherwise we really risk into paying fees that are super high. Like in Austria we have multiple cases of companies that paid some of them already twice for stupidity things. And there’s a second here, like you’re a blogger, you have a website and you don’t even know what you do there. You install a theme and click Next. Next. And then your company, if you’re like a dentist might be possible be sued for being GDPR illegal. Things get better because those ready to run setups and themes, they come with cookie bunners and they come with stuff like that. But in general, I think we see it as that when we start a service with some external entity, if it’s US based, it really needs to be argued why we should use it. So at some point it’s like stopping the European Union probably into using innovative things. But at the end of the day, if you really are into that topic, it’s really about protecting the user. So you should not set the cookie if you cannot argue why you set it and if you can argument why you set it. Like, for example, you have a button that saves the state of dark mode or not dark mode, you can set that cookie, you can argue why you set it and it makes sense. But you need to at some point in your website list that cookie that you do that and why you do it. And I think that’s the good part of the GDPR thing. So people like publishers that we first of all needed when it took effect in the beginning, we needed to figure out what are we doing actually because it grows over time and then you have third party advertisers and stuff like that. And I think the whole GDPR wave really helped to clean it up a little bit, to be honest. The current solutions with the click the button accept it is not the real solution because at the end you end up with most the cookies anyway. But GDPR goes further.

Maciej Nowak [00:25:58]:

Yeah, because when you browse, everyone shows that cookie banner which does nothing apart from saving someone’s, someone’s seed that there is that cookie banner. We are compliant because there is that cookie banner, more compliant because it has to work actually if you are not accepting opting out.

Helmut Januschka [00:26:18]:

But the thing is GDPR goes away a little bit further. Like we as a vendor, we ensure that we do not store IP addresses. We ensure that we eliminate log files. After a recent amount of days, we have to commit to only store data that we really use for the business. So it’s not like previously, before GDPR, we just always said like get the data and then we figure out what we do with the data. Nowadays it’s like a little bit different. You need to have the use case already before collecting the data, which sometimes makes it hard, especially when you want to look at historical data. But it forces the companies to think about what we are doing because also, first of all, your personal information, but also what happens if the company gets breached and they have millions of records from you like a year ago, but they didn’t do anything with it. But the guy who bought the zip file in the darknet now has the data and I think GDPR overall is a good way. The current implementations from the websites is, like I say, improvable with the cookie banners because I think that not the perfect solution because most of the users anyway just click on accept and continue. It doesn’t not really change a lot a little bit for sure, because the website owner needs to, for example, eliminate log files. So a little bit improvements. Anyway.

Maciej Nowak [00:27:49]:

You know, you mentioned that action from half a year ago regarding Google Fonts. Can you elaborate on this for our listeners a little bit more? What happened?

Helmut Januschka [00:28:02]:

What happened? We actually got like a letter that says we need to argue why we are using it. And then the whole lawyer thing kicked off and it was like a huge mess. And then we decided in the reply to the lawyers from the we just tell them, okay, you’re right, that might be not okay, or okay, please make the decision out of our room. But to show you our goodwill, we just remove it. And the funny thing is we now host the Google Fonts on our own, resulting into like 50 terabytes on font traffic just for nothing. So it’s basically also a coast driver. Previously, like Google paid those 50 terabytes of traffic. Now we have to pay this on our own. And I think personally the user is not in a better world right now because the fund is coming from our server. But at the end of the day, maybe the summarization of the small things is it and it forces the big ones like Google, Facebook and Co to think a bit more what they are doing. Before going into the European market, there.

Maciej Nowak [00:29:11]:

Was a case that for example, right now, very recently, like two weeks ago, three, maybe four weeks ago, chad GTP was banned in Italy because of the.

Helmut Januschka [00:29:23]:

I think they already lifted the ban. I think banning is not the right way, to be honest, because especially in the whole AI space, I think it would not be great for the European Union to completely isolate because we don’t have right now the equivalent products. It opens a huge space for AI out of Europe, but right now we don’t have it. So we get a really big disadvantage in terms of innovation and in terms of the whole thing. So personally, I would not like that. I would like if they for example, need me to click and accept something so that people know. For example, if you enter your password in that chat GPT box, most likely someone else will get your password as a possible solution predicted in six months. So take care what you are doing. But blocking the service completely, first of all doesn’t make sense because people will find workarounds proxies, VPN, whatever, and it’s not data protection for me. Who decides then what is blocked and what is not blocked?

Maciej Nowak [00:30:31]:

Exactly?

Helmut Januschka [00:30:32]:

Nowadays we are blocking AI next week we are blocking publishers, next week. We are blocking blockers with a different mindset. So I think that this is not great.

Maciej Nowak [00:30:43]:

Yeah, it’s like zero is binary. It’s like working or not working. Like someone got scared and banned the thing and it’s not maybe the best approach. But also in the German speaking countries, I think the rules for data protections are protection are more strict than in other countries. And for non German speaking listeners that didn’t follow this, there was like a lawyer or small bureau asking for information about GDPR, like on a mass scale, right. Visiting the website and then asking for information, checking if the company is prepared to give that information, which is required by GDPR. You know about this case, right?

Helmut Januschka [00:31:39]:

I know about this case and we as a publisher, we have that like on a daily basis there is people, they are signing up, they are deleting their account and they are sending an email please give me everything you know about me. We have that process completely automated. So it’s just a click and we enter the email address and we get a sip file with everything. But integrating and implementing that was like a huge amount of effort.

Maciej Nowak [00:32:02]:

All right? And this is just for the sake of when you think about this, how do you feel about this? Because again, this can be like double edged sword. You know what you are doing with the data. So it means that you are not losing the data or you can say I’m managing the data. On the other hand, it is a huge pain to create a solution that would automate this. Otherwise it would be impossible with manual work. Right? So it’s like protection, enforced protection pushed to the limit of reasonable effort, I would say.

Helmut Januschka [00:32:41]:

Yeah. So we have a process in place that creates that zip file. But we definitely have a person controlling it before sending it out because usually those people who are writing and requesting that thing are like nitpickers. So if you make a tiny mistake, they will use that tiny mistake for whatever. We don’t even know what persons are those because from my point of view, it doesn’t make sense. We don’t have any crucial data, but it’s their right and we are complying to it and we are sending you the data if you request it for capture. For example, what we definitely not do is like the competitors, you need a JavaScript on your website, but you only need it in that area where your contact form, for example is. So you don’t need to place us all over your page. So we do not know your serve flow where you previously and stuff like that. Because this would be super grazoned and we don’t want to be grazoned.

Maciej Nowak [00:33:38]:

And all of those regulations changing the way think can work, it sounds like first of all, the new regulations created a whole lot of new lawyers.

Helmut Januschka [00:33:53]:

Absolutely. It’s all a lawyer’s business.

Maciej Nowak [00:33:55]:

Exactly. It’s like regulation driven development. Development, yeah. But also for the tools. Like this capture wouldn’t be existing if not for the change in the regulations.

Helmut Januschka [00:34:07]:

We definitely would not have created it because we would have sticked to recapture because the need would not be high enough to come up with a solution. Like even the cookie banner companies, there’s like gazillions of cookie banner companies that charge you multiple hundred dollars just to have a compliant cookie banner.

Maciej Nowak [00:34:27]:

Exactly. And I wonder what else is on the radar for disruption, let’s say because.

Helmut Januschka [00:34:36]:

Of the GDPR, I don’t know. I think the next thing is the third party cookies. So that you’re not allowed to send cookies to other servers in that. But this is something the browsers are going to enforce and therefore I think it’s not so hard because it hits all the publishers and all the websites and therefore it’s fair, most likely fair.

Maciej Nowak [00:34:59]:

What about the fonts? You have to host your own fonts. So is there a space for revolution here?

Helmut Januschka [00:35:07]:

Yeah, maybe someone could build a European Google font hosted thing, but technically that’s super simple. You take a web server, put the fonts there and you’re done. The problem is that the bandwidth you will need is just 50 terabytes in like three months or something like super a lot, only for us. So I think the business model behind that would not scale.

Maciej Nowak [00:35:36]:

So why is the reason this is taking so much bandwidth?

Helmut Januschka [00:35:42]:

First of all, we’re using a lot of Google fonts so that’s like we have designers who take a lot of fonts and then we have multiple variations and font faces like bold, non bold. And in our case we have a huge amount of users who are only visiting us once. Like they are googling for something and say, okay, there was a car accident and then they find an article, click there. They never were at our space, so they don’t have anything in the cache. Then we see a huge increase in incognito browsers. They also eliminate caches and so the cache effectiveness is so low that you have just so many bandwidth about that.

Maciej Nowak [00:36:25]:

All right, because you have so much traffic, so many fonts and hosted this at your server, it takes a lot of bandwidth, but it’s still all of the other page. The rest of the page is also heavy. So for the rest of the traffic there is many times more probably. Right, sorry.

Helmut Januschka [00:36:46]:

Okay, yeah, the rest of the page also needs a lot of traffic. But when you look, for example, an article, it’s mostly like text that’s super easy to compress with Gtzip and it does not take that much. And for example, if I have more reads on an article, I have more ads, I get more money, but if I have more phone calls, I don’t get more money. So it’s like, yeah, we do it, but basically we would love not to do it.

Maciej Nowak [00:37:15]:

All right, yeah. This is not the right market for you right one time.

Helmut Januschka [00:37:19]:

We’re not selling fonts or something like that.

Maciej Nowak [00:37:21]:

Yeah, I’m thinking about this from engineering perspective, what’s the reason behind such big traffic regarding the font hosting? But you explained this. Thanks for this. I’m also thinking, what are the good rules if you have that legendary dentist business? When you are a dentist owner of a small dentist business, what’s important in terms of GDPR? Because again, you are a practitioner from different even scale than dentist business. So I wonder, what are the rule of Thump? What is the rule of Thump when following GDPR rules?

Helmut Januschka [00:38:08]:

I think the rule of Thump goes way beyond GDPR. It’s like only install plugins that you know what, that makes sense. Don’t go into it and take it like a candy shop and activate 15,000 plugins. And the same goes for GDPR. Don’t save data that you do not know you need. If you don’t do that, it’s quite easy. There’s alternatives. For example, for the captcha website itself, we decided to not use Google Analytics because it wouldn’t smell bad to say we are GDPR compliant. But for our website analytics, we use Google. So we go for simple analytics, a great vendor out of, I think Netherlands, and they are providing a great analytics solution that is equivalent to Google. And there is solutions already out there. But just don’t store data that you don’t need. Like don’t add tracking scripts if you don’t do anything with the data. Because in my career previously, when I started in the 90s, we just added tracking for the purpose of tracking. We didn’t know what we do with it. Most of the cases we just dropped the data. Like at the end of the year, we just truncated the table. But nowadays it’s easy to just think about what are you going to do with the data? And if you host everything on your own, you don’t need a cookie consent. On our, for example, capture au website, we don’t even have a cookie banner because everything is self hosted and it’s quite limited in terms of analytics. You can do way more with Google and other products. But the reason why you can’t do it is because it’s like crayshound or illegal.

Maciej Nowak [00:39:44]:

All right, is there anything else? Because I’m looking for good practices for the people when they will be thinking about building their website. So the default approach is, okay, Google Analytics comes with the bundle, right? So you order the website and this is default part of the execution to integrate with Google Analytics.

Helmut Januschka [00:40:12]:

Yeah, the problem is it begins with Google Analytics. So even if you set the anonymizing IP or use the Ga for the newest one, technically you would need a consent for it. So this is the point where it begins that you need a cookie banner.

Maciej Nowak [00:40:27]:

Yeah, exactly. So you so called legalize this with stating that okay, I’m fine with those cookies to kick in.

Helmut Januschka [00:40:38]:

And most of the users do it. So the tools make sense anyway. And those 10% to 15% probably, yeah, they just opt out, but they don’t make any statistically majority.

Maciej Nowak [00:40:50]:

And with European hosted, let’s say, solution for analytics, you wouldn’t have to do this.

Helmut Januschka [00:40:56]:

No, you don’t need to do this.

Maciej Nowak [00:40:58]:

You don’t need to say that you are like tracking users just because they.

Helmut Januschka [00:41:05]:

Have to apply to the European law and they usually don’t need any content they might have. I’m not really sure if they have a plan or a feature that requires that. But the stock thing for simple, for example, doesn’t need anything. You sign up, you get this, it’s called DPA Data Processing agreement and you both sign that and then you’re fine. Usually you get that pre signed as a download.

Maciej Nowak [00:41:35]:

Right? So imagining removing all of the external US based tracking codes. I mean, most of users will like businesses will use Facebook if those are consumer businesses, Facebook Pixel and Google Analytics and if they can consume that data, they are more than fine if they drop Facebook and replace Google Analytics.

Helmut Januschka [00:42:00]:

The thing is by Facebook Pixel or like even Google ad campaigning. So if you plan to buy a campaign at Google and you want to see how well it performs, you’d actually need the Google Pixel and yeah, you need content. Same goes for Facebook Pixel but for example, we are doing a heavy volume in terms of Google Ads for capture, but we just accept the fact that we don’t know how greatly it converts. So we just measure it in our own analytics and then we try to make sense. Okay, we got so many users in this particular time window and most likely they are from Google, but we do not exactly know.

Maciej Nowak [00:42:38]:

Right, so so I understand now.

Helmut Januschka [00:42:41]:

All right, but we wanted to be like since we are heavily marketing with GDPR, we wanted to be the cleanest from the cleanest.

Maciej Nowak [00:42:50]:

Yeah, you have to lead the way.

Helmut Januschka [00:42:52]:

Yeah. I cannot say please use our service because we are better, but please don’t look at our website.

Maciej Nowak [00:43:00]:

Because I’m thinking it’s like hard battle to fight. You are making a product like competing with Google, but still you need to use Google not to the full extent because you are not tracking the conversions because you decided not to do so. Right, so this is very interesting that you sacrifice part of understanding how they campaigns.

Helmut Januschka [00:43:27]:

Actually the campaigns would optimize themselves better if you have the Google Pixel insight because then the Google machine learning would better know where to place the ads. But in that particular case we just go by volume. So we activate a campaign, for example, it’s just a sample. We started on Monday and we ended on Wednesday and then we look afterwards in our Simple Analytics, what was the peak, what was the entry page and stuff like that. Sometimes we create landing pages where we can see in the path of the URL, okay, this is coming from Google, but it makes it harder, but it’s possible. And at the end of the day, since we are SaaS service, all that matters is how many coins do we insert on the one end and how many coins do we get at the other end. And if you are coming from Google, Facebook or wherever, actually it doesn’t.

Maciej Nowak [00:44:19]:

Yeah. And who is your ideal customer? Because I’m thinking we talked a little bit during quarter about this, that this was the time when you started marketing this. And I’m also curious about that product development journey. Your assumption, initial assumptions, who would be like perfect user and who is it now? Any lessons learned here?

Helmut Januschka [00:44:46]:

Yeah, many, many lessons learned because we didn’t do that before. It was not none of our business. All of the involved people didn’t do that before. So we’re not startup guys, we’re like super corporate guys. The thing is, in the development cycle, we use the dentist or the lawyer as a persona. Like the one guy who sets up its own website and downloads the plugin, activates it and buys the cheapest plant. So this was the persona that we built everything around it. And also the business plan we calculated right at the Birdcamp itself. We recognized that there is actually more agencies interested into that. Just answering the first question, who is the customer that we want? We actually want everyone. So we want a single person who buys the tiny plan, but we also want the agency. With the beginning of the real world launch, we realized we need to step up and integrate a couple of features in the dashboard that makes it possible for agencies to use it at a larger scale for them, like managing multiple clients, updating multiple clients. And so right now, we have heavily optimized the dashboard and everything to be agency compatible. But we still have like random tiny websites from some vacation destination that sign up via plan and integrated into WordPress. And we don’t have to do anything with them. So in terms of return of investment, those might be the best customers because they don’t require anything. When it comes to agencies. We call them, we talk to them, we help them integrating into custom solutions, but at the end of the day, they buy bigger packages.

Maciej Nowak [00:46:32]:

And also you mentioned that initially this was meant for WordPress. Now a couple of other frameworks supplying what about bigger websites? Bigger websites with bigger traffic, custom built on Laravel or other Djangos of this world?

Helmut Januschka [00:46:49]:

First of all, it was not built or designed for WordPress, so it was designed as a standalone product. But since we wanted to go in that self service area, we wanted to pick out one platform where we can drop it and people can buy it and don’t need support or anything. And since we are WordPress nerds and already built a lot of plugins. We decided let’s take this platform because we know how it works, and then go from there. And with the talks of the agencies and the customers, we realized, okay, there is this craft CMS I didn’t even know. So we built a plugin in a week. But the thing is, I didn’t want to start with a platform that we do not know to its heart because I wanted to feel safe when starting selling a product.

Maciej Nowak [00:47:40]:

Yeah, it’s only natural that you started with what you know. I’m thinking maybe in terms of those lessons, WordPress is huge, right? 40% of roughly speaking of the internet out there. Obviously Europe is smaller than the world itself, but I’m thinking maybe of those unexpected customers. Like a platform that is huge in terms of volume, but is totally built from scratch. Custom built.

Helmut Januschka [00:48:14]:

Yeah, we have one, actually, I think I’m not allowed to name the names.

Maciej Nowak [00:48:19]:

You don’t have to, but we have.

Helmut Januschka [00:48:22]:

Like a news agency, a bigger one from Europe, and they have a fully custom integration. So we did a team’s call and helped them integrate it into their super tech stack that they came up with like probably a couple of years ago. And, for example, graph CMS. We had one customer who said like, okay, I gave it a try at my personal website, it worked great, but I cannot use it because we have Graph CMS. And then typically we jump in and we take two engineers and try to build a plugin for that for free. In that case, because it’s not really free, it’s for us. It’s a multiplier. If we have that plugin, the next customer would most likely join easier.

Maciej Nowak [00:49:04]:

Yeah, you build on top of what you have already built.

Helmut Januschka [00:49:07]:

The only thing that we always put on the scale is like, we need to maintain that plugin afterwards. So it doesn’t make sense to ramp out the plugin, reach out the plugin in 2 hours and then get this one single customer. But what do we do if that platform, like in three months rolls out an update that somehow changes the APIs? Thankfully, WordPress never changes anything. So WordPress is the most maintenance free plugin. But other platforms like Typo Three or stuff like that, they are quite used to breaking things.

Maciej Nowak [00:49:42]:

Okay, and what’s on your radar right now with the product development path for next quarter?

Helmut Januschka [00:49:49]:

The major plugin we’re working currently on is Typo Three because it just makes a headache with all the different versions and supporting them backwards. We’re trying to figure out how far we need to support it backwards. Like usually we do current minus one, but then we had a customer who said like, he needs current minus three and so we are weighing out how we can support that. And then I think we have quite of all those big platforms, we already shipped the keyclock in the beta version. That is like an enterprise authenticator. The keylock itself is open source, but it’s based on Java and it’s super heavily used in big companies. Well, we hope that we have a door opener if we come into the sales pitch and say them, okay, if you want to give it a try, just download the plugin.

Maciej Nowak [00:50:40]:

And you mentioned also that you are super corporate people and this is, as you call, the startup approach. What’s most surprising for you during this tough with this product? What surprised you most?

Helmut Januschka [00:50:58]:

I think that everything we thought about that will happen in the first few days did not happen. But good things happened anyway, so it happened differently. We reached our goal for the first few days, but on a totally different road, even though we were heavily thinking about what will go on on WordCamp, how many people will buy it there, and then we quickly realized people are not buying it immediately, but like four weeks later, they still use the Voucher code. And that’s super interesting to see, like a tiny little puppy that you cared about, like one and a half year, and then it’s the first time of flying or running around and getting free. Yeah, and then like, customer requests. I really enjoy helping customers. It gives a lot of joy because typically they don’t have those heavy technical questions. They most likely have, okay, I want to host a JavaScript. For example, one of the things we quite quickly realized that there’s people who wanted to host a JavaScript on their own, and we didn’t think about that because we thought, okay, we take the bandwidth on our bill, take the script and be happy. But there’s other companies who say we need to host it on our own. And then in like three days, we modified the plugin to be able to run the JavaScript through the WordPress and stuff like that. Super interesting.

Maciej Nowak [00:52:22]:

And what is the reasoning behind, I have to do this on my own server?

Helmut Januschka [00:52:28]:

Yeah, different. So there’s companies who are not allowed to have external services integrated, and so they proxy the JavaScript through the WordPress. And now we also have, for the verification endpoint, we have multiple options. The one that runs through the CDN where it’s possible that you get a non European IP address. We also have a dedicated Austrian IP so that you can guarantee that everything goes through Austria. But we also now have NGINX and Caddy configurations so that you can in your data center deploy a proxy. And that proxy goes to our service. And in that configuration of that proxy server, it’s guaranteed that the IP address is eliminated on your site already because we had two customers who needed to make sure that the IP address doesn’t leave their data center.

Maciej Nowak [00:53:22]:

Okay, so the IP address of the visitor.

Helmut Januschka [00:53:25]:

Yeah, exactly.

Maciej Nowak [00:53:26]:

All right, so this is precaution as to not to leak the visitor addresses outside of the company.

Helmut Januschka [00:53:34]:

Exactly. All right, for our machine learning, we don’t need it because we constructed it around GDPR itself. And when we created the thing, we already remove it at our end the last two blocks. But yeah, we say that and people believe it, but companies who better make sure. So we officially support we call it the Proxy mode, and you get a tiny little of configuration that you can configure on your side. And in that configuration, you see that the IP address is removed.

Maciej Nowak [00:54:04]:

This is super interesting because an approach to building captcha is now very fine tuned to what your customers are saying. And this is interesting from my perspective, how much effort you want to put into custom configuration that okay, can be rolled out to other users. Maybe they will require this. But this kind of custom configuration you’re saying now, like proxy and everything, sounds like a lot of work.

Helmut Januschka [00:54:35]:

Yeah, I think we always have to say we’re an Austrian company and in Austria it’s most likely if we talk, we find a solution. And yeah, you have a true point. To what extent are you customizing a thing for only a single customer? First of all, every customer counts, to be honest. And those custom solutions were for bigger ones. So they said like, okay, we’re going to create a bigger contract, but we cannot because we don’t get this through legal. And then we decided to make it, but make it in a way that others can use it too, because now we can even go to the smaller customers and say there is a solution already you can use because we built it anyway. So we do not build secret features that you only get on a certain plan.

Maciej Nowak [00:55:26]:

Super interesting stuff, building a product from within a bigger corporation. And what are the lessons learned?

Helmut Januschka [00:55:37]:

But yeah, you’re totally right. We always put on a scale, does it make sense to fulfill that customer requirement? But I think everyone has the same if you are talking to customers, how many customizations do I give him before I need to raise a bill or sometimes before I destroy my yeah, yeah.

Maciej Nowak [00:56:04]:

No point in supporting the product anymore. Okay, Helmut, thank you very much for this conversation. For our listeners, where can they find.

Helmut Januschka [00:56:16]:

Your product so they can find on Capture au the website, you can sign up and you get 100 validations for free. So you can sign up and then go to the plugin store of your choice, either the WordPress directory, the Yomla directory, or all the other directories that we support and install the plugin copy paste your keys and you can try this on your platform. If you have any questions, you can use the onsite chat and you can contact us at all the contact options that you find on the website. And funny side note, even the chat is GDPR compliant because we host this also on our own servers.

Maciej Nowak [00:56:57]:

Great. Yeah. One of the few full GDPR compliant pages. All right, thank you very much, Helmut. This was a pleasure to have a chat today. And take care.

Helmut Januschka [00:57:10]:

Take care.

Lector [00:57:11]:

If you like what you’ve just heard, don’t forget to subscribe for more episodes. On the other hand, if you’ve got a question we haven’t answered yet, feel free to reach out to us directly. Just go to osomstudio.com/contact. Thanks for listening, and see you in the next episode of The Osom to Know podcast.

Next article

multilingual wordpress site seo friendly

How to Make Multilingual WordPress Website SEO-Friendly

Avatar photo

By Łukasz Kaczmarek

11 min read

small logo of osom studio wordpress and woocommerce agency

Join Osom to know newsletter!

Get your monthly dose of WordPress information.