🇬🇧🇺🇸 SpamLabs: Back to the drawing board

The initial outreach to gather feedback about an API that validates email addresses based on various analysis (DNS analysis being the main one) didn't turn out the way I hoped. Sent about 40 cold emails about gathering feedback to gauge the interest in such a service. Although the sample is pretty small, I just ran out of ideas of getting prospects for "targets". Even though, I targeted only the "ideal clients" that I thought had the best chance of needing such a service: newsletters, communities and SaaS businesses that offer a free tier.

The responses were mostly negative such as

  • "not a problem for us"
  • "our platform already handles it pretty well"
  • "we are not big enough to care"
  • "we handle it manually pretty well, the volume is not that high to justify a dedicated solution"

I asked the same questions and pushed the same pitch across some other dedicated communities such as Shopify shop owners and discord server admins. The responses were negative, but a big part of them outlined a new need I had no idea it existed: getting blasted by cold emails, trying to sell all kind of unsolicited services, mostly SEO optimization services.

Then a light bulb popped over my head: what if instead of targeting the spam senders, we instead target the spam/unsolicited content, mainly cold email outreaches?

I spent some time to read about it, and from some generic numbers, it appears that from 100 sent cold emails, 10 are opened, and 1 responds positively. That means the that at least 90 cold emails sent will annoy 90 different people. Especially ones that have a work email.

The slight pivot I did was to convert the direction of SpamLabs from a email address validation API to an email filter that identifies cold emails and marks it as spam.

I integrated some ChatGPT + some algorithmic checking, I put together about 50 examples of spam/not spam from my own inboxes and tested the service. Got about 60% success rate, which is too low to feel comfortable to release it.

So, right now I'm going back to the drawing board and try to get the success rate to at least 90% before I release it. It is surprisingly hard to tell apart cold email attempts from various promotions that you subscribed to (for example, it's surprisingly hard to differentiate the "These games from your wishlist are on sale" from Steam from a cold email, because it tries to sell you stuff, and it's kind of hard to pinpoint if it is legit or not without whitelisting everything that comes from @steampowered.com.

Gotta keep on grinding these algorithms, because ChatGPT doesn't seem to be reliable enough to accurately say if an email is a cold email or not (being a statistical language model, it changes its answer). I will explore its fine tuning capabilities to get to a model that performs good enough for this task, and then iterate from there.

Cheers! Hit me with any question/curiosity you have, let's discuss!