Back to Blog
thompson samplingmulti-armed banditcontent optimizationmachine learningsocial media algorithmbayesian optimization

Using Thompson Sampling to Optimize Social Media Content: A Practical Guide

Nemo9 min read
Share:

When you're running an AI agent that posts content for 50+ users, you face a classic optimization problem: for each user, which type of content should you post? Technical tips? Behind-the-scenes? Product announcements? Industry commentary? The answer is different for every user, and it changes over time.

Most social media tools either use fixed schedules or simple A/B testing. We use Thompson Sampling — a Bayesian approach to the multi-armed bandit problem that balances exploration and exploitation naturally. Here's how it works in practice.

The Problem: Content Mix Optimization

Imagine a user sells a developer tool. Their AI agent can generate 6 types of content:

  1. Technical tips and tutorials
  2. Product feature highlights
  3. Behind-the-scenes / building in public
  4. Industry news commentary
  5. Community engagement (replies and threads)
  6. Promotional / launch announcements

Each type has an unknown "true" engagement rate for this specific audience. We want to figure out which types work best and allocate more budget to them — but we also don't want to stop exploring other types entirely, because audience preferences change.

This is exactly the multi-armed bandit problem: each content type is an "arm" of a slot machine with an unknown payout rate.

Why Not Just A/B Test?

Traditional A/B testing has two problems for this use case:

  1. It's wasteful: A/B testing allocates 50/50 traffic to both variants, even when one is clearly better. If "technical tips" get 4x the engagement of "industry news," you're wasting half your posts on the inferior option during the test period.
  2. It doesn't scale: With 6 content types and parameters like posting time, tone, and format, you'd need hundreds of simultaneous A/B tests. The combinatorial explosion makes it impractical.

Thompson Sampling solves both problems elegantly.

How Thompson Sampling Works

The core idea is simple: maintain a probability distribution over each content type's engagement rate, and sample from those distributions to decide what to post next.

Step 1: Initialize with Beta Distributions

For each content type, we maintain a Beta distribution parameterized by (alpha, beta):

  • alpha = number of "successes" (posts that achieved above-median engagement)
  • beta = number of "failures" (posts that achieved below-median engagement)

We start with alpha=1, beta=1 for all types (uniform prior — we know nothing).

Step 2: Sample and Select

When it's time to decide what to post:

  1. For each content type, draw a random sample from its Beta(alpha, beta) distribution
  2. Select the content type with the highest sampled value
  3. Generate and publish content of that type

Step 3: Update After Observation

After the post has been live for 24 hours, check engagement:

  • If engagement was above median: alpha += 1 (record a success)
  • If engagement was below median: beta += 1 (record a failure)

Why This Works

The beauty of Thompson Sampling is in the sampling step. Content types with high engagement rates will have distributions shifted toward 1.0, so they'll be sampled as the winner more often. But content types with fewer observations will have wider distributions — meaning they'll occasionally be sampled with high values, ensuring they get explored.

As data accumulates, the distributions narrow. The algorithm naturally transitions from exploration (trying everything) to exploitation (focusing on winners) without any manual tuning.

Real Results: Before vs. After Thompson Sampling

Here's data from one user's account. The first 2 weeks used uniform random content selection; weeks 3-6 used Thompson Sampling.

PeriodContent SelectionAvg Engagement RateFollower Growth/Week
Weeks 1-2Uniform random2.1%+15
Week 3Thompson Sampling (exploring)2.8%+22
Week 4Thompson Sampling (converging)3.4%+29
Weeks 5-6Thompson Sampling (exploiting)4.1%+36

The algorithm discovered that this audience responds best to technical tips (45% of posts) and behind-the-scenes content (30%), with small allocations to community engagement (15%) and occasional product mentions (10%). Industry commentary was nearly eliminated — it consistently scored lowest.

Beyond Content Type: Multi-Dimensional Optimization

In practice, we optimize more than just content type. The Strategy Agent uses Thompson Sampling across multiple dimensions:

  • Content type: As described above
  • Posting time: 6 time slots per day, each treated as an arm
  • Tone: Casual, professional, humorous, inspirational — 4 arms
  • Format: Short post, thread, question, link share — 4 arms

Each dimension is optimized independently with its own set of Beta distributions. The full content generation prompt combines the winning samples from each dimension: "Generate a [casual] [short post] about [technical tips] to be published at [11 AM]."

Practical Considerations

Handling Non-Stationarity

Audience preferences change over time. A topic that was hot last month might be stale now. To handle this, we apply a decay factor: every 7 days, we multiply both alpha and beta by 0.9. This gradually "forgets" old data and keeps the algorithm responsive to recent trends. The concept is similar to exponential smoothing in time series analysis.

Cold Start Problem

For new users, we initialize the Beta distributions using aggregate data from similar users (same industry, similar audience size). This gives the algorithm a head start instead of starting from pure exploration. After 10-15 posts, the user's own data dominates.

Minimum Exploration Rate

Even after convergence, we enforce a 10% minimum exploration rate — at least 1 in 10 posts will be drawn from an underexplored content type. This prevents the algorithm from getting stuck in a local optimum.

When Thompson Sampling Isn't Enough

Thompson Sampling optimizes within a fixed set of options. It can't invent new content types or recognize that the entire strategy needs rethinking. For that, we layer a separate "Strategy Agent" on top — a Gemini-powered meta-agent that reviews the full picture (engagement trends, follower growth trajectory, competitor activity) and can make structural changes to the strategy.

The combination of algorithmic optimization (Thompson Sampling for tactical decisions) and LLM reasoning (Strategy Agent for strategic decisions) is more robust than either approach alone.

If you're building something similar or want to see this optimization in action on your own social media, try BlogBurst free. The Thompson Sampling algorithm starts working from your first post.

Comments

Ready to automate your content repurposing?

BlogBurst transforms your blog posts into platform-optimized social media content in seconds.

Try BlogBurst Free