Anti-Detection Playbook

How to avoid CAPTCHAs and bot detection in LumiTeh Sessions

Resolving Bot Detection Issues

Bot detection is a common challenge for web agents. This guide outlines strategies to bypass these mechanisms using LumiTeh’s stealth configuration options.

Common Bot Detection Challenges:

  • Accessing e-commerce sites with anti-bot measures

  • Scraping content from news or social media platforms

  • Interacting with banking or financial websites

  • Accessing sites with geographic restrictions

​Stealth Configuration Strategies

​1. Proxy Configuration

Proxies are one of the most effective ways to bypass bot detection. Different proxy configurations can help your agents appear as legitimate traffic from various locations.

Using Default Proxies

Enable LumiTeh’s built-in residential proxies for better anonymity:

from lumiteh_sdk import LumiTehClient

lumiteh = LumiTehClient()

# Start a session with built-in proxies
with lumiteh.Session(proxies=True) as session:
    _ = session.observe(url="https://www.lumiteh.io/")

Country-Specific Proxies

For sites with geographic restrictions, use proxies from specific countries:

from lumiteh_sdk.types import LumiTehProxy

proxies = LumiTehProxy.from_country("fr")

2. Browser Type Selection

Different browsers have varying levels of detection resistance. Experiment with different browser types for your specific use case:

from lumiteh_sdk import LumiTehClient

lumiteh = LumiTehClient()

# Try different browser types
browsers = ["chromium", "chrome", "firefox"]
for browser in browsers:
    with lumiteh.Session(
        browser_type=browser,
        proxies=True,
        solve_captchas=True
    ) as session:
        result = session.observe(url="https://example.com")
        print(f"Success with {browser}")

Note: chromium is the default browser type but is the most easily detected.


3. CAPTCHA Solving

Enable automatic CAPTCHA solving for sites that use these challenges:

from lumiteh_sdk import LumiTehClient

lumiteh = LumiTehClient()
with lumiteh.Session(
    solve_captchas=True,
    browser_type="firefox",
    headless=False,
) as session:
    # Navigate to a page with a CAPTCHA
    agent = lumiteh.Agent(session=session, max_steps=5)
    resp = agent.run(
        task=(
            "Try to solve the CAPTCHA using internal tools. "
            "If you fail, try to solve it manually."
        ),
        url="https://www.google.com/recaptcha/api2/demo"
    )
    print(resp.answer)

Complete Stealth Configuration Example

Here’s a comprehensive example combining all stealth techniques:

from lumiteh_sdk import LumiTehClient
from lumiteh_sdk.types import LumiTehProxy

lumiteh = LumiTehClient()

# Example stealth configuration
# This is just one possible setup; rotating these values can improve success rates
stealth_config = {
    "solve_captchas": True,
    "proxies": [LumiTehProxy.from_country("us")],
    "browser_type": "chrome",
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "viewport_width": 1920,
    "viewport_height": 1080,
}

# Try the stealth configuration
with lumiteh.Session(**stealth_config) as session:
    result = session.observe(url="https://example.com")
    print("Success with fallback configuration")

​Troubleshooting Tips

  1. Start Simple. Begin with basic configurations and gradually add complexity:

    • Try proxies=True first

    • Add solve_captchas=True if needed

    • Experiment with different browser_type values

    • Add custom user_agent if still detected

  2. Test Incrementally. Test each configuration change individually to identify what works:

  3. Monitor for Patterns. Keep track of which configurations work for different types of sites:

    • E-commerce sites often respond well to residential proxies

    • Social media sites may require specific user agents

    • Banking sites may need country-specific proxies

​Best Practices

  1. Rotate Configurations: Don’t rely on a single configuration: it makes it easier to track you

  2. Monitor Success Rates: Verify which configurations work best for different site types

  3. Respect Rate Limits: Implement delays between requests to avoid triggering rate limiting

  4. Keep Configurations Updated: Bot detection methods evolve, so regularly test and update your configurations

Last updated