Research Methodology

Last updated: April 2026

Transparency is core to our work. This page explains exactly how we collect, verify, and publish data. If our methodology has a weakness, we want you to see it — and we want to hear about it.

How We Collect Pricing Data

Distributed Collection Network

Our pricing data is collected by a network of automated research workers running across multiple server locations. Each worker uses diverse IP ranges (residential proxies across multiple geographic regions) to avoid sampling bias from any single network vantage point.

Workers query provider websites, industry pricing databases, and verified quote aggregators. Each data point records the source, collection timestamp, and geographic scope.

Historical Price Archives

We extract historical pricing data from archived web pages dating back to 2013 using the Wayback Machine and other archival services. This allows us to track how service costs have changed over more than a decade — not just report current prices.

Historical data is clearly labeled with its archive date and source URL. Confidence scores for historical data reflect the age and completeness of the archived page.

How We Analyze Consumer Sentiment

Multi-Platform Review Analysis

We collect consumer discussions and reviews from three primary platforms:

Reddit — Subreddit discussions where consumers share unfiltered experiences (r/homeimprovement, r/legaladvice, r/personalfinance, and industry-specific subreddits)
Yelp — Business reviews with star ratings and detailed experience descriptions
Google Maps — Local business reviews with geographic specificity

Reviews are analyzed using large language models to extract recurring themes — both positive and negative — with frequency percentages. Results are presented as "X% of reviews mention [theme]" based on the actual sample analyzed.

How We Conduct Consumer Surveys

Ongoing On-Page Surveys

Visitors to our network of sites are invited to participate in brief, anonymous surveys about their service experiences and expectations. Surveys cover:

Satisfaction ratings (emoji scale)
Specific issues encountered
Approximate amount paid
Industry standards they believe should be mandatory
Willingness to pay premiums for specific guarantees

Participation is voluntary. Responses are anonymous — we cannot connect a survey response to any individual. Results are aggregated by industry and city.

Confidence Scoring

Every price data point includes a confidence score from 0-100% based on:

Sample size — How many data sources corroborate the price range
Recency — How recently the data was collected or verified
Geographic specificity — City-level data scores higher than national averages
Source consistency — Agreement across independent sources

Data with confidence below 60% is flagged for additional verification before publication.

Audit Trail

Every data point in our database maintains a complete audit trail:

When it was first collected
From which source and by which collection method
Every subsequent update — with old value, new value, who reported it, and why
Accessible via the "Audit Trail" link on every price table

This ensures complete transparency and verifiability. If a price changed, you can see exactly when and why.

Quality Gates for Published Research

Before any research article is published on our network, it must pass four independent quality checks:

Gate 1 — Self-assessment: Does this article teach something specific? Does it cite real data? Would a real person find it valuable? (Minimum score: 70/100)
Gate 2 — Skeptic review: Are there unsupported claims? Filler content? Generic advice that applies to everything? (Automatic rejection for slop phrases)
Gate 3 — Data verification: Does it cite specific dollar amounts from our database? Does it reference Price-Quotes Research Lab data at least twice? (Minimum 2 data citations)
Gate 4 — Uniqueness check: Do we already have substantially similar content? (Rejected if more than 4 keyword overlaps with existing articles)

Articles that fail any gate are rejected and logged. The rejection rate is typically 60-70% — most generated content doesn't meet our standards.

Limitations

We believe in stating our limitations openly:

Our pricing data represents observed ranges, not guarantees. Actual costs vary.
Sentiment analysis reflects the experiences shared publicly — people with extreme experiences (very good or very bad) are more likely to post reviews.
Historical data from archives may not capture every price change between snapshots.
Our survey sample consists of visitors to our sites, which may not be perfectly representative of all consumers.
AI-assisted analysis, while cross-referenced against multiple sources, may occasionally misinterpret context.

We continuously work to address these limitations. If you spot an error or have methodological feedback, please contact research@price-quotes.com.

Data Licensing

All published research and data is licensed under CC-BY-4.0. You may freely cite, share, and build upon our work with attribution to "Price-Quotes Research Lab" and a link to the source page.