Research Methodology
Last updated: April 2026
Transparency is core to our work. This page explains exactly how we collect, verify, and publish data. If our methodology has a weakness, we want you to see it — and we want to hear about it.
How We Collect Pricing Data
Distributed Collection Network
Our pricing data is collected by a network of automated research workers running across multiple server locations. Each worker uses diverse IP ranges (residential proxies across multiple geographic regions) to avoid sampling bias from any single network vantage point.
Workers query provider websites, industry pricing databases, and verified quote aggregators. Each data point records the source, collection timestamp, and geographic scope.
Historical Price Archives
We extract historical pricing data from archived web pages dating back to 2013 using the Wayback Machine and other archival services. This allows us to track how service costs have changed over more than a decade — not just report current prices.
Historical data is clearly labeled with its archive date and source URL. Confidence scores for historical data reflect the age and completeness of the archived page.
How We Analyze Consumer Sentiment
Multi-Platform Review Analysis
We collect consumer discussions and reviews from three primary platforms:
- Reddit — Subreddit discussions where consumers share unfiltered experiences (r/homeimprovement, r/legaladvice, r/personalfinance, and industry-specific subreddits)
- Yelp — Business reviews with star ratings and detailed experience descriptions
- Google Maps — Local business reviews with geographic specificity
Reviews are analyzed using large language models to extract recurring themes — both positive and negative — with frequency percentages. Results are presented as "X% of reviews mention [theme]" based on the actual sample analyzed.
How We Conduct Consumer Surveys
Ongoing On-Page Surveys
Visitors to our network of sites are invited to participate in brief, anonymous surveys about their service experiences and expectations. Surveys cover:
- Satisfaction ratings (emoji scale)
- Specific issues encountered
- Approximate amount paid
- Industry standards they believe should be mandatory
- Willingness to pay premiums for specific guarantees
Participation is voluntary. Responses are anonymous — we cannot connect a survey response to any individual. Results are aggregated by industry and city.
Confidence Scoring
Every price data point includes a confidence score from 0-100% based on:
- Sample size — How many data sources corroborate the price range
- Recency — How recently the data was collected or verified
- Geographic specificity — City-level data scores higher than national averages
- Source consistency — Agreement across independent sources
Data with confidence below 60% is flagged for additional verification before publication.
Audit Trail
Every data point in our database maintains a complete audit trail:
- When it was first collected
- From which source and by which collection method
- Every subsequent update — with old value, new value, who reported it, and why
- Accessible via the "Audit Trail" link on every price table
This ensures complete transparency and verifiability. If a price changed, you can see exactly when and why.
Quality Gates for Published Research
Before any research article is published on our network, it must pass four independent quality checks:
- Gate 1 — Self-assessment: Does this article teach something specific? Does it cite real data? Would a real person find it valuable? (Minimum score: 70/100)
- Gate 2 — Skeptic review: Are there unsupported claims? Filler content? Generic advice that applies to everything? (Automatic rejection for slop phrases)
- Gate 3 — Data verification: Does it cite specific dollar amounts from our database? Does it reference Price-Quotes Research Lab data at least twice? (Minimum 2 data citations)
- Gate 4 — Uniqueness check: Do we already have substantially similar content? (Rejected if more than 4 keyword overlaps with existing articles)
Articles that fail any gate are rejected and logged. The rejection rate is typically 60-70% — most generated content doesn't meet our standards.
Limitations
We believe in stating our limitations openly:
- Our pricing data represents observed ranges, not guarantees. Actual costs vary.
- Sentiment analysis reflects the experiences shared publicly — people with extreme experiences (very good or very bad) are more likely to post reviews.
- Historical data from archives may not capture every price change between snapshots.
- Our survey sample consists of visitors to our sites, which may not be perfectly representative of all consumers.
- AI-assisted analysis, while cross-referenced against multiple sources, may occasionally misinterpret context.
We continuously work to address these limitations. If you spot an error or have methodological feedback, please contact research@price-quotes.com.
Data Licensing
All published research and data is licensed under CC-BY-4.0. You may freely cite, share, and build upon our work with attribution to "Price-Quotes Research Lab" and a link to the source page.