Does gender impact the price quoted by auto mechanics?
Abstract:
This study investigated whether perceived gender, based solely on voice, influences the price quoted for a standard automotive service. Utilizing two AI agents with distinct male-presenting and female-presenting voices but identical scripts, we collected 771 valid oil change price quotes for a 2011 Honda Accord from mechanics across over 20 US cities. Callers intentionally did not specify an oil type, allowing mechanics to propose a service and price. Nationally, the median price quote was identical ($84.99) for both caller types. However, quotes for the female-presenting voice were higher at the 25th (+7.7%) and 75th (+5.6%) percentiles, and the minimum non-outlier price was significantly higher (+26.7%). Despite these differences, a Mann-Whitney U test found no statistically significant difference in the overall national price distributions (p=0.1205). In contrast, regional analysis revealed significant variations: female-presenting callers received notably higher median quotes in the North East (+6.3%) and South East (+6.2%), but lower median quotes in the North Central (-8.1%) and South Central (-5.1%). These findings suggest that while a statistically significant national bias in the overall price distribution was not detected for this specific service and methodology, regional differences are substantial, and potential disparities may exist in specific segments of the price distribution or in the type of service implicitly recommended. The study highlights the complexity of measuring service pricing bias and the importance of considering regional context.
1. Introduction
This study aimed to isolate the effect of perceived gender, conveyed solely through voice, on initial price quotes for a common automotive maintenance task. We employed conversational AI agents to standardize the interaction, ensuring that the only significant variable distinguishing caller groups was the vocal presentation (male-presenting vs. female-presenting).
The central research question addressed is: Does the perceived gender of a caller, based on voice alone, significantly impact the price quoted by auto mechanics for a standard oil change service in the United States?
To answer this, we conducted a large-scale calling campaign targeting auto repair shops across diverse US metropolitan areas. The study utilized a controlled scenario: requesting a price quote for an oil change on a specific vehicle (2011 Honda Accord). By instructing the AI agents not to specify a preferred oil type, we allowed mechanics the opportunity to suggest a service level and price, potentially revealing biases in upselling or service recommendations based on perceived caller gender. The study hypothesized that female-presenting voices might receive higher average quotes or be implicitly steered towards more expensive service options.
2. Methodology
2.1 Data Collection Strategy Two distinct AI conversational agents were deployed, one configured with a standard male-presenting voice and the other with a standard female-presenting voice. Both agents utilized identical scripts designed to elicit a price quote for an oil change for a 2011 Honda Accord. The agents contacted 771 unique auto repair garages across more than 20 major US metropolitan areas. The dataset of garages was divided quasi-randomly between the two agents and evenly within regions, ensuring no overlap in the shops called by each agent persona.
2.2 Interaction Protocol The AI agents initiated calls and requested the price for the specified service and vehicle. A key element of the protocol was the response when asked about oil type preference: the agents were instructed to state they had no preference and would defer to the mechanic’s recommendation. This was designed to test whether different service tiers (e.g. conventional vs synthetic oil, potentially at different price points) would be implicitly suggested based on the caller’s perceived gender. Calls were conducted during standard business hours (Monday to Friday) over a six business day period. Agents attempted up to three calls to each shop if unable to connect with a human representative on initial attempts. Every call attempt was made least twenty four hours prior to the prior attempt.
3. Results
3.1 Sample Characteristics The study successfully obtained and analyzed 771 valid oil change price quotes. These were sourced from AI agent calls using a male-presenting voice (n=412) and a female-presenting voice (n=359). The distribution of collected data points was relatively balanced across the targeted cities and US regions for both gender personas, as depicted in Figures 1 and 2.
3.2 National Price Comparison Nationally, the median price quoted for the oil change was identical for both groups: $84.99. However, variations emerged across other distributional metrics (Figure 3):
- Median: $84.99 (Female) vs. $84.99 (Male) [Delta: $0.00, 0.0%]
- 25th Percentile (Q1): $69.95 (Female) vs. $64.96 (Male) [Delta: +$4.99, +7.7%]
- 75th Percentile (Q3): $94.99 (Female) vs. $89.99 (Male) [Delta: +$5.00, +5.6%]
- Min (Lower Whisker): $38.00 (Female) vs. $29.99 (Male) [Delta: +$8.01, +26.7%]
- Max (Upper Whisker): $129.99 (Female) vs. $124.99 (Male) [Delta: +$5.00, +4.0%]
Visually, the distribution for female callers appeared sparser at the lower end of the price range. A common pattern observed was the tendency for quotes to fall on $5 increments.
Caption: Figure 3. National distribution of oil change price quotes by perceived caller gender. Includes median, quartiles, non-outlier range, and individual data points.
3.3 National Statistical Significance A Mann-Whitney U test compared the overall distributions of prices quoted to male- and female-presenting callers nationally. The test yielded U = 78740.00 and p = 0.1205. As p > 0.05, the difference between the national price distributions for the two groups was not statistically significant.
Caption: Figure 4. Mann-Whitney U test p-value shown.
3.4 Regional Price Variations Analysis stratified by US region revealed considerable heterogeneity in pricing patterns (Figure 4):
- North Central: Median quotes were lower for female callers (-$6.16, -8.1%). Q1 (-6.5%) and Q3 (-6.9%) were also lower. (n=61 Female, n=72 Male)
- North East: Median quotes were higher for female callers (+$5.01, +6.3%). Q1 (+5.4%) and Q3 (+7.8%) were also higher. (n=68 Female, n=77 Male)
- South Central: Median quotes were lower for female callers (-$4.00, -5.1%). Q1 (-0.4%) and Q3 (-0.7%) showed minimal difference. (n=36 Female, n=37 Male)
- South East: Median quotes were higher for female callers (+$5.00, +6.2%). Q1 (+16.7%) and Q3 (+9.4%) were also notably higher. (n=113 Female, n=116 Male)
- South West: Median quotes were higher for female callers (+$4.99, +6.2%). Q1 (+1.4%) and Q3 (+1.5%) showed smaller positive differences. (n=81 Female, n=110 Male)
Caption: Figure 5. Distribution of oil change price quotes by perceived caller gender across different US regions.
These results indicate that any potential gender-based pricing disparity is not uniform across the country and interacts strongly with regional market factors.
4. Discussion
This study provides empirical data on potential gender-based differences in automotive service quoting, using AI to isolate the effect of voice. The findings present a nuanced picture. Nationally, the identical median quote suggests parity at the central tendency of pricing. However, the consistently higher quotes for female-presenting callers at the 25th and 75th percentiles, along with a significantly higher minimum non-outlier price, indicate that female callers might face disadvantages in accessing the lowest available prices and tend towards the upper end of the typical price range more often than male callers.
Despite these observed differences in distribution shape, the Mann-Whitney U test did not find the overall national distributions to be statistically significantly different. This could imply that while subtle differences exist, they are not pronounced enough across the entire distribution to reach statistical significance with the current sample size nationally, or that the variations effectively cancel out when aggregated nationwide.
The most striking finding is the significant regional variation. The opposing trends observed (e.g., lower prices for women in the North Central vs. higher prices in the North East and South East) strongly suggest that local market conditions, competition levels, dominant business types (e.g., dealerships vs. independent shops), or even regional cultural factors may play a more significant role in determining price quotes than the perceived gender of the caller alone, or they interact significantly with gender perception.
The methodological choice to have callers defer on oil type specification introduces a potential confounding variable. The observed price differences, particularly the higher minimum price for women nationally and higher prices in certain regions, could stem from female callers being more frequently quoted for or steered towards higher-margin services (like synthetic oil changes) without explicit discussion of cheaper alternatives (like conventional oil).
Conclusion
Returning to the research question, “Do women get taken advantage of at the car mechanic?”, this study, based on AI-driven oil change quotes, suggests the answer is, well complex and context-dependent.
While we did not find statistically significant evidence of a nationwide difference in the overall distribution of price quotes based solely on perceived gender via voice for this specific task, we observed potentially disadvantageous patterns for female-presenting callers in national quartile and minimum pricing. Crucially, significant regional variations exist, with female callers receiving higher median quotes in some areas (North East, South East, South West) and lower quotes in others (North Central, South Central).
The primary implication is that while blatant, widespread national price discrimination based solely on voice for a simple oil change quote may not be occurring or detectable with this methodology, subtle differences and significant regional disparities warrant attention. Local market dynamics appear highly influential. So always do some independent research before trusting the first quote you get over the phone.
Technological Demonstration and Time Efficiency
Beyond addressing the primary research question, this study served as a practical demonstration of our conversational AI capabilities for large-scale, standardized data collection. The execution of the calling campaigns for both male and female agents involved over 1464 minutes spent conversing with mechanics across more 1,400 call attempts to over 1000 different locations and successfully collecting, analyzing, and cataloging information from more than 770 garages.
We previously estimated that manually conducting this volume of calls and associated data logging would require approximately 90 or more person-hours of direct work and would require at least two different people to conduct phone calls. The project would likely extend over several weeks just to obtain a similar level of data that we collected in just a few days. The benefits included not only the dramatic acceleration of the project timeline through parallelism but also guaranteed consistency in interaction protocols, inherent scalability, automated data management reducing manual effort, and minimized overhead compared to coordinating equivalent human efforts. The speed advantage was a particularly compelling outcome of leveraging this technology for research data collection at scale.
Want your own study done? Shoot us an email hello@cobbery.com