In a concerning development, Google has revised its guidelines for contractors evaluating Gemini AI responses, requiring them to rate technical content regardless of their expertise level, TechCrunch has learned.
According to internal documents obtained by TechCrunch, contractors working through GlobalLogic, a Hitachi-owned outsourcing firm, can no longer opt out of reviewing AI-generated responses that fall outside their domain knowledge. This marks a notable shift from previous protocols that allowed contractors to skip prompts requiring specialized expertise.
The updated guidelines now state: "You should not skip prompts that require specialized domain knowledge." This replaces the former directive that explicitly permitted skipping tasks requiring "critical expertise" in areas like coding or mathematics.
Under the new rules, contractors must attempt to evaluate all responses, even for highly technical topics like rare medical conditions, by rating "the parts of the prompt you understand" while noting their lack of domain expertise. They can only skip prompts that are incomplete or contain harmful content requiring special consent forms.
This policy change has sparked internal concerns about the potential impact on Gemini's accuracy. "I thought the point of skipping was to increase accuracy by giving it to someone better?" questioned one contractor in internal communications reviewed by TechCrunch.
The revision raises questions about the quality control process for Gemini's responses, particularly in sensitive areas like healthcare, where accurate information is paramount. These contractors play a key role in AI development by rating chatbot outputs for truthfulness and other factors.
Google had not responded to requests for comment at the time of publication.
This development sheds light on the complex human infrastructure behind AI systems, where the evaluation process directly influences how these technologies interact with and serve the public.