Systematic evaluation of ChatGPT performance in providing renewable energy information
Summary
This study evaluates ChatGPT's accuracy in providing general renewable energy information compared to human experts using lexical and semantic similarity measures and Gemini-based evaluation.
Key quotes
ChatGPT responses were more accurate and relevant than those of human experts’ responses on renewable energy prompts.
ChatGPT should be viewed as a supportive, not authoritative, tool, most effective when its outputs are verified through up-to-date, expert, and location-specific sources.
The research utilizes a handcrafted dataset of 63 prompts to compare AI-generated responses with human expert data. It employs Word2Vec embeddings and Gemini as an independent assessor to determine semantic alignment.