Research

Q&A: In ChatGPT we trust?

Generative AI like ChatGPT may one day make online search engines easier to use, but for now users need to fact-check AI’s responses, according to Penn State researchers

Credit: tsingha25/Getty Images. All Rights Reserved.

UNIVERSITY PARK, Pa. ­­— A real human wrote this article, albeit with the help of transcription software. ChatGPT, or another large language model, probably would have composed it much more quickly, but artificial intelligence (AI) systems are susceptible to hallucinating ­— generating incorrect information — so could you trust the results?

The accuracy of generative AI systems matters, especially as more people use AI to search for answers online and as search engines incorporate AI into their systems. Penn State News spoke with S. Shyam Sundar, the James P. Jimirro Professor of Media Effects at Penn State and director of the Center for Socially Responsible Artificial Intelligence, and graduate student Yongnam Jung about their research into what makes people trust ChatGPT and other online information sources, and the potential future of AI and online search engines.

Q: Are people using ChatGPT as a search engine?

Sundar: Anecdotal evidence suggests that people are turning to ChatGPT for a first response, where previously they used Google search. For example, two New York lawyers used ChatGPT when compiling a brief for a case, and the judge later found that the precedents that ChatGPT cited was bogus. My lab conducted a very small, preliminary study that did not show any evidence to support the anecdotal evidence. Our participants tended mostly to use Google first, followed by Wikipedia, but these were mostly people in higher education who have been bombarded with information the last couple of years about the shortcomings of generative AI. So, it's clearly not a representative sample. Our interest is in finding out which features about ChatGPT, Google search and Wikipedia make a user prone to trust the platforms.

Jung: Our study participants indicated that they use ChatGPT for specific use cases, such as to improve their writing or to refer to a specific format, like a resume. They also use it to search for information, but they don’t trust the results. Previous studies and news articles have suggested that users sometimes show blind trust in ChatGPT, but our focus group interviews suggested that this blind trust is not always the case. Our participants said that they use ChatGPT to search for information, but they are skeptical about the results because they don’t include reference information like Wikipedia and Google do.

Q: How does ChatGPT compare to Google search and Wikipedia?

Sundar: The primary difference is the conversationality of ChatGPT: the fact that it is a chat interface that goes back and forth in response to your specific question. Every message it gives you is contingent upon what you put in and upon what you put in before that. In this respect, it seems very much like a butler serving you. The more it knows you, it personalizes the information for you, and then it gives you exactly what you want by pointing specifically to your question, whereas Google might just return a whole bunch of results based on a keyword match. People prompt-engineer ChatGPT to be their language buddy or to be their companion. It seems intuitive and authoritative, like it knows what it's talking about. The responses are well-organized. All these features can make it seem more trustworthy. But what users often don't realize is that ChatGPT gives generic, generally applicable answers.

Jung: Participants said they trust the platforms for different reasons. They really like that Google provides diverse search results. They also like features like labeling sponsored ads because it demonstrates that Google is trying to be transparent. For Wikipedia, participants liked the edit function because if anybody can edit an entry means that when something is wrong, somebody will correct it. That’s why they trust information from Wikipedia. Regarding ChatGPT, they really like the interactive features, that they can have a human conversation with ChatGPT, which increases trust in the system.

Q: We’re starting to see online search engines incorporate generative AI into their results. Where do you see the future of search engines and AI headed?

Sundar: Most search engines, information providers, chatbots, customer service agents, they have all adopted large language modeling technology to improve the handoff or handover of information to users. They have improved usability, so information comes across as much more conversational and chattier than the traditional way of delivering information. From a communication perspective, large language models (LLMs) have revolutionized all these different technologies in terms of showing them the path toward better communication with users and more focused, conversational interaction. To that end, we’ve come much closer to the idea of AI as being another entity that you could ask questions just like you would a human being.

To go through Google’s search engine output, or even Wikipedia’s output, you need to have a certain level of skill to derive usable insights. With these search engines incorporating LLMs, what has happened is it avoids the clutter for users, it avoids the need for users to have to figure out themselves or have this special skill to go the extra step to derive intelligence from the output. Instead, it can basically tell me what I need to know. That said, LLMs are known to hallucinate. They are not known to be particularly factual because they are based on probability of occurrence of the next word or sentence in the history of human-generated text. Search engine technology, on the other hand, is based on information retrieval. It’s querying databases, searching online and scraping information. Marrying these two has the promise of overcoming the deficits of each: On the one hand, LLMs overcome the deficit of traditional search engines in terms of their ability to have a conversation with users, while on the other hand, search engines overcome the problem that LLMs have of hallucinating by providing verified information with references or links.

Q: What are some best practices to keep in mind when using ChatGPT to find specific, accurate answers?

Sundar: Users have to be much more systematic in the way they process the information. They have to evaluate the information for the central message. Is it specific enough to my question, or does it seem very general? Often, just because people ask a specific question, they think the response is also specific, but actually it’s a very generic response. They need to see if the output is something super-specific to their situation, and if it is, they need to do a cross verification with another information source provider. Ideally, it would be better if they did that with a non-LLM technology. For example, if I get an output from ChatGPT, I can go to the classic Google search engine to see if I get something similar.

Users also should be thinking about the authenticity of information. To what extent is the information based on well-sourced data from credible information sources versus stringing together words? The lawyers in the New York case could have gone to LexisNexis, which is a database of court cases, to see if there was a specific case by that name and learn more about it. Often people use AI like ChatGPT in a hurry, and that's the danger. People rush to obtain information that may not be fully vetted by the users. The responses can also have baked-in biases that we may not realize.

Jung: People need to better understand how generative AI works. Even though an AI model may refer to a diverse dataset to respond to questions, unlike a search engine or Wikipedia that pull information, generative AI creates new information. This information may not always be true or current.

Since generative AI provides diverse interactions, if you ask questions and you’re still not getting a clear answer, you can refine your prompts to get more specific answers. Just make sure you verify that answer using a search engine or another platform.

Last Updated July 10, 2024

Contact