Building an AI Policy: Is AI a Trustworthy Research Partner?

What happens when components of trust collide with Large Language Models?

by Corinne Brenner

Apr 29, 2026

AI-enabled tools may be a way to support our mission of producing practical social science for a better world, but these tools also present new risks in security, ethics, work quality, and other domains. Our team is striving to understand how this technology could improve our processes and work output, as well as the new risks and challenges it poses. As part of our process of building an AI policy, one core question we keep coming back to is this: is AI trustworthy?

Trust is a complex phenomenon, and emerges from the combination of different qualities and behaviors. Chief among these are competence, reliability, transparency, and benevolence. AI tools present known, significant issues in each of those areas. This blog post explores how they play out in research contexts.

Competence Not Guaranteed

Competence describes our perception of another's abilities (i.e., how knowledgeable, capable, and skillful they are). Due to the way they process text, the large language models (LLMs) under the hood of platforms like ChatGPT and Gemini can produce cogent summaries and editorial feedback, all while struggling to answer seemingly simple questions like "How many Rs are there in the word strawberry?" It's important to keep in mind that LLMs do not check facts by default. Disclaimers like "ChatGPT is AI and can make mistakes. Check important info" and "Gemini is AI and can make mistakes" now appear automatically after entering a prompt in these tools to manage users' expectations of competence.

Additionally, LLMs produce unexpected, inappropriate, or erroneous (but confidently delivered!) responses called hallucinations. Efforts to reduce hallucination can be successful in specific, anticipated situations, but researchers now believe that hallucinations are inevitable as a function of this technology.

Reliability and the Transformer Architecture

Reliability is a key component of trust. When we say someone is reliable, what we're saying is that we're confident they can deliver on their promises—that they're dependable and can meet our expectations on a regular basis. That's another thing AI struggles with.

To understand why, it helps to know a bit about how AI works. A key term here is "generative pre-trained transformer" (which, if you're wondering, is where the "GPT" in ChatGPT comes from). The term refers to the underlying technology of LLMs that processes large amounts of text to produce new text in response to a user's prompt. Due to the complexity of the statistical models, using the exact same prompt twice may yield two different results.

That's far from the standard of reliability required for a trustworthy research tool. For tasks such as brainstorming ideas, this might not be such a critical issue, but risk mitigation steps like fact-checking and human review of responses would always be required.

Transparency, Data, and Complexity

The very foundations LLMs are built on lack transparency. Like other kinds of machine learning models, LLMs are a reflection of the data used to train them, and all the choices people made while collecting, creating, and sharing that data. Training data for most publicly available LLMs includes an enormous amount of text from sources like Wikipedia, Reddit, crawled websites, and corpora of books. But the exact sources are unclear. The data used to create any machine learning model reflect both human decisions about what to include and how to use it, and the messy reality of society. Statistical models trained on human decisions may be intended to support more objective and transparent outcomes than any one person's decision-making process, but have been shown to result in racially biased decisions for risk assessments in the criminal justice system, hiring decisions that favor men over women, or biased lending decisions.

In a research process, we expect to understand how the tools we use produce an output. A thermometer measures temperature based on explainable physical properties. A survey measures attitudes based on a person's reflections. An open coding process making sense of qualitative data can be described and repeated. However, because of the complex underlying statistical model, it's not at all clear how an LLM produced a given output. In other kinds of statistical models, it's possible to calculate how much the training data has affected the likelihood of an outcome, but this isn't possible with extremely complex LLMs. Echoing the challenge of reliability, we have no way of tracing why responses may differ from one occasion to another, or to correct for inappropriate differences. For certain tasks, like brainstorming, this may not be a problem. For other tasks—like supporting a position with evidence—it is.

Benevolence and Sycophancy

One of the most important aspects of trust is benevolence, or the idea that someone cares about your well-being. In the workplace, we have seen that employee perceptions of leader benevolence are associated with organizational trust (even in times of trouble). The more employees believe that leaders care about them, the more trust they hold for the institution. In our research, we've also seen the key role benevolence plays in building public trust in institutions—particularly in connection with behaviors that communicate high regard for the wellbeing of staff, visitors, and the broader community.

Because commercially available LLMs are designed with a business goal of retaining user engagement, they display a pattern of sycophancy—that is, the tendency to agree with and flatter the user. This pattern has been widely observed, and even resulted in a technical change to ChatGPT in April of 2025 to reduce "overly flattering or agreeable" behavior. Users prefer and trust sycophantic AI, even as this type of interaction leads to distorted judgments, lowering a person's willingness to take responsibility and repair interpersonal conflicts. Researchers have noted that the models' tendency to sacrifice the truth to form agreeable responses seems to be a general behavior driven by users' preferences. Is someone who agrees with everything you say worthy of trust, or acting more like an enabler? Does a tool with this pattern of responses belong in a research setting, where critically engaging with and questioning ideas is a core component of the process?

Let's Put it To Work

So, is AI a trustworthy research partner? Our investigation (which is very much an ongoing thing!) indicates that this question can only be answered on a case-by-case basis. Different tasks in the process of research highlight different components of trust and potential issues. For data analysis tasks (where we need accurate information and must be able to trace back how we arrived at an answer), the lack of competence, reliability, and transparency represent significant barriers to trusting the output. For other kinds of tasks, like summarization, proofreading, transcription, or brainstorming, the output may be as good or better than a person's, and it is possible to check that an AI's performance is up to standard. Any output of an LLM still needs human review.

As Knology moves forward in building an AI policy in this new world of possibilities, examining what these tools can do and should do will become a key part of our work. The elements of trust apply to the research process, and our policies and procedures for using these tools will need to account for these unique issues. Engaging with the opportunities and risks is essential to ensure that any AI adoption serves our broader mission.

References

Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016, May 23). Machine bias: Risk assessments in criminal sentencing. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., … Amodei, D. (2020). Language Models are Few-Shot Learners. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.), Advances in Neural Information Processing Systems (Vol. 33, pp. 1877-1901). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf

Cheng, M., Lee, C., Khadpe, P., Yu, S., Han, D., & Jurafsky, D. (2026). Sycophantic AI decreases prosocial intentions and promotes dependence. Science, 391(6792), eaec8352. https://doi.org/10.1126/science.aec8352

Cooper, Y. (2018, October 10). Amazon Ditched AI Recruiting Tool that Favored Men for Technical Jobs. The Guardian. https://www.theguardian.com/technology/2018/oct/10/amazon-hiring-ai-gender-bias-recruiting-engine

Xu, Z., Jain, S., & Kankanhalli, M. (2024). Hallucination is inevitable: An innate limitation of Large Language Models (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2401.11817

About This Article

Want to learn more about our trust research? The best places to start are our "Trust 101" primer and our Trust Framework. From there, take a look at our work on why benevolence is key to perceptions of trustworthiness, our research into how people assess the trustworthiness of zoos and aquariums, a conversation about the role of trust in public health, some findings pertaining to the role of partnerships in trust-building, and our thoughts about how moral motives factor into considerations of trust and trustworthiness. You can also browse through a collection of outputs from our Culture of Trust project! And for more on LLMs, see our earlier post about building trustworthy AI.

Photo courtesy of Igor Omilaev @ Unsplash