The first rigorous, peer-reviewed studies of AI's impact on productivity have now been published — and they tell a more nuanced story than either AI optimists or skeptics typically acknowledge. For healthcare organizations contemplating AI adoption, the findings from adjacent industries offer a provisional but useful map of where value is likely to emerge and what conditions determine whether it does.

This analysis synthesizes findings from published research on AI-assisted work, with particular attention to the patterns most relevant to healthcare, insurance, and member-driven organizations. It does not make predictions about specific tools or vendors. Its aim is to characterize what the evidence shows, note where findings are preliminary, and identify the operational implications most relevant to healthcare leaders.

What the Research Shows

Several peer-reviewed studies published between 2022 and 2024 provide the most rigorous evidence available on AI's productivity effects in knowledge work environments. The findings are broadly consistent across different study designs, sectors, and task types.

A widely-cited study by Noy and Zhang, published in Science in 2023, examined the effect of AI writing assistance on a sample of college-educated professionals performing realistic writing tasks. Participants who used AI assistance completed tasks 11 minutes faster on average — a 40 percent reduction in time — and produced output that independent evaluators rated as higher quality. Critically, the largest gains accrued to workers who started with lower baseline performance, a finding that recurs across multiple studies.

Key Finding

In a controlled study of AI-assisted writing tasks, participants using AI completed work 40 percent faster and received higher quality ratings from independent evaluators — with the largest gains among lower-performing workers (Noy & Zhang, Science, 2023).

A second major study, by Brynjolfsson, Li, and Raymond, examined AI adoption in a large technology company's customer support operations. Customer service agents provided with an AI assistant that suggested responses handled 14 percent more issues per hour, with quality improvements concentrated among newer and lower-skilled workers. The most experienced and highest-performing agents saw essentially no benefit — and in some cases saw modest performance declines, possibly because the AI suggestions disrupted established workflows.

A study of GitHub Copilot, an AI coding assistant, found that developers using the tool completed assigned coding tasks 55 percent faster than those without access. Again, the effect was stronger for less experienced developers — a pattern that appears to reflect AI's capacity to surface knowledge and suggest approaches that experienced workers already possess, while providing meaningful uplift to those still developing expertise.

The Experience-Level Effect

The consistency of the experience-level finding across study contexts is striking and has direct implications for healthcare operations. In every major productivity study, less experienced workers benefit more than their more experienced peers. This is not simply a floor-effect artifact: the mechanism appears to be that AI tools effectively encode and make accessible knowledge that experts have internalized through experience — knowledge that novices would otherwise need years to accumulate.

"AI appears to function less like a tool that makes everyone better at what they already do, and more like a mechanism for compressing the learning curve — accelerating the development of capability in workers who have not yet reached expert performance."

Synthesis of Brynjolfsson et al. (2023) and Noy & Zhang (2023)

For healthcare organizations, this finding has significant operational resonance. Healthcare — including clinical, administrative, and operational roles — is characterized by steep learning curves, long onboarding periods, and persistent performance variation between experienced and inexperienced staff. If AI tools can compress the time required for new employees to reach acceptable performance levels, the organizational implications extend well beyond simple task automation.

Healthcare contact centers, revenue cycle operations, and administrative processing functions all exhibit the structural characteristics — high task volume, significant experience-based performance variation, and costly onboarding — most likely to respond to AI assistance in the ways documented in the research literature.

What the Research Does Not Show

Intellectual honesty about the limits of the current evidence base is as important as characterizing the positive findings. Several caveats deserve explicit acknowledgment.

First, most productivity research has been conducted on relatively well-defined, discrete tasks — writing assignments, customer support queries, coding problems — that differ in important ways from the complex, interdependent, and high-stakes work common in healthcare settings. Whether similar gains transfer to clinical documentation, prior authorization review, or complex case management has not been rigorously established.

Second, most studies measure short-term outcomes in controlled or quasi-experimental settings. Long-term effects — including whether productivity gains persist as novelty fades, whether skill atrophy occurs, and how team dynamics shift — are largely unexamined.

Third, the quality improvements observed in some studies were measured by proxies — evaluator ratings, customer satisfaction scores — that may not translate meaningfully to healthcare quality metrics, which carry different stakes and are subject to regulatory definition.

Adoption and Workflow Design as Determinants of Outcome

Perhaps the most practically important finding from the research literature is that technology access alone does not determine outcomes — adoption behavior and workflow design are equally important. Studies that have examined variation in AI tool usage find significant differences in how individuals and teams integrate AI assistance into existing workflows, and these differences predict outcome variation as much as technical capability differences between tools.

Organizational Implication

Research on technology adoption consistently finds that the gap between potential and realized productivity gains is explained more by adoption patterns and workflow integration than by the capabilities of the underlying technology.

This finding has a direct implication for healthcare organizations: the return on AI investment is not primarily determined by which tool is selected, but by how deliberately the organization manages adoption, workflow redesign, and the behavioral change required to realize potential gains. Organizations that deploy AI without investing in these complementary factors are unlikely to capture the productivity improvements the research documents.

McKinsey's analysis of AI adoption across industries found that organizations in the top quartile of AI adoption maturity — defined by deliberate governance, workflow integration, and change management practices — outperform lower-maturity organizations not just in productivity metrics but in overall AI return on investment by factors of three to five.

Implications for Healthcare Organizations

The first wave of AI productivity research, while conducted primarily outside healthcare, yields several findings with direct relevance to healthcare operations:

The opportunity is real but uneven. AI tools demonstrably improve productivity in knowledge work tasks — but the magnitude of improvement varies significantly by task type, worker experience level, and adoption quality. Healthcare leaders should resist both the skeptic's dismissal and the vendor's promise of universal gains.

Administrative functions may offer the strongest near-term opportunity. The task characteristics that predict the largest AI productivity gains — high volume, well-defined outputs, significant experience-based performance variation — are most prevalent in administrative, operational, and service functions. Clinical applications may follow, but the administrative opportunity is present now.

New employee productivity is a meaningful lever. If the experience-level effect observed in research replicates in healthcare settings, AI tools could meaningfully compress onboarding time and reduce the performance gap between new and experienced staff — with significant implications for workforce economics and care continuity.

Adoption investment is not optional. Organizations that treat AI deployment as a technology project rather than an organizational change initiative are unlikely to capture available productivity gains. Deliberate adoption strategy — including training, workflow redesign, and performance monitoring — is a necessary complement to technology access.

The research on AI productivity is still accumulating. Healthcare-specific evidence is limited. But the patterns emerging from the first rigorous studies suggest that the opportunity is real, that its distribution is predictable, and that the organizations most likely to capture it are those that bring the same analytical discipline to AI adoption that they bring to clinical and operational improvement more broadly.

Citations & Sources

  1. Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187–192.
  2. Brynjolfsson, E., Li, D., & Raymond, L. R. (2023). Generative AI at work. National Bureau of Economic Research Working Paper No. 31161.
  3. Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The impact of AI on developer productivity: Evidence from GitHub Copilot. arXiv preprint arXiv:2302.06590.
  4. McKinsey Global Institute. (2023). The economic potential of generative AI: The next productivity frontier. McKinsey & Company.
  5. Chui, M., Hazan, E., Roberts, R., Singla, A., Smaje, K., Sukharevsky, A., Yee, L., & Zemmel, R. (2023). The economic potential of generative AI. McKinsey Quarterly.